Full Code of theodi/csvlint.rb for AI

main a770a9448ebb cached

94 files

331.6 KB

92.5k tokens

161 symbols

1 requests

Download .txt

Showing preview only (356K chars total). Download the full file or copy to clipboard to get everything.

Repository: theodi/csvlint.rb
Branch: main
Commit: a770a9448ebb
Files: 94
Total size: 331.6 KB

Directory structure:
gitextract_4wguoljj/

├── .coveralls.yml
├── .gitattributes
├── .github/
│   ├── ISSUE_TEMPLATE.md
│   ├── PULL_REQUEST_TEMPLATE.md
│   ├── dependabot.yml
│   └── workflows/
│       └── push.yml
├── .gitignore
├── .pre-commit-hooks.yaml
├── .ruby-version
├── .standard_todo.yml
├── Appraisals
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Dockerfile
├── Gemfile
├── LICENSE.md
├── README.md
├── Rakefile
├── bin/
│   ├── create_schema
│   └── csvlint
├── csvlint.gemspec
├── docker_notes_for_windows.txt
├── features/
│   ├── check_format.feature
│   ├── cli.feature
│   ├── csv_options.feature
│   ├── csvupload.feature
│   ├── csvw_schema_validation.feature
│   ├── fixtures/
│   │   ├── cr-line-endings.csv
│   │   ├── crlf-line-endings.csv
│   │   ├── inconsistent-line-endings-unquoted.csv
│   │   ├── inconsistent-line-endings.csv
│   │   ├── invalid-byte-sequence.csv
│   │   ├── invalid_many_rows.csv
│   │   ├── lf-line-endings.csv
│   │   ├── spreadsheet.xls
│   │   ├── spreadsheet.xlsx
│   │   ├── title-row.csv
│   │   ├── valid.csv
│   │   ├── valid_many_rows.csv
│   │   ├── w3.org/
│   │   │   └── .well-known/
│   │   │       └── csvm
│   │   ├── white space in filename.csv
│   │   └── windows-line-endings.csv
│   ├── information.feature
│   ├── parse_csv.feature
│   ├── schema_validation.feature
│   ├── sources.feature
│   ├── step_definitions/
│   │   ├── cli_steps.rb
│   │   ├── csv_options_steps.rb
│   │   ├── information_steps.rb
│   │   ├── parse_csv_steps.rb
│   │   ├── schema_validation_steps.rb
│   │   ├── sources_steps.rb
│   │   ├── validation_errors_steps.rb
│   │   ├── validation_info_steps.rb
│   │   └── validation_warnings_steps.rb
│   ├── support/
│   │   ├── aruba.rb
│   │   ├── earl_formatter.rb
│   │   ├── env.rb
│   │   ├── load_tests.rb
│   │   └── webmock.rb
│   ├── validation_errors.feature
│   ├── validation_info.feature
│   └── validation_warnings.feature
├── gemfiles/
│   ├── activesupport_5.2.gemfile
│   ├── activesupport_6.0.gemfile
│   ├── activesupport_6.1.gemfile
│   ├── activesupport_7.0.gemfile
│   ├── activesupport_7.1.gemfile
│   └── activesupport_7.2.gemfile
├── lib/
│   ├── csvlint/
│   │   ├── cli.rb
│   │   ├── csvw/
│   │   │   ├── column.rb
│   │   │   ├── date_format.rb
│   │   │   ├── metadata_error.rb
│   │   │   ├── number_format.rb
│   │   │   ├── property_checker.rb
│   │   │   ├── table.rb
│   │   │   └── table_group.rb
│   │   ├── error_collector.rb
│   │   ├── error_message.rb
│   │   ├── field.rb
│   │   ├── schema.rb
│   │   ├── validate.rb
│   │   └── version.rb
│   └── csvlint.rb
└── spec/
    ├── csvw/
    │   ├── column_spec.rb
    │   ├── date_format_spec.rb
    │   ├── number_format_spec.rb
    │   ├── table_group_spec.rb
    │   └── table_spec.rb
    ├── field_spec.rb
    ├── schema_spec.rb
    ├── spec_helper.rb
    └── validator_spec.rb

================================================
FILE CONTENTS
================================================

================================================
FILE: .coveralls.yml
================================================
service_name: travis-ci

================================================
FILE: .gitattributes
================================================
# Don't mess with my CSV files
*.csv binary

================================================
FILE: .github/ISSUE_TEMPLATE.md
================================================
> Please provide a general summary of the issue in the Issue Title above
> fill out the headings below as applicable to the issue you are reporting,
> deleting as appropriate but offering us as much detail as you can to help us resolve the issue

### Expected Behaviour
> What should happen?

### Desired Behaviour (for improvement suggestions only)
> if relevant include images or hyperlinks to other resources that clarify the enhancement you're seeking

### Current Behaviour (for problems)
> What currently happens that isn't expected behaviour?

### Steps to Reproduce (for problems)
> Provide a link to a live example, or an unambiguous set of steps to reproduce this bug. Include code to reproduce, if relevant
1.
2.
3.
4.

### Your Environment
> Include as many relevant details about the environment you experienced the bug in - this will help us resolve the bug more expediently
* Environment name and version (e.g. Chrome 39, node.js 5.4):
* Operating System and version (desktop or mobile):


================================================
FILE: .github/PULL_REQUEST_TEMPLATE.md
================================================
This PR fixes #

Changes proposed in this pull request:

-
-
-


================================================
FILE: .github/dependabot.yml
================================================
version: 2
updates:
- package-ecosystem: bundler
  directory: "/"
  schedule:
    interval: daily
  open-pull-requests-limit: 10
- package-ecosystem: github-actions
  directory: "/"
  schedule:
    interval: weekly


================================================
FILE: .github/workflows/push.yml
================================================
name: CI
on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]
jobs:
  appraisal:
    name: Ruby ${{ matrix.ruby-version }} / Rails ${{ matrix.activesupport-version }}
    runs-on: ubuntu-latest
    strategy:
      matrix:
        ruby-version: ['2.5', '2.6', '2.7', '3.0', '3.1', '3.2', '3.3', '3.4', '4.0']
        activesupport-version:
          - activesupport_5.2
          - activesupport_6.0
          - activesupport_6.1
          - activesupport_7.0
          - activesupport_7.1
          - activesupport_7.2
        exclude:
          - ruby-version: '2.5'
            activesupport-version: activesupport_7.0
          - ruby-version: '2.6'
            activesupport-version: activesupport_7.0
          - ruby-version: '2.5'
            activesupport-version: activesupport_7.1
          - ruby-version: '2.6'
            activesupport-version: activesupport_7.1
          - ruby-version: '2.5'
            activesupport-version: activesupport_7.2
          - ruby-version: '2.6'
            activesupport-version: activesupport_7.2
          - ruby-version: '2.7'
            activesupport-version: activesupport_7.2
          - ruby-version: '3.0'
            activesupport-version: activesupport_7.2
      fail-fast: false

    env:
      BUNDLE_GEMFILE: gemfiles/${{ matrix.activesupport-version }}.gemfile

    steps:
      - uses: actions/checkout@v4
      - uses: ruby/setup-ruby@v1
        with:
          bundler-cache: true
          ruby-version: ${{ matrix.ruby-version }}
      - name: Install dependencies
        run: bundle install
      - name: Run the tests
        run: bundle exec rake

  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ruby/setup-ruby@v1
        with:
          bundler-cache: true
          ruby-version: "4.0"
      - name: Install dependencies
        run: bundle install
      - name: Run the tests
        run: bundle exec standardrb


================================================
FILE: .gitignore
================================================
*.gem
*.rbc
.bundle
.config
.yardoc
Gemfile.lock
InstalledFiles
_yardoc
coverage
doc/
lib/bundler/man
pkg
rdoc
spec/reports
test/tmp
test/version_tmp
tmp
coverage/
/.rspec

.idea
.DS_Store
features/csvw_validation_tests.feature
features/fixtures/csvw

bin/run-csvw-tests

csvlint-earl.ttl
.byebug_history

gemfiles/*.lock


================================================
FILE: .pre-commit-hooks.yaml
================================================
- id: csvlint
  name: csvlint
  entry: csvlint
  language: ruby
  files: \.csv$


================================================
FILE: .ruby-version
================================================
4.0.1


================================================
FILE: .standard_todo.yml
================================================
# Auto generated files with errors to ignore.
# Remove from this list as you refactor files.
---
ignore:
- features/support/load_tests.rb:
  - Security/Open
- lib/csvlint/csvw/column.rb:
  - Style/TernaryParentheses
- lib/csvlint/csvw/date_format.rb:
  - Lint/MixedRegexpCaptureTypes
- lib/csvlint/csvw/number_format.rb:
  - Style/SlicingWithRange
  - Style/IdenticalConditionalBranches
- lib/csvlint/csvw/property_checker.rb:
  - Performance/InefficientHashSearch
  - Naming/VariableName
  - Style/SlicingWithRange
  - Security/Open
  - Lint/BooleanSymbol
- lib/csvlint/csvw/table_group.rb:
  - Style/OptionalArguments
- lib/csvlint/field.rb:
  - Naming/VariableName
- lib/csvlint/schema.rb:
  - Security/Open
  - Style/SlicingWithRange
- lib/csvlint/validate.rb:
  - Performance/Count
  - Lint/BooleanSymbol
  - Naming/VariableName
  - Security/Open
  - Lint/NonLocalExitFromIterator
- lib/csvlint/schema.rb:
  - Lint/UselessRescue
- lib/csvlint/validate.rb:
  -  Lint/UselessRescue
- lib/csvlint/cli.rb:
  - Style/SafeNavigation


================================================
FILE: Appraisals
================================================
# After a new entry: `bundle exec appraisal install`
# Add an entry in `.github/workflows/push.yml`'s file

appraise "activesupport_5.2" do
  gem "activesupport", "~> 5.2.0"
end

appraise "activesupport_6.0" do
  gem "activesupport", "~> 6.0.0"
end

appraise "activesupport_6.1" do
  gem "activesupport", "~> 6.1.0"
end

appraise "activesupport_7.0" do
  gem "activesupport", "~> 7.0.0"
end

appraise "activesupport_7.1" do
  gem "activesupport", "~> 7.1.0"
end

appraise "activesupport_7.2" do
  gem "activesupport", "~> 7.2.0"
end


================================================
FILE: CHANGELOG.md
================================================
# Change Log

## [v1.2.0](https://github.com/data-liberation-front/csvlint.rb/tree/v1.2.0) (2023-02-27)

[Full Changelog](https://github.com/data-liberation-front/csvlint.rb/compare/v1.1.0...v1.2.0)

**Closed issues:**

- Pre-commit integration [\#275](https://github.com/Data-Liberation-Front/csvlint.rb/issues/275)

**Merged pull requests:**

- Pre commit hook [\#276](https://github.com/Data-Liberation-Front/csvlint.rb/pull/276) ([jrottenberg](https://github.com/jrottenberg))

## [v1.1.0](https://github.com/data-liberation-front/csvlint.rb/tree/v1.1.0) (2022-12-28)

[Full Changelog](https://github.com/data-liberation-front/csvlint.rb/compare/v1.0.0...v1.1.0)

**Closed issues:**

- Requires ruby \< 3.2 [\#272](https://github.com/Data-Liberation-Front/csvlint.rb/issues/272)
- Release a new version [\#244](https://github.com/Data-Liberation-Front/csvlint.rb/issues/244)

**Merged pull requests:**

- bump version to 1.1.0 [\#274](https://github.com/Data-Liberation-Front/csvlint.rb/pull/274) ([Floppy](https://github.com/Floppy))
- Add support for Ruby 3.2 [\#273](https://github.com/Data-Liberation-Front/csvlint.rb/pull/273) ([Floppy](https://github.com/Floppy))
- fix lint error [\#271](https://github.com/Data-Liberation-Front/csvlint.rb/pull/271) ([youpy](https://github.com/youpy))
- optimize validation with regular expression [\#270](https://github.com/Data-Liberation-Front/csvlint.rb/pull/270) ([youpy](https://github.com/youpy))
- Bump actions/checkout from 2 to 3 [\#269](https://github.com/Data-Liberation-Front/csvlint.rb/pull/269) ([dependabot[bot]](https://github.com/apps/dependabot))
- Add GitHub Actions to Dependabot [\#267](https://github.com/Data-Liberation-Front/csvlint.rb/pull/267) ([petergoldstein](https://github.com/petergoldstein))
- Lint with standardrb [\#266](https://github.com/Data-Liberation-Front/csvlint.rb/pull/266) ([Floppy](https://github.com/Floppy))
- Add Dockerfile and notes for usage on MS Windows. [\#243](https://github.com/Data-Liberation-Front/csvlint.rb/pull/243) ([jespertp-systematic](https://github.com/jespertp-systematic))

## [v1.0.0](https://github.com/Data-Liberation-Front/csvlint.rb/tree/v1.0.0) (2022-07-13)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.4.0...v1.0.0)

Support Ruby 3.x, and DROPPED support for Ruby 2.4 - that's why the major version bump. That and this has been around long enough that it really shouldn't be on a zero version any more :)

## What's Changed

- Don't patch CSV#init_converters for ruby 2.5 compatibility by @rbmrclo in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/217>

- correct typos in README by @erikj in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/216>
- add info about your PATH by @ftrotter in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/222>
- Remove tests on deprecated ruby versions < 2.3 by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/234>
- Drop mime-types gem dependency by @ohbarye in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/221>
- remove specific version of net-http-persistent in gemspec by @kotaro0522 in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/219>
- Replace colorize with rainbow to make licensing consistent. by @cobbr2 in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/215>
- Update rdf requirement from < 2.0 to < 4.0 by @dependabot-preview in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/231>
- Test on Ruby 2.5 and 2.6 by @Domon in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/223>
- Fix load_from_json deprecation warnings. by @jezhiggins in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/237>
- Fix csvw tests by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/239>
- Test on Ruby 2.6 and 2.7 by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/240>
- Create Dependabot config file by @dependabot-preview in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/245>
- Include active_support/object to ensure this works in ruby 2.6 by @mseverini in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/246>
- add CI workflow for github actions by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/255>
- Enable and fix tests for Ruby 2.5 by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/259>
- Support Ruby 2.6 by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/262>
- Ruby 2.7 support by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/263>
- Drop support for Ruby 2.4 by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/265>
- Ruby 3.0 by @Floppy in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/264>

## New Contributors

- @rbmrclo made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/217>

- @erikj made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/216>
- @ftrotter made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/222>
- @ohbarye made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/221>
- @kotaro0522 made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/219>
- @cobbr2 made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/215>
- @dependabot-preview made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/231>
- @Domon made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/223>
- @mseverini made their first contribution in <https://github.com/Data-Liberation-Front/csvlint.rb/pull/246>

## [0.4.0](https://github.com/theodi/csvlint.rb/tree/0.4.0) (2017-xx-xx)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.3.3...0.4.0)

- Support for Ruby 2.4
  - Ruby 2.4 improves detections of unclosed quotes
- Support Rails ~> 5.0
- Added `--werror` flag to command line, to treat warnings as errors
- Deprecated `Schema#load_from_json` and replaced with `Schema#load_from_uri`. Method will be removed in 1.0.0.
- Added `Schema#load_from_string` to load from a string instead of reading a URI

**Closed issues:**

- CLI doesn't handle filenames with spaces [\#182](https://github.com/theodi/csvlint.rb/issues/182)

## [0.3.3](https://github.com/theodi/csvlint.rb/tree/0.3.3) (2016-11-10)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.3.2...0.3.3)

**Closed issues:**

- testing issue alerts, sorry [\#186](https://github.com/theodi/csvlint.rb/issues/186)

**Merged pull requests:**

- Add row + col to foreign key & duplicate key errors [\#188](https://github.com/theodi/csvlint.rb/pull/188) ([nickzoic](https://github.com/nickzoic))
- Trap-and-bin this [\#185](https://github.com/theodi/csvlint.rb/pull/185) ([pikesley](https://github.com/pikesley))
- csvw: common property names can be URLs [\#181](https://github.com/theodi/csvlint.rb/pull/181) ([JeniT](https://github.com/JeniT))
- force UTF-8 if encoding is ASCII-8BIT [\#180](https://github.com/theodi/csvlint.rb/pull/180) ([JeniT](https://github.com/JeniT))

## [0.3.2](https://github.com/theodi/csvlint.rb/tree/0.3.2) (2016-05-24)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.3.1...0.3.2)

**Merged pull requests:**

- Add schema errors to cli json [\#184](https://github.com/theodi/csvlint.rb/pull/184) ([pezholio](https://github.com/pezholio))

## [0.3.1](https://github.com/theodi/csvlint.rb/tree/0.3.1) (2016-05-23)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.3.0...0.3.1)

**Closed issues:**

- Error installing on Windows because of \*escape\_utils\* dependency [\#175](https://github.com/theodi/csvlint.rb/issues/175)

**Merged pull requests:**

- Add CLI option to output JSON [\#183](https://github.com/theodi/csvlint.rb/pull/183) ([pezholio](https://github.com/pezholio))

## [0.3.0](https://github.com/theodi/csvlint.rb/tree/0.3.0) (2016-01-12)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.2.6...0.3.0)

**Merged pull requests:**

- still increment current\_line after invalid\_encoding error [\#174](https://github.com/theodi/csvlint.rb/pull/174) ([wjordan213](https://github.com/wjordan213))
- Support for CSV on the Web transformations [\#173](https://github.com/theodi/csvlint.rb/pull/173) ([JeniT](https://github.com/JeniT))

## [0.2.6](https://github.com/theodi/csvlint.rb/tree/0.2.6) (2015-11-16)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.2.5...0.2.6)

## [0.2.5](https://github.com/theodi/csvlint.rb/tree/0.2.5) (2015-11-16)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.2.4...0.2.5)

**Merged pull requests:**

- Use STDIN instead of ARGF [\#169](https://github.com/theodi/csvlint.rb/pull/169) ([pezholio](https://github.com/pezholio))

## [0.2.4](https://github.com/theodi/csvlint.rb/tree/0.2.4) (2015-10-20)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.2.3...0.2.4)

**Merged pull requests:**

- Fixes for CLI [\#164](https://github.com/theodi/csvlint.rb/pull/164) ([pezholio](https://github.com/pezholio))

## [0.2.3](https://github.com/theodi/csvlint.rb/tree/0.2.3) (2015-10-20)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.2.2...0.2.3)

**Closed issues:**

- Include field name with error [\#161](https://github.com/theodi/csvlint.rb/issues/161)
- Refactor the binary [\#150](https://github.com/theodi/csvlint.rb/issues/150)

**Merged pull requests:**

- Refactor CLI [\#163](https://github.com/theodi/csvlint.rb/pull/163) ([pezholio](https://github.com/pezholio))
- Update schema file example to clarify type [\#162](https://github.com/theodi/csvlint.rb/pull/162) ([wachunga](https://github.com/wachunga))

## [0.2.2](https://github.com/theodi/csvlint.rb/tree/0.2.2) (2015-10-09)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.2.1...0.2.2)

**Closed issues:**

- Eliminate some date and time formats \(for speed\) [\#105](https://github.com/theodi/csvlint.rb/issues/105)

**Merged pull requests:**

- Check characters in validate\_line method [\#160](https://github.com/theodi/csvlint.rb/pull/160) ([pezholio](https://github.com/pezholio))
- Further optimisations [\#159](https://github.com/theodi/csvlint.rb/pull/159) ([pezholio](https://github.com/pezholio))
- More optimizations after \#157 [\#158](https://github.com/theodi/csvlint.rb/pull/158) ([jpmckinney](https://github.com/jpmckinney))
- Memoize the result of CSV\#encode\_re [\#157](https://github.com/theodi/csvlint.rb/pull/157) ([jpmckinney](https://github.com/jpmckinney))
- Don't pass leading string to parse\_line [\#155](https://github.com/theodi/csvlint.rb/pull/155) ([pezholio](https://github.com/pezholio))

## [0.2.1](https://github.com/theodi/csvlint.rb/tree/0.2.1) (2015-10-07)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.2.0...0.2.1)

**Implemented enhancements:**

- Get total rows number about the CSV file that was validated [\#143](https://github.com/theodi/csvlint.rb/issues/143)

**Closed issues:**

- Optimization: Stream CSV [\#122](https://github.com/theodi/csvlint.rb/issues/122)

**Merged pull requests:**

- Add `row\_count` method [\#153](https://github.com/theodi/csvlint.rb/pull/153) ([pezholio](https://github.com/pezholio))
- Streaming validation [\#146](https://github.com/theodi/csvlint.rb/pull/146) ([pezholio](https://github.com/pezholio))

## [0.2.0](https://github.com/theodi/csvlint.rb/tree/0.2.0) (2015-10-05)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.4...0.2.0)

**Closed issues:**

- CSV on the web support [\#141](https://github.com/theodi/csvlint.rb/issues/141)

**Merged pull requests:**

- Recover from `ArgumentError`s when attempting to locate a schema and detect bad schema when JSON is malformed [\#152](https://github.com/theodi/csvlint.rb/pull/152) ([pezholio](https://github.com/pezholio))
- Catch errors if link headers are don't have particular values [\#151](https://github.com/theodi/csvlint.rb/pull/151) ([pezholio](https://github.com/pezholio))
- Rescue excel warning [\#149](https://github.com/theodi/csvlint.rb/pull/149) ([quadrophobiac](https://github.com/quadrophobiac))
- CSVW-based validation! [\#142](https://github.com/theodi/csvlint.rb/pull/142) ([JeniT](https://github.com/JeniT))

## [0.1.4](https://github.com/theodi/csvlint.rb/tree/0.1.4) (2015-08-06)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.3...0.1.4)

**Merged pull requests:**

- change made to the constraint parameter in order that it is more cons… [\#140](https://github.com/theodi/csvlint.rb/pull/140) ([quadrophobiac](https://github.com/quadrophobiac))

## [0.1.3](https://github.com/theodi/csvlint.rb/tree/0.1.3) (2015-07-24)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.2...0.1.3)

**Merged pull requests:**

- Error reporting schema expanded test suite [\#138](https://github.com/theodi/csvlint.rb/pull/138) ([quadrophobiac](https://github.com/quadrophobiac))
- Validate header size improvement [\#137](https://github.com/theodi/csvlint.rb/pull/137) ([adamc00](https://github.com/adamc00))
- Invalid schema [\#132](https://github.com/theodi/csvlint.rb/pull/132) ([bcouston](https://github.com/bcouston))

## [0.1.2](https://github.com/theodi/csvlint.rb/tree/0.1.2) (2015-07-15)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.1...0.1.2)

**Closed issues:**

- When an encoding error is thrown the line content is put into the column field in the error object [\#131](https://github.com/theodi/csvlint.rb/issues/131)

**Merged pull requests:**

- Catch invalid URIs [\#133](https://github.com/theodi/csvlint.rb/pull/133) ([pezholio](https://github.com/pezholio))
- Emit a warning when the CSV header does not match the supplied schema [\#127](https://github.com/theodi/csvlint.rb/pull/127) ([adamc00](https://github.com/adamc00))

## [0.1.1](https://github.com/theodi/csvlint.rb/tree/0.1.1) (2015-07-13)

[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.0...0.1.1)

**Closed issues:**

- Add Command Line Support [\#128](https://github.com/theodi/csvlint.rb/issues/128)
- BUG: Incorrect inconsistent\_values error on numeric columns [\#106](https://github.com/theodi/csvlint.rb/issues/106)

**Merged pull requests:**

- Fixes line content incorrectly being put into the row column field when there is an encoding error. [\#130](https://github.com/theodi/csvlint.rb/pull/130) ([glacier](https://github.com/glacier))
- Add command line help [\#129](https://github.com/theodi/csvlint.rb/pull/129) ([pezholio](https://github.com/pezholio))
- Remove stray q character. [\#125](https://github.com/theodi/csvlint.rb/pull/125) ([adamc00](https://github.com/adamc00))
- csvlint utility can take arguments to specify a schema and pp errors [\#124](https://github.com/theodi/csvlint.rb/pull/124) ([adamc00](https://github.com/adamc00))
- Fixed warning - use expect\( \) rather than .should [\#123](https://github.com/theodi/csvlint.rb/pull/123) ([jezhiggins](https://github.com/jezhiggins))
- Fixed spelling mistake [\#121](https://github.com/theodi/csvlint.rb/pull/121) ([jezhiggins](https://github.com/jezhiggins))
- Avoid using \#blank? if unnecessary [\#120](https://github.com/theodi/csvlint.rb/pull/120) ([jpmckinney](https://github.com/jpmckinney))
- eliminate some date and time formats, related \#105 [\#119](https://github.com/theodi/csvlint.rb/pull/119) ([jpmckinney](https://github.com/jpmckinney))
- Match another CSV error about line endings [\#118](https://github.com/theodi/csvlint.rb/pull/118) ([jpmckinney](https://github.com/jpmckinney))
- fixed typo mistake in README [\#117](https://github.com/theodi/csvlint.rb/pull/117) ([railsfactory-kumaresan](https://github.com/railsfactory-kumaresan))
- Integrate @jpmickinney's build\_formats improvements [\#112](https://github.com/theodi/csvlint.rb/pull/112) ([Floppy](https://github.com/Floppy))
- make limit\_lines into a non-dialect option [\#110](https://github.com/theodi/csvlint.rb/pull/110) ([Floppy](https://github.com/Floppy))
- fix coveralls stats [\#109](https://github.com/theodi/csvlint.rb/pull/109) ([Floppy](https://github.com/Floppy))
- Limit lines [\#101](https://github.com/theodi/csvlint.rb/pull/101) ([Hoedic](https://github.com/Hoedic))

## [0.1.0](https://github.com/theodi/csvlint.rb/tree/0.1.0) (2014-11-27)

**Implemented enhancements:**

- Blank values shouldn't count as inconsistencies [\#90](https://github.com/theodi/csvlint.rb/issues/90)
- Make sure we don't check schema column count and ragged row count together [\#66](https://github.com/theodi/csvlint.rb/issues/66)
- Include the failed constraints in error message when doing field validation [\#64](https://github.com/theodi/csvlint.rb/issues/64)
- Include the column value in error message when field validation fails [\#63](https://github.com/theodi/csvlint.rb/issues/63)
- Expose optional JSON table schema fields [\#55](https://github.com/theodi/csvlint.rb/issues/55)
- Ensure header rows are properly handled and validated [\#48](https://github.com/theodi/csvlint.rb/issues/48)
- Support zipped CSV? [\#30](https://github.com/theodi/csvlint.rb/issues/30)
- Improve feedback on inconsistent values [\#29](https://github.com/theodi/csvlint.rb/issues/29)
- Reported error positions are not massively useful [\#15](https://github.com/theodi/csvlint.rb/issues/15)

**Fixed bugs:**

- undefined method `\[\]' for nil:NilClass from fetch\_error [\#71](https://github.com/theodi/csvlint.rb/issues/71)
- Inconsistent column bases [\#69](https://github.com/theodi/csvlint.rb/issues/69)
- Improve error handling in Schema loading [\#42](https://github.com/theodi/csvlint.rb/issues/42)
- Recover from some line ending problems [\#41](https://github.com/theodi/csvlint.rb/issues/41)
- Inconsistent values due to number format differences [\#32](https://github.com/theodi/csvlint.rb/issues/32)
- New lines in quoted fields are valid [\#31](https://github.com/theodi/csvlint.rb/issues/31)
- Wrongly reporting incorrect file extension [\#23](https://github.com/theodi/csvlint.rb/issues/23)
- Incorrect extension reported when URL has query options at the end [\#14](https://github.com/theodi/csvlint.rb/issues/14)

**Closed issues:**

- Get gem continuously deploying [\#93](https://github.com/theodi/csvlint.rb/issues/93)
- Publish on rubygems.org [\#92](https://github.com/theodi/csvlint.rb/issues/92)
- Duplicate column names [\#87](https://github.com/theodi/csvlint.rb/issues/87)
- Return code is always 0 \(except when it isn't\) [\#85](https://github.com/theodi/csvlint.rb/issues/85)
- Can't pipe data to csvlint [\#84](https://github.com/theodi/csvlint.rb/issues/84)
- They have some validator running if someone wants to inspect it for "inspiration" [\#27](https://github.com/theodi/csvlint.rb/issues/27)
- Allow CSV parsing options to be configured as a parameter [\#6](https://github.com/theodi/csvlint.rb/issues/6)
- Use explicit CSV parsing options [\#5](https://github.com/theodi/csvlint.rb/issues/5)
- Improving encoding detection [\#2](https://github.com/theodi/csvlint.rb/issues/2)

**Merged pull requests:**

- Speed up \#build\_formats \(changes its API\) [\#103](https://github.com/theodi/csvlint.rb/pull/103) ([jpmckinney](https://github.com/jpmckinney))
- Continuously deploy gem [\#102](https://github.com/theodi/csvlint.rb/pull/102) ([pezholio](https://github.com/pezholio))
- Make csvlint way faster [\#99](https://github.com/theodi/csvlint.rb/pull/99) ([jpmckinney](https://github.com/jpmckinney))
- Update README.md [\#98](https://github.com/theodi/csvlint.rb/pull/98) ([rmalecky](https://github.com/rmalecky))
- Undeclared header error [\#95](https://github.com/theodi/csvlint.rb/pull/95) ([Floppy](https://github.com/Floppy))
- Blank values shouldn't count as inconsistencies [\#91](https://github.com/theodi/csvlint.rb/pull/91) ([pezholio](https://github.com/pezholio))
- Use `reject` instead of `delete\_if` [\#89](https://github.com/theodi/csvlint.rb/pull/89) ([pezholio](https://github.com/pezholio))
- Raise a warning if a title row is found [\#88](https://github.com/theodi/csvlint.rb/pull/88) ([pezholio](https://github.com/pezholio))
- Improve executable [\#86](https://github.com/theodi/csvlint.rb/pull/86) ([pezholio](https://github.com/pezholio))
- Feature undeclared header [\#83](https://github.com/theodi/csvlint.rb/pull/83) ([ldodds](https://github.com/ldodds))
- Support xsd:integer [\#82](https://github.com/theodi/csvlint.rb/pull/82) ([ldodds](https://github.com/ldodds))
- Downgrade header errors [\#81](https://github.com/theodi/csvlint.rb/pull/81) ([ldodds](https://github.com/ldodds))
- Go home, pry [\#78](https://github.com/theodi/csvlint.rb/pull/78) ([pikesley](https://github.com/pikesley))
- Use type validations to check consistency [\#77](https://github.com/theodi/csvlint.rb/pull/77) ([pezholio](https://github.com/pezholio))
- Add data accessor [\#76](https://github.com/theodi/csvlint.rb/pull/76) ([Floppy](https://github.com/Floppy))
- Add failed constraints to schema errors [\#75](https://github.com/theodi/csvlint.rb/pull/75) ([ldodds](https://github.com/ldodds))
- Only perform ragged row check if there's no schema [\#74](https://github.com/theodi/csvlint.rb/pull/74) ([ldodds](https://github.com/ldodds))
- Handle tempfiles [\#73](https://github.com/theodi/csvlint.rb/pull/73) ([pezholio](https://github.com/pezholio))
- Catch errors if regex doesn't match [\#72](https://github.com/theodi/csvlint.rb/pull/72) ([pezholio](https://github.com/pezholio))
- Inconsistent column base [\#70](https://github.com/theodi/csvlint.rb/pull/70) ([ldodds](https://github.com/ldodds))
- include column name in :header\_name message [\#68](https://github.com/theodi/csvlint.rb/pull/68) ([Floppy](https://github.com/Floppy))
- Record default dialect [\#67](https://github.com/theodi/csvlint.rb/pull/67) ([pezholio](https://github.com/pezholio))
- Schema validation message improvements [\#65](https://github.com/theodi/csvlint.rb/pull/65) ([Floppy](https://github.com/Floppy))
- Fix ignore empty fields [\#62](https://github.com/theodi/csvlint.rb/pull/62) ([ldodds](https://github.com/ldodds))
- Create stub schema from existing CSV file [\#61](https://github.com/theodi/csvlint.rb/pull/61) ([ldodds](https://github.com/ldodds))
- Validate dates [\#59](https://github.com/theodi/csvlint.rb/pull/59) ([ldodds](https://github.com/ldodds))
- add schema access from validator [\#58](https://github.com/theodi/csvlint.rb/pull/58) ([Floppy](https://github.com/Floppy))
- Allow schema and fields to have title and description [\#57](https://github.com/theodi/csvlint.rb/pull/57) ([ldodds](https://github.com/ldodds))
- Feature min max ranges [\#56](https://github.com/theodi/csvlint.rb/pull/56) ([ldodds](https://github.com/ldodds))
- Check header without schema [\#54](https://github.com/theodi/csvlint.rb/pull/54) ([ldodds](https://github.com/ldodds))
- Validate types [\#53](https://github.com/theodi/csvlint.rb/pull/53) ([pikesley](https://github.com/pikesley))
- Added open\_uri\_redirections to allow HTTP/HTTPS transfers [\#52](https://github.com/theodi/csvlint.rb/pull/52) ([ldodds](https://github.com/ldodds))
- Added docs on CSV options and header error/warning messages [\#51](https://github.com/theodi/csvlint.rb/pull/51) ([ldodds](https://github.com/ldodds))
- Feature header validation [\#50](https://github.com/theodi/csvlint.rb/pull/50) ([ldodds](https://github.com/ldodds))
- Handle unique columns [\#49](https://github.com/theodi/csvlint.rb/pull/49) ([pikesley](https://github.com/pikesley))
- Validate all the fields [\#47](https://github.com/theodi/csvlint.rb/pull/47) ([ldodds](https://github.com/ldodds))
- Tolerate incomplete schemas [\#46](https://github.com/theodi/csvlint.rb/pull/46) ([ldodds](https://github.com/ldodds))
- Add accessor for line breaks [\#45](https://github.com/theodi/csvlint.rb/pull/45) ([Floppy](https://github.com/Floppy))
- update README for info messages and new error types [\#44](https://github.com/theodi/csvlint.rb/pull/44) ([Floppy](https://github.com/Floppy))
- Info messages for line breaks [\#43](https://github.com/theodi/csvlint.rb/pull/43) ([Floppy](https://github.com/Floppy))
- Add category to messages [\#40](https://github.com/theodi/csvlint.rb/pull/40) ([ldodds](https://github.com/ldodds))
- Badges [\#39](https://github.com/theodi/csvlint.rb/pull/39) ([pikesley](https://github.com/pikesley))
- Generic field validation using JSON Table Schema [\#38](https://github.com/theodi/csvlint.rb/pull/38) ([ldodds](https://github.com/ldodds))
- Feature validate strings and files [\#37](https://github.com/theodi/csvlint.rb/pull/37) ([ldodds](https://github.com/ldodds))
- Support reporting of column number in errors [\#36](https://github.com/theodi/csvlint.rb/pull/36) ([ldodds](https://github.com/ldodds))
- Fix up casing of keys in CSV DDF options [\#35](https://github.com/theodi/csvlint.rb/pull/35) ([ldodds](https://github.com/ldodds))
- Add errors for incorrect newlines [\#34](https://github.com/theodi/csvlint.rb/pull/34) ([pezholio](https://github.com/pezholio))
- Change from parsing CSV line by line to using CSV.new and trapping errors [\#33](https://github.com/theodi/csvlint.rb/pull/33) ([ldodds](https://github.com/ldodds))
- Improved the README, tweaked LICENSE [\#28](https://github.com/theodi/csvlint.rb/pull/28) ([ldodds](https://github.com/ldodds))
- Handle 404s [\#26](https://github.com/theodi/csvlint.rb/pull/26) ([pezholio](https://github.com/pezholio))
- Create more fine-grained errors and warnings for content type issues [\#25](https://github.com/theodi/csvlint.rb/pull/25) ([ldodds](https://github.com/ldodds))
- Report trailing empty rows as an error. Previously threw exception [\#24](https://github.com/theodi/csvlint.rb/pull/24) ([ldodds](https://github.com/ldodds))
- Simplify the guessing of column types [\#22](https://github.com/theodi/csvlint.rb/pull/22) ([ldodds](https://github.com/ldodds))
- Class-ify error messages [\#21](https://github.com/theodi/csvlint.rb/pull/21) ([pezholio](https://github.com/pezholio))
- Error extracts [\#20](https://github.com/theodi/csvlint.rb/pull/20) ([Floppy](https://github.com/Floppy))
- Return headers [\#19](https://github.com/theodi/csvlint.rb/pull/19) ([pezholio](https://github.com/pezholio))
- Return a warning if no character set specified [\#18](https://github.com/theodi/csvlint.rb/pull/18) ([pezholio](https://github.com/pezholio))
- Ignore query params [\#17](https://github.com/theodi/csvlint.rb/pull/17) ([Floppy](https://github.com/Floppy))
- Add invalid\_encoding error for invalid byte sequences [\#16](https://github.com/theodi/csvlint.rb/pull/16) ([ldodds](https://github.com/ldodds))
- Check inconsistent values [\#13](https://github.com/theodi/csvlint.rb/pull/13) ([pezholio](https://github.com/pezholio))
- Add CSV dialect options [\#11](https://github.com/theodi/csvlint.rb/pull/11) ([pezholio](https://github.com/pezholio))
- Return warning if extension doesn't match content type [\#10](https://github.com/theodi/csvlint.rb/pull/10) ([pezholio](https://github.com/pezholio))
- Return warnings for file extension [\#8](https://github.com/theodi/csvlint.rb/pull/8) ([pezholio](https://github.com/pezholio))
- Detect blank rows [\#7](https://github.com/theodi/csvlint.rb/pull/7) ([pezholio](https://github.com/pezholio))
- Detect bad content type [\#3](https://github.com/theodi/csvlint.rb/pull/3) ([pezholio](https://github.com/pezholio))
- Return information about CSV [\#1](https://github.com/theodi/csvlint.rb/pull/1) ([pezholio](https://github.com/pezholio))

\* *This Change Log was automatically generated by [github_changelog_generator](https://github.com/skywinder/Github-Changelog-Generator)*


================================================
FILE: CODE_OF_CONDUCT.md
================================================
## Code of Conduct

### Our Pledge

In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, gender identity and expression, level of experience,
nationality, personal appearance, race, religion, or sexual identity and
orientation.

### Our Standards

Examples of behavior that contributes to creating a positive environment
include:

* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
  address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
  professional setting

### Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.

### Scope

This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.

### Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at [labs@theodi.org](mailto:labs@theodi.org). All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.

### Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at [http://contributor-covenant.org/version/1/4][version]

[homepage]: http://contributor-covenant.org
[version]: http://contributor-covenant.org/version/1/4/


================================================
FILE: CONTRIBUTING.md
================================================
# Contributing to CSVlint.rb

The CSVlint library is open source, and contributions are gratefully accepted!
Details on how to contribute are below. By participating in this project, you agree to abide by our [Code of Conduct](https://github.com/theodi/csvlint.rb/blob/CODE_OF_CONDUCT.md).

Before you start coding, please reach out to us either on our [gitter channel](https://gitter.im/theodi/toolbox) or by tagging a repository administrator on the issue ticket you are interested in contributing towards to indicate your interest in helping.

If this is your first time contributing to the ODI’s codebase you will need to [create a fork of this repository](https://help.github.com/articles/fork-a-repo/).

Consult our [Getting Started Guide](https://github.com/theodi/toolbox/wiki/Developers-Guide:-Getting-Started) (if necessary) and then follow the [readme instructions](https://github.com/theodi/csvlint.rb/blob/master/README.md#development) to get your Development environment running locally

Ensure that the [tests](https://github.com/theodi/csvlint.rb/blob/master/README.md#tests) pass before working on your contribution

## Code Review Process

All contributions to the codebase - whether fork or pull request - will be reviewed per the below criteria.
To increase your chances of your push being accepted please be aware of the following
- Write [well formed commit messages](http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)
- Follow our [style guide recommendations](https://github.com/theodi/toolbox/blob/README.md#code-style-guide)
- Write tests for all changes (additions or refactors of existing code).
- Of the github integrations we use two will be utilised to check appraise your contribution. In order of priority these are
    - Travis ensures that all tests (existing and additions) pass
    - Travis/Coveralls ensures that overall test coverage for lines of code meets a certain threshold. If this metric dips below what it previously was for the repository you’re pushing to then your PR will be rejected
    - Gemnasium ensures dependencies are up to date
- Once your PR is published and passes the above checks a repository administrator will review your contribution. Where appropriate comments may be provided and amendments suggested before your PR is merged into Master.
- Once your PR is accepted you will be granted push access to the repository you have contributed to! Congratulations on joining our community, you’ll no longer need to work from forks.

If you make a contribution to another repository in the Toolbox you will be expected to repeat this process. Read more about that [here](https://github.com/theodi/toolbox/blob/master/README.md#push-access).

## Code Style Guide

We follow the same code style conventions as detailed in Github’s [Ruby Style Guide](https://github.com/github/rubocop-github/blob/master/STYLEGUIDE.md)


================================================
FILE: Dockerfile
================================================
FROM ruby:2.5.8-buster

# throw errors if Gemfile has been modified since Gemfile.lock
RUN bundle config --global frozen 1

WORKDIR /usr/src/app

ENV LANG C.UTF-8

COPY ./lib/csvlint/version.rb ./lib/csvlint/
COPY csvlint.gemspec Gemfile Gemfile.lock ./
RUN bundle install

COPY ./ ./

CMD ["./bin/csvlint"]


================================================
FILE: Gemfile
================================================
source "https://rubygems.org"

# Specify your gem's dependencies in csvlint.rb.gemspec
gemspec


================================================
FILE: LICENSE.md
================================================
##Copyright (c) 2014 The Open Data Institute

#MIT License

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

================================================
FILE: README.md
================================================
[![Build Status](https://img.shields.io/github/workflow/status/Data-Liberation-Front/csvlint.rb/CI/main)](https://travis-ci.org/theodi/csvlint.rb)
[![Dependency Status](https://img.shields.io/librariesio/github/Data-Liberation-Front/csvlint.rb)](https://libraries.io/github/Data-Liberation-Front/csvlint.rb)
[![Coverage Status](http://img.shields.io/coveralls/Data-Liberation-Front/csvlint.rb.svg)](https://coveralls.io/r/Data-Liberation-Front/csvlint.rb)
[![License](http://img.shields.io/:license-mit-blue.svg)](http://theodi.mit-license.org)
[![Badges](http://img.shields.io/:badges-5/5-ff6799.svg)](https://github.com/pikesley/badger)

# CSV Lint

A ruby gem to support validating CSV files to check their syntax and contents. You can either use this gem within your own Ruby code, or as a standalone command line application

## Summary of features

* Validation that checks the structural formatting of a CSV file
* Validation of a delimiter-separated values (dsv) file accesible via URL, File, or an IO-style object (e.g. StringIO)
* Validation against [CSV dialects](http://dataprotocols.org/csv-dialect/)
* Validation against multiple schema standards; [JSON Table Schema](https://github.com/theodi/csvlint.rb/blob/master/README.md#json-table-schema-support) and [CSV on the Web](https://github.com/theodi/csvlint.rb/blob/master/README.md#csv-on-the-web-validation-support)

## Development

`ruby version 4.0`

### Tests

The codebase includes both rspec and cucumber tests, which can be run together using:

    $ rake

or separately:

    $ rake spec
    $ rake features

When the cucumber tests are first run, a script will create tests based on the latest version of the [CSV on the Web test suite](http://w3c.github.io/csvw/tests/), including creating a local cache of the test files. This requires an internet connection and some patience. Following that download, the tests will run locally; there's also a batch script:

    $ bin/run-csvw-tests

which will run the tests from the command line.

If you need to refresh the CSV on the Web tests:

    $ rm bin/run-csvw-tests
    $ rm features/csvw_validation_tests.feature
    $ rm -r features/fixtures/csvw

and then run the cucumber tests again or:

    $ ruby features/support/load_tests.rb


## Installation

Add this line to your application's Gemfile:

    gem 'csvlint'

And then execute:

    $ bundle

Or install it yourself as:

    $ gem install csvlint

## Usage

You can either use this gem within your own Ruby code, or as a standalone command line application

## On the command line

After installing the gem, you can validate a CSV on the command line like so:

	csvlint myfile.csv

You may need to add the gem exectuable directory to your path, by adding '/usr/local/lib/ruby/gems/2.6.0/bin'
or whatever your version is, to your .bash_profile PATH entry. [like so](https://stackoverflow.com/questions/2392293/ruby-gems-returns-command-not-found)

You will then see the validation result, together with any warnings or errors e.g.

```
myfile.csv is INVALID
1. blank_rows. Row: 3
1. title_row.
2. inconsistent_values. Column: 14
```

You can also optionally pass a schema file like so:

	csvlint myfile.csv --schema=schema.json

## Via pre-commit

Add to your .pre-commit-config.yaml file :

```
repos: # `pre-commit autoupdate` to get latest available tags

  - repo: https://github.com/Data-Liberation-Front/csvlint.rb
    rev: v1.2.0
    hooks:
      - id: csvlint
```

`pre-commit install` to enable it on your repository.

To force a manual run of [pre-commit](https://pre-commit.com/) use the command :

```
pre-commit run -a
```

## In your own Ruby code

Currently the gem supports retrieving a CSV accessible from a URL, File, or an IO-style object (e.g. StringIO)

	require 'csvlint'

	validator = Csvlint::Validator.new( "http://example.org/data.csv" )
	validator = Csvlint::Validator.new( File.new("/path/to/my/data.csv" ))
	validator = Csvlint::Validator.new( StringIO.new( my_data_in_a_string ) )

When validating from a URL the range of errors and warnings is wider as the library will also check HTTP headers for
best practices

	#invoke the validation
	validator.validate

	#check validation status
	validator.valid?

	#access array of errors, each is an Csvlint::ErrorMessage object
	validator.errors

	#access array of warnings
	validator.warnings

	#access array of information messages
	validator.info_messages

	#get some information about the CSV file that was validated
	validator.encoding
	validator.content_type
	validator.extension
	validator.row_count

	#retrieve HTTP headers from request
	validator.headers

## Controlling CSV Parsing

The validator supports configuration of the [CSV Dialect](http://dataprotocols.org/csv-dialect/) used in a data file. This is specified by
passing a dialect hash to the constructor:

    dialect = {
    	"header" => true,
    	"delimiter" => ","
    }
	validator = Csvlint::Validator.new( "http://example.org/data.csv", dialect )

The options should be a Hash that conforms to the [CSV Dialect](http://dataprotocols.org/csv-dialect/) JSON structure.

While these options configure the parser to correctly process the file, the validator will still raise errors or warnings for CSV
structure that it considers to be invalid, e.g. a missing header or different delimiters.

Note that the parser will also check for a `header` parameter on the `Content-Type` header returned when fetching a remote CSV file. As
specified in [RFC 4180](http://www.ietf.org/rfc/rfc4180.txt) the values for this can be `present` and `absent`, e.g:

	Content-Type: text/csv; header=present

## Error Reporting

The validator provides feedback on a validation result using instances of `Csvlint::ErrorMessage`. Errors are divided into errors, warnings and information
messages. A validation attempt is successful if there are no errors.

Messages provide context including:

* `category` has a symbol that indicates the category or error/warning: `:structure` (well-formedness issues), `:schema` (schema validation), `:context` (publishing metadata, e.g. content type)
* `type` has a symbol that indicates the type of error or warning being reported
* `row` holds the line number of the problem
* `column` holds the column number of the issue
* `content` holds the contents of the row that generated the error or warning

## Errors

The following types of error can be reported:

* `:wrong_content_type` -- content type is not `text/csv`
* `:ragged_rows` -- row has a different number of columns (than the first row in the file)
* `:blank_rows` -- completely empty row, e.g. blank line or a line where all column values are empty
* `:invalid_encoding` -- encoding error when parsing row, e.g. because of invalid characters
* `:not_found` -- HTTP 404 error when retrieving the data
* `:stray_quote` -- missing or stray quote
* `:unclosed_quote` -- unclosed quoted field
* `:whitespace` -- a quoted column has leading or trailing whitespace
* `:line_breaks` -- line breaks were inconsistent or incorrectly specified

## Warnings

The following types of warning can be reported:

* `:no_encoding` -- the `Content-Type` header returned in the HTTP request does not have a `charset` parameter
* `:encoding` -- the character set is not UTF-8
* `:no_content_type` -- file is being served without a `Content-Type` header
* `:excel` -- no `Content-Type` header and the file extension is `.xls`
* `:check_options` -- CSV file appears to contain only a single column
* `:inconsistent_values` -- inconsistent values in the same column. Reported if <90% of values seem to have same data type (either numeric or alphanumeric including punctuation)
* `:empty_column_name` -- a column in the CSV header has an empty name
* `:duplicate_column_name` -- a column in the CSV header has a duplicate name
* `:title_row` -- if there appears to be a title field in the first row of the CSV

## Information Messages

There are also information messages available:

* `:nonrfc_line_breaks` -- uses non-CRLF line breaks, so doesn't conform to RFC4180.
* `:assumed_header` -- the validator has assumed that a header is present

## Schema Validation

The library supports validating data against a schema. A schema configuration can be provided as a Hash or parsed from JSON. The structure currently
follows JSON Table Schema with some extensions and rudinmentary [CSV on the Web Metadata](http://www.w3.org/TR/tabular-metadata/).

An example JSON Table Schema schema file is:

	{
		"fields": [
			{
				"name": "id",
				"constraints": {
					"required": true,
					"type": "http://www.w3.org/TR/xmlschema-2/#integer"
				}
			},
			{
				"name": "price",
				"constraints": {
					"required": true,
					"minLength": 1
				}
			},
			{
				"name": "postcode",
				"constraints": {
					"required": true,
					"pattern": "[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-Z]{2}"
				}
			}
		]
	}

An equivalent CSV on the Web Metadata file is:

	{
		"@context": "http://www.w3.org/ns/csvw",
		"url": "http://example.com/example1.csv",
		"tableSchema": {
			"columns": [
				{
					"name": "id",
					"required": true,
					"datatype": { "base": "integer" }
				},
				{
					"name": "price",
					"required": true,
					"datatype": { "base": "string", "minLength": 1 }
				},
				{
					"name": "postcode",
					"required": true
				}
			]
		}
	}

Parsing and validating with a schema (of either kind):

	schema = Csvlint::Schema.load_from_json(uri)
	validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, schema )

### CSV on the Web Validation Support

This gem passes all the validation tests in the [official CSV on the Web test suite](http://w3c.github.io/csvw/tests/) (though there might still be errors or parts of the [CSV on the Web standard](http://www.w3.org/TR/tabular-metadata/) that aren't tested by that test suite).

### JSON Table Schema Support

Supported constraints:

* `required` -- there must be a value for this field in every row
* `unique` -- the values in every row should be unique
* `minLength` -- minimum number of characters in the value
* `maxLength` -- maximum number of characters in the value
* `pattern` -- values must match the provided regular expression
* `type` -- specifies an XML Schema data type. Values of the column must be a valid value for that type
* `minimum` -- specify a minimum range for values, the value will be parsed as specified by `type`
* `maximum` -- specify a maximum range for values, the value will be parsed as specified by `type`
* `datePattern` -- specify a `strftime` compatible date pattern to be used when parsing date values and min/max constraints

Supported data types (this is still a work in progress):

* String -- `http://www.w3.org/2001/XMLSchema#string` (effectively a no-op)
* Integer -- `http://www.w3.org/2001/XMLSchema#integer` or `http://www.w3.org/2001/XMLSchema#int`
* Float -- `http://www.w3.org/2001/XMLSchema#float`
* Double -- `http://www.w3.org/2001/XMLSchema#double`
* URI -- `http://www.w3.org/2001/XMLSchema#anyURI`
* Boolean -- `http://www.w3.org/2001/XMLSchema#boolean`
* Non Positive Integer -- `http://www.w3.org/2001/XMLSchema#nonPositiveInteger`
* Positive Integer -- `http://www.w3.org/2001/XMLSchema#positiveInteger`
* Non Negative Integer -- `http://www.w3.org/2001/XMLSchema#nonNegativeInteger`
* Negative Integer -- `http://www.w3.org/2001/XMLSchema#negativeInteger`
* Date -- `http://www.w3.org/2001/XMLSchema#date`
* Date Time -- `http://www.w3.org/2001/XMLSchema#dateTime`
* Year -- `http://www.w3.org/2001/XMLSchema#gYear`
* Year Month -- `http://www.w3.org/2001/XMLSchema#gYearMonth`
* Time -- `http://www.w3.org/2001/XMLSchema#time`

Use of an unknown data type will result in the column failing to validate.

Schema validation provides some additional types of error and warning messages:

* `:missing_value` (error) -- a column marked as `required` in the schema has no value
* `:min_length` (error) -- a column with a `minLength` constraint has a value that is too short
* `:max_length` (error) -- a column with a `maxLength` constraint has a value that is too long
* `:pattern` (error) --  a column with a `pattern` constraint has a value that doesn't match the regular expression
* `:malformed_header` (warning) -- the header in the CSV doesn't match the schema
* `:missing_column` (warning) -- a row in the CSV file has a missing column, that is specified in the schema. This is a warning only, as it may be legitimate
* `:extra_column` (warning) -- a row in the CSV file has extra column.
* `:unique` (error) -- a column with a `unique` constraint contains non-unique values
* `:below_minimum` (error) -- a column with a `minimum` constraint contains a value that is below the minimum
* `:above_maximum` (error) -- a column with a `maximum` constraint contains a value that is above the maximum

### Other validation options

You can also provide an optional options hash as the fourth argument to Validator#new. Supported options are:

* :limit_lines -- only check this number of lines of the CSV file. Good for a quick check on huge files.

```
options = {
  limit_lines: 100
}
validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, nil, options )
```

* :lambda -- Pass a block of code to be called when each line is validated, this will give you access to the `Validator` object. For example, this will return the current line number for every line validated:

```
    options = {
      lambda: ->(validator) { puts validator.current_line }
    }
    validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, nil, options )
    => 1
    2
    3
    4
    .....
```


================================================
FILE: Rakefile
================================================
require "bundler/gem_tasks"

$:.unshift File.join(File.dirname(__FILE__), "lib")

require "rubygems"
require "cucumber"
require "cucumber/rake/task"
require "coveralls/rake/task"
require "rspec/core/rake_task"

RSpec::Core::RakeTask.new(:spec)
Coveralls::RakeTask.new
Cucumber::Rake::Task.new(:features) do |t|
  t.cucumber_opts = "features --format pretty"
end

task default: [:spec, :features, "coveralls:push"]


================================================
FILE: bin/create_schema
================================================
#!/usr/bin/env ruby
$:.unshift File.join( File.dirname(__FILE__), "..", "lib")

require 'csvlint'

begin
  puts ARGV[0]
  csv = CSV.new( URI.open(ARGV[0]) )
	headers = csv.shift
	
	name = File.basename( ARGV[0] )
	schema = {
	  "title" => name,
	  "description" => "Auto generated schema for #{name}",
	  "fields" => []
	}
	
	headers.each do |name|
	  schema["fields"] << {
	    "name" => name,
	    "title" => "",
	    "description" => "",
	    "constraints" => {}
	  }
	end
	
	$stdout.puts JSON.pretty_generate(schema)
rescue => e
  puts e
  puts e.backtrace
	puts "Unable to parse CSV file"
end


================================================
FILE: bin/csvlint
================================================
#!/usr/bin/env ruby
$:.unshift File.join( File.dirname(__FILE__), "..", "lib")

require 'csvlint/cli'

if ARGV == ["help"]
  Csvlint::Cli.start(["help"])
else
  Csvlint::Cli.start(ARGV.unshift("validate"))
end


================================================
FILE: csvlint.gemspec
================================================
lib = File.expand_path("../lib", __FILE__)
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
require "csvlint/version"

Gem::Specification.new do |spec|
  spec.name = "csvlint"
  spec.version = Csvlint::VERSION
  spec.authors = ["pezholio"]
  spec.email = ["pezholio@gmail.com"]
  spec.description = "CSV Validator"
  spec.summary = "CSV Validator"
  spec.homepage = "https://github.com/theodi/csvlint.rb"
  spec.license = "MIT"

  spec.files = `git ls-files`.split($/)
  spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
  spec.require_paths = ["lib"]

  spec.required_ruby_version = [">= 2.5", "< 4.1"]

  spec.add_dependency "csv"
  spec.add_dependency "rainbow"
  spec.add_dependency "open_uri_redirections"
  spec.add_dependency "activesupport"
  spec.add_dependency "addressable"
  spec.add_dependency "typhoeus"
  spec.add_dependency "escape_utils"
  spec.add_dependency "uri_template"
  spec.add_dependency "thor"
  spec.add_dependency "rack"
  spec.add_dependency "net-http-persistent"
  spec.add_dependency "mutex_m" # For Ruby 3.4+

  spec.add_development_dependency "bundler", ">= 1.3"
  spec.add_development_dependency "rake"
  spec.add_development_dependency "cucumber"
  spec.add_development_dependency "simplecov"
  spec.add_development_dependency "simplecov-rcov"
  spec.add_development_dependency "spork"
  spec.add_development_dependency "webmock"
  spec.add_development_dependency "rspec"
  spec.add_development_dependency "rspec-pride"
  spec.add_development_dependency "rspec-expectations"
  spec.add_development_dependency "coveralls_reborn"
  spec.add_development_dependency "byebug"
  spec.add_development_dependency "github_changelog_generator"
  spec.add_development_dependency "aruba"
  spec.add_development_dependency "rdf", "< 4.0"
  spec.add_development_dependency "rdf-turtle"
  spec.add_development_dependency "standardrb"
  spec.add_development_dependency "appraisal"
  spec.add_development_dependency "benchmark"
end


================================================
FILE: docker_notes_for_windows.txt
================================================
# Note that these commands are specific for a docker environment on MS Windows.

# to generate Gemfile.lock file
docker run --rm -v %CD%:/usr/src/app -w /usr/src/app ruby:2.5 bundle install

# to build docker image from source (the ending dot is significant)
docker build -t csvlint .

# to run tests
docker run -it --rm csvlint rake

# to run csvlint command line with a CSV file.
# cd to the directory with the CSV file then
docker run -it --rm -v %CD%:/tmp csvlint ./bin/csvlint --dump-errors /tmp/file-to-lint.csv

# to enter the linux container
docker run -it --rm -v %CD%:/tmp csvlint bash

# to enter the ruby REPL
docker run -it --rm -v %CD%:/tmp csvlint irb


================================================
FILE: features/check_format.feature
================================================
Feature: Check inconsistent formatting

  Scenario: Inconsistent formatting for integers
    Given I have a CSV with the following content:
    """
"1","2","3"
"Foo","5","6"
"3","2","1"
"3","2","1"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "inconsistent_values"
    And that warning should have the column "1"
    
  Scenario: Inconsistent formatting for alpha fields
    Given I have a CSV with the following content:
    """
"Foo","Bar","Baz"
"Biz","1","Baff"
"Boff","Giff","Goff"
"Boff","Giff","Goff"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "inconsistent_values"
    And that warning should have the column "2"

  Scenario: Inconsistent formatting for alphanumeric fields
    Given I have a CSV with the following content:
    """
"Foo 123","Bar","Baz"
"1","Bar","Baff"
"Boff 432423","Giff","Goff"
"Boff444","Giff","Goff"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "inconsistent_values"
    And that warning should have the column "1"


    


================================================
FILE: features/cli.feature
================================================
Feature: CSVlint CLI

  Scenario: Valid CSV from url
    Given I have a CSV with the following content:
    """
"Foo","Bar","Baz"
"1","2","3"
"3","2","1"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I run `csvlint http://example.com/example1.csv`
    Then the output should contain "http://example.com/example1.csv is VALID"

  Scenario: Valid CSV from file
    When I run `csvlint ../../features/fixtures/valid.csv`
    Then the output should contain "valid.csv is VALID"

  # This is a hacky way of saying to run `cat features/fixtures/valid.csv | csvlint`
  Scenario: Valid CSV from pipe
    Given I have stubbed stdin to contain "features/fixtures/valid.csv"
    When I run `csvlint`
    Then the output should contain "CSV is VALID"

  Scenario: URL that 404s
    Given there is no file at the url "http://example.com/example1.csv"
    And there is no file at the url "http://example.com/.well-known/csvm"
    And there is no file at the url "http://example.com/example1.csv-metadata.json"
    And there is no file at the url "http://example.com/csv-metadata.json"
    When I run `csvlint http://example.com/example1.csv`
    Then the output should contain "http://example.com/example1.csv is INVALID"
    And the output should contain "not_found"

  Scenario: File doesn't exist
    When I run `csvlint ../../features/fixtures/non-existent-file.csv`
    Then the output should contain "non-existent-file.csv not found"

  Scenario: No file or URL specified
    Given I have stubbed stdin to contain nothing
    When I run `csvlint`
    Then the output should contain "No CSV data to validate"

  Scenario: No file or URL specified, but schema specified
    Given I have stubbed stdin to contain nothing
    And I have a schema with the following content:
    """
{
  "fields": [
          { "name": "Name", "constraints": { "required": true } },
          { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
          { "name": "Email", "constraints": { "required": true } }
    ]
}
    """
    And the schema is stored at the url "http://example.com/schema.json"
    When I run `csvlint --schema http://example.com/schema.json`
    Then the output should contain "No CSV data to validate"

  Scenario: Invalid CSV from url
    Given I have a CSV with the following content:
    """
    "Foo",	"Bar"	,	"Baz"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I run `csvlint http://example.com/example1.csv`
    Then the output should contain "http://example.com/example1.csv is INVALID"
    And the output should contain "whitespace"

  Scenario: Invalid CSV from url with JSON
    Given I have a CSV with the following content:
    """
    "Foo",	"Bar"	,	"Baz"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I run `csvlint http://example.com/example1.csv --json`
    Then the output should contain JSON
    And the JSON should have a state of "invalid"
    And the JSON should have 1 error
    And that error should have the "type" "whitespace"
    And that error should have the "category" "structure"
    And that error should have the "row" "1"

  Scenario: Specify schema
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
{
	"fields": [
          { "name": "Name", "constraints": { "required": true } },
          { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
          { "name": "Email", "constraints": { "required": true } }
    ]
}
    """
    And the schema is stored at the url "http://example.com/schema.json"
    When I run `csvlint http://example.com/example1.csv --schema http://example.com/schema.json`
    Then the output should contain "http://example.com/example1.csv is VALID"

  Scenario: Schema errors
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
{
  "fields": [
          { "name": "Name", "constraints": { "required": true } },
          { "name": "Id", "constraints": { "required": true, "minLength": 3 } },
          { "name": "Email", "constraints": { "required": true } }
    ]
}
    """
    And the schema is stored at the url "http://example.com/schema.json"
    When I run `csvlint http://example.com/example1.csv --schema http://example.com/schema.json`
    Then the output should contain "http://example.com/example1.csv is INVALID"
    And the output should contain "1. Id: min_length. Row: 2,2. 5"
    And the output should contain "1. malformed_header. Row: 1. Bob,1234,bob@example.org"

  Scenario: Schema errors with JSON
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
{
  "fields": [
          { "name": "Name", "constraints": { "required": true } },
          { "name": "Id", "constraints": { "required": true, "minLength": 3 } },
          { "name": "Email", "constraints": { "required": true } }
    ]
}
    """
    And the schema is stored at the url "http://example.com/schema.json"
    When I run `csvlint http://example.com/example1.csv --schema http://example.com/schema.json --json`
    Then the output should contain JSON
    And the JSON should have a state of "invalid"
    And the JSON should have 1 error
    And error 1 should have the "type" "min_length"
    And error 1 should have the "header" "Id"
    And error 1 should have the constraint "min_length" "3"

  Scenario: Invalid schema
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
NO JSON HERE SON
    """
    And the schema is stored at the url "http://example.com/schema.json"
    Then nothing should be outputted to STDERR
    When I run `csvlint http://example.com/example1.csv --schema http://example.com/schema.json`
    And the output should contain "invalid metadata: malformed JSON"

  Scenario: Schema that 404s
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And there is no file at the url "http://example.com/schema404.json"
    When I run `csvlint http://example.com/example1.csv --schema http://example.com/schema404.json`
    Then the output should contain "http://example.com/schema404.json not found"

  Scenario: Schema that doesn't exist
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I run `csvlint http://example.com/example1.csv --schema /fake/file/path.json`
    Then the output should contain "/fake/file/path.json not found"

  Scenario: Valid CSVw schema
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have metadata with the following content:
    """
{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "http://example.com/example1.csv",
  "dialect": { "header": false },
  "tableSchema": {
    "columns": [
            { "name": "Name", "required": true },
            { "name": "Id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
            { "name": "Email", "required": true }
      ]
  }
}
    """
    And the schema is stored at the url "http://example.com/schema.json"
    When I run `csvlint http://example.com/example1.csv --schema http://example.com/schema.json`
    Then the output should contain "http://example.com/example1.csv is VALID"

  Scenario: CSVw schema with invalid CSV
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have metadata with the following content:
    """
{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "http://example.com/example1.csv",
  "dialect": { "header": false },
  "tableSchema": {
    "columns": [
            { "name": "Name", "required": true },
            { "name": "Id", "required": true, "datatype": { "base": "string", "minLength": 3 } },
            { "name": "Email", "required": true }
      ]
  }
}
    """
    And the schema is stored at the url "http://example.com/schema.json"
    When I run `csvlint http://example.com/example1.csv --schema http://example.com/schema.json`
    Then the output should contain "http://example.com/example1.csv is INVALID"
    And the output should contain "1. min_length. Row: 2,2. 5"

  Scenario: CSVw table Schema
    Given I have stubbed stdin to contain nothing
    And I have a metadata file called "csvw/countries.json"
    And the metadata is stored at the url "http://w3c.github.io/csvw/tests/countries.json"
    And I have a file called "csvw/countries.csv" at the url "http://w3c.github.io/csvw/tests/countries.csv"
    And I have a file called "csvw/country_slice.csv" at the url "http://w3c.github.io/csvw/tests/country_slice.csv"
    When I run `csvlint --schema http://w3c.github.io/csvw/tests/countries.json`
    Then the output should contain "http://w3c.github.io/csvw/tests/countries.csv is VALID"
    And the output should contain "http://w3c.github.io/csvw/tests/country_slice.csv is VALID"


================================================
FILE: features/csv_options.feature
================================================
Feature: CSV options

  Scenario: Sucessfully parse a valid CSV
    Given I have a CSV with the following content:
    """
'Foo';'Bar';'Baz'
'1';'2';'3'
'3';'2';'1'
    """
    And I set the delimiter to ";"
    And I set quotechar to "'" 
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of true

  Scenario: Warn if options seem to return invalid data
    Given I have a CSV with the following content:
    """
'Foo';'Bar';'Baz'
'1';'2';'3'
'3';'2';'1'
    """
    And I set the delimiter to ","
    And I set quotechar to """ 
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "check_options"

  Scenario: Use esoteric line endings
    Given I have a CSV file called "windows-line-endings.csv"
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of true
    

================================================
FILE: features/csvupload.feature
================================================
Feature: Collect all the tests that should trigger dialect check related errors

  Scenario: Title rows, I wish to trigger a :title_row type message
    Given I have a CSV file called "title-row.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "title_row"

#    :nonrfc_line_breaks

  Scenario: LF line endings in file give an info message of type :nonrfc_line_breaks
    Given I have a CSV file called "lf-line-endings.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I set header to "true"
    And I ask if there are info messages
    Then there should be 1 info message
    And one of the messages should have the type "nonrfc_line_breaks"

  Scenario: CRLF line endings in file produces no info messages of type :nonrfc_line_breaks
    Given I have a CSV file called "crlf-line-endings.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I set header to "true"
    And I ask if there are info messages
    Then there should be 0 info messages

#  :line_breaks

  Scenario: Incorrect line endings specified in settings
    Given I have a CSV file called "lf-line-endings.csv"
    And I set the line endings to carriage return
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are errors
    Then there should be 1 error
    And that error should have the type "line_breaks"

  Scenario: inconsistent line endings in file cause an error
    Given I have a CSV file called "inconsistent-line-endings.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are errors
    Then there should be 1 error
    And that error should have the type "line_breaks"


  Scenario: inconsistent line endings with unquoted fields in file cause an error
    Given I have a CSV file called "inconsistent-line-endings-unquoted.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are errors
    Then there should be 1 error
    And that error should have the type "line_breaks"

#:unclosed_quote

  Scenario: CSV with incorrect quoting
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar","Baz
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "unclosed_quote"
    And that error should have the row "2"
    And that error should have the content ""Foo","Bar","Baz"

#  :invalid_encoding

  Scenario: Report invalid Encoding
    Given I have a CSV file called "invalid-byte-sequence.csv"
    And I set an encoding header of "UTF-8"
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "invalid_encoding"

  Scenario: Report invalid file
#should this throw an excel error?
    Given I have a CSV file called "spreadsheet.xls"
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "invalid_encoding"

#  :blank_rows

  Scenario: Successfully report a CSV with blank rows
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar","Baz"
"","",
"Baz","Bar","Foo"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "blank_rows"
    And that error should have the row "3"
    And that error should have the content ""","","

  Scenario: Successfully report a CSV with multiple trailing empty rows
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar","Baz"
"Foo","Bar","Baz"


    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "blank_rows"
    And that error should have the row "4"

  Scenario: Successfully report a CSV with an empty row
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar","Baz"

"Foo","Bar","Baz"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "blank_rows"
    And that error should have the row "3"

#:check_options

  Scenario: Warn if options seem to return invalid data
    Given I have a CSV with the following content:
    """
'Foo';'Bar';'Baz'
'1';'2';'3'
'3';'2';'1'
    """
    And I set the delimiter to ","
    And I set quotechar to """
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "check_options"


================================================
FILE: features/csvw_schema_validation.feature
================================================
Feature: CSVW Schema Validation

  Scenario: Valid CSV
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have metadata with the following content:
    """
{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "http://example.com/example1.csv",
  "dialect": { "header": false },
  "tableSchema": {
    "columns": [
            { "name": "Name", "required": true },
            { "name": "Id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
            { "name": "Email", "required": true }
      ]
  }
}
    """
    When I ask if there are errors
    Then there should be 0 error

  Scenario: Schema invalid CSV
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have metadata with the following content:
    """
{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "http://example.com/example1.csv",
  "dialect": { "header": false },
  "tableSchema": {
    "columns": [
            { "name": "Name", "required": true },
            { "name": "Id", "required": true, "datatype": { "base": "string", "minLength": 3 } },
            { "name": "Email", "required": true }
      ]
  }
}
    """
    When I ask if there are errors
    Then there should be 1 error

  Scenario: CSV with incorrect header
    Given I have a CSV with the following content:
    """
"name","id","contact"
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have metadata with the following content:
    """
{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "http://example.com/example1.csv",
  "tableSchema": {
    "columns": [
            { "titles": "name", "required": true },
            { "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
            { "titles": "email", "required": true }
      ]
  }
}
    """
    When I ask if there are errors
    Then there should be 1 error

  Scenario: Schema with valid regex
    Given I have a CSV with the following content:
    """
  "firstname","id","email"
  "Bob","1234","bob@example.org"
  "Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have metadata with the following content:
    """
{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "http://example.com/example1.csv",
  "tableSchema": {
    "columns": [
            { "titles": "firstname", "required": true, "datatype": { "base": "string", "format": "^[A-Za-z0-9_]*$" } },
            { "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
            { "titles": "email", "required": true }
      ]
  }
}
    """
    When I ask if there are warnings
    Then there should be 0 warnings

  Scenario: Schema with invalid regex
    Given I have a CSV with the following content:
    """
  "firstname","id","email"
  "Bob","1234","bob@example.org"
  "Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have metadata with the following content:
    """
{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "http://example.com/example1.csv",
  "tableSchema": {
    "columns": [
            { "titles": "firstname", "required": true, "datatype": { "base": "string", "format": "((" } },
            { "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
            { "titles": "email", "required": true }
      ]
  }
}
    """
    When I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "invalid_regex"


================================================
FILE: features/fixtures/cr-line-endings.csv
================================================
"Foo","Bar","Baz"
"Biff","Baff","Boff"
"Qux","Teaspoon","Doge"

================================================
FILE: features/fixtures/crlf-line-endings.csv
================================================
"Foo","Bsr","Baz"
"Biff","Baff","Boff"
"Qux","Teaspoon","Doge"

================================================
FILE: features/fixtures/inconsistent-line-endings-unquoted.csv
================================================
Foo,Bsr,Baz
Biff,Baff,Boff
Qux,Teaspoon,Doge

================================================
FILE: features/fixtures/inconsistent-line-endings.csv
================================================
"Foo","Bsr","Baz"
"Biff","Baff","Boff"
"Qux","Teaspoon","Doge"

================================================
FILE: features/fixtures/invalid-byte-sequence.csv
================================================
"Data","Dependencia Origem","Histrico","Data do Balancete","Nmero do documento","Valor",
"10/31/2012","","Saldo Anterior","","0","100.00",
"11/01/2012","0000-9","Transferncia on line - 01/11 4885     256620-6 XXXXXXXXXXXXX","","224885000256620","100.00",
"11/01/2012","","Depsito COMPE - 033 0502    27588602104 XXXXXXXXXXXXXX","","101150","100.00",
"11/01/2012","","Proventos","","496774","1000.00",
"11/01/2012","","Benefcio","","496775","100.00",
"11/01/2012","0000-0","Compra com Carto - 01/11 09:45 XXXXXXXXXXX","","135102","-1.00",
"11/01/2012","0000-0","Compra com Carto - 01/11 09:48    XXXXXXXXXXX","","235338","-10.00",
"11/01/2012","0000-0","Compra com Carto - 01/11 12:35 XXXXXXXX","","345329","-10.00",
"11/01/2012","0000-0","Compra com Carto - 01/11 23:57    XXXXXXXXXXXXXXXX","","686249","-10.00",
"11/01/2012","0000-0","Saque com carto - 01/11 13:17 XXXXXXXXXXXXXXXX","","11317296267021","-10.00",
"11/01/2012","","Pagto conta telefone - VIVO DF","","110101","-100.00",
"11/01/2012","","Cobrana de I.O.F.","","391100701","-1.00",
"11/05/2012","0000-0","Compra com Carto - 02/11 16:57    XXXXXXXXXXXX","","161057","-10.00",
"11/05/2012","0000-0","Compra com Carto - 03/11 18:57    XXXXXXXXXXXXXXX","","168279","-10.00",
"11/05/2012","0000-0","Compra com Carto - 05/11 12:32    XXXXXXXXXXXXXXXXX","","245166","-10.00",
"11/05/2012","0000-0","Compra com Carto - 02/11 17:18    XXXXXXXXXXXXX","","262318","-1.00",
"11/05/2012","0000-0","Compra com Carto - 02/11 22:46    XXXXXXXXXXX","","382002","-100.00",
"11/05/2012","0000-0","Compra com Carto - 02/11 23:19    XXXXXXXXXXX","","683985","-1.00",
"11/05/2012","0000-0","Compra com Carto - 03/11 01:19    XXXXXXXXXXXXXXXX","","704772","-10.00",
"11/05/2012","0000-0","Compra com Carto - 03/11 11:08 XXXXXXXX","","840112","-1.00",
"11/05/2012","0000-0","Saque com carto - 05/11 19:24 XXXXXXXXXXXXXXXXX","","51924256267021","-10.00",
"11/05/2012","0000-0","Transferncia on line - 05/11 4885     256620-6 XXXXXXXXXXXXX","","224885000256620","-100.00",
"11/05/2012","","Pagamento de Ttulo - XXXXXXXXXXXXXXXXXXX","","110501","-100.00",


================================================
FILE: features/fixtures/invalid_many_rows.csv
================================================
"Foo","Bar","Baz"
"1","2","3"
"3","2","1"
"1","2","3" "
"3","two","1"
"1","2","3"
"3","2","1"

"3","2","1"
"3","2","1"
"","",""
"3","2","1"

================================================
FILE: features/fixtures/lf-line-endings.csv
================================================
"Foo","Bsr","Baz"
"Biff","Baff","Boff"
"Qux","Teaspoon","Doge"

================================================
FILE: features/fixtures/title-row.csv
================================================
"This is a title row",,
"Foo","Bsr","Baz"
"Biff","Baff","Boff"
"Qux","Teaspoon","Doge"

================================================
FILE: features/fixtures/valid.csv
================================================
"Foo","Bar","Baz"
"1","2","3"
"3","2","1"


================================================
FILE: features/fixtures/valid_many_rows.csv
================================================
"Foo","Bar","Baz"
"1","2","3"
"3","2","1"
"1","2","3"
"3","2","1"
"1","2","3"
"3","2","1"


================================================
FILE: features/fixtures/w3.org/.well-known/csvm
================================================
{+url}-metadata.json
csv-metadata.json
{+url}.json
csvm.json

================================================
FILE: features/fixtures/white space in filename.csv
================================================
"Foo","Bar","Baz"
"1","2","3"
"3","2","1"


================================================
FILE: features/fixtures/windows-line-endings.csv
================================================
a,b,c
d,e,f


================================================
FILE: features/information.feature
================================================
Feature: Return information

  Background:
    Given I have a CSV with the following content:
    """
"abc","2","3"
    """
    And it is encoded as "utf-8"
    And the content type is "text/csv"
    And it is stored at the url "http://example.com/example1.csv?query=true"

  Scenario: Return encoding
    Then the "encoding" should be "UTF-8"

  Scenario: Return content type
    Then the "content_type" should be "text/csv; charset=utf-8"

  Scenario: Return extension
    Then the "extension" should be ".csv"

  Scenario: Return meta
    Then the metadata content type should be "text/csv; charset=utf-8"


================================================
FILE: features/parse_csv.feature
================================================
Feature: Parse CSV

  Scenario: Successfully parse a valid CSV
    Given I have a CSV with the following content:
    """
"Foo","Bar","Baz"
"1","2","3"
"3","2","1"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of true

   Scenario: Successfully parse a CSV with newlines in quoted fields
    Given I have a CSV with the following content:
    """
"a","b","c"
"d","e","this is
valid"
"a","b","c"
"""
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of true

   Scenario: Successfully parse a CSV with multiple newlines in quoted fields
    Given I have a CSV with the following content:
    """
"a","b","c"
"d","this is
valid","as is this
too"
"""
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of true

  Scenario: Successfully report an invalid CSV
    Given I have a CSV with the following content:
    """
	"Foo",	"Bar"	,	"Baz
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of false

   Scenario: Successfully report a CSV with incorrect quoting
    Given I have a CSV with the following content:
    """
"Foo","Bar","Baz
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of false

   Scenario: Successfully report a CSV with incorrect whitespace
    Given I have a CSV with the following content:
    """
"Foo","Bar",   "Baz"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of false

   Scenario: Successfully report a CSV with ragged rows
    Given I have a CSV with the following content:
    """
"col1","col2","col2"
"1","2","3"
"4","5"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if the CSV is valid
    Then I should get the value of false

    Scenario: Don't class blank values as inconsistencies
     Given I have a CSV with the following content:
     """
"col1","col2","col3"
"1","2","3"
"4","5","6"
"","7","8"
"9","10","11"
"","12","13"
"","14","15"
"16","17","18"
     """
     And it is stored at the url "http://example.com/example1.csv"
     When I ask if there are warnings
     Then there should be 0 warnings


================================================
FILE: features/schema_validation.feature
================================================
Feature: Schema Validation

  Scenario: Valid CSV
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
{
	"fields": [
          { "name": "Name", "constraints": { "required": true } },
          { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
          { "name": "Email", "constraints": { "required": true } }
    ]
}
    """
    When I ask if there are errors
    Then there should be 0 error

  Scenario: Schema invalid CSV
    Given I have a CSV with the following content:
    """
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
{
	"fields": [
          { "name": "Name", "constraints": { "required": true } },
          { "name": "Id", "constraints": { "required": true, "minLength": 3 } },
          { "name": "Email", "constraints": { "required": true } }
    ]
}
    """
    When I ask if there are errors
    Then there should be 1 error

  Scenario: CSV with incorrect header
    Given I have a CSV with the following content:
    """
"name","id","contact"
"Bob","1234","bob@example.org"
"Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
{
	"fields": [
          { "name": "name", "constraints": { "required": true } },
          { "name": "id", "constraints": { "required": true, "minLength": 3 } },
          { "name": "email", "constraints": { "required": true } }
    ]
}
    """
    When I ask if there are warnings
    Then there should be 1 warnings

  Scenario: Schema with valid regex
    Given I have a CSV with the following content:
    """
  "firstname","id","email"
  "Bob","1234","bob@example.org"
  "Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
{
  "fields": [
          { "name": "Name", "constraints": { "required": true, "pattern": "^[A-Za-z0-9_]*$" } },
          { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
          { "name": "Email", "constraints": { "required": true } }
    ]
}
    """
    When I ask if there are errors
    Then there should be 0 error

  Scenario: Schema with invalid regex
    Given I have a CSV with the following content:
    """
  "firstname","id","email"
  "Bob","1234","bob@example.org"
  "Alice","5","alice@example.com"
    """
    And it is stored at the url "http://example.com/example1.csv"
    And I have a schema with the following content:
    """
{
  "fields": [
          { "name": "Name", "constraints": { "required": true, "pattern": "((" } },
          { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
          { "name": "Email", "constraints": { "required": true } }
    ]
}
    """
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "invalid_regex"


================================================
FILE: features/sources.feature
================================================
Feature: Parse CSV from Different Sources

  Scenario: Successfully parse a valid CSV from a StringIO
      Given I have a CSV with the following content:
    """
"Foo","Bar","Baz"
"1","2","3"
"3","2","1"
    """
    And it is parsed as a StringIO
    When I ask if the CSV is valid
    Then I should get the value of true

  Scenario: Successfully parse a valid CSV from a File
    Given I parse a file called "valid.csv"
    When I ask if the CSV is valid
    Then I should get the value of true


================================================
FILE: features/step_definitions/cli_steps.rb
================================================
Given(/^I have stubbed $stdin to contain "(.*?)"$/) do |file|
  expect($stdin).to receive(:read).and_return(File.read(file))
end

Given(/^I have stubbed $stdin to contain nothing$/) do
  expect($stdin).to receive(:read).and_return(nil)
end

Then(/^nothing should be outputted to STDERR$/) do
  expect($stderr).to_not receive(:puts)
end

Then(/^the output should contain JSON$/) do
  @json = JSON.parse(all_stdout)
  expect(@json["validation"]).to be_present
end

Then(/^the JSON should have a state of "(.*?)"$/) do |state|
  expect(@json["validation"]["state"]).to eq(state)
end

Then(/^the JSON should have (\d+) errors?$/) do |count|
  @index = count.to_i - 1
  expect(@json["validation"]["errors"].count).to eq(count.to_i)
end

Then(/^that error should have the "(.*?)" "(.*?)"$/) do |k, v|
  expect(@json["validation"]["errors"][@index][k].to_s).to eq(v)
end

Then(/^error (\d+) should have the "(.*?)" "(.*?)"$/) do |index, k, v|
  expect(@json["validation"]["errors"][index.to_i - 1][k].to_s).to eq(v)
end

Then(/^error (\d+) should have the constraint "(.*?)" "(.*?)"$/) do |index, k, v|
  expect(@json["validation"]["errors"][index.to_i - 1]["constraints"][k].to_s).to eq(v)
end


================================================
FILE: features/step_definitions/csv_options_steps.rb
================================================
Given(/^I set the delimiter to "(.*?)"$/) do |delimiter|
  @csv_options ||= default_csv_options
  @csv_options["delimiter"] = delimiter
end

Given(/^I set quotechar to "(.*?)"$/) do |doublequote|
  @csv_options ||= default_csv_options
  @csv_options["quoteChar"] = doublequote
end

Given(/^I set the line endings to linefeed$/) do
  @csv_options ||= default_csv_options
  @csv_options["lineTerminator"] = "\n"
end

Given(/^I set the line endings to carriage return$/) do
  @csv_options ||= default_csv_options
  @csv_options["lineTerminator"] = "\r"
end

Given(/^I set header to "(.*?)"$/) do |boolean|
  @csv_options ||= default_csv_options
  @csv_options["header"] = boolean == "true"
end


================================================
FILE: features/step_definitions/information_steps.rb
================================================
Given(/^the content type is "(.*?)"$/) do |arg1|
  @content_type = "text/csv"
end

Then(/^the "(.*?)" should be "(.*?)"$/) do |type, encoding|
  validator = Csvlint::Validator.new(@url, default_csv_options)
  expect(validator.send(type.to_sym)).to eq(encoding)
end

Then(/^the metadata content type should be "(.*?)"$/) do |content_type|
  validator = Csvlint::Validator.new(@url, default_csv_options)
  expect(validator.headers["content-type"]).to eq(content_type)
end


================================================
FILE: features/step_definitions/parse_csv_steps.rb
================================================
Given(/^I have a CSV with the following content:$/) do |string|
  @csv = string.to_s
end

Given(/^it has a Link header holding "(.*?)"$/) do |link|
  @link = "#{link}; type=\"application/csvm+json\""
end

Given(/^it is stored at the url "(.*?)"$/) do |url|
  @url = url
  content_type = @content_type || "text/csv"
  charset = @encoding || "UTF-8"
  headers = {"Content-Type" => "#{content_type}; charset=#{charset}"}
  headers["Link"] = @link if @link
  stub_request(:get, url).to_return(status: 200, body: @csv, headers: headers)
  stub_request(:get, URI.join(url, "/.well-known/csvm")).to_return(status: 404)
  stub_request(:get, url + "-metadata.json").to_return(status: 404)
  stub_request(:get, URI.join(url, "csv-metadata.json")).to_return(status: 404)
end

Given(/^it is stored at the url "(.*?)" with no character set$/) do |url|
  @url = url
  content_type = @content_type || "text/csv"
  stub_request(:get, url).to_return(status: 200, body: @csv, headers: {"Content-Type" => content_type.to_s})
  stub_request(:get, URI.join(url, "/.well-known/csvm")).to_return(status: 404)
  stub_request(:get, url + "-metadata.json").to_return(status: 404)
  stub_request(:get, URI.join(url, "csv-metadata.json")).to_return(status: 404)
end

When(/^I ask if the CSV is valid$/) do
  @csv_options ||= default_csv_options
  @validator = Csvlint::Validator.new(@url, @csv_options)
  @valid = @validator.valid?
end

Then(/^I should get the value of true$/) do
  expect(@valid).to be(true)
end

Then(/^I should get the value of false$/) do
  expect(@valid).to be(false)
end


================================================
FILE: features/step_definitions/schema_validation_steps.rb
================================================
Given(/^I have a schema with the following content:$/) do |json|
  @schema_type = :json_table
  @schema_json = json
end

Given(/^I have metadata with the following content:$/) do |json|
  @schema_type = :csvw_metadata
  @schema_json = json
end

Given(/^I have a metadata file called "([^"]*)"$/) do |filename|
  @schema_type = :csvw_metadata
  @schema_json = File.read(File.join(File.dirname(__FILE__), "..", "fixtures", filename))
end

Given(/^the (schema|metadata) is stored at the url "(.*?)"$/) do |schema_type, schema_url|
  @schema_url = schema_url
  stub_request(:get, @schema_url).to_return(status: 200, body: @schema_json.to_str)
end

Given(/^there is a file at "(.*?)" with the content:$/) do |url, content|
  stub_request(:get, url).to_return(status: 200, body: content.to_str)
end

Given(/^I have a file called "(.*?)" at the url "(.*?)"$/) do |filename, url|
  content = File.read(File.join(File.dirname(__FILE__), "..", "fixtures", filename))
  content_type = /.csv$/.match?(filename) ? "text/csv" : "application/csvm+json"
  stub_request(:get, url).to_return(status: 200, body: content, headers: {"Content-Type" => "#{content_type}; charset=UTF-8"})
end

Given(/^there is no file at the url "(.*?)"$/) do |url|
  stub_request(:get, url).to_return(status: 404)
end


================================================
FILE: features/step_definitions/sources_steps.rb
================================================
Given(/^it is parsed as a StringIO$/) do
  @url = StringIO.new(@csv)
end

Given(/^I parse a file called "(.*?)"$/) do |filename|
  @url = File.new(File.join(File.dirname(__FILE__), "..", "fixtures", filename))
end


================================================
FILE: features/step_definitions/validation_errors_steps.rb
================================================
When(/^I ask if there are errors$/) do
  @csv_options ||= default_csv_options

  if @schema_json
    @schema = if @schema_type == :json_table
      Csvlint::Schema.from_json_table(@schema_url || "http://example.org ", JSON.parse(@schema_json))
    else
      Csvlint::Schema.from_csvw_metadata(@schema_url || "http://example.org ", JSON.parse(@schema_json))
    end
  end

  @validator = Csvlint::Validator.new(@url, @csv_options, @schema)
  @errors = @validator.errors
end

When(/^I carry out CSVW validation$/) do
  @csv_options ||= default_csv_options

  begin
    if @schema_json
      json = JSON.parse(@schema_json)
      @schema = if @schema_type == :json_table
        Csvlint::Schema.from_json_table(@schema_url || "http://example.org ", json)
      else
        Csvlint::Schema.from_csvw_metadata(@schema_url || "http://example.org ", json)
      end
    end

    if @url.nil?
      @errors = []
      @warnings = []
      @schema.tables.keys.each do |table_url|
        validator = Csvlint::Validator.new(table_url, @csv_options, @schema)
        @errors += validator.errors
        @warnings += validator.warnings
      end
    else
      validator = Csvlint::Validator.new(@url, @csv_options, @schema)
      @errors = validator.errors
      @warnings = validator.warnings
    end
  rescue JSON::ParserError => e
    @errors = [e]
  rescue Csvlint::Csvw::MetadataError => e
    @errors = [e]
  end
end

Then(/^there should be errors$/) do
  # this test is only used for CSVW testing; :invalid_encoding & :line_breaks mask lack of real errors
  @errors.delete_if { |e| e.instance_of?(Csvlint::ErrorMessage) && [:invalid_encoding, :line_breaks].include?(e.type) }
  expect(@errors.count).to be > 0
end

Then(/^there should not be errors$/) do
  expect(@errors.count).to eq(0)
end

Then(/^there should be (\d+) error$/) do |count|
  expect(@errors.count).to eq(count.to_i)
end

Then(/^that error should have the type "(.*?)"$/) do |type|
  expect(@errors.first.type).to eq(type.to_sym)
end

Then(/^that error should have the row "(.*?)"$/) do |row|
  expect(@errors.first.row).to eq(row.to_i)
end

Then(/^that error should have the column "(.*?)"$/) do |column|
  expect(@errors.first.column).to eq(column.to_i)
end

Then(/^that error should have the content "(.*)"$/) do |content|
  expect(@errors.first.content.chomp).to eq(content.chomp)
end

Then(/^that error should have no content$/) do
  expect(@errors.first.content).to eq(nil)
end

Given(/^I have a CSV that doesn't exist$/) do
  @url = "http//www.example.com/fake-csv.csv"
  stub_request(:get, @url).to_return(status: 404)
end

Then(/^there should be no "(.*?)" errors$/) do |type|
  @errors.each { |error| error.type.should_not == type.to_sym }
end


================================================
FILE: features/step_definitions/validation_info_steps.rb
================================================
Given(/^I ask if there are info messages$/) do
  @csv_options ||= default_csv_options

  if @schema_json
    @schema = if @schema_type == :json_table
      Csvlint::Schema.from_json_table(@schema_url || "http://example.org ", JSON.parse(@schema_json))
    else
      Csvlint::Schema.from_csvw_metadata(@schema_url || "http://example.org ", JSON.parse(@schema_json))
    end
  end

  @validator = Csvlint::Validator.new(@url, @csv_options, @schema)
  @info_messages = @validator.info_messages
end

Then(/^there should be (\d+) info messages?$/) do |num|
  expect(@info_messages.count).to eq(num.to_i)
end

Then(/^one of the messages should have the type "(.*?)"$/) do |msg_type|
  expect(@info_messages.find { |x| x.type == msg_type.to_sym }).to be_present
end


================================================
FILE: features/step_definitions/validation_warnings_steps.rb
================================================
Given(/^it is encoded as "(.*?)"$/) do |encoding|
  @csv = @csv.encode(encoding)
  @encoding = encoding
end

Given(/^I set an encoding header of "(.*?)"$/) do |encoding|
  @encoding = encoding
end

Given(/^I do not set an encoding header$/) do
  @encoding = nil
end

Given(/^I have a CSV file called "(.*?)"$/) do |filename|
  @csv = File.read(File.join(File.dirname(__FILE__), "..", "fixtures", filename))
end

When(/^I ask if there are warnings$/) do
  @csv_options ||= default_csv_options
  if @schema_json
    @schema = if @schema_type == :json_table
      Csvlint::Schema.from_json_table(@schema_url || "http://example.org ", JSON.parse(@schema_json))
    else
      Csvlint::Schema.from_csvw_metadata(@schema_url || "http://example.org ", JSON.parse(@schema_json))
    end
  end

  @validator = Csvlint::Validator.new(@url, @csv_options, @schema)
  @warnings = @validator.warnings
end

Then(/^there should be warnings$/) do
  expect(@warnings.count).to be > 0
end

Then(/^there should not be warnings$/) do
  # this test is only used for CSVW testing, and :inconsistent_values warnings don't count in CSVW
  @warnings.delete_if { |w| [:inconsistent_values, :check_options].include?(w.type) }
  expect(@warnings.count).to eq(0)
end

Then(/^there should be (\d+) warnings$/) do |count|
  expect(@warnings.count).to eq(count.to_i)
end

Given(/^the content type is set to "(.*?)"$/) do |type|
  @content_type = type
end

Then(/^that warning should have the row "(.*?)"$/) do |row|
  expect(@warnings.first.row).to eq(row.to_i)
end

Then(/^that warning should have the column "(.*?)"$/) do |column|
  expect(@warnings.first.column).to eq(column.to_i)
end

Then(/^that warning should have the type "(.*?)"$/) do |type|
  expect(@warnings.first.type).to eq(type.to_sym)
end


================================================
FILE: features/support/aruba.rb
================================================
require "aruba"
require "aruba/cucumber"

require "csvlint/cli"

module Csvlint
  class CliRunner
    # Allow everything fun to be injected from the outside while defaulting to normal implementations.
    def initialize(argv, stdin = $stdin, stdout = $stdout, stderr = $stderr, kernel = Kernel)
      @argv, @stdin, @stdout, @stderr, @kernel = argv, stdin, stdout, stderr, kernel
    end

    def execute!
      exit_code = begin
        # Thor accesses these streams directly rather than letting them be injected, so we replace them...
        $stderr = @stderr
        $stdin = @stdin
        $stdout = @stdout

        # Run our normal Thor app the way we know and love.
        Csvlint::Cli.start(@argv.dup.unshift("validate"))

        # Thor::Base#start does not have a return value, assume success if no exception is raised.
        0
      rescue => e
        # The ruby interpreter would pipe this to STDERR and exit 1 in the case of an unhandled exception
        b = e.backtrace
        @stderr.puts("#{b.shift}: #{e.message} (#{e.class})")
        @stderr.puts(b.map { |s| "\tfrom #{s}" }.join("\n"))
        1
      rescue SystemExit => e
        e.status
      ensure
        # TODO: reset your app here, free up resources, etc.
        # Examples:
        # MyApp.logger.flush
        # MyApp.logger.close
        # MyApp.logger = nil
        #
        # MyApp.reset_singleton_instance_variables

        # ...then we put the streams back.
        $stderr = STDERR
        $stdin = STDIN
        $stdout = STDOUT
      end

      # Proxy our exit code back to the injected kernel.
      @kernel.exit(exit_code)
    end
  end
end

Aruba.configure do |config|
  config.command_launcher = :in_process
  config.main_class = Csvlint::CliRunner
end


================================================
FILE: features/support/earl_formatter.rb
================================================
require "rdf"
require "rdf/turtle"

class EarlFormatter
  def initialize(step_mother, io, options)
    output = RDF::Resource.new("")
    @graph = RDF::Graph.new
    @graph << [CSVLINT, RDF.type, RDF::DOAP.Project]
    @graph << [CSVLINT, RDF.type, EARL.TestSubject]
    @graph << [CSVLINT, RDF.type, EARL.Software]
    @graph << [CSVLINT, RDF::DOAP.name, "csvlint"]
    @graph << [CSVLINT, RDF::DC.title, "csvlint"]
    @graph << [CSVLINT, RDF::DOAP.description, "CSV validator"]
    @graph << [CSVLINT, RDF::DOAP.homepage, RDF::Resource.new("https://github.com/theodi/csvlint.rb")]
    @graph << [CSVLINT, RDF::DOAP.license, RDF::Resource.new("https://raw.githubusercontent.com/theodi/csvlint.rb/master/LICENSE.md")]
    @graph << [CSVLINT, RDF::DOAP["programming-language"], "Ruby"]
    @graph << [CSVLINT, RDF::DOAP.implements, RDF::Resource.new("http://www.w3.org/TR/tabular-data-model/")]
    @graph << [CSVLINT, RDF::DOAP.implements, RDF::Resource.new("http://www.w3.org/TR/tabular-metadata/")]
    @graph << [CSVLINT, RDF::DOAP.developer, ODI]
    @graph << [CSVLINT, RDF::DOAP.maintainer, ODI]
    @graph << [CSVLINT, RDF::DOAP.documenter, ODI]
    @graph << [CSVLINT, RDF::FOAF.maker, ODI]
    @graph << [CSVLINT, RDF::DC.creator, ODI]
    @graph << [output, RDF::FOAF["primaryTopic"], CSVLINT]
    @graph << [output, RDF::DC.issued, DateTime.now]
    @graph << [output, RDF::FOAF.maker, ODI]
    @graph << [ODI, RDF.type, RDF::FOAF.Organization]
    @graph << [ODI, RDF.type, EARL.Assertor]
    @graph << [ODI, RDF::FOAF.name, "Open Data Institute"]
    @graph << [ODI, RDF::FOAF.homepage, "https://theodi.org/"]
  end

  def scenario_name(keyword, name, file_colon_line, source_indent)
    @test = RDF::Resource.new("http://www.w3.org/2013/csvw/tests/#{name.split(" ")[0]}")
  end

  def after_steps(steps)
    passed = true
    steps.each do |s|
      passed = false unless s.status == :passed
    end
    a = RDF::Node.new
    @graph << [a, RDF.type, EARL.Assertion]
    @graph << [a, EARL.assertedBy, ODI]
    @graph << [a, EARL.subject, CSVLINT]
    @graph << [a, EARL.test, @test]
    @graph << [a, EARL.mode, EARL.automatic]
    r = RDF::Node.new
    @graph << [a, EARL.result, r]
    @graph << [r, RDF.type, EARL.TestResult]
    @graph << [r, EARL.outcome, passed ? EARL.passed : EARL.failed]
    @graph << [r, RDF::DC.date, DateTime.now]
  end

  def after_features(features)
    RDF::Writer.for(:ttl).open("csvlint-earl.ttl", {prefixes: {"earl" => EARL}, standard_prefixes: true, canonicalize: true, literal_shorthand: true}) do |writer|
      writer << @graph
    end
  end

  private

  EARL = RDF::Vocabulary.new("http://www.w3.org/ns/earl#")
  ODI = RDF::Resource.new("https://theodi.org/")
  CSVLINT = RDF::Resource.new("https://github.com/theodi/csvlint.rb")
end


================================================
FILE: features/support/env.rb
================================================
require "coveralls"
Coveralls.wear_merged!("test_frameworks")

$:.unshift File.join(File.dirname(__FILE__), "..", "..", "lib")

require "rspec/expectations"
require "cucumber/rspec/doubles"
require "csvlint"
require "byebug"

require "spork"

Spork.each_run do
  require "csvlint"
end

class CustomWorld
  def default_csv_options
    {}
  end
end

World do
  CustomWorld.new
end


================================================
FILE: features/support/load_tests.rb
================================================
require "json"
require "open-uri"
require "uri"

BASE_URI = "https://w3c.github.io/csvw/tests/"
BASE_PATH = File.join(File.dirname(__FILE__), "..", "fixtures", "csvw")
FEATURE_BASE_PATH = File.join(File.dirname(__FILE__), "..")
VALIDATION_FEATURE_FILE_PATH = File.join(FEATURE_BASE_PATH, "csvw_validation_tests.feature")
SCRIPT_FILE_PATH = File.join(File.dirname(__FILE__), "..", "..", "bin", "run-csvw-tests")

Dir.mkdir(BASE_PATH) unless Dir.exist?(BASE_PATH)

def cache_file(filename)
  file = File.join(BASE_PATH, filename)
  uri = URI.join(BASE_URI, filename)
  unless File.exist?(file)
    if filename.include? "/"
      levels = filename.split("/")[0..-2]
      (0..levels.length).each do |i|
        dir = File.join(BASE_PATH, levels[0..i].join("/"))
        Dir.mkdir(dir) unless Dir.exist?(dir)
      end
    end
    warn("storing #{file} locally")
    File.open(file, "wb") do |f|
      f.puts URI.open(uri, "rb").read
    end
  end
  uri
end

unless File.exist? SCRIPT_FILE_PATH
  File.open(SCRIPT_FILE_PATH, "w") do |file|
    File.chmod(0o755, SCRIPT_FILE_PATH)
    manifest = JSON.parse(URI.open("#{BASE_URI}manifest-validation.jsonld").read)
    manifest["entries"].each do |entry|
      type = "valid"
      case entry["type"]
      when "csvt:WarningValidationTest"
        type = "warnings"
      when "csvt:NegativeValidationTest"
        type = "errors"
      end
      file.puts "echo \"#{entry["id"].split("#")[-1]}: #{entry["name"].tr("`", "'")}\""
      file.puts "echo \"#{type}: #{entry["comment"].gsub("\"", "\\\"").tr("`", "'")}\""
      if entry["action"].end_with?(".json")
        file.puts "csvlint --schema=features/fixtures/csvw/#{entry["action"]}"
      elsif entry["option"] && entry["option"]["metadata"]
        file.puts "csvlint features/fixtures/csvw/#{entry["action"]} --schema=features/fixtures/csvw/#{entry["option"]["metadata"]}"
      else
        file.puts "csvlint features/fixtures/csvw/#{entry["action"]}"
      end
      file.puts "echo"
    end
  end
end

unless File.exist? VALIDATION_FEATURE_FILE_PATH
  File.open(VALIDATION_FEATURE_FILE_PATH, "w") do |file|
    file.puts "# Auto-generated file based on standard validation CSVW tests from #{BASE_URI}manifest-validation.jsonld"
    file.puts ""

    manifest = JSON.parse(URI.open("#{BASE_URI}manifest-validation.jsonld").read)

    file.puts "Feature: #{manifest["label"]}"
    file.puts ""

    manifest["entries"].each do |entry|
      action_uri = cache_file(entry["action"])
      metadata = nil
      provided_files = []
      missing_files = []
      file.puts "\t# #{entry["id"]}"
      file.puts "\t# #{entry["comment"]}"
      file.puts "\tScenario: #{entry["id"]} #{entry["name"].gsub("<", "less than")}"
      if entry["action"].end_with?(".json")
        file.puts "\t\tGiven I have a metadata file called \"csvw/#{entry["action"]}\""
        file.puts "\t\tAnd the metadata is stored at the url \"#{action_uri}\""
      else
        file.puts "\t\tGiven I have a CSV file called \"csvw/#{entry["action"]}\""
        file.puts "\t\tAnd it has a Link header holding \"#{entry["httpLink"]}\"" if entry["httpLink"]
        file.puts "\t\tAnd it is stored at the url \"#{action_uri}\""
        if entry["option"] && entry["option"]["metadata"]
          # no need to store the file here, as it will be listed in the 'implicit' list, which all get stored
          metadata = URI.join(BASE_URI, entry["option"]["metadata"])
          file.puts "\t\tAnd I have a metadata file called \"csvw/#{entry["option"]["metadata"]}\""
          file.puts "\t\tAnd the metadata is stored at the url \"#{metadata}\""
        end
        provided_files << action_uri.to_s
        if entry["name"].include?("/.well-known/csvm")
          file.puts "\t\tAnd I have a file called \"w3.org/.well-known/csvm\" at the url \"https://www.w3.org/.well-known/csvm\""
          missing_files << "#{action_uri}.json"
          missing_files << URI.join(action_uri, "csvm.json").to_s
        else
          missing_files << URI.join(action_uri, "/.well-known/csvm").to_s
        end
        missing_files << "#{action_uri}-metadata.json"
        missing_files << URI.join(action_uri, "csv-metadata.json").to_s
      end
      entry["implicit"]&.each do |implicit|
        implicit_uri = cache_file(implicit)
        provided_files << implicit_uri.to_s
        unless implicit_uri == metadata
          file.puts "\t\tAnd I have a file called \"csvw/#{implicit}\" at the url \"#{implicit_uri}\""
        end
      end
      missing_files.each do |uri|
        file.puts "\t\tAnd there is no file at the url \"#{uri}\"" unless provided_files.include? uri
      end
      file.puts "\t\tWhen I carry out CSVW validation"
      if entry["type"] == "csvt:WarningValidationTest"
        file.puts "\t\tThen there should not be errors"
        file.puts "\t\tAnd there should be warnings"
      elsif entry["type"] == "csvt:NegativeValidationTest"
        file.puts "\t\tThen there should be errors"
      else
        file.puts "\t\tThen there should not be errors"
        file.puts "\t\tAnd there should not be warnings"
      end
      file.puts "\t"
    end
  end
end


================================================
FILE: features/support/webmock.rb
================================================
require "webmock/cucumber"

WebMock.disable_net_connect!(allow: %r{csvw/tests})


================================================
FILE: features/validation_errors.feature
================================================
Feature: Get validation errors

  Scenario: CSV with ragged rows
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"1","2","3"
"4","5"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "ragged_rows"
    And that error should have the row "3"
    And that error should have the content ""4","5""

  Scenario: CSV with incorrect quoting
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar","Baz
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "unclosed_quote"
    And that error should have the row "2"
    And that error should have the content ""Foo","Bar","Baz"

  Scenario: Successfully report a CSV with incorrect whitespace
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar",   "Baz"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "whitespace"
    And that error should have the row "2"
    And that error should have the content ""Foo","Bar",   "Baz""

  Scenario: Successfully report a CSV with blank rows
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar","Baz"
"","",
"Baz","Bar","Foo"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "blank_rows"
    And that error should have the row "3"
    And that error should have the content ""","","

  Scenario: Successfully report a CSV with multiple trailing empty rows
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar","Baz"
"Foo","Bar","Baz"


    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "blank_rows"
    And that error should have the row "4"

  Scenario: Successfully report a CSV with an empty row
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"Foo","Bar","Baz"

"Foo","Bar","Baz"
    """
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "blank_rows"
    And that error should have the row "3"

  Scenario: Report invalid Encoding
    Given I have a CSV file called "invalid-byte-sequence.csv"
    And I set an encoding header of "UTF-8"
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "invalid_encoding"

  Scenario: Correctly handle different encodings
    Given I have a CSV file called "invalid-byte-sequence.csv"
    And I set an encoding header of "ISO-8859-1"
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be no "content_encoding" errors

  Scenario: Report invalid file
    Given I have a CSV file called "spreadsheet.xls"
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "invalid_encoding"

  Scenario: Incorrect extension
    Given I have a CSV with the following content:
    """
"abc","2","3"
    """
    And the content type is set to "application/excel"
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are errors
    Then there should be 1 error
    And that error should have the type "wrong_content_type"

  Scenario: Handles urls that 404
    Given I have a CSV that doesn't exist
    When I ask if there are errors
    Then there should be 1 error
    And that error should have the type "not_found"

  Scenario: Incorrect line endings specified in settings
    Given I have a CSV file called "cr-line-endings.csv"
    And I set the line endings to linefeed
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are errors
    Then there should be 1 error
    And that error should have the type "line_breaks"

  Scenario: inconsistent line endings in file cause an error
    Given I have a CSV file called "inconsistent-line-endings.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are errors
    Then there should be 1 error
    And that error should have the type "line_breaks"


  Scenario: inconsistent line endings with unquoted fields in file cause an error
    Given I have a CSV file called "inconsistent-line-endings-unquoted.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are errors
    Then there should be 1 error
    And that error should have the type "line_breaks"


================================================
FILE: features/validation_info.feature
================================================
Feature: Get validation information messages

  Scenario: LF line endings in file give an info message
    Given I have a CSV file called "lf-line-endings.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I set header to "true"
    And I ask if there are info messages
    Then there should be 1 info messages
    And one of the messages should have the type "nonrfc_line_breaks"

  Scenario: CRLF line endings in file produces no info messages
    Given I have a CSV file called "crlf-line-endings.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I set header to "true"
    And I ask if there are info messages
    Then there should be 0 info messages


================================================
FILE: features/validation_warnings.feature
================================================
Feature: Validation warnings

  Scenario: UTF-8 Encoding
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"abc","2","3"
    """
    And it is encoded as "utf-8"
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are warnings
    Then there should be 0 warnings

   Scenario: ISO-8859-1 Encoding
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"1","2","3"
    """
     And it is encoded as "iso-8859-1"
    And it is stored at the url "http://example.com/example1.csv"
    When I ask if there are warnings
    Then there should be 1 warnings

  Scenario: Correct content type
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"abc","2","3"
    """
    And the content type is set to "text/csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are warnings
    Then there should be 0 warnings

  Scenario: No extension
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"abc","2","3"
    """
    And the content type is set to "text/csv"
    And it is stored at the url "http://example.com/example1"
    And I ask if there are warnings
    Then there should be 0 warnings

  Scenario: Allow query params after extension
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"abc","2","3"
    """
    And the content type is set to "text/csv"
    And it is stored at the url "http://example.com/example1.csv?query=param"
    And I ask if there are warnings
    Then there should be 0 warnings

  Scenario: User doesn't supply encoding
    Given I have a CSV with the following content:
    """
"col1","col2","col3"
"abc","2","3"
    """
    And it is stored at the url "http://example.com/example1.csv" with no character set
    When I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "no_encoding"

  Scenario: Title rows
    Given I have a CSV file called "title-row.csv"
    And it is stored at the url "http://example.com/example1.csv"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "title_row"

  Scenario: catch excel warnings
    Given I parse a file called "spreadsheet.xls"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "excel"

  Scenario: catch excel warnings
    Given I parse a file called "spreadsheet.xlsx"
    And I ask if there are warnings
    Then there should be 1 warnings
    And that warning should have the type "excel"


================================================
FILE: gemfiles/activesupport_5.2.gemfile
================================================
# This file was generated by Appraisal

source "https://rubygems.org"

gem "activesupport", "~> 5.2.0"

gemspec path: "../"


================================================
FILE: gemfiles/activesupport_6.0.gemfile
================================================
# This file was generated by Appraisal

source "https://rubygems.org"

gem "activesupport", "~> 6.0.0"

gemspec path: "../"


================================================
FILE: gemfiles/activesupport_6.1.gemfile
================================================
# This file was generated by Appraisal

source "https://rubygems.org"

gem "activesupport", "~> 6.1.0"

gemspec path: "../"


================================================
FILE: gemfiles/activesupport_7.0.gemfile
================================================
# This file was generated by Appraisal

source "https://rubygems.org"

gem "activesupport", "~> 7.0.0"

gemspec path: "../"


================================================
FILE: gemfiles/activesupport_7.1.gemfile
================================================
# This file was generated by Appraisal

source "https://rubygems.org"

gem "activesupport", "~> 7.1.0"

gemspec path: "../"


================================================
FILE: gemfiles/activesupport_7.2.gemfile
================================================
# This file was generated by Appraisal

source "https://rubygems.org"

gem "activesupport", "~> 7.2.0"

gemspec path: "../"


================================================
FILE: lib/csvlint/cli.rb
================================================
require "csvlint"
require "rainbow"
require "active_support/json"
require "json"
require "thor"

require "active_support/inflector"

module Csvlint
  class Cli < Thor
    desc "myfile.csv OR csvlint http://example.com/myfile.csv", "Supports validating CSV files to check their syntax and contents"

    option :dump_errors, desc: "Pretty print error and warning objects.", type: :boolean, aliases: :d
    option :schema, banner: "FILENAME OR URL", desc: "Schema file", aliases: :s
    option :json, desc: "Output errors as JSON", type: :boolean, aliases: :j
    option :werror, desc: "Make all warnings into errors", type: :boolean, aliases: :w

    def validate(source = nil)
      source = read_source(source)
      @schema = get_schema(options[:schema]) if options[:schema]
      fetch_schema_tables(@schema, options) if source.nil?

      Rainbow.enabled = $stdout.tty?

      valid = validate_csv(source, @schema, options[:dump_errors], options[:json], options[:werror])
      exit 1 unless valid
    end

    def help
      self.class.command_help(shell, :validate)
    end

    default_task :validate

    private

    def read_source(source)
      if source.nil?
        # If no source is present, try reading from stdin
        if !$stdin.tty?
          source = begin
            StringIO.new($stdin.read)
          rescue
            nil
          end
          return_error "No CSV data to validate" if !options[:schema] && source.nil?
        end
      else
        # If the source isn't a URL, it's a file
        unless /^http(s)?/.match?(source)
          begin
            source = File.new(source)
          rescue Errno::ENOENT
            return_error "#{source} not found"
          end
        end
      end

      source
    end

    def get_schema(schema)
      begin
        schema = Csvlint::Schema.load_from_uri(schema, false)
      rescue Csvlint::Csvw::MetadataError => e
        return_error "invalid metadata: #{e.message}#{" at " + e.path if e.path}"
      rescue OpenURI::HTTPError, Errno::ENOENT
        return_error "#{options[:schema]} not found"
      end

      if schema.instance_of?(Csvlint::Schema) && schema.description == "malformed"
        return_error "invalid metadata: malformed JSON"
      end

      schema
    end

    def fetch_schema_tables(schema, options)
      valid = true

      unless schema.instance_of? Csvlint::Csvw::TableGroup
        return_error "No CSV data to validate."
      end
      schema.tables.keys.each do |source|
        unless /^http(s)?/.match?(source)
          begin
            source = source.sub("file:", "")
            source = File.new(source)
          rescue Errno::ENOENT
            return_error "#{source} not found"
          end
        end
        valid &= validate_csv(source, schema, options[:dump_errors], nil, options[:werror])
      end

      exit 1 unless valid
    end

    def print_error(index, error, dump, color)
      location = ""
      location += error.row.to_s if error.row
      location += "#{error.row ? "," : ""}#{error.column}" if error.column
      if error.row || error.column
        location = "#{error.row ? "Row" : "Column"}: #{location}"
      end
      output_string = "#{index + 1}. "
      if error.column && @schema&.instance_of?(Csvlint::Schema)
        unless @schema.fields[error.column - 1].nil?
          output_string += "#{@schema.fields[error.column - 1].name}: "
        end
      end
      output_string += error.type.to_s
      output_string += ". #{location}" unless location.empty?
      output_string += ". #{error.content}" if error.content

      puts Rainbow(output_string).color(color)

      if dump
        pp error
      end
    end

    def print_errors(errors, dump)
      if errors.size > 0
        errors.each_with_index { |error, i| print_error(i, error, dump, :red) }
      end
    end

    def return_error(message)
      puts Rainbow(message).red
      exit 1
    end

    def validate_csv(source, schema, dump, json, werror)
      @error_count = 0

      validator = if json === true
        Csvlint::Validator.new(source, {}, schema)
      else
        Csvlint::Validator.new(source, {}, schema, {lambda: report_lines})
      end

      csv = if source.instance_of?(String)
        source
      elsif source.instance_of?(File)
        source.path
      else
        "CSV"
      end

      if json === true
        json = {
          validation: {
            state: validator.valid? ? "valid" : "invalid",
            errors: validator.errors.map { |v| hashify(v) },
            warnings: validator.warnings.map { |v| hashify(v) },
            info: validator.info_messages.map { |v| hashify(v) }
          }
        }.to_json
        print json
      else
        puts "\r\n#{csv} is #{validator.valid? ? Rainbow("VALID").green : Rainbow("INVALID").red}"
        print_errors(validator.errors, dump)
        print_errors(validator.warnings, dump)
      end

      return false if werror && validator.warnings.size > 0
      validator.valid?
    end

    def hashify(error)
      h = {
        type: error.type,
        category: error.category,
        row: error.row,
        col: error.column
      }

      if error.column && @schema&.instance_of?(Csvlint::Schema) && !@schema.fields[error.column - 1].nil?
        field = @schema.fields[error.column - 1]
        h[:header] = field.name
        h[:constraints] = field.constraints.map { |k, v| [k.underscore, v] }.to_h
      end

      h
    end

    def report_lines
      lambda do |row|
        new_errors = row.errors.count
        if new_errors > @error_count
          print Rainbow("!").red
        else
          print Rainbow(".").green
        end
        @error_count = new_errors
      end
    end
  end
end


================================================
FILE: lib/csvlint/csvw/column.rb
================================================
module Csvlint
  module Csvw
    class Column
      include Csvlint::ErrorCollector

      attr_reader :id, :about_url, :datatype, :default, :lang, :name, :null, :number, :ordered, :property_url, :required, :separator, :source_number, :suppress_output, :text_direction, :default_name, :titles, :value_url, :virtual, :annotations

      def initialize(number, name, id: nil, about_url: nil, datatype: {"@id" => "http://www.w3.org/2001/XMLSchema#string"}, default: "", lang: "und", null: [""], ordered: false, property_url: nil, required: false, separator: nil, source_number: nil, suppress_output: false, text_direction: :inherit, default_name: nil, titles: {}, value_url: nil, virtual: false, annotations: [], warnings: [])
        @number = number
        @name = name
        @id = id
        @about_url = about_url
        @datatype = datatype
        @default = default
        @lang = lang
        @null = null
        @ordered = ordered
        @property_url = property_url
        @required = required
        @separator = separator
        @source_number = source_number || number
        @suppress_output = suppress_output
        @text_direction = text_direction
        @default_name = default_name
        @titles = titles
        @value_url = value_url
        @virtual = virtual
        @annotations = annotations
        reset
        @warnings += warnings
      end

      def self.from_json(number, column_desc, base_url = nil, lang = "und", inherited_properties = {})
        annotations = {}
        warnings = []
        column_properties = {}
        inherited_properties = inherited_properties.clone

        column_desc.each do |property, value|
          if property == "@type"
            raise Csvlint::Csvw::MetadataError.new("columns[#{number}].@type"), "@type of column is not 'Column'" if value != "Column"
          else
            v, warning, type = Csvw::PropertyChecker.check_property(property, value, base_url, lang)
            warnings += Array(warning).map { |w| Csvlint::ErrorMessage.new(w, :metadata, nil, nil, "#{property}: #{value}", nil) } unless warning.nil? || warning.empty?
            if type == :annotation
              annotations[property] = v
            elsif type == :common || type == :column
              column_properties[property] = v
            elsif type == :inherited
              inherited_properties[property] = v
            else
              warnings << Csvlint::ErrorMessage.new(:invalid_property, :metadata, nil, nil, "column: #{property}", nil)
            end
          end
        end

        new(number, column_properties["name"],
          id: column_properties["@id"],
          datatype: inherited_properties["datatype"] || {"@id" => "http://www.w3.org/2001/XMLSchema#string"},
          lang: inherited_properties["lang"] || "und",
          null: inherited_properties["null"] || [""],
          default: inherited_properties["default"] || "",
          about_url: inherited_properties["aboutUrl"],
          property_url: inherited_properties["propertyUrl"],
          value_url: inherited_properties["valueUrl"],
          required: inherited_properties["required"] || false,
          separator: inherited_properties["separator"],
          ordered: inherited_properties["ordered"] || false,
          default_name: column_properties["titles"] && column_properties["titles"][lang] ? column_properties["titles"][lang][0] : nil,
          titles: column_properties["titles"],
          suppress_output: column_properties["suppressOutput"] || false,
          virtual: column_properties["virtual"] || false,
          annotations: annotations,
          warnings: warnings)
      end

      def validate_header(header, strict)
        reset
        if strict || @titles
          valid_headers = @titles ? @titles.map { |l, v| v if Column.languages_match(l, lang) }.flatten : []
          unless valid_headers.include? header
            if strict
              build_errors(:invalid_header, :schema, 1, @number, header, @titles)
            else
              build_warnings(:invalid_header, :schema, 1, @number, header, @titles)
            end
          end
        end
        valid?
      end

      def validate(string_value, row = nil)
        reset
        string_value ||= @default
        if null.include? string_value
          validate_required(nil, row)
          nil

        else
          string_values = @separator.nil? ? [string_value] : string_value.split(@separator)
          values = []
          string_values.each do |s|
            invalid = false
            value, warning = DATATYPE_PARSER[@datatype["base"] || @datatype["@id"]].call(s, @datatype["format"])
            if warning.nil?
              validate_required(value, row)
              invalid = !validate_format(value, row) || invalid
              invalid = !validate_length(value, row) || invalid
              invalid = !validate_value(value, row) || invalid
              values << (invalid ? {invalid: s} : value)
            else
              build_errors(warning, :schema, row, @number, s, @datatype)
              values << {invalid: s}
            end
          end
          values && @separator.nil? ? values[0] : values

        end
      end

      private

      class << self
        def create_date_parser(type, warning)
          lambda { |value, format|
            format = Csvlint::Csvw::DateFormat.new(nil, type) if format.nil?
            v = format.parse(value)
            return nil, warning if v.nil?
            [v, nil]
          }
        end

        def create_regexp_based_parser(regexp, warning)
          lambda { |value, format|
            return nil, warning unless value&.match?(regexp)
            [value, nil]
          }
        end

        def languages_match(l1, l2)
          return true if l1 == l2 || l1 == "und" || l2 == "und"
          return true if l1 =~ Regexp.new("^#{l2}-") || l2 =~ Regexp.new("^#{l1}-")
          false
        end
      end

      def validate_required(value, row)
        if @required && value.nil?
          build_errors(:required, :schema, row, number, value, {"required" => @required})
          return false
        end
        true
      end

      def validate_length(value, row)
        valid = true
        if datatype["length"] || datatype["minLength"] || datatype["maxLength"]
          length = value.length
          length = value.gsub(/==?$/, "").length * 3 / 4 if datatype["@id"] == "http://www.w3.org/2001/XMLSchema#base64Binary" || datatype["base"] == "http://www.w3.org/2001/XMLSchema#base64Binary"
          length = value.length / 2 if datatype["@id"] == "http://www.w3.org/2001/XMLSchema#hexBinary" || datatype["base"] == "http://www.w3.org/2001/XMLSchema#hexBinary"

          if datatype["minLength"] && length < datatype["minLength"]
            build_errors(:min_length, :schema, row, number, value, {"minLength" => datatype["minLength"]})
            valid = false
          end
          if datatype["maxLength"] && length > datatype["maxLength"]
            build_errors(:max_length, :schema, row, number, value, {"maxLength" => datatype["maxLength"]})
            valid = false
          end
          if datatype["length"] && length != datatype["length"]
            build_errors(:length, :schema, row, number, value, {"length" => datatype["length"]})
            valid = false
          end
        end
        valid
      end

      def validate_format(value, row)
        if datatype["format"]
          unless DATATYPE_FORMAT_VALIDATION[datatype["base"]].call(value, datatype["format"])
            build_errors(:format, :schema, row, number, value, {"format" => datatype["format"]})
            return false
          end
        end
        true
      end

      def validate_value(value, row)
        valid = true
        if datatype["minInclusive"] && ((value.is_a? Hash) ? (value[:dateTime] < datatype["minInclusive"][:dateTime]) : (value < datatype["minInclusive"]))
          build_errors(:min_inclusive, :schema, row, number, value, {"minInclusive" => datatype["minInclusive"]})
          valid = false
        end
        if datatype["maxInclusive"] && ((value.is_a? Hash) ? (value[:dateTime] > datatype["maxInclusive"][:dateTime]) : (value > datatype["maxInclusive"]))
          build_errors(:max_inclusive, :schema, row, number, value, {"maxInclusive" => datatype["maxInclusive"]})
          valid = false
        end
        if datatype["minExclusive"] && ((value.is_a? Hash) ? (value[:dateTime] <= datatype["minExclusive"][:dateTime]) : (value <= datatype["minExclusive"]))
          build_errors(:min_exclusive, :schema, row, number, value, {"minExclusive" => datatype["minExclusive"]})
          valid = false
        end
        if datatype["maxExclusive"] && ((value.is_a? Hash) ? (value[:dateTime] >= datatype["maxExclusive"][:dateTime]) : (value >= datatype["maxExclusive"]))
          build_errors(:max_exclusive, :schema, row, number, value, {"maxExclusive" => datatype["maxExclusive"]})
          valid = false
        end
        valid
      end

      REGEXP_VALIDATION = lambda { |value, format| value =~ format }

      NO_ADDITIONAL_VALIDATION = lambda { |value, format| true }

      DATATYPE_FORMAT_VALIDATION = {
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral" => REGEXP_VALIDATION,
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML" => REGEXP_VALIDATION,
        "http://www.w3.org/ns/csvw#JSON" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#anyAtomicType" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#anyURI" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#base64Binary" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#boolean" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#date" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#dateTime" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#dateTimeStamp" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#decimal" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#integer" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#long" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#int" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#short" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#byte" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#nonNegativeInteger" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#positiveInteger" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#unsignedLong" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#unsignedInt" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#unsignedShort" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#unsignedByte" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#nonPositiveInteger" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#negativeInteger" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#double" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#duration" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#dayTimeDuration" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#yearMonthDuration" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#float" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#gDay" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#gMonth" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#gMonthDay" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#gYear" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#gYearMonth" => NO_ADDITIONAL_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#hexBinary" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#QName" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#string" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#normalizedString" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#token" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#language" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#Name" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#NMTOKEN" => REGEXP_VALIDATION,
        "http://www.w3.org/2001/XMLSchema#time" => NO_ADDITIONAL_VALIDATION
      }

      TRIM_VALUE = lambda { |value, format| [value.strip, nil] }
      ALL_VALUES_VALID = lambda { |value, format| [value, nil] }

      NUMERIC_PARSER = lambda { |value, format, integer = false|
        format = Csvlint::Csvw::NumberFormat.new(nil, nil, ".", integer) if format.nil?
        v = format.parse(value)
        return nil, :invalid_number if v.nil?
        [v, nil]
      }

      DATATYPE_PARSER = {
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral" => TRIM_VALUE,
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML" => TRIM_VALUE,
        "http://www.w3.org/ns/csvw#JSON" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#anyAtomicType" => ALL_VALUES_VALID,
        "http://www.w3.org/2001/XMLSchema#anyURI" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#base64Binary" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#boolean" => lambda { |value, format|
          if format.nil?
            return true, nil if ["true", "1"].include? value
            return false, nil if ["false", "0"].include? value
          else
            return true, nil if value == format[0]
            return false, nil if value == format[1]
          end
          [value, :invalid_boolean]
        },
        "http://www.w3.org/2001/XMLSchema#date" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#date", :invalid_date),
        "http://www.w3.org/2001/XMLSchema#dateTime" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#dateTime", :invalid_date_time),
        "http://www.w3.org/2001/XMLSchema#dateTimeStamp" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#dateTimeStamp", :invalid_date_time_stamp),
        "http://www.w3.org/2001/XMLSchema#decimal" => lambda { |value, format|
          return nil, :invalid_decimal if /(E|e|^(NaN|INF|-INF)$)/.match?(value)
          NUMERIC_PARSER.call(value, format)
        },
        "http://www.w3.org/2001/XMLSchema#integer" => lambda { |value, format|
          v, w = NUMERIC_PARSER.call(value, format, true)
          return v, :invalid_integer unless w.nil?
          return nil, :invalid_integer unless v.is_a? Integer
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#long" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#integer"].call(value, format)
          return v, :invalid_long unless w.nil?
          return nil, :invalid_long unless v <= 9223372036854775807 && v >= -9223372036854775808
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#int" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#integer"].call(value, format)
          return v, :invalid_int unless w.nil?
          return nil, :invalid_int unless v <= 2147483647 && v >= -2147483648
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#short" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#integer"].call(value, format)
          return v, :invalid_short unless w.nil?
          return nil, :invalid_short unless v <= 32767 && v >= -32768
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#byte" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#integer"].call(value, format)
          return v, :invalid_byte unless w.nil?
          return nil, :invalid_byte unless v <= 127 && v >= -128
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#nonNegativeInteger" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#integer"].call(value, format)
          return v, :invalid_nonNegativeInteger unless w.nil?
          return nil, :invalid_nonNegativeInteger unless v >= 0
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#positiveInteger" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#integer"].call(value, format)
          return v, :invalid_positiveInteger unless w.nil?
          return nil, :invalid_positiveInteger unless v > 0
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#unsignedLong" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#nonNegativeInteger"].call(value, format)
          return v, :invalid_unsignedLong unless w.nil?
          return nil, :invalid_unsignedLong unless v <= 18446744073709551615
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#unsignedInt" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#nonNegativeInteger"].call(value, format)
          return v, :invalid_unsignedInt unless w.nil?
          return nil, :invalid_unsignedInt unless v <= 4294967295
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#unsignedShort" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#nonNegativeInteger"].call(value, format)
          return v, :invalid_unsignedShort unless w.nil?
          return nil, :invalid_unsignedShort unless v <= 65535
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#unsignedByte" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#nonNegativeInteger"].call(value, format)
          return v, :invalid_unsignedByte unless w.nil?
          return nil, :invalid_unsignedByte unless v <= 255
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#nonPositiveInteger" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#integer"].call(value, format)
          return v, :invalid_nonPositiveInteger unless w.nil?
          return nil, :invalid_nonPositiveInteger unless v <= 0
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#negativeInteger" => lambda { |value, format|
          v, w = DATATYPE_PARSER["http://www.w3.org/2001/XMLSchema#integer"].call(value, format)
          return v, :invalid_negativeInteger unless w.nil?
          return nil, :invalid_negativeInteger unless v < 0
          [v, w]
        },
        "http://www.w3.org/2001/XMLSchema#double" => NUMERIC_PARSER,
        # regular expressions here taken from XML Schema datatypes spec
        "http://www.w3.org/2001/XMLSchema#duration" =>
          create_regexp_based_parser(/-?P((([0-9]+Y([0-9]+M)?([0-9]+D)?|([0-9]+M)([0-9]+D)?|([0-9]+D))(T(([0-9]+H)([0-9]+M)?([0-9]+(\.[0-9]+)?S)?|([0-9]+M)([0-9]+(\.[0-9]+)?S)?|([0-9]+(\.[0-9]+)?S)))?)|(T(([0-9]+H)([0-9]+M)?([0-9]+(\.[0-9]+)?S)?|([0-9]+M)([0-9]+(\.[0-9]+)?S)?|([0-9]+(\.[0-9]+)?S))))/, :invalid_duration),
        "http://www.w3.org/2001/XMLSchema#dayTimeDuration" =>
          create_regexp_based_parser(/-?P(([0-9]+D(T(([0-9]+H)([0-9]+M)?([0-9]+(\.[0-9]+)?S)?|([0-9]+M)([0-9]+(\.[0-9]+)?S)?|([0-9]+(\.[0-9]+)?S)))?)|(T(([0-9]+H)([0-9]+M)?([0-9]+(\.[0-9]+)?S)?|([0-9]+M)([0-9]+(\.[0-9]+)?S)?|([0-9]+(\.[0-9]+)?S))))/, :invalid_dayTimeDuration),
        "http://www.w3.org/2001/XMLSchema#yearMonthDuration" =>
          create_regexp_based_parser(/-?P([0-9]+Y([0-9]+M)?|([0-9]+M))/, :invalid_duration),
        "http://www.w3.org/2001/XMLSchema#float" => NUMERIC_PARSER,
        "http://www.w3.org/2001/XMLSchema#gDay" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#gDay", :invalid_gDay),
        "http://www.w3.org/2001/XMLSchema#gMonth" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#gMonth", :invalid_gMonth),
        "http://www.w3.org/2001/XMLSchema#gMonthDay" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#gMonthDay", :invalid_gMonthDay),
        "http://www.w3.org/2001/XMLSchema#gYear" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#gYear", :invalid_gYear),
        "http://www.w3.org/2001/XMLSchema#gYearMonth" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#gYearMonth", :invalid_gYearMonth),
        "http://www.w3.org/2001/XMLSchema#hexBinary" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#QName" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#string" => ALL_VALUES_VALID,
        "http://www.w3.org/2001/XMLSchema#normalizedString" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#token" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#language" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#Name" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#NMTOKEN" => TRIM_VALUE,
        "http://www.w3.org/2001/XMLSchema#time" =>
          create_date_parser("http://www.w3.org/2001/XMLSchema#time", :invalid_time)
      }
    end
  end
end


================================================
FILE: lib/csvlint/csvw/date_format.rb
================================================
module Csvlint
  module Csvw
    class DateFormat
      attr_reader :pattern

      def initialize(pattern, datatype = nil)
        @pattern = pattern

        if @pattern.nil?
          @regexp = DEFAULT_REGEXP[datatype]
          @type = datatype
        else
          test_pattern = pattern.clone
          test_pattern.gsub!(/S+/, "")
          FIELDS.keys.sort_by { |f| -f.length }.each do |field|
            test_pattern.gsub!(field, "")
          end
          raise Csvw::DateFormatError, "unrecognised date field symbols in date format" if /[GyYuUrQqMLlwWdDFgEecahHKkjJmsSAzZOvVXx]/.match?(test_pattern)

          @regexp = DATE_PATTERN_REGEXP[@pattern]
          @type = @regexp.nil? ? "http://www.w3.org/2001/XMLSchema#time" : "http://www.w3.org/2001/XMLSchema#date"
          @regexp ||= TIME_PATTERN_REGEXP[@pattern]
          @type = @regexp.nil? ? "http://www.w3.org/2001/XMLSchema#dateTime" : @type
          @regexp ||= DATE_TIME_PATTERN_REGEXP[@pattern]

          if @regexp.nil?
            regexp = @pattern

            @type = "http://www.w3.org/2001/XMLSchema#date" if !(regexp =~ /HH/) && regexp =~ /yyyy/
            @type = "http://www.w3.org/2001/XMLSchema#time" if regexp =~ /HH/ && !(regexp =~ /yyyy/)
            @type = "http://www.w3.org/2001/XMLSchema#dateTime" if regexp =~ /HH/ && regexp =~ /yyyy/

            regexp = regexp.sub("HH", FIELDS["HH"].to_s)
            regexp = regexp.sub("mm", FIELDS["mm"].to_s)
            if /ss\.S+/.match?(@pattern)
              max_fractional_seconds = @pattern.split(".")[-1].length
              regexp = regexp.sub(/ss\.S+$/, "(?<second>#{FIELDS["ss"]}(.[0-9]{1,#{max_fractional_seconds}})?)")
            else
              regexp = regexp.sub("ss", "(?<second>#{FIELDS["ss"]})")
            end

            if /yyyy/.match?(regexp)
              regexp = regexp.sub("yyyy", FIELDS["yyyy"].to_s)
              regexp = regexp.sub("MM", FIELDS["MM"].to_s)
              regexp = regexp.sub("M", FIELDS["M"].to_s)
              regexp = regexp.sub("dd", FIELDS["dd"].to_s)
              regexp = regexp.sub(/d(?=[-T \/.])/, FIELDS["d"].to_s)
            end

            regexp = regexp.sub("XXX", FIELDS["XXX"].to_s)
            regexp = regexp.sub("XX", FIELDS["XX"].to_s)
            regexp = regexp.sub("X", FIELDS["X"].to_s)
            regexp = regexp.sub("xxx", FIELDS["xxx"].to_s)
            regexp = regexp.sub("xx", FIELDS["xx"].to_s)
            regexp = regexp.sub(/x(?!:)/, FIELDS["x"].to_s)

            @regexp = Regexp.new("^#{regexp}$")
          end
        end
      end

      def match(value)
        value&.match?(@regexp) ? true : false
      end

      def parse(value)
        match = @regexp.match(value)
        return nil if match.nil?
        # STDERR.puts(@regexp)
        # STDERR.puts(value)
        # STDERR.puts(match.inspect)
        value = {}
        match.names.each do |field|
          unless match[field].nil?
            case field
            when "timezone"
              tz = match["timezone"]
              tz = "+00:00" if tz == "Z"
              tz += ":00" if tz.length == 3
              tz = "#{tz[0..2]}:#{tz[3..4]}" unless /:/.match?(tz)
              value[:timezone] = tz
            when "second"
              value[:second] = match["second"].to_f
            else
              value[field.to_sym] = match[field].to_i
            end
          end
        end
        case @type
        when "http://www.w3.org/2001/XMLSchema#date"
          begin
            value[:dateTime] = Date.new(match["year"].to_i, match["month"].to_i, match["day"].to_i)
          rescue ArgumentError
            return nil
          end
        when "http://www.w3.org/2001/XMLSchema#dateTime"
          begin
            value[:dateTime] = DateTime.new(match["year"].to_i, match["month"].to_i, match["day"].to_i, match["hour"].to_i, match["minute"].to_i, (match.names.include?("second") ? match["second"].to_f : 0), (match.names.include?("timezone") && match["timezone"]) ? match["timezone"] : "")
          rescue ArgumentError
            return nil
          end
        else
          value[:dateTime] = DateTime.new(value[:year] || 0, value[:month] || 1, value[:day] || 1, value[:hour] || 0, value[:minute] || 0, value[:second] || 0, value[:timezone] || "+00:00")
        end
        value[:string] = if value[:year]
          if value[:month]
            if value[:day]
              if value[:hour]
                # dateTime
                "#{format("%04d", value[:year])}-#{format("%02d", value[:month])}-#{format("%02d", value[:day])}T#{format("%02d", value[:hour])}:#{format("%02d", value[:minute] || 0)}:#{format("%02g", value[:second] || 0)}#{value[:timezone] ? value[:timezone].sub("+00:00", "Z") : ""}"
              else
                # date
                "#{format("%04d", value[:year])}-#{format("%02d", value[:month])}-#{format("%02d", value[:day])}#{value[:timezone] ? value[:timezone].sub("+00:00", "Z") : ""}"
              end
            else
              # gYearMonth
              "#{format("%04d", value[:year])}-#{format("%02d", value[:month])}#{value[:timezone] ? value[:timezone].sub("+00:00", "Z") : ""}"
            end
          else
            # gYear
            "#{format("%04d", value[:year])}#{value[:timezone] ? value[:timezone].sub("+00:00", "Z") : ""}"
          end
        elsif value[:month]
          if value[:day]
            # gMonthDay
            "--#{format("%02d", value[:month])}-#{format("%02d", value[:day])}#{value[:timezone] ? value[:timezone].sub("+00:00", "Z") : ""}"
          else
            # gMonth
            "--#{format("%02d", value[:month])}#{value[:timezone] ? value[:timezone].sub("+00:00", "Z") : ""}"
          end
        elsif value[:day]
          # gDay
          "---#{format("%02d", value[:day])}#{value[:timezone] ? value[:timezone].sub("+00:00", "Z") : ""}"
        else
          "#{format("%02d", value[:hour])}:#{format("%02d", value[:minute])}:#{format("%02g", value[:second] || 0)}#{value[:timezone] ? value[:timezone].sub("+00:00", "Z") : ""}"
        end
        value
      end

      private

      FIELDS = {
        "yyyy" => /(?<year>-?([1-9][0-9]{3,}|0[0-9]{3}))/,
        "MM" => /(?<month>0[1-9]|1[0-2])/,
        "M" => /(?<month>[1-9]|1[0-2])/,
        "dd" => /(?<day>0[1-9]|[12][0-9]|3[01])/,
        "d" => /(?<day>[1-9]|[12][0-9]|3[01])/,
        "HH" => /(?<hour>[01][0-9]|2[0-3])/,
        "mm" => /(?<minute>[0-5][0-9])/,
        "ss" => /([0-6][0-9])/,
        "X" => /(?<timezone>Z|[-+]((0[0-9]|1[0-3])([0-5][0-9])?|14(00)?))/,
        "XX" => /(?<timezone>Z|[-+]((0[0-9]|1[0-3])[0-5][0-9]|1400))/,
        "XXX" => /(?<timezone>Z|[-+]((0[0-9]|1[0-3]):[0-5][0-9]|14:00))/,
        "x" => /(?<timezone>[-+]((0[0-9]|1[0-3])([0-5][0-9])?|14(00)?))/,
        "xx" => /(?<timezone>[-+]((0[0-9]|1[0-3])[0-5][0-9]|1400))/,
        "xxx" => /(?<timezone>[-+]((0[0-9]|1[0-3]):[0-5][0-9]|14:00))/
      }

      DATE_PATTERN_REGEXP = {
        "yyyy-MM-dd" => Regexp.new("^#{FIELDS["yyyy"]}-#{FIELDS["MM"]}-#{FIELDS["dd"]}$"),
        "yyyyMMdd" => Regexp.new("^#{FIELDS["yyyy"]}#{FIELDS["MM"]}#{FIELDS["dd"]}$"),
        "dd-MM-yyyy" => Regexp.new("^#{FIELDS["dd"]}-#{FIELDS["MM"]}-#{FIELDS["yyyy"]}$"),
        "d-M-yyyy" => Regexp.new("^#{FIELDS["d"]}-#{FIELDS["M"]}-#{FIELDS["yyyy"]}$"),
        "MM-dd-yyyy" => Regexp.new("^#{FIELDS["MM"]}-#{FIELDS["dd"]}-#{FIELDS["yyyy"]}$"),
        "M-d-yyyy" => Regexp.new("^#{FIELDS["M"]}-#{FIELDS["d"]}-#{FIELDS["yyyy"]}$"),
        "dd/MM/yyyy" => Regexp.new("^#{FIELDS["dd"]}/#{FIELDS["MM"]}/#{FIELDS["yyyy"]}$"),
        "d/M/yyyy" => Regexp.new("^#{FIELDS["d"]}/#{FIELDS["M"]}/#{FIELDS["yyyy"]}$"),
        "MM/dd/yyyy" => Regexp.new("^#{FIELDS["MM"]}/#{FIELDS["dd"]}/#{FIELDS["yyyy"]}$"),
        "M/d/yyyy" => Regexp.new("^#{FIELDS["M"]}/#{FIELDS["d"]}/#{FIELDS["yyyy"]}$"),
        "dd.MM.yyyy" => Regexp.new("^#{FIELDS["dd"]}.#{FIELDS["MM"]}.#{FIELDS["yyyy"]}$"),
        "d.M.yyyy" => Regexp.new("^#{FIELDS["d"]}.#{FIELDS["M"]}.#{FIELDS["yyyy"]}$"),
        "MM.dd.yyyy" => Regexp.new("^#{FIELDS["MM"]}.#{FIELDS["dd"]}.#{FIELDS["yyyy"]}$"),
        "M.d.yyyy" => Regexp.new("^#{FIELDS["M"]}.#{FIELDS["d"]}.#{FIELDS["yyyy"]}$")
      }

      TIME_PATTERN_REGEXP = {
        "HH:mm:ss" => Regexp.new("^#{FIELDS["HH"]}:#{FIELDS["mm"]}:(?<second>#{FIELDS["ss"]})$"),
        "HHmmss" => Regexp.new("^#{FIELDS["HH"]}#{FIELDS["mm"]}(?<second>#{FIELDS["ss"]})$"),
        "HH:mm" => Regexp.new("^#{FIELDS["HH"]}:#{FIELDS["mm"]}$"),
        "HHmm" => Regexp.new("^#{FIELDS["HH"]}#{FIELDS["mm"]}$")
      }

      DATE_TIME_PATTERN_REGEXP = {
        "yyyy-MM-ddTHH:mm:ss" => Regexp.new("^#{FIELDS["yyyy"]}-#{FIELDS["MM"]}-#{FIELDS["dd"]}T#{FIELDS["HH"]}:#{FIELDS["mm"]}:(?<second>#{FIELDS["ss"]})$"),
        "yyyy-MM-ddTHH:mm" => Regexp.new("^#{FIELDS["yyyy"]}-#{FIELDS["MM"]}-#{FIELDS["dd"]}T#{FIELDS["HH"]}:#{FIELDS["mm"]}$")
      }

      DEFAULT_REGEXP = {
        "http://www.w3.org/2001/XMLSchema#date" =>
          Regexp.new("^#{FIELDS["yyyy"]}-#{FIELDS["MM"]}-#{FIELDS["dd"]}#{FIELDS["XXX"]}?$"),
        "http://www.w3.org/2001/XMLSchema#dateTime" =>
          Regexp.new("^#{FIELDS["yyyy"]}-#{FIELDS["MM"]}-#{FIELDS["dd"]}T#{FIELDS["HH"]}:#{FIELDS["mm"]}:(?<second>#{FIELDS["ss"]}(.[0-9]+)?)#{FIELDS["XXX"]}?$"),
        "http://www.w3.org/2001/XMLSchema#dateTimeStamp" =>
          Regexp.new("^#{FIELDS["yyyy"]}-#{FIELDS["MM"]}-#{FIELDS["dd"]}T#{FIELDS["HH"]}:#{FIELDS["mm"]}:(?<second>#{FIELDS["ss"]}(.[0-9]+)?)#{FIELDS["XXX"]}$"),
        "http://www.w3.org/2001/XMLSchema#gDay" =>
          Regexp.new("^---#{FIELDS["dd"]}#{FIELDS["XXX"]}?$"),
        "http://www.w3.org/2001/XMLSchema#gMonth" =>
          Regexp.new("^--#{FIELDS["MM"]}#{FIELDS["XXX"]}?$"),
        "http://www.w3.org/2001/XMLSchema#gMonthDay" =>
          Regexp.new("^--#{FIELDS["MM"]}-#{FIELDS["dd"]}#{FIELDS["XXX"]}?$"),
        "http://www.w3.org/2001/XMLSchema#gYear" =>
          Regexp.new("^#{FIELDS["yyyy"]}#{FIELDS["XXX"]}?$"),
        "http://www.w3.org/2001/XMLSchema#gYearMonth" =>
          Regexp.new("^#{FIELDS["yyyy"]}-#{FIELDS["MM"]}#{FIELDS["XXX"]}?$"),
        "http://www.w3.org/2001/XMLSchema#time" =>
          Regexp.new("^#{FIELDS["HH"]}:#{FIELDS["mm"]}:(?<second>#{FIELDS["ss"]}(.[0-9]+)?)#{FIELDS["XXX"]}?$")
      }
    end

    class DateFormatError < StandardError
    end
  end
end


================================================
FILE: lib/csvlint/csvw/metadata_error.rb
================================================
module Csvlint
  module Csvw
    class MetadataError < StandardError
      attr_reader :path

      def initialize(path = nil)
        @path = path
      end
    end
  end
end


================================================
FILE: lib/csvlint/csvw/number_format.rb
================================================
module Csvlint
  module Csvw
    class NumberFormat
      attr_reader :integer, :pattern, :prefix, :numeric_part, :suffix, :grouping_separator, :decimal_separator, :primary_grouping_size, :secondary_grouping_size, :fractional_grouping_size

      def initialize(pattern = nil, grouping_separator = nil, decimal_separator = ".", integer = nil)
        @pattern = pattern
        @integer = integer
        if @integer.nil?
          @integer = if @pattern.nil?
            nil
          else
            !@pattern.include?(decimal_separator)
          end
        end
        @grouping_separator = grouping_separator || (@pattern.nil? ? nil : ",")
        @decimal_separator = decimal_separator || "."
        if pattern.nil?
          @regexp = if integer
            INTEGER_REGEXP
          else
            Regexp.new("^(([-+]?[0-9]+(\\.[0-9]+)?([Ee][-+]?[0-9]+)?[%‰]?)|NaN|INF|-INF)$")
          end
        else
          numeric_part_regexp = Regexp.new("(?<numeric_part>[-+]?([0#Ee]|#{Regexp.escape(@grouping_separator)}|#{Regexp.escape(@decimal_separator)})+)")
          number_format_regexp = Regexp.new("^(?<prefix>.*?)#{numeric_part_regexp}(?<suffix>.*?)$")
          match = number_format_regexp.match(pattern)
          raise Csvw::NumberFormatError, "invalid number format" if match.nil?

          @prefix = match["prefix"]
          @numeric_part = match["numeric_part"]
          @suffix = match["suffix"]

          parts = @numeric_part.split("E")
          mantissa_part = parts[0]
          exponent_part = parts[1] || ""
          mantissa_parts = mantissa_part.split(@decimal_separator)
          # raise Csvw::NumberFormatError, "more than two decimal separators in number format" if parts.length > 2
          integer_part = mantissa_parts[0]
          fractional_part = mantissa_parts[1] || ""

          if ["+", "-"].include?(integer_part[0])
            numeric_part_regexp = "\\#{integer_part[0]}"
            integer_part = integer_part[1..-1]
          else
            numeric_part_regexp = "[-+]?"
          end

          min_integer_digits = integer_part.gsub(@grouping_separator, "").delete("#").length
          min_fraction_digits = fractional_part.gsub(@grouping_separator, "").delete("#").length
          max_fraction_digits = fractional_part.gsub(@grouping_separator, "").length
          min_exponent_digits = exponent_part.delete("#").length
          max_exponent_digits = exponent_part.length

          integer_parts = integer_part.split(@grouping_separator)[1..-1]
          @primary_grouping_size = begin
            integer_parts[-1].length
          rescue
            0
          end
          @secondary_grouping_size = begin
            integer_parts[-2].length
          rescue
            @primary_grouping_size
          end

          fractional_parts = fractional_part.split(@grouping_separator)[0..-2]
          @fractional_grouping_size = begin
            fractional_parts[0].length
          rescue
            0
          end

          if @primary_grouping_size == 0
            integer_regexp = "[0-9]*[0-9]{#{min_integer_digits}}"
          else
            leading_regexp = "([0-9]{0,#{@secondary_grouping_size - 1}}#{Regexp.escape(@grouping_separator)})?"
            secondary_groups = "([0-9]{#{@secondary_grouping_size}}#{Regexp.escape(@grouping_separator)})*"
            if min_integer_digits > @primary_grouping_size
              remaining_req_digits = min_integer_digits - @primary_grouping_size
              req_secondary_groups = (remaining_req_digits / @secondary_grouping_size > 0) ? "([0-9]{#{@secondary_grouping_size}}#{Regexp.escape(@grouping_separator)}){#{remaining_req_digits / @secondary_grouping_size}}" : ""
              if remaining_req_digits % @secondary_grouping_size > 0
                final_req_digits = "[0-9]{#{@secondary_grouping_size - (remaining_req_digits % @secondary_grouping_size)}}"
                final_opt_digits = "[0-9]{0,#{@secondary_grouping_size - (remaining_req_digits % @secondary_grouping_size)}}"
                integer_regexp = "((#{leading_regexp}#{secondary_groups}#{final_req_digits})|#{final_opt_digits})[0-9]{#{remaining_req_digits % @secondary_grouping_size}}#{Regexp.escape(@grouping_separator)}#{req_secondary_groups}[0-9]{#{@primary_grouping_size}}"
              else
                integer_regexp = "(#{leading_regexp}#{secondary_groups})?#{req_secondary_groups}[0-9]{#{@primary_grouping_size}}"
              end
            else
              final_req_digits = (@primary_grouping_size > min_integer_digits) ? "[0-9]{#{@primary_grouping_size - min_integer_digits}}" : ""
              final_opt_digits = (@primary_grouping_size > min_integer_digits) ? "[0-9]{0,#{@primary_grouping_size - min_integer_digits}}" : ""
              integer_regexp = "((#{leading_regexp}#{secondary_groups}#{final_req_digits})|#{final_opt_digits})[0-9]{#{min_integer_digits}}"
            end
          end

          numeric_part_regexp += integer_regexp

          if max_fraction_digits > 0
            if @fractional_grouping_size == 0
              fractional_regexp = ""
              fractional_regexp += "[0-9]{#{min_fraction_digits}}" if min_fraction_digits > 0
              fractional_regexp += "[0-9]{0,#{max_fraction_digits - min_fraction_digits}}" unless min_fraction_digits == max_fraction_digits
              fractional_regexp = "#{Regexp.escape(@decimal_separator)}#{fractional_regexp}"
              fractional_regexp = "(#{fractional_regexp})?" if min_fraction_digits == 0
              numeric_part_regexp += fractional_regexp
            else
              fractional_regexp = ""

              if min_fraction_digits > 0
                if min_fraction_digits >= @fractional_grouping_size
                  # first group of required digits - something like "[0-9]{3}"
                  fractional_regexp += "[0-9]{#{@fractional_grouping_size}}"
                  # additional groups of required digits - something like "(,[0-9]{3}){1}"
                  fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{#{@fractional_grouping_size}}){#{min_fraction_digits / @fractional_grouping_size - 1}}" if min_fraction_digits / @fractional_grouping_size > 1
                  fractional_regexp += Regexp.escape(@grouping_separator).to_s if min_fraction_digits % @fractional_grouping_size > 0
                end
                # additional required digits - something like ",[0-9]{1}"
                fractional_regexp += "[0-9]{#{min_fraction_digits % @fractional_grouping_size}}" if min_fraction_digits % @fractional_grouping_size > 0

                opt_fractional_digits = max_fraction_digits - min_fraction_digits
                if opt_fractional_digits > 0
                  fractional_regexp += "("

                  if min_fraction_digits % @fractional_grouping_size > 0
                    # optional fractional digits to complete the group
                    fractional_regexp += "[0-9]{0,#{[opt_fractional_digits, @fractional_grouping_size - (min_fraction_digits % @fractional_grouping_size)].min}}"
                    fractional_regexp += "|"
                    fractional_regexp += "[0-9]{#{[opt_fractional_digits, @fractional_grouping_size - (min_fraction_digits % @fractional_grouping_size)].min}}"
                  else
                    fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{1,#{@fractional_grouping_size}})?"
                    fractional_regexp += "|"
                    fractional_regexp += "#{Regexp.escape(@grouping_separator)}[0-9]{#{@fractional_grouping_size}}"
                  end

                  remaining_opt_fractional_digits = opt_fractional_digits - (@fractional_grouping_size - (min_fraction_digits % @fractional_grouping_size))
                  if remaining_opt_fractional_digits > 0
                    if remaining_opt_fractional_digits % @fractional_grouping_size > 0
                      # optional fraction digits in groups
                      fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{#{@fractional_grouping_size}}){0,#{remaining_opt_fractional_digits / @fractional_grouping_size}}" if remaining_opt_fractional_digits > @fractional_grouping_size
                      # remaining optional fraction digits
                      fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{1,#{remaining_opt_fractional_digits % @fractional_grouping_size}})?"
                    else
                      # optional fraction digits in groups
                      fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{#{@fractional_grouping_size}}){0,#{(remaining_opt_fractional_digits / @fractional_grouping_size) - 1}}" if remaining_opt_fractional_digits > @fractional_grouping_size
                      # remaining optional fraction digits
                      fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{1,#{@fractional_grouping_size}})?"
                    end

                    # optional fraction digits in groups
                    fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{#{@fractional_grouping_size}}){0,#{(remaining_opt_fractional_digits / @fractional_grouping_size) - 1}}" if remaining_opt_fractional_digits > @fractional_grouping_size
                    # remaining optional fraction digits
                    fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{1,#{remaining_opt_fractional_digits % @fractional_grouping_size}})?" if remaining_opt_fractional_digits % @fractional_grouping_size > 0
                  end
                  fractional_regexp += ")"
                end
              elsif max_fraction_digits % @fractional_grouping_size > 0
                # optional fractional digits in groups
                fractional_regexp += "([0-9]{#{@fractional_grouping_size}}#{Regexp.escape(@grouping_separator)}){0,#{max_fraction_digits / @fractional_grouping_size}}"
                # remaining optional fraction digits
                fractional_regexp += "(#{Regexp.escape(@grouping_separator)}[0-9]{1,#{max_fraction_digits % @fractional_grouping_size}})?" if max_fraction_digits % @fractional_grouping_size > 0
              else
                fractional_regexp += "([0-9]{#{@fractional_grouping_size}}#{Regexp.escape(@grouping_separator)}){0,#{(max_fraction_digits / @fractional_grouping_size) - 1}}" if max_fraction_digits > @fractional_grouping_size
                fractional_regexp += "[0-9]{1,#{@fractional_grouping_size}}"
              end
              fractional_regexp = "#{Regexp.escape(@decimal_separator)}#{fractional_regexp}"
              fractional_regexp = "(#{fractional_regexp})?" if min_fraction_digits == 0
              numeric_part_regexp += fractional_regexp
            end
          end

          if max_exponent_digits > 0
            numeric_part_regexp += "E"
            numeric_part_regexp += "[0-9]{0,#{max_exponent_digits - min_exponent_digits}}" unless max_exponent_digits == min_exponent_digits
            numeric_part_regexp += "[0-9]{#{min_exponent_digits}}" unless min_exponent_digits == 0
          end

          @regexp = Regexp.new("^(?<prefix>#{Regexp.escape(@prefix)})(?<numeric_part>#{numeric_part_regexp})(?<suffix>#{suffix})$")
        end
      end

      def match(value)
        value&.match?(@regexp) ? true : false
      end

      def parse(value)
        if @pattern.nil?
          return nil if !@grouping_separator.nil? && value =~ Regexp.new("((^#{Regexp.escape(@grouping_separator)})|#{Regexp.escape(@grouping_separator)}{2})")
          value.gsub!(@grouping_separator, "") unless @grouping_separator.nil?
          value.gsub!(@decimal_separator, ".") unless @decimal_separator.nil?
          if value&.match?(@regexp)
            case value
            when "NaN"
              Float::NAN
            when "INF"
              Float::INFINITY
            when "-INF"
              -Float::INFINITY
            else
              case value[-1]
              when "%"
                value.to_f / 100
              when "‰"
                value.to_f / 1000
              else
                if @integer.nil?
                  value.include?(".") ? value.to_f : value.to_i
                else
                  @integer ? value.to_i : value.to_f
                end
              end
            end
          end
        else
          match = @regexp.match(value)
          return nil if match.nil?
          number = match["numeric_part"]
          number.gsub!(@grouping_separator, "") unless @grouping_separator.nil?
          number.gsub!(@decimal_separator, ".") unless @decimal_separator.nil?
          number = @integer ? number.to_i : number.to_f
          number = number.to_f / 100 if match["prefix"].include?("%") || match["suffix"].include?("%")
          number = number.to_f / 1000 if match["prefix"].include?("‰") || match["suffix"].include?("‰")
          number
        end
      end

      private

      INTEGER_REGEXP = /^[-+]?[0-9]+[%‰]?$/
    end

    class NumberFormatError < StandardError
    end
  end
end


================================================
FILE: lib/csvlint/csvw/property_checker.rb
================================================
module Csvlint
  module Csvw
    class PropertyChecker
      class << self
        def check_property(property, value, base_url, lang)
          if PROPERTIES.include? property
            PROPERTIES[property].call(value, base_url, lang)
          elsif property =~ /^([a-z]+):/ && NAMESPACES.include?(property.split(":")[0])
            value, warnings = check_common_property_value(value, base_url, lang)
            [value, warnings, :annotation]
          else
            # property name must be an absolute URI
            begin
              return value, :invalid_property, nil if URI(property).scheme.nil?
              value, warnings = check_common_property_value(value, base_url, lang)
              [value, warnings, :annotation]
            rescue
              [value, :invalid_property, nil]
            end
          end
        end

        private

        def check_common_property_value(value, base_url, lang)
          case value
          when Hash
            value = value.clone
            warnings = []
            value.each do |p, v|
              case p
              when "@context"
                raise Csvlint::Csvw::MetadataError.new(p), "common property has @context property"
              when "@list"
                raise Csvlint::Csvw::MetadataError.new(p), "common property has @list property"
              when "@set"
                raise Csvlint::Csvw::MetadataError.new(p), "common property has @set property"
              when "@type"
                if value["@value"] && BUILT_IN_DATATYPES.include?(v)
                elsif !value["@value"] && BUILT_IN_TYPES.include?(v)
                elsif ((v.is_a? String) && (v =~ /^([a-z]+):/)) && NAMESPACES.include?(v.split(":")[0])
                else
                  # must be an absolute URI
                  begin
                    raise Csvlint::Csvw::MetadataError.new, "common property has invalid @type (#{v})" if URI(v).scheme.nil?
                  rescue
                    raise Csvlint::Csvw::MetadataError.new, "common property has invalid @type (#{v})"
                  end
                end
              when "@id"
                unless base_url.nil?
                  begin
                    v = URI.join(base_url, v)
                  rescue
                    raise Csvlint::Csvw::MetadataError.new, "common property has invalid @id (#{v})"
                  end
                end
              when "@value"
                raise Csvlint::Csvw::MetadataError.new, "common property with @value has both @language and @type" if value["@type"] && value["@language"]
                raise Csvlint::Csvw::MetadataError.new, "common property with @value has properties other than @language or @type" unless value.except("@type").except("@language").except("@value").empty?
              when "@language"
                raise Csvlint::Csvw::MetadataError.new, "common property with @language lacks a @value" unless value["@value"]
                raise Csvlint::Csvw::MetadataError.new, "common property has invalid @language (#{v})" if !((v.is_a? String) && (v =~ BCP47_LANGUAGE_REGEXP)) || !v.nil?
              else
                if p[0] == "@"
                  raise Csvlint::Csvw::MetadataError.new, "common property has property other than @id, @type, @value or @language beginning with @ (#{p})"
                else
                  v, w = check_common_property_value(v, base_url, lang)
                  warnings += Array(w)
                end
              end
              value[p] = v
            end
            [value, warnings]
          when String
            if lang == "und"
              [value, nil]
            else
              [{"@value" => value, "@language" => lang}, nil]
            end
          when Array
            values = []
            warnings = []
            value.each do |v|
              v, w = check_common_property_value(v, base_url, lang)
              warnings += Array(w)
              values << v
            end
            [values, warnings]
          else
            [value, nil]
          end
        end

        def convert_value_facet(value, property, datatype)
          if value[property]
            if DATE_FORMAT_DATATYPES.include?(datatype)
              format = Csvlint::Csvw::DateFormat.new(nil, datatype)
              v = format.parse(value[property])
              if v.nil?
                value.delete(property)
                return [:":invalid_#{property}"]
              else
                value[property] = v
                return []
              end
            elsif NUMERIC_FORMAT_DATATYPES.include?(datatype)
              return []
            else
              raise Csvlint::Csvw::MetadataError.new("datatype.#{property}"), "#{property} is only allowed for numeric, date/time and duration types"
            end
          end
          []
        end

        def array_property(type)
          lambda { |value, base_url, lang|
            return value, nil, type if value.instance_of? Array
            [false, :invalid_value, type]
          }
        end

        def boolean_property(type)
          lambda { |value, base_url, lang|
            return value, nil, type if value == true || value == false
            [false, :invalid_value, type]
          }
        end

        def string_property(type)
          lambda { |value, base_url, lang|
            return value, nil, type if value.instance_of? String
            ["", :invalid_value, type]
          }
        end

        def uri_template_property(type)
          lambda { |value, base_url, lang|
            return URITemplate.new(value), nil, type if value.instance_of? String
            [URITemplate.new(""), :invalid_value, type]
          }
        end

        def numeric_property(type)
          lambda { |value, base_url, lang|
            return value, nil, type if value.is_a?(Integer) && value >= 0
            [nil, :invalid_value, type]
          }
        end

        def link_property(type)
          lambda { |value, base_url, lang|
            raise Csvlint::Csvw::MetadataError.new, "URL #{value} starts with _:" if /^_:/.match?(value.to_s)
            return (base_url.nil? ? URI(value) : URI.join(base_url, value)), nil, type if value.instance_of? String
            [base_url, :invalid_value, type]
          }
        end

        def language_property(type)
          lambda { |value, base_url, lang|
            return value, nil, type if BCP47_REGEXP.match?(value)
            [nil, :invalid_value, type]
          }
        end

        def natural_language_property(type)
          lambda { |value, base_url, lang|
            warnings = []
            if value.instance_of? String
              [{lang => [value]}, nil, type]
            elsif value.instance_of? Array
              valid_titles = []
              value.each do |title|
                if title.instance_of? String
                  valid_titles << title
                else
                  warnings << :invalid_value
                end
              end
              [{lang => valid_titles}, warnings, type]
            elsif value.instance_of? Hash
              value = value.clone
              value.each do |l, v|
                if BCP47_REGEXP.match?(l)
                  valid_titles = []
                  Array(v).each do |title|
                    if title.instance_of? String
                      valid_titles << title
                    else
                      warnings << :invalid_value
                    end
                  end
                  value[l] = valid_titles
                else
                  value.delete(l)
                  warnings << :invalid_language
                end
              end
              warnings << :invalid_value if value.empty?
              [value, warnings, type]
            else
              [{}, :invalid_value, type]
            end
          }
        end

        def column_reference_property(type)
          lambda { |value, base_url, lang|
            [Array(value), nil, type]
          }
        end
      end

      PROPERTIES = {
        # context properties
        "@language" => language_property(:context),
        "@base" => link_property(:context),
        # common properties
        "@id" => link_property(:common),
        "notes" => lambda { |value, base_url, lang|
          return false, :invalid_value, :common unless value.instance_of? Array
          values = []
          warnings = []
          value.each do |v|
            v, w = check_common_property_value(v, base_url, lang)
            values << v
            warnings += w
          end
          [values, warnings, :common]
        },
        "suppressOutput" => boolean_property(:common),
        "dialect" => lambda { |value, base_url, lang|
          if value.instance_of? Hash
            value = value.clone
            warnings = []
            value.each do |p, v|
              if p == "@id"
                raise Csvlint::Csvw::MetadataError.new("dialect.@id"), "@id starts with _:" if /^_:/.match?(v)
              elsif p == "@type"
                raise Csvlint::Csvw::MetadataError.new("dialect.@type"), "@type of dialect is not 'Dialect'" if v != "Dialect"
              else
                v, warning, type = check_property(p, v, base_url, lang)
                if type == :dialect && (warning.nil? || warning.empty?)
                  value[p] = v
                else
                  value.delete(p)
                  warnings << :invalid_property unless type == :dialect
                  warnings += Array(warning)
                end
              end
            end
            [value, warnings, :common]
          else
            [{}, :invalid_value, :common]
          end
        },
        # inherited properties
        "null" => lambda { |value, base_url, lang|
          case value
          when String
            [[value], nil, :inherited]
          when Array
            values = []
            warnings = []
            value.each do |v|
              if v.instance_of? String
                values << v
              else
                warnings << :invalid_value
              end
            end
            [values, warnings, :inherited]
          else
            [[""], :invalid_value, :inherited]
          end
        },
        "default" => string_property(:inherited),
        "separator" => lambda { |value, base_url, lang|
          return value, nil, :inherited if value.instance_of?(String) || value.nil?
          [nil, :invalid_value, :inherited]
        },
        "lang" => language_property(:inherited),
        "datatype" => lambda { |value, base_url, lang|
          value = value.clone
          warnings = []
          if value.instance_of? Hash
            if value["@id"]
              raise Csvlint::Csvw::MetadataError.new("datatype.@id"), "datatype @id must not be the id of a built-in datatype (#{value["@id"]})" if BUILT_IN_DATATYPES.value?(value["@id"])
              _, w, _ = PROPERTIES["@id"].call(value["@id"], base_url, lang)
              unless w.nil?
                warnings << w
                value.delete("@id")
              end
            end

            if value["base"]
              if BUILT_IN_DATATYPES.include? value["base"]
                value["base"] = BUILT_IN_DATATYPES[value["base"]]
              else
                value["base"] = BUILT_IN_DATATYPES["string"]
                warnings << :invalid_datatype_base
              end
            else
              value["base"] = BUILT_IN_DATATYPES["string"]
            end
          elsif BUILT_IN_DATATYPES.include? value
            value = {"@id" => BUILT_IN_DATATYPES[value]}
          else
            value = {"@id" => BUILT_IN_DATATYPES["string"]}
            warnings << :invalid_value
          end

          unless STRING_DATATYPES.include?(value["base"]) || BINARY_DATATYPES.include?(value["base"])
            raise Csvlint::Csvw::MetadataError.new("datatype.length"), "datatypes based on #{value["base"]} cannot have a length facet" if value["length"]
            raise Csvlint::Csvw::MetadataError.new("datatype.minLength"), "datatypes based on #{value["base"]} cannot have a minLength facet" if value["minLength"]
            raise Csvlint::Csvw::MetadataError.new("datatype.maxLength"), "datatypes based on #{value["base"]} cannot have a maxLength facet" if value["maxLength"]
          end

          if value["minimum"]
            value["minInclusive"] = value["minimum"]
            value.delete("minimum")
          end
          if value["maximum"]
            value["maxInclusive"] = value["maximum"]
            value.delete("maximum")
          end

          warnings += convert_value_facet(value, "minInclusive", value["base"])
          warnings += convert_value_facet(value, "minExclusive", value["base"])
          warnings += convert_value_facet(value, "maxInclusive", value["base"])
          warnings += convert_value_facet(value, "maxExclusive", value["base"])

          minInclusive = value["minInclusive"].is_a?(Hash) ? value["minInclusive"][:dateTime] : value["minInclusive"]
          maxInclusive = value["maxInclusive"].is_a?(Hash) ? value["maxInclusive"][:dateTime] : value["maxInclusive"]
          minExclusive = value["minExclusive"].is_a?(Hash) ? value["minExclusive"][:dateTime] : value["minExclusive"]
          maxExclusive = value["maxExclusive"].is_a?(Hash) ? value["maxExclusive"][:dateTime] : value["maxExclusive"]

          raise Csvlint::Csvw::MetadataError.new(""), "datatype cannot specify both minimum/minInclusive (#{minInclusive}) and minExclusive (#{minExclusive}" if minInclusive && minExclusive
          raise Csvlint::Csvw::MetadataError.new(""), "datatype cannot specify both maximum/maxInclusive (#{maxInclusive}) and maxExclusive (#{maxExclusive}" if maxInclusive && maxExclusive
          raise Csvlint::Csvw::MetadataError.new(""), "datatype minInclusive (#{minInclusive}) cannot be more than maxInclusive (#{maxInclusive}" if minInclusive && maxInclusive && minInclusive > maxInclusive
          raise Csvlint::Csvw::MetadataError.new(""), "datatype minInclusive (#{minInclusive}) cannot be more than or equal to maxExclusive (#{maxExclusive}" if minInclusive && maxExclusive && minInclusive >= maxExclusive
          raise Csvlint::Csvw::MetadataError.new(""), "datatype minExclusive (#{minExclusive}) cannot be more than or equal to maxExclusive (#{maxExclusive}" if minExclusive && maxExclusive && minExclusive > maxExclusive
          raise Csvlint::Csvw::MetadataError.new(""), "datatype minExclusive (#{minExclusive}) cannot be more than maxInclusive (#{maxInclusive}" if minExclusive && maxInclusive && minExclusive >= maxInclusive

          raise Csvlint::Csvw::MetadataError.new(""), "datatype length (#{value["length"]}) cannot be less than minLength (#{value["minLength"]}" if value["length"] && value["minLength"] && value["length"] < value["minLength"]
          raise Csvlint::Csvw::MetadataError.new(""), "datatype length (#{value["length"]}) cannot be more than maxLength (#{value["maxLength"]}" if value["length"] && value["maxLength"] && value["length"] > value["maxLength"]
          raise Csvlint::Csvw::MetadataError.new(""), "datatype minLength (#{value["minLength"]}) cannot be more than maxLength (#{value["maxLength"]}" if value["minLength"] && value["maxLength"] && value["minLength"] > value["maxLength"]

          if value["format"]
            if REGEXP_FORMAT_DATATYPES.include?(value["base"])
              begin
                value["format"] = Regexp.new(value["format"])
              rescue RegexpError
                value.delete("format")
                warnings << :invalid_regex
              end
            elsif NUMERIC_FORMAT_DATATYPES.include?(value["base"])
              value["format"] = {"pattern" => value["format"]} if value["format"].instance_of? String
              begin
                value["format"] = Csvlint::Csvw::NumberFormat.new(value["format"]["pattern"], value["format"]["groupChar"], value["format"]["decimalChar"] || ".", INTEGER_FORMAT_DATATYPES.include?(value["base"]))
              rescue Csvlint::Csvw::NumberFormatError
                value["format"] = Csvlint::Csvw::NumberFormat.new(nil, value["format"]["groupChar"], value["format"]["decimalChar"] || ".", INTEGER_FORMAT_DATATYPES.include?(value["base"]))
                warnings << :invalid_number_format
              end
            elsif value["base"] == "http://www.w3.org/2001/XMLSchema#boolean"
              if value["format"].instance_of? String
                value["format"] = value["format"].split("|")
                unless value["format"].length == 2
                  value.delete("format")
                  warnings << :invalid_boolean_format
                end
              else
                value.delete("format")
                warnings << :invalid_boolean_format
              end
            elsif DATE_FORMAT_DATATYPES.include?(value["base"])
              if value["format"].instance_of? String
                begin
                  value["format"] = Csvlint::Csvw::DateFormat.new(value["format"])
                rescue Csvlint::CsvDateFormatError
                  value.delete("format")
                  warnings << :invalid_date_format
                end
              else
                value.delete("format")
                warnings << :invalid_date_format
              end
            end
          end
          [value, warnings, :inherited]
        },
        "required" => boolean_property(:inherited),
        "ordered" => boolean_property(:inherited),
        "aboutUrl" => uri_template_property(:inherited),
        "propertyUrl" => uri_template_property(:inherited),
        "valueUrl" => uri_template_property(:inherited),
        "textDirection" => lambda { |value, base_url, lang|
          value = value.to_sym
          return value, nil, :inherited if [:ltr, :rtl, :auto, :inherit].include? value
          [:inherit, :invalid_value, :inherited]
        },
        # column level properties
        "virtual" => boolean_property(:column),
        "titles" => natural_language_property(:column),
        "name" => lambda { |value, base_url, lang|
          return value, nil, :column if value.instance_of?(String) && value =~ NAME_REGEXP
          [nil, :invalid_value, :column]
        },
        # table level properties
        "transformations" => lambda { |value, base_url, lang|
          transformations = []
          warnings = []
          if value.instance_of? Array
            value.each_with_index do |transformation, i|
              if transformation.instance_of? Hash
                transformation = transformation.clone
                transformation.each do |p, v|
                  if p == "@id"
                    raise Csvlint::Csvw::MetadataError.new("transformations[#{i}].@id"), "@id starts with _:" if /^_:/.match?(v)
                  elsif p == "@type"
                    raise Csvlint::Csvw::MetadataError.new("transformations[#{i}].@type"), "@type of transformation is not 'Template'" if v != "Template"
                  elsif p == "url"
                  elsif p == "titles"
                  else
                    _, warning, type = check_property(p, v, base_url, lang)
                    if type != :transformation && !(warning.nil? || warning.empty?)
                      value.delete(p)
                      warnings << :invalid_property unless type == :transformation
                      warnings += Array(warning)
                    end
                  end
                end
                transformations << transformation
              else

Download .txt

gitextract_4wguoljj/

├── .coveralls.yml
├── .gitattributes
├── .github/
│   ├── ISSUE_TEMPLATE.md
│   ├── PULL_REQUEST_TEMPLATE.md
│   ├── dependabot.yml
│   └── workflows/
│       └── push.yml
├── .gitignore
├── .pre-commit-hooks.yaml
├── .ruby-version
├── .standard_todo.yml
├── Appraisals
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Dockerfile
├── Gemfile
├── LICENSE.md
├── README.md
├── Rakefile
├── bin/
│   ├── create_schema
│   └── csvlint
├── csvlint.gemspec
├── docker_notes_for_windows.txt
├── features/
│   ├── check_format.feature
│   ├── cli.feature
│   ├── csv_options.feature
│   ├── csvupload.feature
│   ├── csvw_schema_validation.feature
│   ├── fixtures/
│   │   ├── cr-line-endings.csv
│   │   ├── crlf-line-endings.csv
│   │   ├── inconsistent-line-endings-unquoted.csv
│   │   ├── inconsistent-line-endings.csv
│   │   ├── invalid-byte-sequence.csv
│   │   ├── invalid_many_rows.csv
│   │   ├── lf-line-endings.csv
│   │   ├── spreadsheet.xls
│   │   ├── spreadsheet.xlsx
│   │   ├── title-row.csv
│   │   ├── valid.csv
│   │   ├── valid_many_rows.csv
│   │   ├── w3.org/
│   │   │   └── .well-known/
│   │   │       └── csvm
│   │   ├── white space in filename.csv
│   │   └── windows-line-endings.csv
│   ├── information.feature
│   ├── parse_csv.feature
│   ├── schema_validation.feature
│   ├── sources.feature
│   ├── step_definitions/
│   │   ├── cli_steps.rb
│   │   ├── csv_options_steps.rb
│   │   ├── information_steps.rb
│   │   ├── parse_csv_steps.rb
│   │   ├── schema_validation_steps.rb
│   │   ├── sources_steps.rb
│   │   ├── validation_errors_steps.rb
│   │   ├── validation_info_steps.rb
│   │   └── validation_warnings_steps.rb
│   ├── support/
│   │   ├── aruba.rb
│   │   ├── earl_formatter.rb
│   │   ├── env.rb
│   │   ├── load_tests.rb
│   │   └── webmock.rb
│   ├── validation_errors.feature
│   ├── validation_info.feature
│   └── validation_warnings.feature
├── gemfiles/
│   ├── activesupport_5.2.gemfile
│   ├── activesupport_6.0.gemfile
│   ├── activesupport_6.1.gemfile
│   ├── activesupport_7.0.gemfile
│   ├── activesupport_7.1.gemfile
│   └── activesupport_7.2.gemfile
├── lib/
│   ├── csvlint/
│   │   ├── cli.rb
│   │   ├── csvw/
│   │   │   ├── column.rb
│   │   │   ├── date_format.rb
│   │   │   ├── metadata_error.rb
│   │   │   ├── number_format.rb
│   │   │   ├── property_checker.rb
│   │   │   ├── table.rb
│   │   │   └── table_group.rb
│   │   ├── error_collector.rb
│   │   ├── error_message.rb
│   │   ├── field.rb
│   │   ├── schema.rb
│   │   ├── validate.rb
│   │   └── version.rb
│   └── csvlint.rb
└── spec/
    ├── csvw/
    │   ├── column_spec.rb
    │   ├── date_format_spec.rb
    │   ├── number_format_spec.rb
    │   ├── table_group_spec.rb
    │   └── table_spec.rb
    ├── field_spec.rb
    ├── schema_spec.rb
    ├── spec_helper.rb
    └── validator_spec.rb

Download .txt

SYMBOL INDEX (161 symbols across 18 files)

FILE: features/support/aruba.rb
  type Csvlint (line 6) | module Csvlint
    class CliRunner (line 7) | class CliRunner
      method initialize (line 9) | def initialize(argv, stdin = $stdin, stdout = $stdout, stderr = $std...
      method execute! (line 13) | def execute!

FILE: features/support/earl_formatter.rb
  class EarlFormatter (line 4) | class EarlFormatter
    method initialize (line 5) | def initialize(step_mother, io, options)
    method scenario_name (line 33) | def scenario_name(keyword, name, file_colon_line, source_indent)
    method after_steps (line 37) | def after_steps(steps)
    method after_features (line 55) | def after_features(features)

FILE: features/support/env.rb
  class CustomWorld (line 17) | class CustomWorld
    method default_csv_options (line 18) | def default_csv_options

FILE: features/support/load_tests.rb
  function cache_file (line 13) | def cache_file(filename)

FILE: lib/csvlint/cli.rb
  type Csvlint (line 9) | module Csvlint
    class Cli (line 10) | class Cli < Thor
      method validate (line 18) | def validate(source = nil)
      method help (line 29) | def help
      method read_source (line 37) | def read_source(source)
      method get_schema (line 62) | def get_schema(schema)
      method fetch_schema_tables (line 78) | def fetch_schema_tables(schema, options)
      method print_error (line 99) | def print_error(index, error, dump, color)
      method print_errors (line 123) | def print_errors(errors, dump)
      method return_error (line 129) | def return_error(message)
      method validate_csv (line 134) | def validate_csv(source, schema, dump, json, werror)
      method hashify (line 171) | def hashify(error)
      method report_lines (line 188) | def report_lines

FILE: lib/csvlint/csvw/column.rb
  type Csvlint (line 1) | module Csvlint
    type Csvw (line 2) | module Csvw
      class Column (line 3) | class Column
        method initialize (line 8) | def initialize(number, name, id: nil, about_url: nil, datatype: {"...
        method from_json (line 33) | def self.from_json(number, column_desc, base_url = nil, lang = "un...
        method validate_header (line 77) | def validate_header(header, strict)
        method validate (line 92) | def validate(string_value, row = nil)
        method create_date_parser (line 124) | def create_date_parser(type, warning)
        method create_regexp_based_parser (line 133) | def create_regexp_based_parser(regexp, warning)
        method languages_match (line 140) | def languages_match(l1, l2)
        method validate_required (line 147) | def validate_required(value, row)
        method validate_length (line 155) | def validate_length(value, row)
        method validate_format (line 178) | def validate_format(value, row)
        method validate_value (line 188) | def validate_value(value, row)

FILE: lib/csvlint/csvw/date_format.rb
  type Csvlint (line 1) | module Csvlint
    type Csvw (line 2) | module Csvw
      class DateFormat (line 3) | class DateFormat
        method initialize (line 6) | def initialize(pattern, datatype = nil)
        method match (line 62) | def match(value)
        method parse (line 66) | def parse(value)
      class DateFormatError (line 210) | class DateFormatError < StandardError

FILE: lib/csvlint/csvw/metadata_error.rb
  type Csvlint (line 1) | module Csvlint
    type Csvw (line 2) | module Csvw
      class MetadataError (line 3) | class MetadataError < StandardError
        method initialize (line 6) | def initialize(path = nil)

FILE: lib/csvlint/csvw/number_format.rb
  type Csvlint (line 1) | module Csvlint
    type Csvw (line 2) | module Csvw
      class NumberFormat (line 3) | class NumberFormat
        method initialize (line 6) | def initialize(pattern = nil, grouping_separator = nil, decimal_se...
        method match (line 181) | def match(value)
        method parse (line 185) | def parse(value)
      class NumberFormatError (line 231) | class NumberFormatError < StandardError

FILE: lib/csvlint/csvw/property_checker.rb
  type Csvlint (line 1) | module Csvlint
    type Csvw (line 2) | module Csvw
      class PropertyChecker (line 3) | class PropertyChecker
        method check_property (line 5) | def check_property(property, value, base_url, lang)
        method check_common_property_value (line 25) | def check_common_property_value(value, base_url, lang)
        method convert_value_facet (line 95) | def convert_value_facet(value, property, datatype)
        method array_property (line 116) | def array_property(type)
        method boolean_property (line 123) | def boolean_property(type)
        method string_property (line 130) | def string_property(type)
        method uri_template_property (line 137) | def uri_template_property(type)
        method numeric_property (line 144) | def numeric_property(type)
        method link_property (line 151) | def link_property(type)
        method language_property (line 159) | def language_property(type)
        method natural_language_property (line 166) | def natural_language_property(type)
        method column_reference_property (line 207) | def column_reference_property(type)

FILE: lib/csvlint/csvw/table.rb
  type Csvlint (line 1) | module Csvlint
    type Csvw (line 2) | module Csvw
      class Table (line 3) | class Table
        method initialize (line 8) | def initialize(url, columns: [], dialect: {}, table_direction: :au...
        method validate_header (line 32) | def validate_header(headers, strict)
        method validate_row (line 48) | def validate_row(values, row = nil, validate = false)
        method validate_foreign_keys (line 92) | def validate_foreign_keys
        method validate_foreign_key_references (line 104) | def validate_foreign_key_references(foreign_key, remote_url, remote)
        method from_json (line 119) | def self.from_json(table_desc, base_url = nil, lang = "und", commo...

FILE: lib/csvlint/csvw/table_group.rb
  type Csvlint (line 1) | module Csvlint
    type Csvw (line 2) | module Csvw
      class TableGroup (line 3) | class TableGroup
        method initialize (line 8) | def initialize(url, id: nil, tables: {}, notes: [], annotations: {...
        method validate_header (line 22) | def validate_header(header, table_url, strict)
        method validate_row (line 32) | def validate_row(values, row = nil, all_errors = [], table_url, va...
        method validate_foreign_keys (line 43) | def validate_foreign_keys
        method from_json (line 55) | def self.from_json(url, json)

FILE: lib/csvlint/error_collector.rb
  type Csvlint (line 1) | module Csvlint
    type ErrorCollector (line 2) | module ErrorCollector
      function build_errors (line 5) | def build_errors(type, category = nil, row = nil, column = nil, cont...
      function build_warnings (line 10) | def build_warnings(type, category = nil, row = nil, column = nil, co...
      function build_info_messages (line 15) | def build_info_messages(type, category = nil, row = nil, column = ni...
      function valid? (line 19) | def valid?
      function reset (line 23) | def reset

FILE: lib/csvlint/error_message.rb
  type Csvlint (line 1) | module Csvlint
    class ErrorMessage (line 2) | class ErrorMessage
      method initialize (line 5) | def initialize(type, category, row, column, content, constraints)

FILE: lib/csvlint/field.rb
  type Csvlint (line 1) | module Csvlint
    class Field (line 2) | class Field
      method initialize (line 7) | def initialize(name, constraints = {}, title = nil, description = nil)
      method validate_column (line 17) | def validate_column(value, row = nil, column = nil, all_errors = [])
      method validate_length (line 31) | def validate_length(value, row, column)
      method validate_regex (line 52) | def validate_regex(value, row, column, all_errors)
      method build_regex_error (line 66) | def build_regex_error(value, row, column, pattern, all_errors)
      method validate_values (line 73) | def validate_values(value, row, column)
      method validate_type (line 93) | def validate_type(value, row, column)
      method validate_range (line 107) | def validate_range(value, row, column)
      method convert_to_type (line 130) | def convert_to_type(value)

FILE: lib/csvlint/schema.rb
  type Csvlint (line 1) | module Csvlint
    class Schema (line 2) | class Schema
      method initialize (line 7) | def initialize(uri, fields = [], title = nil, description = nil)
      method from_json_table (line 18) | def from_json_table(uri, json)
      method from_csvw_metadata (line 27) | def from_csvw_metadata(uri, json)
      method load_from_json (line 32) | def load_from_json(uri, output_errors = true)
      method load_from_uri (line 37) | def load_from_uri(uri, output_errors = true)
      method load_from_string (line 43) | def load_from_string(uri, string, output_errors = true)
      method validate_header (line 65) | def validate_header(header, source_url = nil, validate = true)
      method validate_row (line 76) | def validate_row(values, row = nil, all_errors = [], source_url = ni...

FILE: lib/csvlint/validate.rb
  type Csvlint (line 1) | module Csvlint
    class Validator (line 2) | class Validator
      class LineCSV (line 3) | class LineCSV < CSV
        method encode_re (line 24) | def encode_re(*chunks)
        method encode_str (line 30) | def encode_str(*chunks)
        method escape_re (line 36) | def escape_re(str)
        method init_converters (line 43) | def init_converters(options, field_name = :converters)
      method initialize (line 64) | def initialize(source, dialect = {}, schema = nil, options = {})
      method validate (line 91) | def validate
      method validate_stream (line 108) | def validate_stream
      method validate_url (line 117) | def validate_url
      method parse_line (line 144) | def parse_line(line)
      method validate_line (line 166) | def validate_line(input = nil, index = nil)
      method parse_contents (line 179) | def parse_contents(stream, line = nil)
      method finish (line 217) | def finish
      method validate_metadata (line 230) | def validate_metadata
      method header? (line 288) | def header?
      method report_line_breaks (line 292) | def report_line_breaks(line_no = nil)
      method line_breaks_reported? (line 304) | def line_breaks_reported?
      method set_dialect (line 308) | def set_dialect
      method validate_encoding (line 331) | def validate_encoding
      method check_mixed_linebreaks (line 342) | def check_mixed_linebreaks
      method line_breaks (line 346) | def line_breaks
      method row_count (line 354) | def row_count
      method build_exception_messages (line 358) | def build_exception_messages(csvException, errChars, lineNo)
      method build_linebreak_error (line 369) | def build_linebreak_error
      method validate_header (line 373) | def validate_header(header)
      method fetch_error (line 392) | def fetch_error(error)
      method dialect_to_csv_options (line 402) | def dialect_to_csv_options(dialect)
      method build_formats (line 414) | def build_formats(row)
      method check_consistency (line 434) | def check_consistency
      method check_foreign_keys (line 445) | def check_foreign_keys
      method locate_schema (line 453) | def locate_schema
      method parse_extension (line 507) | def parse_extension(source)
      method uri? (line 528) | def uri?(value)
      method possible_date? (line 537) | def possible_date?(col)
      method date_formats (line 541) | def date_formats(col)
      method date_format? (line 567) | def date_format?(klass, value, format)
      method line_limit_reached? (line 573) | def line_limit_reached?
      method get_line_break (line 577) | def get_line_break(line)

FILE: lib/csvlint/version.rb
  type Csvlint (line 1) | module Csvlint

Download .json

Condensed preview — 94 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (364K chars).

[
  {
    "path": ".coveralls.yml",
    "chars": 23,
    "preview": "service_name: travis-ci"
  },
  {
    "path": ".gitattributes",
    "chars": 43,
    "preview": "# Don't mess with my CSV files\n*.csv binary"
  },
  {
    "path": ".github/ISSUE_TEMPLATE.md",
    "chars": 1003,
    "preview": "> Please provide a general summary of the issue in the Issue Title above\n> fill out the headings below as applicable to "
  },
  {
    "path": ".github/PULL_REQUEST_TEMPLATE.md",
    "chars": 63,
    "preview": "This PR fixes #\n\nChanges proposed in this pull request:\n\n-\n-\n-\n"
  },
  {
    "path": ".github/dependabot.yml",
    "chars": 215,
    "preview": "version: 2\nupdates:\n- package-ecosystem: bundler\n  directory: \"/\"\n  schedule:\n    interval: daily\n  open-pull-requests-l"
  },
  {
    "path": ".github/workflows/push.yml",
    "chars": 1955,
    "preview": "name: CI\non:\n  push:\n    branches: [ main ]\n  pull_request:\n    branches: [ main ]\njobs:\n  appraisal:\n    name: Ruby ${{"
  },
  {
    "path": ".gitignore",
    "chars": 322,
    "preview": "*.gem\n*.rbc\n.bundle\n.config\n.yardoc\nGemfile.lock\nInstalledFiles\n_yardoc\ncoverage\ndoc/\nlib/bundler/man\npkg\nrdoc\nspec/repo"
  },
  {
    "path": ".pre-commit-hooks.yaml",
    "chars": 80,
    "preview": "- id: csvlint\n  name: csvlint\n  entry: csvlint\n  language: ruby\n  files: \\.csv$\n"
  },
  {
    "path": ".ruby-version",
    "chars": 6,
    "preview": "4.0.1\n"
  },
  {
    "path": ".standard_todo.yml",
    "chars": 1032,
    "preview": "# Auto generated files with errors to ignore.\n# Remove from this list as you refactor files.\n---\nignore:\n- features/supp"
  },
  {
    "path": "Appraisals",
    "chars": 533,
    "preview": "# After a new entry: `bundle exec appraisal install`\n# Add an entry in `.github/workflows/push.yml`'s file\n\nappraise \"ac"
  },
  {
    "path": "CHANGELOG.md",
    "chars": 27803,
    "preview": "# Change Log\n\n## [v1.2.0](https://github.com/data-liberation-front/csvlint.rb/tree/v1.2.0) (2023-02-27)\n\n[Full Changelog"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "chars": 3235,
    "preview": "## Code of Conduct\n\n### Our Pledge\n\nIn the interest of fostering an open and welcoming environment, we as\ncontributors a"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 2894,
    "preview": "# Contributing to CSVlint.rb\n\nThe CSVlint library is open source, and contributions are gratefully accepted!\nDetails on "
  },
  {
    "path": "Dockerfile",
    "chars": 324,
    "preview": "FROM ruby:2.5.8-buster\r\n\r\n# throw errors if Gemfile has been modified since Gemfile.lock\r\nRUN bundle config --global fro"
  },
  {
    "path": "Gemfile",
    "chars": 95,
    "preview": "source \"https://rubygems.org\"\n\n# Specify your gem's dependencies in csvlint.rb.gemspec\ngemspec\n"
  },
  {
    "path": "LICENSE.md",
    "chars": 1082,
    "preview": "##Copyright (c) 2014 The Open Data Institute\n\n#MIT License\n\nPermission is hereby granted, free of charge, to any person "
  },
  {
    "path": "README.md",
    "chars": 13614,
    "preview": "[![Build Status](https://img.shields.io/github/workflow/status/Data-Liberation-Front/csvlint.rb/CI/main)](https://travis"
  },
  {
    "path": "Rakefile",
    "chars": 414,
    "preview": "require \"bundler/gem_tasks\"\n\n$:.unshift File.join(File.dirname(__FILE__), \"lib\")\n\nrequire \"rubygems\"\nrequire \"cucumber\"\n"
  },
  {
    "path": "bin/create_schema",
    "chars": 598,
    "preview": "#!/usr/bin/env ruby\n$:.unshift File.join( File.dirname(__FILE__), \"..\", \"lib\")\n\nrequire 'csvlint'\n\nbegin\n  puts ARGV[0]\n"
  },
  {
    "path": "bin/csvlint",
    "chars": 210,
    "preview": "#!/usr/bin/env ruby\n$:.unshift File.join( File.dirname(__FILE__), \"..\", \"lib\")\n\nrequire 'csvlint/cli'\n\nif ARGV == [\"help"
  },
  {
    "path": "csvlint.gemspec",
    "chars": 1979,
    "preview": "lib = File.expand_path(\"../lib\", __FILE__)\n$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)\nrequire \"csvlint/vers"
  },
  {
    "path": "docker_notes_for_windows.txt",
    "chars": 687,
    "preview": "# Note that these commands are specific for a docker environment on MS Windows.\r\n\r\n# to generate Gemfile.lock file\r\ndock"
  },
  {
    "path": "features/check_format.feature",
    "chars": 1384,
    "preview": "Feature: Check inconsistent formatting\n\n  Scenario: Inconsistent formatting for integers\n    Given I have a CSV with the"
  },
  {
    "path": "features/cli.feature",
    "chars": 9991,
    "preview": "Feature: CSVlint CLI\n\n  Scenario: Valid CSV from url\n    Given I have a CSV with the following content:\n    \"\"\"\n\"Foo\",\"B"
  },
  {
    "path": "features/csv_options.feature",
    "chars": 1056,
    "preview": "Feature: CSV options\n\n  Scenario: Sucessfully parse a valid CSV\n    Given I have a CSV with the following content:\n    \""
  },
  {
    "path": "features/csvupload.feature",
    "chars": 5074,
    "preview": "Feature: Collect all the tests that should trigger dialect check related errors\n\n  Scenario: Title rows, I wish to trigg"
  },
  {
    "path": "features/csvw_schema_validation.feature",
    "chars": 3908,
    "preview": "Feature: CSVW Schema Validation\n\n  Scenario: Valid CSV\n    Given I have a CSV with the following content:\n    \"\"\"\n\"Bob\","
  },
  {
    "path": "features/fixtures/cr-line-endings.csv",
    "chars": 62,
    "preview": "\"Foo\",\"Bar\",\"Baz\"\r\"Biff\",\"Baff\",\"Boff\"\r\"Qux\",\"Teaspoon\",\"Doge\""
  },
  {
    "path": "features/fixtures/crlf-line-endings.csv",
    "chars": 64,
    "preview": "\"Foo\",\"Bsr\",\"Baz\"\r\n\"Biff\",\"Baff\",\"Boff\"\r\n\"Qux\",\"Teaspoon\",\"Doge\""
  },
  {
    "path": "features/fixtures/inconsistent-line-endings-unquoted.csv",
    "chars": 45,
    "preview": "Foo,Bsr,Baz\r\nBiff,Baff,Boff\nQux,Teaspoon,Doge"
  },
  {
    "path": "features/fixtures/inconsistent-line-endings.csv",
    "chars": 63,
    "preview": "\"Foo\",\"Bsr\",\"Baz\"\r\n\"Biff\",\"Baff\",\"Boff\"\n\"Qux\",\"Teaspoon\",\"Doge\""
  },
  {
    "path": "features/fixtures/invalid-byte-sequence.csv",
    "chars": 2117,
    "preview": "\"Data\",\"Dependencia Origem\",\"Histrico\",\"Data do Balancete\",\"Nmero do documento\",\"Valor\",\r\n\"10/31/2012\",\"\",\"Saldo Anterio"
  },
  {
    "path": "features/fixtures/invalid_many_rows.csv",
    "chars": 150,
    "preview": "\"Foo\",\"Bar\",\"Baz\"\r\n\"1\",\"2\",\"3\"\r\n\"3\",\"2\",\"1\"\r\n\"1\",\"2\",\"3\" \"\r\n\"3\",\"two\",\"1\"\r\n\"1\",\"2\",\"3\"\r\n\"3\",\"2\",\"1\"\r\n\r\n\"3\",\"2\",\"1\"\r\n\"3\","
  },
  {
    "path": "features/fixtures/lf-line-endings.csv",
    "chars": 62,
    "preview": "\"Foo\",\"Bsr\",\"Baz\"\n\"Biff\",\"Baff\",\"Boff\"\n\"Qux\",\"Teaspoon\",\"Doge\""
  },
  {
    "path": "features/fixtures/title-row.csv",
    "chars": 86,
    "preview": "\"This is a title row\",,\n\"Foo\",\"Bsr\",\"Baz\"\n\"Biff\",\"Baff\",\"Boff\"\n\"Qux\",\"Teaspoon\",\"Doge\""
  },
  {
    "path": "features/fixtures/valid.csv",
    "chars": 45,
    "preview": "\"Foo\",\"Bar\",\"Baz\"\r\n\"1\",\"2\",\"3\"\r\n\"3\",\"2\",\"1\"\r\n"
  },
  {
    "path": "features/fixtures/valid_many_rows.csv",
    "chars": 97,
    "preview": "\"Foo\",\"Bar\",\"Baz\"\r\n\"1\",\"2\",\"3\"\r\n\"3\",\"2\",\"1\"\r\n\"1\",\"2\",\"3\"\r\n\"3\",\"2\",\"1\"\r\n\"1\",\"2\",\"3\"\r\n\"3\",\"2\",\"1\"\r\n"
  },
  {
    "path": "features/fixtures/w3.org/.well-known/csvm",
    "chars": 60,
    "preview": "{+url}-metadata.json\ncsv-metadata.json\n{+url}.json\ncsvm.json"
  },
  {
    "path": "features/fixtures/white space in filename.csv",
    "chars": 45,
    "preview": "\"Foo\",\"Bar\",\"Baz\"\r\n\"1\",\"2\",\"3\"\r\n\"3\",\"2\",\"1\"\r\n"
  },
  {
    "path": "features/fixtures/windows-line-endings.csv",
    "chars": 14,
    "preview": "a,b,c\r\nd,e,f\r\n"
  },
  {
    "path": "features/information.feature",
    "chars": 609,
    "preview": "Feature: Return information\n\n  Background:\n    Given I have a CSV with the following content:\n    \"\"\"\n\"abc\",\"2\",\"3\"\n    "
  },
  {
    "path": "features/parse_csv.feature",
    "chars": 2519,
    "preview": "Feature: Parse CSV\n\n  Scenario: Successfully parse a valid CSV\n    Given I have a CSV with the following content:\n    \"\""
  },
  {
    "path": "features/schema_validation.feature",
    "chars": 3241,
    "preview": "Feature: Schema Validation\n\n  Scenario: Valid CSV\n    Given I have a CSV with the following content:\n    \"\"\"\n\"Bob\",\"1234"
  },
  {
    "path": "features/sources.feature",
    "chars": 498,
    "preview": "Feature: Parse CSV from Different Sources\n\n  Scenario: Successfully parse a valid CSV from a StringIO\n      Given I have"
  },
  {
    "path": "features/step_definitions/cli_steps.rb",
    "chars": 1188,
    "preview": "Given(/^I have stubbed $stdin to contain \"(.*?)\"$/) do |file|\n  expect($stdin).to receive(:read).and_return(File.read(fi"
  },
  {
    "path": "features/step_definitions/csv_options_steps.rb",
    "chars": 691,
    "preview": "Given(/^I set the delimiter to \"(.*?)\"$/) do |delimiter|\n  @csv_options ||= default_csv_options\n  @csv_options[\"delimite"
  },
  {
    "path": "features/step_definitions/information_steps.rb",
    "chars": 470,
    "preview": "Given(/^the content type is \"(.*?)\"$/) do |arg1|\n  @content_type = \"text/csv\"\nend\n\nThen(/^the \"(.*?)\" should be \"(.*?)\"$"
  },
  {
    "path": "features/step_definitions/parse_csv_steps.rb",
    "chars": 1566,
    "preview": "Given(/^I have a CSV with the following content:$/) do |string|\n  @csv = string.to_s\nend\n\nGiven(/^it has a Link header h"
  },
  {
    "path": "features/step_definitions/schema_validation_steps.rb",
    "chars": 1279,
    "preview": "Given(/^I have a schema with the following content:$/) do |json|\n  @schema_type = :json_table\n  @schema_json = json\nend\n"
  },
  {
    "path": "features/step_definitions/sources_steps.rb",
    "chars": 214,
    "preview": "Given(/^it is parsed as a StringIO$/) do\n  @url = StringIO.new(@csv)\nend\n\nGiven(/^I parse a file called \"(.*?)\"$/) do |f"
  },
  {
    "path": "features/step_definitions/validation_errors_steps.rb",
    "chars": 2719,
    "preview": "When(/^I ask if there are errors$/) do\n  @csv_options ||= default_csv_options\n\n  if @schema_json\n    @schema = if @schem"
  },
  {
    "path": "features/step_definitions/validation_info_steps.rb",
    "chars": 760,
    "preview": "Given(/^I ask if there are info messages$/) do\n  @csv_options ||= default_csv_options\n\n  if @schema_json\n    @schema = i"
  },
  {
    "path": "features/step_definitions/validation_warnings_steps.rb",
    "chars": 1773,
    "preview": "Given(/^it is encoded as \"(.*?)\"$/) do |encoding|\n  @csv = @csv.encode(encoding)\n  @encoding = encoding\nend\n\nGiven(/^I s"
  },
  {
    "path": "features/support/aruba.rb",
    "chars": 1758,
    "preview": "require \"aruba\"\nrequire \"aruba/cucumber\"\n\nrequire \"csvlint/cli\"\n\nmodule Csvlint\n  class CliRunner\n    # Allow everything"
  },
  {
    "path": "features/support/earl_formatter.rb",
    "chars": 2790,
    "preview": "require \"rdf\"\nrequire \"rdf/turtle\"\n\nclass EarlFormatter\n  def initialize(step_mother, io, options)\n    output = RDF::Res"
  },
  {
    "path": "features/support/env.rb",
    "chars": 379,
    "preview": "require \"coveralls\"\nCoveralls.wear_merged!(\"test_frameworks\")\n\n$:.unshift File.join(File.dirname(__FILE__), \"..\", \"..\", "
  },
  {
    "path": "features/support/load_tests.rb",
    "chars": 5154,
    "preview": "require \"json\"\nrequire \"open-uri\"\nrequire \"uri\"\n\nBASE_URI = \"https://w3c.github.io/csvw/tests/\"\nBASE_PATH = File.join(Fi"
  },
  {
    "path": "features/support/webmock.rb",
    "chars": 80,
    "preview": "require \"webmock/cucumber\"\n\nWebMock.disable_net_connect!(allow: %r{csvw/tests})\n"
  },
  {
    "path": "features/validation_errors.feature",
    "chars": 5178,
    "preview": "Feature: Get validation errors\n\n  Scenario: CSV with ragged rows\n    Given I have a CSV with the following content:\n    "
  },
  {
    "path": "features/validation_info.feature",
    "chars": 712,
    "preview": "Feature: Get validation information messages\n\n  Scenario: LF line endings in file give an info message\n    Given I have "
  },
  {
    "path": "features/validation_warnings.feature",
    "chars": 2658,
    "preview": "Feature: Validation warnings\n\n  Scenario: UTF-8 Encoding\n    Given I have a CSV with the following content:\n    \"\"\"\n\"col"
  },
  {
    "path": "gemfiles/activesupport_5.2.gemfile",
    "chars": 124,
    "preview": "# This file was generated by Appraisal\n\nsource \"https://rubygems.org\"\n\ngem \"activesupport\", \"~> 5.2.0\"\n\ngemspec path: \"."
  },
  {
    "path": "gemfiles/activesupport_6.0.gemfile",
    "chars": 124,
    "preview": "# This file was generated by Appraisal\n\nsource \"https://rubygems.org\"\n\ngem \"activesupport\", \"~> 6.0.0\"\n\ngemspec path: \"."
  },
  {
    "path": "gemfiles/activesupport_6.1.gemfile",
    "chars": 124,
    "preview": "# This file was generated by Appraisal\n\nsource \"https://rubygems.org\"\n\ngem \"activesupport\", \"~> 6.1.0\"\n\ngemspec path: \"."
  },
  {
    "path": "gemfiles/activesupport_7.0.gemfile",
    "chars": 124,
    "preview": "# This file was generated by Appraisal\n\nsource \"https://rubygems.org\"\n\ngem \"activesupport\", \"~> 7.0.0\"\n\ngemspec path: \"."
  },
  {
    "path": "gemfiles/activesupport_7.1.gemfile",
    "chars": 124,
    "preview": "# This file was generated by Appraisal\n\nsource \"https://rubygems.org\"\n\ngem \"activesupport\", \"~> 7.1.0\"\n\ngemspec path: \"."
  },
  {
    "path": "gemfiles/activesupport_7.2.gemfile",
    "chars": 124,
    "preview": "# This file was generated by Appraisal\n\nsource \"https://rubygems.org\"\n\ngem \"activesupport\", \"~> 7.2.0\"\n\ngemspec path: \"."
  },
  {
    "path": "lib/csvlint/cli.rb",
    "chars": 5734,
    "preview": "require \"csvlint\"\nrequire \"rainbow\"\nrequire \"active_support/json\"\nrequire \"json\"\nrequire \"thor\"\n\nrequire \"active_support"
  },
  {
    "path": "lib/csvlint/csvw/column.rb",
    "chars": 21357,
    "preview": "module Csvlint\n  module Csvw\n    class Column\n      include Csvlint::ErrorCollector\n\n      attr_reader :id, :about_url, "
  },
  {
    "path": "lib/csvlint/csvw/date_format.rb",
    "chars": 10372,
    "preview": "module Csvlint\n  module Csvw\n    class DateFormat\n      attr_reader :pattern\n\n      def initialize(pattern, datatype = n"
  },
  {
    "path": "lib/csvlint/csvw/metadata_error.rb",
    "chars": 176,
    "preview": "module Csvlint\n  module Csvw\n    class MetadataError < StandardError\n      attr_reader :path\n\n      def initialize(path "
  },
  {
    "path": "lib/csvlint/csvw/number_format.rb",
    "chars": 13070,
    "preview": "module Csvlint\n  module Csvw\n    class NumberFormat\n      attr_reader :integer, :pattern, :prefix, :numeric_part, :suffi"
  },
  {
    "path": "lib/csvlint/csvw/property_checker.rb",
    "chars": 37829,
    "preview": "module Csvlint\n  module Csvw\n    class PropertyChecker\n      class << self\n        def check_property(property, value, b"
  },
  {
    "path": "lib/csvlint/csvw/table.rb",
    "chars": 10260,
    "preview": "module Csvlint\n  module Csvw\n    class Table\n      include Csvlint::ErrorCollector\n\n      attr_reader :columns, :dialect"
  },
  {
    "path": "lib/csvlint/csvw/table_group.rb",
    "chars": 7265,
    "preview": "module Csvlint\n  module Csvw\n    class TableGroup\n      include Csvlint::ErrorCollector\n\n      attr_reader :url, :id, :t"
  },
  {
    "path": "lib/csvlint/error_collector.rb",
    "chars": 977,
    "preview": "module Csvlint\n  module ErrorCollector\n    attr_reader :errors, :warnings, :info_messages\n    # Creates a validation err"
  },
  {
    "path": "lib/csvlint/error_message.rb",
    "chars": 341,
    "preview": "module Csvlint\n  class ErrorMessage\n    attr_reader :type, :category, :row, :column, :content, :constraints\n\n    def ini"
  },
  {
    "path": "lib/csvlint/field.rb",
    "chars": 10352,
    "preview": "module Csvlint\n  class Field\n    include Csvlint::ErrorCollector\n\n    attr_reader :name, :constraints, :title, :descript"
  },
  {
    "path": "lib/csvlint/schema.rb",
    "chars": 3020,
    "preview": "module Csvlint\n  class Schema\n    include Csvlint::ErrorCollector\n\n    attr_reader :uri, :fields, :title, :description\n\n"
  },
  {
    "path": "lib/csvlint/validate.rb",
    "chars": 20209,
    "preview": "module Csvlint\n  class Validator\n    class LineCSV < CSV\n      ENCODE_RE = Hash.new do |h, str|\n        h[str] = Regexp."
  },
  {
    "path": "lib/csvlint/version.rb",
    "chars": 39,
    "preview": "module Csvlint\n  VERSION = \"1.5.0\"\nend\n"
  },
  {
    "path": "lib/csvlint.rb",
    "chars": 560,
    "preview": "require \"csv\"\nrequire \"date\"\nrequire \"open-uri\"\nrequire \"tempfile\"\nrequire \"typhoeus\"\n\nrequire \"active_support/all\"\nrequ"
  },
  {
    "path": "spec/csvw/column_spec.rb",
    "chars": 4117,
    "preview": "require \"spec_helper\"\n\ndescribe Csvlint::Csvw::Column do\n  it \"shouldn't generate errors for string values\" do\n    colum"
  },
  {
    "path": "spec/csvw/date_format_spec.rb",
    "chars": 2600,
    "preview": "require \"spec_helper\"\n\ndescribe Csvlint::Csvw::DateFormat do\n  it \"should parse dates that match yyyy-MM-dd correctly\" d"
  },
  {
    "path": "spec/csvw/number_format_spec.rb",
    "chars": 18327,
    "preview": "require \"spec_helper\"\n\ndescribe Csvlint::Csvw::NumberFormat do\n  it \"should correctly parse #,##0.##\" do\n    format = Cs"
  },
  {
    "path": "spec/csvw/table_group_spec.rb",
    "chars": 4925,
    "preview": "require \"spec_helper\"\n\ndescribe Csvlint::Csvw::TableGroup do\n  it \"should inherit null to all columns\" do\n    @metadata "
  },
  {
    "path": "spec/csvw/table_spec.rb",
    "chars": 3138,
    "preview": "require \"spec_helper\"\n\ndescribe Csvlint::Csvw::Table do\n  context \"when parsing CSVW table metadata\" do\n    before(:each"
  },
  {
    "path": "spec/field_spec.rb",
    "chars": 10787,
    "preview": "require \"spec_helper\"\n\ndescribe Csvlint::Field do\n  it \"should validate required fields\" do\n    field = Csvlint::Field.n"
  },
  {
    "path": "spec/schema_spec.rb",
    "chars": 8744,
    "preview": "require \"spec_helper\"\n\ndescribe Csvlint::Schema do\n  it \"should tolerate missing fields\" do\n    schema = Csvlint::Schema"
  },
  {
    "path": "spec/spec_helper.rb",
    "chars": 479,
    "preview": "require \"coveralls\"\nCoveralls.wear_merged!(\"test_frameworks\")\n\nrequire \"csvlint\"\nrequire \"byebug\"\nrequire \"webmock/rspec"
  },
  {
    "path": "spec/validator_spec.rb",
    "chars": 25360,
    "preview": "require \"spec_helper\"\n\ndescribe Csvlint::Validator do\n  before do\n    stub_request(:get, \"http://example.com/example.csv"
  }
]

// ... and 2 more files (download for full content)

About this extraction

This page contains the full source code of the theodi/csvlint.rb GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 94 files (331.6 KB), approximately 92.5k tokens, and a symbol index with 161 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo