Repository: clojure/core.rrb-vector Branch: master Commit: d1cc27f1698b Files: 74 Total size: 759.9 KB Directory structure: gitextract_vrqmmogg/ ├── .github/ │ └── workflows/ │ ├── doc-build.yml │ ├── release.yml │ ├── snapshot.yml │ └── test.yml ├── .gitignore ├── CHANGES.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── deps.edn ├── doc/ │ ├── benchmarks/ │ │ ├── benchmarks.md │ │ └── data/ │ │ ├── benchmarks.edn │ │ ├── concat.csv │ │ ├── list_construct.csv │ │ ├── list_iterate.csv │ │ └── list_lookup.csv │ ├── crrbv-27/ │ │ ├── description.md │ │ ├── proposed-fix-needs-thought-and-testing-plus-debug-prints.patch │ │ ├── proposed-fix-needs-thought-and-testing.patch │ │ └── use-shift-increment-2.patch │ ├── hash-details.md │ ├── rrb-tree-notes.md │ └── use-transducers/ │ ├── README.md │ └── use-transducers.patch ├── epl-v10.html ├── pom.xml ├── project.clj ├── script/ │ ├── jdo │ ├── mvn-run-tests │ ├── replace-params │ ├── sdo │ └── test └── src/ ├── main/ │ ├── cljs/ │ │ └── clojure/ │ │ └── core/ │ │ ├── rrb_vector/ │ │ │ ├── debug.cljs │ │ │ ├── debug_platform_dependent.cljs │ │ │ ├── interop.cljs │ │ │ ├── macros.clj │ │ │ ├── nodes.cljs │ │ │ ├── protocols.cljs │ │ │ ├── rrbt.cljs │ │ │ ├── transients.cljs │ │ │ └── trees.cljs │ │ └── rrb_vector.cljs │ └── clojure/ │ └── clojure/ │ └── core/ │ ├── rrb_vector/ │ │ ├── debug.clj │ │ ├── debug_platform_dependent.clj │ │ ├── fork_join.clj │ │ ├── interop.clj │ │ ├── nodes.clj │ │ ├── parameters.clj │ │ ├── protocols.clj │ │ ├── rrbt.clj │ │ └── transients.clj │ └── rrb_vector.clj ├── parameterized/ │ └── clojure/ │ └── clojure/ │ └── core/ │ ├── rrb_vector/ │ │ ├── debug.clj │ │ ├── debug_platform_dependent.clj │ │ ├── fork_join.clj │ │ ├── interop.clj │ │ ├── nodes.clj │ │ ├── parameters.clj │ │ ├── protocols.clj │ │ ├── rrbt.clj │ │ └── transients.clj │ └── rrb_vector.clj ├── test/ │ ├── cljs/ │ │ └── clojure/ │ │ └── core/ │ │ └── rrb_vector/ │ │ ├── long_test.cljs │ │ ├── test_cljs.cljs │ │ ├── test_cljs_only.cljs │ │ ├── test_common.cljs │ │ └── test_utils.cljs │ ├── clojure/ │ │ └── clojure/ │ │ └── core/ │ │ └── rrb_vector/ │ │ ├── long_test.clj │ │ ├── test_clj_only.clj │ │ ├── test_cljs.clj │ │ ├── test_common.clj │ │ └── test_utils.clj │ └── resources/ │ └── clojure/ │ └── core/ │ └── rrb_vector/ │ └── cljs_testsuite.clj └── test_local/ └── clojure/ └── clojure/ └── core/ └── rrb_vector_check.clj ================================================ FILE CONTENTS ================================================ ================================================ FILE: .github/workflows/doc-build.yml ================================================ name: Build API Docs permissions: contents: write on: workflow_dispatch: jobs: call-doc-build-workflow: uses: clojure/build.ci/.github/workflows/doc-build.yml@master with: project: clojure/core.rrb-vector ================================================ FILE: .github/workflows/release.yml ================================================ name: Release on demand permissions: contents: write on: workflow_dispatch: inputs: releaseVersion: description: "Version to release" required: true snapshotVersion: description: "Snapshot version after release" required: true jobs: call-release: uses: clojure/build.ci/.github/workflows/release.yml@master with: releaseVersion: ${{ github.event.inputs.releaseVersion }} snapshotVersion: ${{ github.event.inputs.snapshotVersion }} secrets: inherit ================================================ FILE: .github/workflows/snapshot.yml ================================================ name: Snapshot on demand permissions: contents: read on: [workflow_dispatch] jobs: call-snapshot: uses: clojure/build.ci/.github/workflows/snapshot.yml@master secrets: inherit ================================================ FILE: .github/workflows/test.yml ================================================ name: Test permissions: contents: read on: [push] jobs: test: strategy: matrix: os: [ubuntu-latest] # macOS-latest, windows-latest] java-version: ["8", "11"] # NOTE: tests fail on Java 17 as they depend on Nashorn clojure-version: ["1.9.0", "1.10.3", "1.11.1"] runs-on: ${{ matrix.os }} steps: - uses: actions/checkout@v3 - name: Set up Java uses: actions/setup-java@v3 with: java-version: ${{ matrix.java-version }} distribution: 'temurin' cache: 'maven' - name: Build with Maven run: mvn -ntp -B -Dclojure.version=${{ matrix.clojure-version }} clean test ================================================ FILE: .gitignore ================================================ /target /lib /classes /checkouts *.jar *.class .lein-deps-sum .lein-failures .lein-plugins .lein-repl-history /.repl /out /repl .\#* /.nrepl-port .idea *.iml /.cpcache ================================================ FILE: CHANGES.md ================================================ # Changes in 0.2.1 * Update parent pom and default Clojure version to 1.11.4 # Changes in 0.2.0 * Update dependencies and versions # Changes in 0.1.2 Bug fixes: * Correct handling of reduce-kv for empty vectors [CRRBV-29](https://clojure.atlassian.net/browse/CRRBV-29) # Changes in 0.1.1 ## Changes visible to users of the library Bug fixes: * Eliminate warning issued by ClojureScript compiler caused by use of a not-yet-defined type name [CRRBV-24](https://clojure.atlassian.net/browse/CRRBV-24) Minor code cleanup: * Eliminate redundant require of a namespace [commit](https://github.com/clojure/core.rrb-vector/commit/8c3bdc03f4d4c73326ac0146310bf472cda4d035) Documentation: * Created this change log. * Added new introductory text at beginning of README explaining briefly why someone might want to use this library. * Added doc/benchmarks/benchmarks.md document, with link from README, showing some benchmark results of this library versus several other implementations of vector data structures, some of which are based on RRB trees, some of which have only a linear time vector concatenation operation. * Added doc/rrb-tree-notes.md with links to other implementations of RRB trees, and papers and theses that have been written about them. ## Changes relevant to those who test and develop the library itself * Added doc/use-transducers/ directory with README and proposed patch for speeding up some of the code by using transducers, which in the cases used provide a speedup by avoiding allocating multiple intermediate sequences. # Changes in 0.1.0 ## Changes visible to users of the library Bug fixes: * Test case added that was failing before the fixes for other bugs listed below, but now works with those fixes, so likely its root cause has also been corrected [CRRBV-12](https://clojure.atlassian.net/browse/CRRBV-12) * Fixed bug that caused assoc and assoc! to fail when used on vectors of primitives [CRRBV-13](https://clojure.atlassian.net/browse/CRRBV-13) * Fixed bug where internal tree data structure gets too "tall and skinny", eventually exceeding the limits supported by the library's implementation [CRRBV-14](https://clojure.atlassian.net/browse/CRRBV-14) * Bug with similar root cause, and the same fix as, CRRBV-14 [CRRBV-17](https://clojure.atlassian.net/browse/CRRBV-17) * Fix incorrect condition check for when a subtree was full or not [CRRBV-20](https://clojure.atlassian.net/browse/CRRBV-20) * Fix of several bugs found during implementation and testing of other issues [CRRBV-21](https://clojure.atlassian.net/browse/CRRBV-21) * Fix off by one bug that caused incorrect results for pop and pop! operations on vectors of certain sizes [CRRBV-22](https://clojure.atlassian.net/browse/CRRBV-22) * Fix incorrect hash calculation for empty RRB vectors in ClojureScript implementation [CRRBV-25](https://clojure.atlassian.net/browse/CRRBV-25) * Fix potential data race for multi-threaded Clojure programs using this library where the hash value could be returned incorrectly as -1 instead of the correct value [CRRBV-26](https://clojure.atlassian.net/browse/CRRBV-26) Enhancements: * Support bootstrapped ClojureScript [CRRBV-16](https://clojure.atlassian.net/browse/CRRBV-16) ## Changes relevant to those who test and develop the library itself In order to continue to support Clojure 1.6.0, core.rrb-vector uses neither `.cljc` files nor transducers, for which Clojure added support in Clojure 1.7.0. However, in preparation for a future core.rrb-vector release that requires Clojure 1.7.0 or later, several test namespaces have been made nearly identical between their Clojure and ClojureScript versions, so that they can be replaced with a single `.cljc` file in the future, with only a few small uses of reader conditionals. The implementation of core.rrb-vector still contains similar, but independent, implementations in Clojure and ClojureScript, and no attempt has been made to make their implementations so similar that merging them into a combined .cljc file would be reasonable. Replacing some or all of the Clojure implementation with Java source code may help improve the constant factors of the execution time enough to warrant such a change in a future version of this library. * Updated Maven pom.xml file, Leiningen project.clj file, and Clojure deps.edn files, so that any of them may be used for the purposes of developing and testing this library further. Added examples of commands for all of these tools for running this library's tests, and commands for Ubuntu Linux and macOS for installing two different JavaScript run time engines that can be used to test the ClojureScript implementation. * The ClojureScript tests are now run, using JDK's Nashorn JavaScript run time environment, on build.clojure.org, in addition to the previous behavior of running the Clojure tests. * Rearranged tests in the test namespaces extensively, including some of their namespace names. There is now a `test-common` namespace that contains most of the tests, which can thus be run on both the Clojure and ClojureScript implementations. There are `test-clj-only` and `test-cljs-only` namespaces for tests unique to one of the implementations. * The `clojure.core.rrb-vector.debug/dbg-vec` function has been enhanced to support showing internal details of both the built in Clojure vectors, persistent and transient, as well as core.rrb-vector's data structures. The `debug` namespace also has several `checking-*` functions, e.g. `checking-catvec`, `checking-subvec`, etc. that behave the same as their non-checking counterparts, but perform significant sanity checking of their return values before they return, and while they have some configuration options, by default they throw exceptions if they find errors in the returned values. Any such exceptions are likely to be due to bugs in core.rrb-vector. These checking functions are significantly slower than the non-checking variants, and only intended for testing the core.rrb-vector library. The details of configuring their options are not intended to be stable, and thus likely to change in future releases of this library. ================================================ FILE: CONTRIBUTING.md ================================================ This is a [Clojure contrib] project. Under the Clojure contrib [guidelines], this project cannot accept pull requests. All patches must be submitted via [JIRA]. See [Contributing] on the Clojure website for more information on how to contribute. [Clojure contrib]: https://clojure.org/community/contrib_libs [Contributing]: https://clojure.org/community/contributing [JIRA]: https://clojure.atlassian.net/browse/CRRBV [guidelines]: https://clojure.org/community/contrib_howto ================================================ FILE: LICENSE ================================================ Eclipse Public License - v 1.0 THE ACCOMPANYING PROGRAM IS PROVIDED UNDER THE TERMS OF THIS ECLIPSE PUBLIC LICENSE ("AGREEMENT"). ANY USE, REPRODUCTION OR DISTRIBUTION OF THE PROGRAM CONSTITUTES RECIPIENT'S ACCEPTANCE OF THIS AGREEMENT. 1. DEFINITIONS "Contribution" means: a) in the case of the initial Contributor, the initial code and documentation distributed under this Agreement, and b) in the case of each subsequent Contributor: i) changes to the Program, and ii) additions to the Program; where such changes and/or additions to the Program originate from and are distributed by that particular Contributor. A Contribution 'originates' from a Contributor if it was added to the Program by such Contributor itself or anyone acting on such Contributor's behalf. Contributions do not include additions to the Program which: (i) are separate modules of software distributed in conjunction with the Program under their own license agreement, and (ii) are not derivative works of the Program. "Contributor" means any person or entity that distributes the Program. "Licensed Patents" mean patent claims licensable by a Contributor which are necessarily infringed by the use or sale of its Contribution alone or when combined with the Program. "Program" means the Contributions distributed in accordance with this Agreement. "Recipient" means anyone who receives the Program under this Agreement, including all Contributors. 2. GRANT OF RIGHTS a) Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, distribute and sublicense the Contribution of such Contributor, if any, and such derivative works, in source code and object code form. b) Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free patent license under Licensed Patents to make, use, sell, offer to sell, import and otherwise transfer the Contribution of such Contributor, if any, in source code and object code form. This patent license shall apply to the combination of the Contribution and the Program if, at the time the Contribution is added by the Contributor, such addition of the Contribution causes such combination to be covered by the Licensed Patents. The patent license shall not apply to any other combinations which include the Contribution. No hardware per se is licensed hereunder. c) Recipient understands that although each Contributor grants the licenses to its Contributions set forth herein, no assurances are provided by any Contributor that the Program does not infringe the patent or other intellectual property rights of any other entity. Each Contributor disclaims any liability to Recipient for claims brought by any other entity based on infringement of intellectual property rights or otherwise. As a condition to exercising the rights and licenses granted hereunder, each Recipient hereby assumes sole responsibility to secure any other intellectual property rights needed, if any. For example, if a third party patent license is required to allow Recipient to distribute the Program, it is Recipient's responsibility to acquire that license before distributing the Program. d) Each Contributor represents that to its knowledge it has sufficient copyright rights in its Contribution, if any, to grant the copyright license set forth in this Agreement. 3. REQUIREMENTS A Contributor may choose to distribute the Program in object code form under its own license agreement, provided that: a) it complies with the terms and conditions of this Agreement; and b) its license agreement: i) effectively disclaims on behalf of all Contributors all warranties and conditions, express and implied, including warranties or conditions of title and non-infringement, and implied warranties or conditions of merchantability and fitness for a particular purpose; ii) effectively excludes on behalf of all Contributors all liability for damages, including direct, indirect, special, incidental and consequential damages, such as lost profits; iii) states that any provisions which differ from this Agreement are offered by that Contributor alone and not by any other party; and iv) states that source code for the Program is available from such Contributor, and informs licensees how to obtain it in a reasonable manner on or through a medium customarily used for software exchange. When the Program is made available in source code form: a) it must be made available under this Agreement; and b) a copy of this Agreement must be included with each copy of the Program. Contributors may not remove or alter any copyright notices contained within the Program. Each Contributor must identify itself as the originator of its Contribution, if any, in a manner that reasonably allows subsequent Recipients to identify the originator of the Contribution. 4. COMMERCIAL DISTRIBUTION Commercial distributors of software may accept certain responsibilities with respect to end users, business partners and the like. While this license is intended to facilitate the commercial use of the Program, the Contributor who includes the Program in a commercial product offering should do so in a manner which does not create potential liability for other Contributors. Therefore, if a Contributor includes the Program in a commercial product offering, such Contributor ("Commercial Contributor") hereby agrees to defend and indemnify every other Contributor ("Indemnified Contributor") against any losses, damages and costs (collectively "Losses") arising from claims, lawsuits and other legal actions brought by a third party against the Indemnified Contributor to the extent caused by the acts or omissions of such Commercial Contributor in connection with its distribution of the Program in a commercial product offering. The obligations in this section do not apply to any claims or Losses relating to any actual or alleged intellectual property infringement. In order to qualify, an Indemnified Contributor must: a) promptly notify the Commercial Contributor in writing of such claim, and b) allow the Commercial Contributor to control, and cooperate with the Commercial Contributor in, the defense and any related settlement negotiations. The Indemnified Contributor may participate in any such claim at its own expense. For example, a Contributor might include the Program in a commercial product offering, Product X. That Contributor is then a Commercial Contributor. If that Commercial Contributor then makes performance claims, or offers warranties related to Product X, those performance claims and warranties are such Commercial Contributor's responsibility alone. Under this section, the Commercial Contributor would have to defend claims against the other Contributors related to those performance claims and warranties, and if a court requires any other Contributor to pay any damages as a result, the Commercial Contributor must pay those damages. 5. NO WARRANTY EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is solely responsible for determining the appropriateness of using and distributing the Program and assumes all risks associated with its exercise of rights under this Agreement , including but not limited to the risks and costs of program errors, compliance with applicable laws, damage to or loss of data, programs or equipment, and unavailability or interruption of operations. 6. DISCLAIMER OF LIABILITY EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 7. GENERAL If any provision of this Agreement is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this Agreement, and without further action by the parties hereto, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable. If Recipient institutes patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Program itself (excluding combinations of the Program with other software or hardware) infringes such Recipient's patent(s), then such Recipient's rights granted under Section 2(b) shall terminate as of the date such litigation is filed. All Recipient's rights under this Agreement shall terminate if it fails to comply with any of the material terms or conditions of this Agreement and does not cure such failure in a reasonable period of time after becoming aware of such noncompliance. If all Recipient's rights under this Agreement terminate, Recipient agrees to cease use and distribution of the Program as soon as reasonably practicable. However, Recipient's obligations under this Agreement and any licenses granted by Recipient relating to the Program shall continue and survive. Everyone is permitted to copy and distribute copies of this Agreement, but in order to avoid inconsistency the Agreement is copyrighted and may only be modified in the following manner. The Agreement Steward reserves the right to publish new versions (including revisions) of this Agreement from time to time. No one other than the Agreement Steward has the right to modify this Agreement. The Eclipse Foundation is the initial Agreement Steward. The Eclipse Foundation may assign the responsibility to serve as the Agreement Steward to a suitable separate entity. Each new version of the Agreement will be given a distinguishing version number. The Program (including Contributions) may always be distributed subject to the version of the Agreement under which it was received. In addition, after a new version of the Agreement is published, Contributor may elect to distribute the Program (including its Contributions) under the new version. Except as expressly stated in Sections 2(a) and 2(b) above, Recipient receives no rights or licenses to the intellectual property of any Contributor under this Agreement, whether expressly, by implication, estoppel or otherwise. All rights in the Program not expressly granted under this Agreement are reserved. This Agreement is governed by the laws of the State of New York and the intellectual property laws of the United States of America. No party to this Agreement will bring a legal action under this Agreement more than one year after the cause of action arose. Each party waives its rights to a jury trial in any resulting litigation. ================================================ FILE: README.md ================================================ # core.rrb-vector Why would anyone want to use this library? The two primary answers are: + You want faster concatenation of vectors, which core.rrb-vector's `catvec` function provides for both Clojure and ClojureScript. + You use vectors of Java primitive types like long, double, etc., returned by Clojure's `vector-of` function, e.g. to reduce memory usage to about 1/3 of the memory required by vectors of arbitrary objects, and + You want the speed enabled by using the transient versions of such vectors. Clojure does not implement transients for primitive vectors created via `vector-of` -- core.rrb-vector does. Vectors are one of the most commonly used data structures within Clojure. Likely you already know that creating a vector equal to `v` plus a new element `e` appended to the end using the expression `(conj v e)` has a run time that is "effectively constant", i.e. it takes O(log N) time in the size N of `v`, where the base of the logarithm is 32, so it is a constant at most 4 for all vector sizes up to a million, and at most 7 for all vector sizes that Clojure supports. The fastest way to concatenate two vectors `v1` and `v2` into a single new vector is using an expression like `(into v1 v2)`. This is implemented by repeatedly appending a single element from the second vector to the first, so it takes linear time in the size of `v2` (multiplied by the effectively constant time mentioned above). Aside: There might be another expression that has a better _constant factor_ for its run time than `(into v1 v2)` does, and is thus faster. However, any other such expression will still take at least linear time in the size of the second vector. The core.rrb-vector library uses a tree structure similar to the one that Clojure uses internally for vectors, but generalizes it in such a way that producing a new tree that represents the concatenation of two input vectors using the `catvec` function can be done in O(log N) time, where N is the size of the result. You can give `catvec` vectors created in all of the ways you already normally do, and while it will return a new type of object, this new type behaves in all of the ways you expect a Clojure vector to behave. This new type of vector is indistinguishable from a normal Clojure vector unless you examine the value of `(type v)` or `(class v)`. In particular, `(vector? v)` is true for this new type, you can call all of the usual sequence-based functions on it to examine or process its elements, you can call `conj` on it, `nth`, etc. Thus if you have a program where frequently concatenating large vectors to produce new vectors is useful, core.rrb-vector may help you write a much faster program in a more natural way. This library is an implementation of the confluently persistent vector data structure introduced in the paper "RRB-Trees: Efficient Immutable Vectors", EPFL-REPORT-169879, September, 2011, by Phil Bagwell and Tiark Rompf. RRB-Trees build upon Clojure's internal `PersistentVector` class used to implement its built in vectors, adding logarithmic time concatenation and slicing (i.e. create sub-vectors from input vectors). ClojureScript is supported with the same API, except for the absence of the `vector-of` function. The main functions provided are `clojure.core.rrb-vector/catvec`, performing vector concatenation, and `clojure.core.rrb-vector/subvec`, which produces a new vector containing the appropriate subrange of the input vector (in contrast to `clojure.core/subvec`, which returns a view on the input vector). Like Clojure vectors, core.rrb-vector vectors can store arbitrary values, or using `vector-of` you can create vectors restricted to one primitive type, e.g. long, double, etc. The core.rrb-vector implementation provides seamless interoperability with the built in Clojure vectors of class `clojure.lang.PersistentVector`, `clojure.core.Vec` (vectors of primitive values) and `clojure.lang.APersistentVector$SubVector` instances: `clojure.core.rrb-vector/catvec` and `clojure.core.rrb-vector/subvec` convert their inputs to `clojure.core.rrb_vector.rrbt.Vector` instances whenever necessary (this is a very fast constant time operation for PersistentVector and primitive vectors; for SubVector it is O(log N), where N is the size of the underlying vector). `clojure.core.rrb-vector` also provides its own versions of `vector`, `vector-of`, and `vec` that always produce `clojure.core.rrb_vector.rrbt.Vector` instances. Note that `vector-of` accepts `:object` as one of the possible type arguments, in addition to keywords naming primitive types. ## Usage core.rrb-vector exports one public namespace: (require '[clojure.core.rrb-vector :as fv]) Note that the ClojureScript version uses the same namespace name (it *does not* use the alternative `cljs.*` prefix!). This is because the API is precisely the same (except `clojure.core.rrb-vector/vector-of` only makes sense on the JVM and is therefore not available in ClojureScript). The docstring attached to the namespace provides an overview of the available functionality (as found at the top of this README): (doc clojure.core.rrb-vector) The new functionality is accessible through two functions: `clojure.core.rrb-vector/subvec`, which provides logarithmic-time non-view slicing (in contrast to `clojure.core/subvec`, which is a constant-time operation producing view vectors that prevent the underlying vector from becoming eligible for garbage collection), and `clojure.core.rrb-vector/catvec`, which provides logarithmic-time concatenation. Crucially, these can be applied to regular Clojure(Script) vectors. (doc fv/subvec) (doc fv/catvec) ;; apply catvec and subvec to regular Clojure(Script) vectors (fv/catvec (vec (range 1234)) (vec (range 8765))) (fv/subvec (vec (range 1024)) 123 456) Additionally, several functions for constructing RRB vectors are provided. There is rarely any reason to use them, since, as mentioned above, the interesting functions exported by core.rrb-vector work with regular vectors. Note that `clojure.core.rrb-vector/vec`, in contrast to `clojure.core/vec`, reuses the internal tree of its input if it already is a vector (of any type) and does not alias short arrays. When passed a non-vector argument, it returns an RRB vector. (doc fv/vector) (doc fv/vector-of) (doc fv/vec) The debug namespace bundled with core.rrb-vector provides several utilities used by the test suite, as well as a function for visualizing the internal structure of vectors that works with regular Clojure(Script) vectors and RRB vectors. ;; for peeking under the hood (require '[clojure.core.rrb-vector.debug :as dv]) (dv/dbg-vec (fv/catvec (vec (range 1234)) (vec (range 8765)))) ## Releases and dependency information core.rrb-vector requires Clojure >= 1.5.0. View vectors created by `clojure.core/subvec` are correctly handled for Clojure >= 1.6.0. The ClojureScript version is regularly tested against the most recent ClojureScript release. core.rrb-vector releases are available from Maven Central. Development snapshots are available from the Sonatype OSS repository. * [Released versions](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.clojure%22%20AND%20a%3A%22core.rrb-vector%22) * [Development snapshots](https://oss.sonatype.org/index.html#nexus-search;gav~org.clojure~core.rrb-vector~~~) * [Change log](CHANGES.md) of changes made in this library. * Some [benchmark results](doc/benchmarks/benchmarks.md) comparing the run time of core.rrb-vector's JVM implementation against other vector/list implementations on the JVM. Follow the first link above to discover the current release number. [CLI/`deps.edn`](https://clojure.org/reference/deps_and_cli) dependency information: ```clojure org.clojure/core.rrb-vector {:mvn/version "${version}"} ``` [Leiningen](http://leiningen.org/) dependency information: [org.clojure/core.rrb-vector "${version}"] [Maven](http://maven.apache.org/) dependency information: org.clojure core.rrb-vector ${version} [Gradle](http://www.gradle.org/) dependency information: compile "org.clojure:core.rrb-vector:${version}" ## TODO 1. more tests; 2. performance: general perf tuning, more efficient `catvec` implementation (to replace current seq-ops-based impl). ## Developer information core.rrb-vector is being developed as a Clojure Contrib project, see the [What is Clojure Contrib](https://clojure.org/dev/contrib_libs) page for details. Patches will only be accepted from developers who have signed the Clojure Contributor Agreement. * [GitHub project](https://github.com/clojure/core.rrb-vector) * [Bug Tracker](https://clojure.atlassian.net/browse/CRRBV) * [Continuous Integration](https://github.com/clojure/core.rrb-vector/actions/workflows/test.yml) ### Useful Maven commands To run Clojure and ClojureScript tests: ```bash $ mvn -DCLOJURE_VERSION=1.10.1 -Dclojure.version=1.10.1 clean test ``` Clojure versions as old as 1.5.1 can be tested with such a command, but the ClojureScript tests only work when using Clojure 1.8.0 or later. To run tests and, if successful, create a JAR file in the targets directory: ```bash $ mvn -DCLOJURE_VERSION=1.10.1 -Dclojure.version=1.10.1 clean package ``` Prerequisites: Only Java and Maven need to be installed. Maven will download whatever versions of Clojure are needed for the command you use. Both Clojure and ClojureScript tests are run with the commands given here. They use the Nashorn JavaScript run time environment included with Java -- no other JavaScript run time is needed. ### Useful clj CLI commands To run relatively short Clojure tests, but no ClojureScript tests: ```bash $ ./script/jdo test ``` To run relatively short ClojureScript tests, but no Clojure tests: ```bash $ ./script/sdo test ``` Warning: Currently the command above for running ClojureScript tests does _not_ show warnings from the ClojureScript compiler. I have seen some ClojureScript compiler warnings appear when running the Maven command above, and the Leiningen command given below for running ClojureScript tests, that unfortunately do not appear using `./script/sdo test`. Suggestions welcome on how to make that command also show similar warnings. Replace `test` in the commands above with one of the following for other useful things: * `sock` (or no argument at all) - start a REPL, and listen for a socket REPL connection on TCP port 50505 * `long` - run a longer set of tests * `coll` - run generative tests from [`collection-check`](https://github.com/ztellman/collection-check) library * `east` - run Eastwood lint tool (clj version only, not cljs) ### Useful Leiningen commands To run Clojure tests, but no ClojureScript tests: ```bash $ lein with-profile +1.10 test ``` You can test with Clojure versions 1.5 through 1.10 by specifying that version number after the `+`. Prerequisites: Only Java and Leiningen. Leiningen will download whatever versions of Clojure and other libraries are needed. To run ClojureScript tests with Node.js and SpiderMonkey JavaScript runtimes, but no Clojure tests: ```bash $ lein with-profile +cljs cljsbuild test ``` Add `node` or `spidermonkey` as a separate argument after `test` to restrict the JavaScript runtime used to only the one you specify. You may need to adjust the command names in the `:test-commands` section of the `project.clj` file if the command for running those JavaScript runtimes have a different name on your system than what is used there. Prerequisites: Java, Leiningen, and either or both of Node.js and SpiderMonkey JavaScript run time environments. To run normal Clojure tests, plus the [`collection-check`](https://github.com/ztellman/collection-check) tests, but no ClojureScript tests: ```bash $ lein with-profile +coll,+1.7 test ``` The `collection-check` tests require Clojure 1.7.0 or later, I believe because collection-check and/or its dependencies require that. There is no existing command configured to run `collection-check` tests with ClojureScript. To start a REPL from Leiningen with Clojure versions 1.6.0 and older, you must use Leiningen 2.8.0 (likely some other versions work, too). ### Installing other software you will need For all of the development commands you must have Java installed. This includes the ClojureScript compile and test commands, since the ClojureScript compiler is at least partially written in the Java version of Clojure. #### Java Install one or more of the pre-built binaries from [AdoptOpenJDK](https://adoptopenjdk.net), or several other providers of Java binaries. Additional methods: * Ubuntu 18.04 Linux: `sudo apt-get install default-jre` #### Maven For any `mvn` command you must install [Maven](https://maven.apache.org). * Ubuntu 18.04 Linux: `sudo apt-get install maven` * macOS * plus Homebrew: `brew install maven` * plus MacPorts: `sudo port install maven3`, then either use the command `mvn3`, or to use `mvn` also run the command `sudo port select --set maven maven3`. #### Leiningen An install script and instructions are available on the [Leiningen](https://leiningen.org) site. #### Node.js JavaScript run time environment Installation instructions for many different versions of Node.js are available on the [Node.js web site](https://nodejs.org). You can also install it using the commands below. * Ubuntu 18.04 Linux: `sudo apt-get install nodejs` * macOS * plus Homebrew: `brew install node` * plus MacPorts: `sudo port install nodejs10`. You can see other versions available via the command `port list | grep nodejs`. #### SpiderMonkey JavaScript run time environment Installation instructions for many different versions of SpiderMonkey are available on the [SpiderMonkey web site](https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey). You may also install it using the commands below. * Ubuntu 18.04 Linux: `sudo apt-get install libmozjs-52-dev` * macOS * plus Homebrew: As of 2019-Sep-24, `brew install spidermonkey` installs version 1.8.5 of SpiderMonkey, which according to the Wikipedia page on SpiderMonkey was first released in 2011, with at least one release per year after that. The ClojureScript tests fail to run using this version of SpiderMonkey. It seems worth avoiding this version of SpiderMonkey for the purposes of testing `core.rrb-vector`. * plus MacPorts: `sudo port install mozjs52` ## Clojure(Script) code reuse core.rrb-vector's vectors support the same basic functionality regular Clojure's vectors do (with the omissions listed above). Where possible, this is achieved by reusing code from Clojure's gvec and ClojureScript's PersistentVector implementations. The Clojure(Script) source files containing the relevant code carry the following copyright notice: Copyright (c) Rich Hickey. All rights reserved. The use and distribution terms for this software are covered by the Eclipse Public License 1.0 (https://opensource.org/license/epl-1-0/) which can be found in the file epl-v10.html at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software. ## Licence Copyright © Michał Marczyk, Andy Fingerhut, Rich Hickey and contributors Distributed under the [Eclipse Public License 1.0](https://opensource.org/license/epl-1-0/), the same as Clojure. The licence text can be found in the `epl-v10.html` file at the root of this distribution. ================================================ FILE: deps.edn ================================================ ;; See shell scripts './script/sdo' and './script/jdo' for sample ;; useful combinations of aliases to use to acommplish common tasks. ;; One way to specify a local version of ClojureScript, in case you ;; want to test with modifications to it: ;; org.clojure/clojurescript {:local/root "/Users/jafinger/clj/clojurescript"} {:paths ["src/main/clojure" "src/main/cljs" "src/main/cljc"] ;;:paths ["src/parameterized/clojure" "src/main/cljs" "src/main/cljc"] :aliases {;; Common alias to use for all Clojure/Java commands :clj {:jvm-opts ["-XX:-OmitStackTraceInFastThrow"]} ;; Common alias to use for all ClojureScript commands :cljs {:extra-deps {org.clojure/clojurescript {:mvn/version "1.10.520"}} :jvm-opts ["-XX:-OmitStackTraceInFastThrow"]} ;; - start a Clojure/Java Socket REPL on port 50505 :clj-socket {:jvm-opts ["-Dclojure.server.repl={:port,50505,:accept,clojure.core.server/repl}"]} ;; start a Node-based ClojureScript socket REPL on port 50505 :cljs-socket {:jvm-opts ["-Dclojure.server.repl={:port,50505,:accept,cljs.server.node/repl}"]} ;; Common alias to use for all Clojure/Java commands that run tests :clj-test {:extra-paths ["src/test/clojure" "src/test/cljc"] :extra-deps {org.clojure/test.check {:mvn/version "1.1.3"}}} ;; Common alias to use for all ClojureScript commands that run tests :cljs-test {:extra-paths ["src/test/cljs" "src/test/cljc"] :extra-deps {org.clojure/test.check {:mvn/version "1.1.3"}}} ;; Run 'short' tests :clj-runt {:main-opts ["-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector.test-clj-only,'clojure.core.rrb-vector.test-common),(t/run-tests,'clojure.core.rrb-vector.test-common),(t/run-tests,'clojure.core.rrb-vector.test-clj-only)"]} :cljs-runt {:main-opts ["-m" "cljs.main" "-re" "node" "-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector.test-cljs-only,'clojure.core.rrb-vector.test-common),(t/run-tests,'clojure.core.rrb-vector.test-common),(t/run-tests,'clojure.core.rrb-vector.test-cljs-only)"]} ;; Run 'short' tests with extra-checks? enabled :clj-extrachecks-runt {:main-opts ["-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector.test-clj-only,'clojure.core.rrb-vector.test-common),(alter-var-root,#'clojure.core.rrb-vector.test-utils/extra-checks?,(constantly,true)),(t/run-tests,'clojure.core.rrb-vector.test-common),(t/run-tests,'clojure.core.rrb-vector.test-clj-only)"]} :cljs-extrachecks-runt {:main-opts ["-m" "cljs.main" "-re" "node" "-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector.test-cljs-only,'clojure.core.rrb-vector.test-common),(set!,clojure.core.rrb-vector.test-utils/extra-checks?,true),(t/run-tests,'clojure.core.rrb-vector.test-common),(t/run-tests,'clojure.core.rrb-vector.test-cljs-only)"]} ;; Run generative and/or 'long' tests :clj-runlongtests {:main-opts ["-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector.long-test),(t/run-tests,'clojure.core.rrb-vector.long-test)"]} :cljs-runlongtests {:main-opts ["-m" "cljs.main" "-re" "node" "-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector.long-test),(t/run-tests,'clojure.core.rrb-vector.long-test)"]} ;; Using collections-check requires this minimum version of ;; test.check, and at least Clojure 1.7.0 :clj-check {:extra-paths ["src/test_local/clojure"] :extra-deps {collection-check/collection-check {:mvn/version "0.1.7"} com.gfredericks/test.chuck {:mvn/version "0.2.10"} org.clojure/test.check {:mvn/version "1.1.3"}}} :clj-runcheck {:main-opts ["-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector-check),(t/run-tests,'clojure.core.rrb-vector-check)"]} :cljs-check {:extra-paths ["src/test_local/clojure"] :extra-deps {collection-check/collection-check {:mvn/version "0.1.7"} com.gfredericks/test.chuck {:mvn/version "0.2.10"} org.clojure/test.check {:mvn/version "1.1.3"}}} :cljs-runcheck {:main-opts ["-m" "cljs.main" "-re" "node" "-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector-check),(t/run-tests,'clojure.core.rrb-vector-check)"]} ;; Run performance tests :clj-runperf {:main-opts ["-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector-performance-test),(t/run-tests,'clojure.core.rrb-vector-performance-test)"]} :cljs-runperf {:main-opts ["-m" "cljs.main" "-re" "node" "-e" "(require,'[clojure.test,:as,t],'clojure.core.rrb-vector-performance-test),(t/run-tests,'clojure.core.rrb-vector-performance-test)"]} ;; Run whatever the current 'focus' tests are :clj-runfocus {:main-opts ["-e" "(require,'[clojure.test,:as,t],'[clojure.core.rrb-vector.test-common,:as,ct]),(ct/test-reduce-subvec-catvec2)"]} :cljs-runfocus {:main-opts ["-m" "cljs.main" "-re" "node" "-e" "(require,'[clojure.test,:as,t],'[clojure.core.rrb-vector.test-common,:as,ct]),(ct/test-reduce-subvec-catvec2)"]} ;; I have tried using cljs-test-runner for running clojure.test ;; tests in a modified version of core.rrb-vector, but my guess is ;; that since an older version of core.rrb-vector (version 0.0.11) ;; is in the transitive dependencies of the cljs-test-runner project ;; itself, that version conflicts with the version I am attempting ;; to test. See ;; https://github.com/Olical/cljs-test-runner/issues/34 ;; :cljs-runner {:extra-deps {olical/cljs-test-runner {:mvn/version "3.7.0"}} ;; :main-opts ["-m" "cljs-test-runner.main" ;; "-d" "src/test/cljs"]} ;; :cljol {:extra-deps {cljol {:local/root "/Users/andy/clj/cljol"} ;; org.clojure/clojure {:mvn/version "1.7.0"}}} :cljol {:extra-deps {cljol/cljol {:git/url "https://github.com/jafingerhut/cljol" :sha "bb5549e9832e73e4a9fc5dfdf695c48e797729fa"}}} :cap {;; recommended options from README of ;; https://github.com/clojure-goes-fast/clj-async-profiler :jvm-opts ["-Djdk.attach.allowAttachSelf" ;; I have trouble entering password for this from ;; clj REPL. Maybe clojure command instead of clj ;; is better for this? "-Djol.tryWithSudo=true" "-XX:+UnlockDiagnosticVMOptions" "-XX:+DebugNonSafepoints"] :extra-deps {com.clojure-goes-fast/clj-async-profiler {:mvn/version "1.6.2"}}} :nodis {:extra-deps {com.clojure-goes-fast/clj-java-decompiler {:mvn/version "0.3.7"}}} :eastwood {:extra-deps {jonase/eastwood {:mvn/version "0.3.5"}} :main-opts ["-m" "eastwood.lint" "{:source-paths,[\"src/main/clojure\"],:test-paths,[\"src/test/clojure\",\"src/test/cljs\",\"src/test_local/clojure\"],:add-linters,[:unused-fn-args,:unused-locals,:unused-namespaces,:unused-private-vars],:exclude-linters,[:implicit-dependencies],:exclude-namespaces,[]}"]} :clj-kondo {:extra-deps {clj-kondo/clj-kondo {:mvn/version "RELEASE"}} :main-opts ["-m" "clj-kondo.main"]} ;; pull in specific versions of Clojure: :1.5 {:override-deps {org.clojure/clojure {:mvn/version "1.5.1"}}} :1.6 {:override-deps {org.clojure/clojure {:mvn/version "1.6.0"}}} :1.7 {:override-deps {org.clojure/clojure {:mvn/version "1.7.0"}}} :1.8 {:override-deps {org.clojure/clojure {:mvn/version "1.8.0"}}} :1.9 {:override-deps {org.clojure/clojure {:mvn/version "1.9.0"}}} :1.10.0 {:override-deps {org.clojure/clojure {:mvn/version "1.10.0"}}} :1.10 {:override-deps {org.clojure/clojure {:mvn/version "1.10.1"}}} :master {:override-deps {org.clojure/clojure {:mvn/version "1.11.0-master-SNAPSHOT"}}}}} ================================================ FILE: doc/benchmarks/benchmarks.md ================================================ # core.rrb-vector benchmarks See the section "How the benchmarks were run" below for how the results were created. ## Benchmark results These benchmark results are based upon benchmark code written for the [`bifurcan`](https://github.com/lacuna/bifurcan) library. The [benchmarks published originally here](https://github.com/lacuna/bifurcan/blob/master/doc/comparison.md) include comparisons of data structures other than vectors, e.g. hash maps and sorted sets. Note that for the data structures we care most about here, those results call them "lists" rather than "vectors". I will use "lists" here to be consistent with the `bifurcan` benchmarks. The portion of the `bifurcan` results relevant to lists can be found [here](https://github.com/lacuna/bifurcan/blob/master/doc/comparison.md#lists). The time measurements here were made on a different machine than the `bifurcan` results, so you should not infer anything from the absolute time measurements here vs. there. Only the relative time measurements between the libraries in one set of benchmark results. --- ![](images/list_construct.png) As in the original benchmark results, construction times are quite consistent (at the scale shown in this graph) across different libraries, except for vavr. The graph below shows the results for all libraries except vavr, so that any differences between them become more apparent. Even at that finer scale the maximum differences above 100 element vectors appears to be at most 3x. ![](images/list_construct_all_but_vavr.png) --- ![](images/list_iterate.png) Without core.rrb-vector included, Zach Tellman's original comment on these results was "The mutable collections, which are stored contiguously, are only moderately faster than their immutable counterparts". Note that the benchmark here creates a Java [`Iterator`](https://docs.oracle.com/javase/8/docs/api/java/util/Iterator.html) object from the collection, and then in a loop uses that interface's `hasNext` and `next` methods to traverse the entire collection until `hasNext` returns false. The more typical way to traverse collections in Clojure is to call `seq` on them (or use one of many functions that call `seq` for you under the covers), and then use `rest` or `next` to advance one step. core.rrb-vector version 0.1.0 is significantly slower, primarily because it has less optimized code for its Java `Iterator` implementation than it has for its `seq`/`rest`/`next` implementation. The reason for the increases near 1K elements and 32K elements are most likely because those are the sizes where the 32-way tree data structure gets one level deeper. The graph below shows the results for all except the core.rrb-vector library, so any differences between those libraries can be seen. Several of them show at least some change in run time when crossing the 1K and 32K element points. ![](images/list_iterate_all_but_core_rrb_vector.png) --- ![](images/list_lookup.png) Zach's comment on the original results, which are nearly identical to those shown above: "The mutable collections are O(1) while their immutable counterparts are unmistakably O(log N)." --- ![](images/concat.png) Zach's comment on the original results: "Concatenation is O(N) for every library except Paguro and Bifurcan, which are O(log N) due to their use of RRB trees." I know for a fact that clojure.PersistentVector does not use RRB trees, and concatenation takes at least linear time (multiplied by some small log factor) in the length of the second vector. This appears to be the case for vavr.Vector and java.ArrayList as well. I would have guessed that scala.Vector used RRB trees as well, but have not checked the implementation yet to verify. If it does use RRB trees, it is by far the slowest of the ones that do, at least for vectors 100K in size and larger -- perhaps the scala.Vector authors chose to implement some more extensive tree rebalancing than other RRB implementers did, in order to preserve faster run times for other operations? The next graph shows the results with only the 4 libraries that are the fastest for concatenation. Unlike the previous graphs, the vertical axis is the elapsed time, not "elapsed time per vector element". RRB trees should enable O(log N) run time for concatenation. core.rrb-vector is the slowest of these by a large factor, probably because of a concatenation implementation that has not been scrutinized for optimization opportunities yet. ![](images/concat_time_all_rrb.png) The next graph shows only the 3 libraries that are the fastest for concatenation, to see any detail that might be of interest there. Like the previous one, the vertical axis is elapsed time, not elapsed time per vector element. TBD: How can bifurcan.LinearList have pretty much a constant run time for all vector sizes? ![](images/concat_time_all_rrb_but_core_rrb_vector.png) --- ## How the benchmarks were run To run benchmarks from the Bifurcan project, with small modifications that add core.rrb-vector to the list of libraries that are measured, follow these steps. Note that the version of the `bifurcan.List` data structure code used in these results has a few proposed bug fixes from the version of the code in the original `bifurcan` repository, but I do not believe they affect the performance in any noticeable way. ```bash $ git clone https://github.com/jafingerhut/bifurcan $ cd bifurcan $ git checkout 457fd0346b78392f39e4c0e79f1e43b7847ea93b $ ./benchmarks/run-vectors-only.sh ``` To copy the data and images produced as a result of the above, to where I copied them in this repository: ```bash $ DST=/path/to/my/clone/of/core.rrb-vector/doc/benchmarks $ mkdir $DST/images $DST/data $ cp -p benchmarks/images/list*.png benchmarks/images/concat*.png $DST/images $ cp -p benchmarks/data/benchmarks.edn benchmarks/data/concat.csv benchmarks/data/list*.csv $DST/data ``` The benchmark results here were measured on a system with these properties: * MacBook Pro model 11,2, 2.5 GHz Intel Core i7 with peak clock speed 3.6 GHz, 16 GB RAM * macOS 10.14.6 * AdoptOpenJDK 11.0.4, 64-bit server build 11.0.4+11 * Leiningen 2.9.1 * To see the versions of the list libraries that were measured, look in the project.clj file of the bifurcan project at the commit mentioned above. For core.rrb-vector, the only measured library that is written in Clojure, that project.clj file currently specifies Clojure version 1.8.0. The other libraries are written in Java. ================================================ FILE: doc/benchmarks/data/benchmarks.edn ================================================ {"bifurcan.List" {100000 {:intersection nil, :remove nil, :insert nil, :lookup 1746262, :concat 297, :equals nil, :iteration 167543, :difference nil, :clone nil, :union nil, :construct 683919}, 177827 {:intersection nil, :remove nil, :insert nil, :lookup 3062911, :concat 333, :equals nil, :iteration 297203, :difference nil, :clone nil, :union nil, :construct 1292069}, 1000000 {:intersection nil, :remove nil, :insert nil, :lookup 33788032, :concat 481, :equals nil, :iteration 1745738, :difference nil, :clone nil, :union nil, :construct 7310803}, 316227 {:intersection nil, :remove nil, :insert nil, :lookup 7754499, :concat 364, :equals nil, :iteration 526319, :difference nil, :clone nil, :union nil, :construct 2348309}, 1778 {:intersection nil, :remove nil, :insert nil, :lookup 13720, :concat 1466, :equals nil, :iteration 2558, :difference nil, :clone nil, :union nil, :construct 12905}, 31 {:intersection nil, :remove nil, :insert nil, :lookup 46, :concat 139, :equals nil, :iteration 28, :difference nil, :clone nil, :union nil, :construct 188}, 10000 {:intersection nil, :remove nil, :insert nil, :lookup 84389, :concat 447, :equals nil, :iteration 14238, :difference nil, :clone nil, :union nil, :construct 72806}, 1000 {:intersection nil, :remove nil, :insert nil, :lookup 3614, :concat 639, :equals nil, :iteration 2204, :difference nil, :clone nil, :union nil, :construct 7284}, 56 {:intersection nil, :remove nil, :insert nil, :lookup 171, :concat 144, :equals nil, :iteration 85, :difference nil, :clone nil, :union nil, :construct 450}, 562341 {:intersection nil, :remove nil, :insert nil, :lookup 16662099, :concat 480, :equals nil, :iteration 941730, :difference nil, :clone nil, :union nil, :construct 4357028}, 316 {:intersection nil, :remove nil, :insert nil, :lookup 1108, :concat 395, :equals nil, :iteration 697, :difference nil, :clone nil, :union nil, :construct 2356}, 100 {:intersection nil, :remove nil, :insert nil, :lookup 363, :concat 241, :equals nil, :iteration 218, :difference nil, :clone nil, :union nil, :construct 791}, 3162 {:intersection nil, :remove nil, :insert nil, :lookup 24404, :concat 347, :equals nil, :iteration 4462, :difference nil, :clone nil, :union nil, :construct 22781}, 56234 {:intersection nil, :remove nil, :insert nil, :lookup 903863, :concat 1475, :equals nil, :iteration 93320, :difference nil, :clone nil, :union nil, :construct 416354}, 17 {:intersection nil, :remove nil, :insert nil, :lookup 27, :concat 139, :equals nil, :iteration 19, :difference nil, :clone nil, :union nil, :construct 154}, 17782 {:intersection nil, :remove nil, :insert nil, :lookup 181189, :concat 625, :equals nil, :iteration 25257, :difference nil, :clone nil, :union nil, :construct 130015}, 562 {:intersection nil, :remove nil, :insert nil, :lookup 2028, :concat 546, :equals nil, :iteration 1241, :difference nil, :clone nil, :union nil, :construct 4229}, 5623 {:intersection nil, :remove nil, :insert nil, :lookup 43274, :concat 411, :equals nil, :iteration 7946, :difference nil, :clone nil, :union nil, :construct 41536}, 10 {:intersection nil, :remove nil, :insert nil, :lookup 17, :concat 136, :equals nil, :iteration 14, :difference nil, :clone nil, :union nil, :construct 94}, 31622 {:intersection nil, :remove nil, :insert nil, :lookup 356039, :concat 626, :equals nil, :iteration 44706, :difference nil, :clone nil, :union nil, :construct 229882}, 177 {:intersection nil, :remove nil, :insert nil, :lookup 615, :concat 314, :equals nil, :iteration 393, :difference nil, :clone nil, :union nil, :construct 1403}}, "java.ArrayList" {100000 {:intersection nil, :remove nil, :insert nil, :lookup 232778, :concat 224121, :equals nil, :iteration 56148, :difference nil, :clone nil, :union nil, :construct 666432}, 177827 {:intersection nil, :remove nil, :insert nil, :lookup 269608, :concat 412139, :equals nil, :iteration 99187, :difference nil, :clone nil, :union nil, :construct 1331189}, 1000000 {:intersection nil, :remove nil, :insert nil, :lookup 1873249, :concat 5719794, :equals nil, :iteration 562946, :difference nil, :clone nil, :union nil, :construct 14379088}, 316227 {:intersection nil, :remove nil, :insert nil, :lookup 455997, :concat 1924960, :equals nil, :iteration 176352, :difference nil, :clone nil, :union nil, :construct 3804737}, 1778 {:intersection nil, :remove nil, :insert nil, :lookup 2220, :concat 2900, :equals nil, :iteration 1001, :difference nil, :clone nil, :union nil, :construct 11630}, 31 {:intersection nil, :remove nil, :insert nil, :lookup 39, :concat 108, :equals nil, :iteration 31, :difference nil, :clone nil, :union nil, :construct 206}, 10000 {:intersection nil, :remove nil, :insert nil, :lookup 15085, :concat 18509, :equals nil, :iteration 5572, :difference nil, :clone nil, :union nil, :construct 84652}, 1000 {:intersection nil, :remove nil, :insert nil, :lookup 1236, :concat 1767, :equals nil, :iteration 582, :difference nil, :clone nil, :union nil, :construct 7784}, 56 {:intersection nil, :remove nil, :insert nil, :lookup 74, :concat 157, :equals nil, :iteration 49, :difference nil, :clone nil, :union nil, :construct 396}, 562341 {:intersection nil, :remove nil, :insert nil, :lookup 3617188, :concat 4255442, :equals nil, :iteration 312946, :difference nil, :clone nil, :union nil, :construct 9850749}, 316 {:intersection nil, :remove nil, :insert nil, :lookup 394, :concat 685, :equals nil, :iteration 206, :difference nil, :clone nil, :union nil, :construct 2525}, 100 {:intersection nil, :remove nil, :insert nil, :lookup 127, :concat 231, :equals nil, :iteration 79, :difference nil, :clone nil, :union nil, :construct 662}, 3162 {:intersection nil, :remove nil, :insert nil, :lookup 3937, :concat 5415, :equals nil, :iteration 1756, :difference nil, :clone nil, :union nil, :construct 21032}, 56234 {:intersection nil, :remove nil, :insert nil, :lookup 129834, :concat 122657, :equals nil, :iteration 31247, :difference nil, :clone nil, :union nil, :construct 382130}, 17 {:intersection nil, :remove nil, :insert nil, :lookup 22, :concat 137, :equals nil, :iteration 21, :difference nil, :clone nil, :union nil, :construct 132}, 17782 {:intersection nil, :remove nil, :insert nil, :lookup 38404, :concat 34667, :equals nil, :iteration 9918, :difference nil, :clone nil, :union nil, :construct 115303}, 562 {:intersection nil, :remove nil, :insert nil, :lookup 698, :concat 1040, :equals nil, :iteration 340, :difference nil, :clone nil, :union nil, :construct 5249}, 5623 {:intersection nil, :remove nil, :insert nil, :lookup 7047, :concat 9791, :equals nil, :iteration 3131, :difference nil, :clone nil, :union nil, :construct 36137}, 10 {:intersection nil, :remove nil, :insert nil, :lookup 14, :concat 114, :equals nil, :iteration 16, :difference nil, :clone nil, :union nil, :construct 60}, 31622 {:intersection nil, :remove nil, :insert nil, :lookup 69391, :concat 72656, :equals nil, :iteration 17507, :difference nil, :clone nil, :union nil, :construct 281678}, 177 {:intersection nil, :remove nil, :insert nil, :lookup 222, :concat 384, :equals nil, :iteration 129, :difference nil, :clone nil, :union nil, :construct 1567}}, "clojure.PersistentVector" {100000 {:intersection nil, :remove nil, :insert nil, :lookup 2228634, :concat 1779133, :equals nil, :iteration 191083, :difference nil, :clone nil, :union nil, :construct 1561274}, 177827 {:intersection nil, :remove nil, :insert nil, :lookup 4423742, :concat 3174520, :equals nil, :iteration 339029, :difference nil, :clone nil, :union nil, :construct 2788184}, 1000000 {:intersection nil, :remove nil, :insert nil, :lookup 65538555, :concat 18577365, :equals nil, :iteration 1935245, :difference nil, :clone nil, :union nil, :construct 15673957}, 316227 {:intersection nil, :remove nil, :insert nil, :lookup 15758293, :concat 5668410, :equals nil, :iteration 602441, :difference nil, :clone nil, :union nil, :construct 4990200}, 1778 {:intersection nil, :remove nil, :insert nil, :lookup 16304, :concat 31391, :equals nil, :iteration 3260, :difference nil, :clone nil, :union nil, :construct 25188}, 31 {:intersection nil, :remove nil, :insert nil, :lookup 146, :concat 978, :equals nil, :iteration 35, :difference nil, :clone nil, :union nil, :construct 561}, 10000 {:intersection nil, :remove nil, :insert nil, :lookup 114617, :concat 175732, :equals nil, :iteration 18315, :difference nil, :clone nil, :union nil, :construct 154981}, 1000 {:intersection nil, :remove nil, :insert nil, :lookup 7500, :concat 17906, :equals nil, :iteration 1779, :difference nil, :clone nil, :union nil, :construct 14172}, 56 {:intersection nil, :remove nil, :insert nil, :lookup 357, :concat 1536, :equals nil, :iteration 99, :difference nil, :clone nil, :union nil, :construct 900}, 562341 {:intersection nil, :remove nil, :insert nil, :lookup 22863239, :concat 10219007, :equals nil, :iteration 1071436, :difference nil, :clone nil, :union nil, :construct 8827463}, 316 {:intersection nil, :remove nil, :insert nil, :lookup 2479, :concat 6085, :equals nil, :iteration 568, :difference nil, :clone nil, :union nil, :construct 4536}, 100 {:intersection nil, :remove nil, :insert nil, :lookup 768, :concat 2335, :equals nil, :iteration 181, :difference nil, :clone nil, :union nil, :construct 1530}, 3162 {:intersection nil, :remove nil, :insert nil, :lookup 29530, :concat 56268, :equals nil, :iteration 5805, :difference nil, :clone nil, :union nil, :construct 49063}, 56234 {:intersection nil, :remove nil, :insert nil, :lookup 1663501, :concat 990805, :equals nil, :iteration 106731, :difference nil, :clone nil, :union nil, :construct 872348}, 17 {:intersection nil, :remove nil, :insert nil, :lookup 81, :concat 752, :equals nil, :iteration 22, :difference nil, :clone nil, :union nil, :construct 384}, 17782 {:intersection nil, :remove nil, :insert nil, :lookup 236158, :concat 310298, :equals nil, :iteration 32369, :difference nil, :clone nil, :union nil, :construct 275176}, 562 {:intersection nil, :remove nil, :insert nil, :lookup 4294, :concat 10456, :equals nil, :iteration 1002, :difference nil, :clone nil, :union nil, :construct 7998}, 5623 {:intersection nil, :remove nil, :insert nil, :lookup 54315, :concat 98658, :equals nil, :iteration 10284, :difference nil, :clone nil, :union nil, :construct 87357}, 10 {:intersection nil, :remove nil, :insert nil, :lookup 48, :concat 625, :equals nil, :iteration 17, :difference nil, :clone nil, :union nil, :construct 249}, 31622 {:intersection nil, :remove nil, :insert nil, :lookup 483401, :concat 549372, :equals nil, :iteration 57762, :difference nil, :clone nil, :union nil, :construct 490170}, 177 {:intersection nil, :remove nil, :insert nil, :lookup 1301, :concat 3748, :equals nil, :iteration 327, :difference nil, :clone nil, :union nil, :construct 2608}}, "vavr.Vector" {100000 {:intersection nil, :remove nil, :insert nil, :lookup 1032512, :concat 442338, :equals nil, :iteration 183461, :difference nil, :clone nil, :union nil, :construct 19086024}, 177827 {:intersection nil, :remove nil, :insert nil, :lookup 2412248, :concat 788452, :equals nil, :iteration 324402, :difference nil, :clone nil, :union nil, :construct 35034903}, 1000000 {:intersection nil, :remove nil, :insert nil, :lookup 35926792, :concat 4775235, :equals nil, :iteration 1891016, :difference nil, :clone nil, :union nil, :construct 203992824}, 316227 {:intersection nil, :remove nil, :insert nil, :lookup 4897287, :concat 1423772, :equals nil, :iteration 579257, :difference nil, :clone nil, :union nil, :construct 64204535}, 1778 {:intersection nil, :remove nil, :insert nil, :lookup 10737, :concat 6417, :equals nil, :iteration 3165, :difference nil, :clone nil, :union nil, :construct 324766}, 31 {:intersection nil, :remove nil, :insert nil, :lookup 46, :concat 210, :equals nil, :iteration 36, :difference nil, :clone nil, :union nil, :construct 4205}, 10000 {:intersection nil, :remove nil, :insert nil, :lookup 69566, :concat 45135, :equals nil, :iteration 17793, :difference nil, :clone nil, :union nil, :construct 1795069}, 1000 {:intersection nil, :remove nil, :insert nil, :lookup 3024, :concat 3542, :equals nil, :iteration 1909, :difference nil, :clone nil, :union nil, :construct 180283}, 56 {:intersection nil, :remove nil, :insert nil, :lookup 175, :concat 306, :equals nil, :iteration 103, :difference nil, :clone nil, :union nil, :construct 8508}, 562341 {:intersection nil, :remove nil, :insert nil, :lookup 13748592, :concat 2536907, :equals nil, :iteration 1038763, :difference nil, :clone nil, :union nil, :construct 109838874}, 316 {:intersection nil, :remove nil, :insert nil, :lookup 930, :concat 1218, :equals nil, :iteration 613, :difference nil, :clone nil, :union nil, :construct 53085}, 100 {:intersection nil, :remove nil, :insert nil, :lookup 305, :concat 572, :equals nil, :iteration 192, :difference nil, :clone nil, :union nil, :construct 17031}, 3162 {:intersection nil, :remove nil, :insert nil, :lookup 19182, :concat 14152, :equals nil, :iteration 5600, :difference nil, :clone nil, :union nil, :construct 604280}, 56234 {:intersection nil, :remove nil, :insert nil, :lookup 491455, :concat 244803, :equals nil, :iteration 103257, :difference nil, :clone nil, :union nil, :construct 11700143}, 17 {:intersection nil, :remove nil, :insert nil, :lookup 27, :concat 170, :equals nil, :iteration 24, :difference nil, :clone nil, :union nil, :construct 2139}, 17782 {:intersection nil, :remove nil, :insert nil, :lookup 152997, :concat 73002, :equals nil, :iteration 31425, :difference nil, :clone nil, :union nil, :construct 3724367}, 562 {:intersection nil, :remove nil, :insert nil, :lookup 1613, :concat 2223, :equals nil, :iteration 1056, :difference nil, :clone nil, :union nil, :construct 97549}, 5623 {:intersection nil, :remove nil, :insert nil, :lookup 34617, :concat 23687, :equals nil, :iteration 9969, :difference nil, :clone nil, :union nil, :construct 1102982}, 10 {:intersection nil, :remove nil, :insert nil, :lookup 18, :concat 150, :equals nil, :iteration 18, :difference nil, :clone nil, :union nil, :construct 1228}, 31622 {:intersection nil, :remove nil, :insert nil, :lookup 300171, :concat 138379, :equals nil, :iteration 55955, :difference nil, :clone nil, :union nil, :construct 6935015}, 177 {:intersection nil, :remove nil, :insert nil, :lookup 515, :concat 854, :equals nil, :iteration 345, :difference nil, :clone nil, :union nil, :construct 28413}}, "scala.Vector" {100000 {:intersection nil, :remove nil, :insert nil, :lookup 1443229, :concat 128174, :equals nil, :iteration 207209, :difference nil, :clone nil, :union nil, :construct 446145}, 177827 {:intersection nil, :remove nil, :insert nil, :lookup 1859958, :concat 227317, :equals nil, :iteration 366050, :difference nil, :clone nil, :union nil, :construct 797333}, 1000000 {:intersection nil, :remove nil, :insert nil, :lookup 18855904, :concat 936365, :equals nil, :iteration 2107386, :difference nil, :clone nil, :union nil, :construct 4582003}, 316227 {:intersection nil, :remove nil, :insert nil, :lookup 4385075, :concat 406149, :equals nil, :iteration 654705, :difference nil, :clone nil, :union nil, :construct 1400461}, 1778 {:intersection nil, :remove nil, :insert nil, :lookup 8606, :concat 2516, :equals nil, :iteration 3256, :difference nil, :clone nil, :union nil, :construct 9005}, 31 {:intersection nil, :remove nil, :insert nil, :lookup 54, :concat 168, :equals nil, :iteration 71, :difference nil, :clone nil, :union nil, :construct 154}, 10000 {:intersection nil, :remove nil, :insert nil, :lookup 69497, :concat 12487, :equals nil, :iteration 17861, :difference nil, :clone nil, :union nil, :construct 49662}, 1000 {:intersection nil, :remove nil, :insert nil, :lookup 2936, :concat 1546, :equals nil, :iteration 1794, :difference nil, :clone nil, :union nil, :construct 5072}, 56 {:intersection nil, :remove nil, :insert nil, :lookup 132, :concat 281, :equals nil, :iteration 115, :difference nil, :clone nil, :union nil, :construct 325}, 562341 {:intersection nil, :remove nil, :insert nil, :lookup 11807347, :concat 713960, :equals nil, :iteration 1159375, :difference nil, :clone nil, :union nil, :construct 2554273}, 316 {:intersection nil, :remove nil, :insert nil, :lookup 861, :concat 738, :equals nil, :iteration 593, :difference nil, :clone nil, :union nil, :construct 1679}, 100 {:intersection nil, :remove nil, :insert nil, :lookup 256, :concat 518, :equals nil, :iteration 213, :difference nil, :clone nil, :union nil, :construct 565}, 3162 {:intersection nil, :remove nil, :insert nil, :lookup 18950, :concat 4940, :equals nil, :iteration 5670, :difference nil, :clone nil, :union nil, :construct 15888}, 56234 {:intersection nil, :remove nil, :insert nil, :lookup 763569, :concat 70672, :equals nil, :iteration 116749, :difference nil, :clone nil, :union nil, :construct 249868}, 17 {:intersection nil, :remove nil, :insert nil, :lookup 31, :concat 170, :equals nil, :iteration 53, :difference nil, :clone nil, :union nil, :construct 118}, 17782 {:intersection nil, :remove nil, :insert nil, :lookup 156912, :concat 25782, :equals nil, :iteration 31478, :difference nil, :clone nil, :union nil, :construct 88109}, 562 {:intersection nil, :remove nil, :insert nil, :lookup 1618, :concat 1085, :equals nil, :iteration 1028, :difference nil, :clone nil, :union nil, :construct 2918}, 5623 {:intersection nil, :remove nil, :insert nil, :lookup 35728, :concat 8230, :equals nil, :iteration 10070, :difference nil, :clone nil, :union nil, :construct 27946}, 10 {:intersection nil, :remove nil, :insert nil, :lookup 19, :concat 166, :equals nil, :iteration 36, :difference nil, :clone nil, :union nil, :construct 98}, 31622 {:intersection nil, :remove nil, :insert nil, :lookup 304412, :concat 45674, :equals nil, :iteration 55700, :difference nil, :clone nil, :union nil, :construct 150112}, 177 {:intersection nil, :remove nil, :insert nil, :lookup 465, :concat 586, :equals nil, :iteration 355, :difference nil, :clone nil, :union nil, :construct 934}}, "paguro.RrbTree" {100000 {:intersection nil, :remove nil, :insert nil, :lookup 2279200, :concat 898, :equals nil, :iteration 214485, :difference nil, :clone nil, :union nil, :construct 1222418}, 177827 {:intersection nil, :remove nil, :insert nil, :lookup 3902249, :concat 922, :equals nil, :iteration 387376, :difference nil, :clone nil, :union nil, :construct 2193137}, 1000000 {:intersection nil, :remove nil, :insert nil, :lookup 45184233, :concat 507, :equals nil, :iteration 2320294, :difference nil, :clone nil, :union nil, :construct 12813206}, 316227 {:intersection nil, :remove nil, :insert nil, :lookup 10253465, :concat 803, :equals nil, :iteration 701969, :difference nil, :clone nil, :union nil, :construct 4102486}, 1778 {:intersection nil, :remove nil, :insert nil, :lookup 10448, :concat 420, :equals nil, :iteration 3737, :difference nil, :clone nil, :union nil, :construct 18476}, 31 {:intersection nil, :remove nil, :insert nil, :lookup 47, :concat 353, :equals nil, :iteration 31, :difference nil, :clone nil, :union nil, :construct 217}, 10000 {:intersection nil, :remove nil, :insert nil, :lookup 73155, :concat 655, :equals nil, :iteration 21406, :difference nil, :clone nil, :union nil, :construct 112118}, 1000 {:intersection nil, :remove nil, :insert nil, :lookup 3876, :concat 727, :equals nil, :iteration 2241, :difference nil, :clone nil, :union nil, :construct 11107}, 56 {:intersection nil, :remove nil, :insert nil, :lookup 85, :concat 788, :equals nil, :iteration 294, :difference nil, :clone nil, :union nil, :construct 558}, 562341 {:intersection nil, :remove nil, :insert nil, :lookup 46099574, :concat 1053, :equals nil, :iteration 1259165, :difference nil, :clone nil, :union nil, :construct 7316685}, 316 {:intersection nil, :remove nil, :insert nil, :lookup 1341, :concat 271, :equals nil, :iteration 763, :difference nil, :clone nil, :union nil, :construct 3330}, 100 {:intersection nil, :remove nil, :insert nil, :lookup 414, :concat 398, :equals nil, :iteration 367, :difference nil, :clone nil, :union nil, :construct 922}, 3162 {:intersection nil, :remove nil, :insert nil, :lookup 19158, :concat 902, :equals nil, :iteration 6550, :difference nil, :clone nil, :union nil, :construct 34028}, 56234 {:intersection nil, :remove nil, :insert nil, :lookup 1173421, :concat 679, :equals nil, :iteration 119708, :difference nil, :clone nil, :union nil, :construct 668771}, 17 {:intersection nil, :remove nil, :insert nil, :lookup 28, :concat 288, :equals nil, :iteration 22, :difference nil, :clone nil, :union nil, :construct 132}, 17782 {:intersection nil, :remove nil, :insert nil, :lookup 164131, :concat 649, :equals nil, :iteration 37101, :difference nil, :clone nil, :union nil, :construct 201296}, 562 {:intersection nil, :remove nil, :insert nil, :lookup 2269, :concat 342, :equals nil, :iteration 1358, :difference nil, :clone nil, :union nil, :construct 5538}, 5623 {:intersection nil, :remove nil, :insert nil, :lookup 34842, :concat 653, :equals nil, :iteration 11716, :difference nil, :clone nil, :union nil, :construct 62541}, 10 {:intersection nil, :remove nil, :insert nil, :lookup 19, :concat 265, :equals nil, :iteration 18, :difference nil, :clone nil, :union nil, :construct 118}, 31622 {:intersection nil, :remove nil, :insert nil, :lookup 328633, :concat 816, :equals nil, :iteration 66726, :difference nil, :clone nil, :union nil, :construct 363202}, 177 {:intersection nil, :remove nil, :insert nil, :lookup 747, :concat 241, :equals nil, :iteration 548, :difference nil, :clone nil, :union nil, :construct 1677}}, "bifurcan.LinearList" {100000 {:intersection nil, :remove nil, :insert nil, :lookup 253203, :concat 132, :equals nil, :iteration 69761, :difference nil, :clone nil, :union nil, :construct 463192}, 177827 {:intersection nil, :remove nil, :insert nil, :lookup 284300, :concat 129, :equals nil, :iteration 123582, :difference nil, :clone nil, :union nil, :construct 1670219}, 1000000 {:intersection nil, :remove nil, :insert nil, :lookup 1975906, :concat 135, :equals nil, :iteration 709329, :difference nil, :clone nil, :union nil, :construct 9882679}, 316227 {:intersection nil, :remove nil, :insert nil, :lookup 1869626, :concat 130, :equals nil, :iteration 219949, :difference nil, :clone nil, :union nil, :construct 4434546}, 1778 {:intersection nil, :remove nil, :insert nil, :lookup 2415, :concat 190, :equals nil, :iteration 1254, :difference nil, :clone nil, :union nil, :construct 9076}, 31 {:intersection nil, :remove nil, :insert nil, :lookup 45, :concat 187, :equals nil, :iteration 35, :difference nil, :clone nil, :union nil, :construct 189}, 10000 {:intersection nil, :remove nil, :insert nil, :lookup 16400, :concat 189, :equals nil, :iteration 6930, :difference nil, :clone nil, :union nil, :construct 59852}, 1000 {:intersection nil, :remove nil, :insert nil, :lookup 1257, :concat 187, :equals nil, :iteration 729, :difference nil, :clone nil, :union nil, :construct 4923}, 56 {:intersection nil, :remove nil, :insert nil, :lookup 81, :concat 189, :equals nil, :iteration 58, :difference nil, :clone nil, :union nil, :construct 348}, 562341 {:intersection nil, :remove nil, :insert nil, :lookup 888722, :concat 130, :equals nil, :iteration 415113, :difference nil, :clone nil, :union nil, :construct 9194265}, 316 {:intersection nil, :remove nil, :insert nil, :lookup 402, :concat 188, :equals nil, :iteration 257, :difference nil, :clone nil, :union nil, :construct 1938}, 100 {:intersection nil, :remove nil, :insert nil, :lookup 135, :concat 189, :equals nil, :iteration 97, :difference nil, :clone nil, :union nil, :construct 603}, 3162 {:intersection nil, :remove nil, :insert nil, :lookup 4462, :concat 189, :equals nil, :iteration 2201, :difference nil, :clone nil, :union nil, :construct 15648}, 56234 {:intersection nil, :remove nil, :insert nil, :lookup 140044, :concat 148, :equals nil, :iteration 39371, :difference nil, :clone nil, :union nil, :construct 279259}, 17 {:intersection nil, :remove nil, :insert nil, :lookup 26, :concat 163, :equals nil, :iteration 23, :difference nil, :clone nil, :union nil, :construct 150}, 17782 {:intersection nil, :remove nil, :insert nil, :lookup 40452, :concat 187, :equals nil, :iteration 12368, :difference nil, :clone nil, :union nil, :construct 112391}, 562 {:intersection nil, :remove nil, :insert nil, :lookup 706, :concat 189, :equals nil, :iteration 423, :difference nil, :clone nil, :union nil, :construct 3746}, 5623 {:intersection nil, :remove nil, :insert nil, :lookup 8145, :concat 189, :equals nil, :iteration 3914, :difference nil, :clone nil, :union nil, :construct 32593}, 10 {:intersection nil, :remove nil, :insert nil, :lookup 16, :concat 161, :equals nil, :iteration 17, :difference nil, :clone nil, :union nil, :construct 86}, 31622 {:intersection nil, :remove nil, :insert nil, :lookup 77626, :concat 190, :equals nil, :iteration 21930, :difference nil, :clone nil, :union nil, :construct 146753}, 177 {:intersection nil, :remove nil, :insert nil, :lookup 231, :concat 188, :equals nil, :iteration 161, :difference nil, :clone nil, :union nil, :construct 1031}}, "clojure.core.rrb-vector" {100000 {:intersection nil, :remove nil, :insert nil, :lookup 2813811, :concat 29900, :equals nil, :iteration 1835797, :difference nil, :clone nil, :union nil, :construct 1100608}, 177827 {:intersection nil, :remove nil, :insert nil, :lookup 5832712, :concat 33432, :equals nil, :iteration 3267179, :difference nil, :clone nil, :union nil, :construct 1964552}, 1000000 {:intersection nil, :remove nil, :insert nil, :lookup 63871894, :concat 42625, :equals nil, :iteration 18548726, :difference nil, :clone nil, :union nil, :construct 11052688}, 316227 {:intersection nil, :remove nil, :insert nil, :lookup 22443835, :concat 35833, :equals nil, :iteration 5833114, :difference nil, :clone nil, :union nil, :construct 3503652}, 1778 {:intersection nil, :remove nil, :insert nil, :lookup 23919, :concat 10418, :equals nil, :iteration 26883, :difference nil, :clone nil, :union nil, :construct 19724}, 31 {:intersection nil, :remove nil, :insert nil, :lookup 190, :concat 659, :equals nil, :iteration 303, :difference nil, :clone nil, :union nil, :construct 495}, 10000 {:intersection nil, :remove nil, :insert nil, :lookup 143060, :concat 7720, :equals nil, :iteration 153119, :difference nil, :clone nil, :union nil, :construct 109576}, 1000 {:intersection nil, :remove nil, :insert nil, :lookup 10760, :concat 15333, :equals nil, :iteration 11649, :difference nil, :clone nil, :union nil, :construct 11064}, 56 {:intersection nil, :remove nil, :insert nil, :lookup 556, :concat 838, :equals nil, :iteration 623, :difference nil, :clone nil, :union nil, :construct 752}, 562341 {:intersection nil, :remove nil, :insert nil, :lookup 49298836, :concat 37877, :equals nil, :iteration 10344969, :difference nil, :clone nil, :union nil, :construct 6150388}, 316 {:intersection nil, :remove nil, :insert nil, :lookup 3437, :concat 5847, :equals nil, :iteration 3622, :difference nil, :clone nil, :union nil, :construct 3584}, 100 {:intersection nil, :remove nil, :insert nil, :lookup 1083, :concat 2958, :equals nil, :iteration 1161, :difference nil, :clone nil, :union nil, :construct 1190}, 3162 {:intersection nil, :remove nil, :insert nil, :lookup 43169, :concat 4663, :equals nil, :iteration 48516, :difference nil, :clone nil, :union nil, :construct 34629}, 56234 {:intersection nil, :remove nil, :insert nil, :lookup 1426155, :concat 21259, :equals nil, :iteration 1033616, :difference nil, :clone nil, :union nil, :construct 619399}, 17 {:intersection nil, :remove nil, :insert nil, :lookup 106, :concat 481, :equals nil, :iteration 173, :difference nil, :clone nil, :union nil, :construct 330}, 17782 {:intersection nil, :remove nil, :insert nil, :lookup 291313, :concat 10993, :equals nil, :iteration 271247, :difference nil, :clone nil, :union nil, :construct 194009}, 562 {:intersection nil, :remove nil, :insert nil, :lookup 6053, :concat 9207, :equals nil, :iteration 6590, :difference nil, :clone nil, :union nil, :construct 6288}, 5623 {:intersection nil, :remove nil, :insert nil, :lookup 75496, :concat 5903, :equals nil, :iteration 85194, :difference nil, :clone nil, :union nil, :construct 61628}, 10 {:intersection nil, :remove nil, :insert nil, :lookup 62, :concat 378, :equals nil, :iteration 108, :difference nil, :clone nil, :union nil, :construct 236}, 31622 {:intersection nil, :remove nil, :insert nil, :lookup 629648, :concat 16872, :equals nil, :iteration 488417, :difference nil, :clone nil, :union nil, :construct 347757}, 177 {:intersection nil, :remove nil, :insert nil, :lookup 1932, :concat 3828, :equals nil, :iteration 2042, :difference nil, :clone nil, :union nil, :construct 2066}}} ================================================ FILE: doc/benchmarks/data/concat.csv ================================================ size,bifurcan.List,java.ArrayList,clojure.PersistentVector,vavr.Vector,scala.Vector,paguro.RrbTree,bifurcan.LinearList,clojure.core.rrb-vector 10,13.6,11.4,62.5,15.0,16.6,26.5,16.1,37.8 17,8.176471,8.058824,44.235294,10.0,10.0,16.941177,9.588235,28.294117 31,4.483871,3.483871,31.548388,6.774194,5.419355,11.387096,6.032258,21.258064 56,2.5714285,2.8035715,27.428572,5.464286,5.017857,14.071428,3.375,14.964286 100,2.41,2.31,23.35,5.72,5.18,3.98,1.89,29.58 177,1.7740113,2.1694915,21.17514,4.8248587,3.3107345,1.3615819,1.0621469,21.62712 316,1.25,2.1677215,19.256329,3.8544304,2.335443,0.8575949,0.5949367,18.503164 562,0.97153026,1.8505338,18.604982,3.955516,1.9306049,0.60854095,0.33629894,16.382563 1000,0.639,1.767,17.906,3.542,1.546,0.727,0.187,15.333 1778,0.82452196,1.6310462,17.655231,3.6091113,1.4150732,0.23622048,0.10686164,5.8593926 3162,0.10974067,1.7125237,17.795067,4.4756484,1.5623024,0.2852625,0.059772298,1.4746996 5623,0.073092654,1.7412413,17.545439,4.21252,1.4636315,0.11613018,0.03361195,1.0497955 10000,0.0447,1.8509,17.5732,4.5135,1.2487,0.0655,0.0189,0.772 17782,0.0351479,1.9495558,17.450119,4.1053877,1.4498931,0.03649758,0.010516252,0.6182094 31622,0.019796344,2.2976408,17.373095,4.3760357,1.4443742,0.02580482,0.006008475,0.5335526 56234,0.026229683,2.1811893,17.619324,4.3532915,1.2567486,0.012074546,0.0026318596,0.37804532 100000,0.00297,2.24121,17.79133,4.42338,1.28174,0.00898,0.00132,0.299 177827,0.0018726066,2.31764,17.851732,4.433815,1.2783042,0.0051848143,7.2542415E-4,0.18800294 316227,0.0011510719,6.0872726,17.925129,4.502373,1.284359,0.002539315,4.110971E-4,0.11331417 562341,8.535746E-4,7.5673695,18.17226,4.511332,1.2696211,0.0018725293,2.3117645E-4,0.06735593 1000000,4.81E-4,5.719794,18.577366,4.775235,0.936365,5.07E-4,1.35E-4,0.042625 ================================================ FILE: doc/benchmarks/data/list_construct.csv ================================================ size,bifurcan.List,java.ArrayList,clojure.PersistentVector,vavr.Vector,scala.Vector,paguro.RrbTree,bifurcan.LinearList,clojure.core.rrb-vector 10,9.4,6.0,24.9,122.8,9.8,11.8,8.6,23.6 17,9.058824,7.7647057,22.588236,125.82353,6.9411764,7.7647057,8.823529,19.411764 31,6.064516,6.645161,18.096775,135.64516,4.967742,7.0,6.096774,15.967742 56,8.035714,7.071429,16.071428,151.92857,5.803571,9.964286,6.214286,13.428572 100,7.91,6.62,15.3,170.31,5.65,9.22,6.03,11.9 177,7.9265537,8.853107,14.734464,160.52542,5.2768364,9.474576,5.8248587,11.672317 316,7.455696,7.990506,14.35443,167.99051,5.313291,10.537974,6.132911,11.341772 562,7.524911,9.339858,14.231317,173.57474,5.1921706,9.854093,6.6654806,11.188612 1000,7.284,7.784,14.172,180.283,5.072,11.107,4.923,11.064 1778,7.2581553,6.5410576,14.166479,182.65804,5.0646796,10.391451,5.104612,11.093364 3162,7.2046175,6.6514864,15.516445,191.10689,5.0246677,10.761543,4.9487667,10.951612 5623,7.386804,6.4266405,15.535657,196.15543,4.969945,11.1223545,5.796372,10.959986 10000,7.2806,8.4652,15.4981,179.5069,4.9662,11.2118,5.9852,10.9576 17782,7.3116074,6.484254,15.474975,209.4459,4.9549546,11.320211,6.3204927,10.910415 31622,7.2696857,8.90766,15.500917,219.30981,4.7470746,11.485738,4.6408515,10.997312 56234,7.403955,6.7953553,15.512821,208.06172,4.4433618,11.892645,4.966017,11.01467 100000,6.83919,6.66432,15.61274,190.86024,4.46145,12.22418,4.63192,11.00608 177827,7.2658763,7.4858656,15.6791935,197.01678,4.4837565,12.332981,9.392382,11.047546 316227,7.426023,12.031664,15.7804365,203.03307,4.428657,12.973231,14.0233,11.079547 562341,7.748018,17.517395,15.697704,195.32433,4.542214,13.011118,16.349981,10.937115 1000000,7.310803,14.379088,15.673957,203.99283,4.582003,12.813206,9.882679,11.052688 ================================================ FILE: doc/benchmarks/data/list_iterate.csv ================================================ size,bifurcan.List,java.ArrayList,clojure.PersistentVector,vavr.Vector,scala.Vector,paguro.RrbTree,bifurcan.LinearList,clojure.core.rrb-vector 10,1.4,1.6,1.7,1.8,3.6,1.8,1.7,10.8 17,1.117647,1.2352941,1.2941177,1.4117647,3.1176472,1.2941177,1.3529412,10.176471 31,0.9032258,1.0,1.1290323,1.1612903,2.2903225,1.0,1.1290323,9.774194 56,1.5178572,0.875,1.7678572,1.8392857,2.0535715,5.25,1.0357143,11.125 100,2.18,0.79,1.81,1.92,2.13,3.67,0.97,11.61 177,2.220339,0.7288136,1.8474576,1.9491526,2.0056498,3.0960453,0.90960455,11.536723 316,2.205696,0.65189874,1.7974683,1.9398735,1.8765823,2.414557,0.81329113,11.462026 562,2.208185,0.6049822,1.7829181,1.8790035,1.8291816,2.4163702,0.75266904,11.725979 1000,2.204,0.582,1.779,1.909,1.794,2.241,0.729,11.649 1778,1.4386952,0.56299216,1.8335208,1.78009,1.831271,2.1017997,0.70528686,15.119798 3162,1.4111322,0.5553447,1.8358634,1.771031,1.7931689,2.0714738,0.6960784,15.343453 5623,1.4131247,0.5568202,1.8289169,1.772897,1.790859,2.0835853,0.6960697,15.150987 10000,1.4238,0.5572,1.8315,1.7793,1.7861,2.1406,0.693,15.3119 17782,1.4203689,0.55775505,1.820324,1.7672366,1.7702171,2.0864358,0.6955348,15.254021 31622,1.4137626,0.5536336,1.8266397,1.769496,1.7614319,2.1101131,0.6935045,15.445481 56234,1.6594943,0.55566025,1.8979799,1.8362023,2.0761282,2.1287477,0.700128,18.380625 100000,1.67543,0.56148,1.91083,1.83461,2.07209,2.14485,0.69761,18.35797 177827,1.6713041,0.55777246,1.9065102,1.8242562,2.0584614,2.178387,0.69495636,18.372795 316227,1.6643709,0.55767536,1.9050903,1.8317759,2.070364,2.2198262,0.6955415,18.44597 562341,1.6746601,0.55650574,1.9053137,1.847212,2.061694,2.2391484,0.7381873,18.396255 1000000,1.745738,0.562946,1.935245,1.891016,2.107386,2.320294,0.709329,18.548725 ================================================ FILE: doc/benchmarks/data/list_lookup.csv ================================================ size,bifurcan.List,java.ArrayList,clojure.PersistentVector,vavr.Vector,scala.Vector,paguro.RrbTree,bifurcan.LinearList,clojure.core.rrb-vector 10,1.7,1.4,4.8,1.8,1.9,1.9,1.6,6.2 17,1.5882353,1.2941177,4.7647057,1.5882353,1.8235294,1.6470588,1.5294118,6.2352943 31,1.483871,1.2580645,4.709677,1.483871,1.7419355,1.516129,1.451613,6.129032 56,3.0535715,1.3214285,6.375,3.125,2.357143,1.5178572,1.4464285,9.928572 100,3.63,1.27,7.68,3.05,2.56,4.14,1.35,10.83 177,3.4745762,1.2542373,7.3502827,2.9096045,2.6271186,4.220339,1.3050847,10.915255 316,3.506329,1.2468355,7.844937,2.943038,2.7246835,4.243671,1.272152,10.876582 562,3.608541,1.2419928,7.640569,2.8701067,2.8790035,4.0373664,1.2562277,10.770463 1000,3.614,1.236,7.5,3.024,2.936,3.876,1.257,10.76 1778,7.7165356,1.2485939,9.169854,6.038808,4.84027,5.8762655,1.3582677,13.452756 3162,7.7179003,1.245098,9.3390255,6.066414,5.9930425,6.0588236,1.4111322,13.652435 5623,7.695892,1.2532456,9.659434,6.1563225,6.353904,6.1963363,1.448515,13.426285 10000,8.4389,1.5085,11.4617,6.9566,6.9497,7.3155,1.64,14.306 17782,10.189462,2.159712,13.280733,8.604038,8.824204,9.230177,2.2748847,16.382465 31622,11.259218,2.19439,15.286858,9.492474,9.626589,10.392543,2.45481,19.911707 56234,16.073248,2.3088167,29.581766,8.739464,13.578423,20.866753,2.4903796,25.36108 100000,17.46262,2.32778,22.28634,10.32512,14.43229,22.792,2.53203,28.13811 177827,17.224106,1.5161252,24.876661,13.565139,10.459368,21.944075,1.5987449,32.799923 316227,24.521938,1.4419926,49.83222,15.486619,13.8668585,32.42438,5.912291,70.97381 562341,29.629885,6.4323745,40.65725,24.448853,20.996775,81.977974,1.5803969,87.66715 1000000,33.788033,1.873249,65.53855,35.926792,18.855904,45.184235,1.975906,63.871895 ================================================ FILE: doc/crrbv-27/description.md ================================================ The production version of the Clojure/Java core.rrb-vector library is in the directory `src/main/clojure`. It uses a maximum tree branching factor of 32, i.e. all tree nodes have at most 32 children, the same as Clojure's built in persistent vectors. This is good for look-up efficiency, but when testing the code, it requires a large number of vector elements and/or operations on the vectors in order to reach "interesting" tree structures that exercise all parts of the code, and find bugs that may be there. The `src/parameterized/clojure` directory contains source code for a modified version of the core.rrb-vector library, which uses parameters defined in file `src/parameterized/clojure/clojure/core/rrb_vector/parameters.clj` to control the maximum tree branching factor. The existing code allows you to change the value of `shift-increment` to any value that is 2 or larger, and the maximum tree branching factor will then be 2 to the power of `shift-increment`. I have found a bug in the core.rrb-vector library that I do not yet fully understand. I have a patch to the code that causes the problem not to occur, but I do not understand the original or modified code well enough to be confident that it is a correct fix. I do have an easily reproducible test case that causes the problem to occur with the parameterized version of the code when `shift-increment` is 2, so a maximum tree branch factor of 4. I believe it is likely that a different test sequence could be found that exhibits the same bug with the production code's branch factor of 32, but it would likely be a far longer sequence of operations, e.g. perhaps as long as millions of operations or more. Here is a way to reproduce the problem with the parameterized version of the code. ```bash git clone https://github.com/clojure/core.rrb-vector cd core.rrb-vector git checkout f69df0f95e450bb4ff8e3294f3265d3d25f4e5db patch -p1 < use-shift-increment-2.patch ``` You can see these tests fail: ```bash ./script/jdo check ``` To reproduce the same problem in the REPL, it takes a few more steps, shown below. ```bash # Start a JVM with a socket REPL listening on TCP port 50505 ./script/jdo ``` ```clojure ;; Connect to the socket REPL from your favorite dev environment, via ;; either a socket REPL, nREPL, whatever you prefer and know how to ;; set up. (require '[clojure.core.rrb-vector :as fv] '[clojure.core.rrb-vector.debug :as dv] '[clojure.core.rrb-vector.rrbt :as rrbt]) ;; Enable full debug options, except trace printing is off, for all ;; checking-* functions in the debug namespace. (dv/set-debug-opts! dv/full-debug-opts) ;; The shortest sequence of operations I currently know that reaches ;; the point where the bug occurs only after about 2000 calls to ;; insert-by-sub-vcatvec, with a particular sequence of arguments. ;; It takes a couple of minutes to get there when using ;; checking-catvec and checking-subvec, but we can capture the vectors ;; that cause the problem once we get there, and as long as we keep ;; the same JVM running we can examine their contents all we want. (def my-catvec dv/checking-catvec) (def my-subvec dv/checking-subvec) ;; This code is slightly modified from that in the deftest named ;; test-reduce-subvec-catvec in the test-common namespace. (defn insert-at-index-n [v n] (my-catvec (my-subvec v 0 n) (dv/cvec ['x]) (my-subvec v n))) (defn insert-by-sub-catvec [v [n sz]] (let [ret (insert-at-index-n v n)] (when (or (< n 20) (zero? (mod n 10))) (println "n=" n)) (let [ret-nums (filter number? ret)] (if (not= (filter number? ret) (range sz)) (throw (ex-info (str "Failure for sz=" sz " n=" n) {:v v :sz sz :n n})))) ret)) (defn repeated-subvec-catvec [sz] (reduce insert-by-sub-catvec (dv/cvec (range sz)) (map (fn [x] [x sz]) (range sz 0 -1)))) (count @dv/failure-data) ;; should be 0 initially (def sz 2061) (def x (repeated-subvec-catvec sz)) ;; That caused an exception to be thrown just after "n= 15" was printed ;; Record the exception in e1 (def e1 *e) (count @dv/failure-data) ;; should be 1 now, since data about 1 error has been appended to the vector ;; @dv/failure-data. ;; extract the data from the error (def ed (nth @dv/failure-data 0)) (keys ed) (:err-desc-str ed) ;; "splice-rrbts-main" (def vret (:ret ed)) (count (:args ed)) ;; should be 4. We only care about the last two args, which are ;; the two vectors that we were trying to concatenate. (def v1 (nth (:args ed) 2)) (def v2 (nth (:args ed) 3)) (def errinf (:error-info ed)) errinf ;; {:error true, :description "One or more errors found", :data ({:error true, :kind :internal, :description "Found internal non-regular node with 1 non-nil, 3 nil children, and # children prefix sums: (39) - expected that to match stored ranges: (33 0 0 0 1)"})} (dv/dbg-vec v1) (dv/dbg-vec v2) (dv/dbg-vec vret) ``` In the long output of the dbg-vec on vret, there is only one node that has a ranges array printed as (33 0 0 0 1), so that must be the one that the error message above is referring to. Here are the few lines before and after that node, which are near the beginning of the output: ``` Vector (4109 elements): 16:00 PersistentVector$Node: (33 4108 0 0 2) 14:00 PersistentVector$Node: (33 0 0 0 1) 12:00 PersistentVector$Node: (37 39 0 0 2) 10:00 PersistentVector$Node: (21 29 35 37 4) ``` The line starting with 14:00 is the node that has the reported problem. Its first child is the node printed on the line starting with 12:00. Note that its ranges array contains (37 39 0 0 2), so that node has 2 children, the first with 37 vector elements as leaves beneath it, the second with 39-37=2 vector elements with leaves beneath it (if those values are correct -- I believe they are, else ranges-errors would have complained about those before its parent node). The node 14:00 should have a ranges array (39 0 0 0 1), but it is (33 0 0 0 1), so it is under-counting the number of vector elements beneath it by 6. Its parent node, the one printed on line 16:00, is consistent with node 14:00, but also too small by 6. ================================================ FILE: doc/crrbv-27/proposed-fix-needs-thought-and-testing-plus-debug-prints.patch ================================================ diff --git a/src/parameterized/clojure/clojure/core/rrb_vector/debug.clj b/src/parameterized/clojure/clojure/core/rrb_vector/debug.clj index 27156d2..8208cae 100644 --- a/src/parameterized/clojure/clojure/core/rrb_vector/debug.clj +++ b/src/parameterized/clojure/clojure/core/rrb_vector/debug.clj @@ -248,28 +248,31 @@ (defn slow-into [to from] (reduce conj to from)) +(defn all-nodes-in-subtree [node node-shift get-array regular?] + (letfn [(go [depth shift node] + (if node + (if (not= shift 0) + (cons + {:depth depth :shift shift :kind :internal :node node} + (apply concat + (map (partial go (inc depth) (- shift p/shift-increment)) + (let [arr (get-array node)] + (if (regular? node) + arr + (butlast arr)))))) + (cons {:depth depth :shift shift :kind :internal :node node} + (map (fn [x] + {:depth (inc depth) :kind :leaf :value x}) + (get-array node))))))] + (go 1 node-shift node))) + (defn all-vector-tree-nodes [v] (let [{:keys [v get-root get-shift get-array regular?]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v)] - (letfn [(go [depth shift node] - (if node - (if (not= shift 0) - (cons - {:depth depth :shift shift :kind :internal :node node} - (apply concat - (map (partial go (inc depth) (- shift p/shift-increment)) - (let [arr (get-array node)] - (if (regular? node) - arr - (butlast arr)))))) - (cons {:depth depth :shift shift :kind :internal :node node} - (map (fn [x] - {:depth (inc depth) :kind :leaf :value x}) - (get-array node))))))] - (cons {:depth 0 :kind :base :shift shift :value v} - (go 1 shift root))))) + (cons {:depth 0 :kind :base :shift shift :value v} + (all-nodes-in-subtree root shift get-array regular?)))) ;; All nodes that should be internal nodes are one of the internal ;; node types satisfying internal-node? All nodes that are less @@ -297,6 +300,11 @@ ;; a straightforward sanity check to make, to return an error if a ;; non-regular node is found with a regular ancestor in the tree. +(defn basic-node-errors-subtree [node node-shift] + + ) + + (defn basic-node-errors [v] (let [{:keys [v get-shift]} (pd/unwrap-subvec-accessors-for v) shift (get-shift v) @@ -500,6 +508,48 @@ (/ (* 1.0 tail-off) max-tree-cap))) +(defn ranges-errors-subtree + "Whereas ranges-errors performs checks on the contents of the entire + tree that represents a vector, ranges-errors-subtree performs these + checks on any subtree of such a tree, rooted at the given node. You + must also supply a shift value, since this is not stored in the data + itself." + ([node node-shift get-array regular? get-ranges] + (ranges-errors-subtree node node-shift get-array regular? get-ranges + false nil)) + ([node node-shift get-array regular? get-ranges do-root-checks? root-node-cnt] + (letfn [ + (go [shift node] + (cond + (nil? node) {:error false :kind :nil} + (zero? shift) (let [n (count (get-array node))] + (merge {:error (zero? n), :kind :leaves, + :full? (= n p/max-branches), :count n} + (if (zero? n) + {:description + (str "Leaf array has 0 elements." + " Expected > 0.")}))) + :else ;; non-0 shift + (let [children (map (partial go (- shift p/shift-increment)) + (let [arr (get-array node)] + (if (regular? node) + arr + (butlast arr)))) + errs (filter :error children)] + (cond + (seq errs) {:error true, :description "One or more errors found", + :data errs} + (not= p/max-branches (count children)) + {:error true, :kind :internal, + :description (str "Found internal node that has " + (count children) " children - expected p/max-branches.")} + (regular? node) (regular-node-errors (and do-root-checks? + (= shift node-shift)) + root-node-cnt children) + :else (non-regular-node-errors node get-ranges children)))))] + (go node-shift node)))) + + (defn ranges-errors [v] (let [{:keys [v get-root get-shift get-tail get-cnt get-array get-ranges regular? tail-len]} @@ -508,70 +558,48 @@ root-node-cnt (count v) root-shift (get-shift v) tail-off (pd/dbg-tailoff v) - tail (get-tail v)] - (letfn [ - (go [shift node] - (cond - (nil? node) {:error false :kind :nil} - (zero? shift) (let [n (count (get-array node))] - (merge {:error (zero? n), :kind :leaves, - :full? (= n p/max-branches), :count n} - (if (zero? n) - {:description - (str "Leaf array has 0 elements." - " Expected > 0.")}))) - :else ;; non-0 shift - (let [children (map (partial go (- shift p/shift-increment)) - (let [arr (get-array node)] - (if (regular? node) - arr - (butlast arr)))) - errs (filter :error children)] - (cond - (seq errs) {:error true, :description "One or more errors found", - :data errs} - (not= p/max-branches (count children)) - {:error true, :kind :internal, - :description (str "Found internal node that has " - (count children) " children - expected p/max-branches.")} - (regular? node) (regular-node-errors (= shift root-shift) - root-node-cnt children) - :else (non-regular-node-errors node get-ranges children)))))] - (let [x (go root-shift root)] - (cond - (:error x) x - (not= tail-off (:count x)) - {:error true, :kind :root, - :description (str "Found tail-off=" tail-off " != " (:count x) - "=count of values beneath internal nodes") - :internal-node-leaf-count (:count x) :tail-off tail-off - :cnt (get-cnt v)} - (and (pd/transient-vector? v) - (not= (tail-len tail) p/max-branches)) - {:error true, :kind :root, - :description (str "Found transient vector with tail length " - (tail-len tail) " - expecting p/max-branches")} - ;; It is always a bad thing if shift becomes more than 32, - ;; because the bit-shift-left and bit-shift-right operations - ;; on 32-bit ints actually behave like (bit-shift-left - ;; x (mod shift-amount 32)) for shift-amount over 32. It is - ;; also likely a bug in the implementation if that happens. - (>= root-shift 32) - {:error true, :kind :root, - :description (str "shift of root is " root-shift " >= 32," - " which is not supported.")} - ;; This is not necessarily a bug, but it seems likely to be - ;; a bug if a tree is less than 1/max-branches-squared full compared to its - ;; max capacity. 1/(p/max-branches) full is normal when a tree becomes 1 - ;; deeper than it was before. - (< 0 (:count x) (max-capacity-divided-by-max-branches-squared root-shift)) - {:error false, :warning true, :kind :root-too-deep, - :description (str "For root shift=" root-shift " the maximum " - "capacity divided by p/max-branches-squared is " - (max-capacity-divided-by-max-branches-squared root-shift) - " but the tree contains only " - (:count x) " vector elements outside of the tail")} - :else x))))) + tail (get-tail v) + x (ranges-errors-subtree root root-shift get-array regular? get-ranges + true root-node-cnt)] + (cond + (:error x) x + + (not= tail-off (:count x)) + {:error true, :kind :root, + :description (str "Found tail-off=" tail-off " != " (:count x) + "=count of values beneath internal nodes") + :internal-node-leaf-count (:count x) :tail-off tail-off + :cnt (get-cnt v)} + + (and (pd/transient-vector? v) + (not= (tail-len tail) p/max-branches)) + {:error true, :kind :root, + :description (str "Found transient vector with tail length " + (tail-len tail) " - expecting p/max-branches")} + + ;; It is always a bad thing if shift becomes more than 32, + ;; because the bit-shift-left and bit-shift-right operations + ;; on 32-bit ints actually behave like (bit-shift-left + ;; x (mod shift-amount 32)) for shift-amount over 32. It is + ;; also likely a bug in the implementation if that happens. + (>= root-shift 32) + {:error true, :kind :root, + :description (str "shift of root is " root-shift " >= 32," + " which is not supported.")} + + ;; This is not necessarily a bug, but it seems likely to be + ;; a bug if a tree is less than 1/max-branches-squared full compared to its + ;; max capacity. 1/(p/max-branches) full is normal when a tree becomes 1 + ;; deeper than it was before. + (< 0 (:count x) (max-capacity-divided-by-max-branches-squared root-shift)) + {:error false, :warning true, :kind :root-too-deep, + :description (str "For root shift=" root-shift " the maximum " + "capacity divided by p/max-branches-squared is " + (max-capacity-divided-by-max-branches-squared root-shift) + " but the tree contains only " + (:count x) " vector elements outside of the tail")} + + :else x))) #_(defn add-return-value-checks [f err-desc-str return-value-check-fn] (fn [& args] @@ -1119,7 +1147,7 @@ "(type v1)=" (type v1) "(type v2)=" (type v2))) (let [r1 (checking-splice-rrbts-main nm am v1 v2) - r2 (rrbt/peephole-optimize-root r1)] + r2 (rrbt/peephole-optimize-root v1 v2 r1)] ;; Optimize a bit by only doing all of the sanity checks on r2 ;; if it is not the same identical data structure r1 that ;; checking-splice-rrbts-main already checked. diff --git a/src/parameterized/clojure/clojure/core/rrb_vector/rrbt.clj b/src/parameterized/clojure/clojure/core/rrb_vector/rrbt.clj index 9548b40..ee61ef4 100644 --- a/src/parameterized/clojure/clojure/core/rrb_vector/rrbt.clj +++ b/src/parameterized/clojure/clojure/core/rrb_vector/rrbt.clj @@ -1387,6 +1387,8 @@ (map list (take gcs arr) (take gcs (map - rngs (cons 0 rngs))))))] (mapcat cseq (take cs arr) (take cs (map - rngs (cons 0 rngs)))))) +(def extra-trace (atom false)) + (defn rebalance [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] (if (nil? n2) @@ -1398,14 +1400,24 @@ sbc2 (subtree-branch-count nm am n2 shift) p (+ sbc1 sbc2) e (- a (inc (quot (dec p) p/max-branches)))] + (when @extra-trace + (println "dbg rebalance #1 shift=" shift + "cnt1=" cnt1 "cnt2=" cnt2) + (println "dbg rebalance #2 slc1=" slc1 " slc2=" slc2 + "a=" a "sbc1=" sbc1 "sbc2=" sbc2 "p=" p "e=" e)) (cond (<= e max-extra-search-steps) - (pair n1 n2) + (do + (when @extra-trace + (println "dbg rebalance #3a")) + (pair n1 n2)) (<= (+ sbc1 sbc2) p/max-branches-squared) (let [new-arr (object-array p/non-regular-array-len) new-rngs (int-array p/non-regular-array-len) new-n1 (.node nm nil new-arr)] + (when @extra-trace + (println "dbg rebalance #3b")) (loop [i 0 bs (partition-all p/max-branches (concat (child-seq nm n1 shift cnt1) @@ -1427,7 +1439,21 @@ (aset new-rngs p/max-branches (inc i)) (recur (inc i) (next bs))))) (aset new-arr p/max-branches new-rngs) - (set! (.-val transferred-leaves) cnt2) + +;; (when @extra-trace +;; (println "dbg rebalance #3 cnt2=" cnt2 "-> transferred-leaves")) +;; (set! (.-val transferred-leaves) cnt2) + + (when-not (zero? (.-val transferred-leaves)) + (println "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX") + (println "dbg CHANGED rebalance #3" + "shift=" shift + "cnt1=" cnt1 + "cnt2=" cnt2 + "cnt2 added to transferred-leaves=" (.-val transferred-leaves))) + (set! (.-val transferred-leaves) + (+ (.-val transferred-leaves) cnt2)) + (pair new-n1 nil)) :else @@ -1437,6 +1463,8 @@ new-rngs2 (int-array p/non-regular-array-len) new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] + (when @extra-trace + (println "dbg rebalance #3c")) (loop [i 0 bs (partition-all p/max-branches (concat (child-seq nm n1 shift cnt1) @@ -1457,6 +1485,10 @@ d (if (>= tbs p/max-branches) (aget r li) (- (aget r li) (aget r (- li tbs))))] + (when @extra-trace + (println "dbg rebalance #4 transferred-leaves was" + (.-val transferred-leaves) + "adding d=" d "to it")) (set! (.-val transferred-leaves) (+ (.-val transferred-leaves) d)))) (let [new-arr (if (< i p/max-branches) new-arr1 new-arr2) @@ -1472,7 +1504,21 @@ (aset new-arr2 p/max-branches new-rngs2) (pair new-n1 new-n2)))))) -(defn zippath +(declare zippath) + +(defn show-ranges [msg node nm desc] + (cond + (nil? node) + (println (format "%s %s=nil" msg desc)) + + (= p/non-regular-array-len (alength (.array node))) + (println (format "%s (ranges %s)=" msg desc) + (seq (ranges nm node))) + + :else + (println (format "%s %s has no ranges" msg desc)))) + +(defn zippath-main [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] (if (== shift p/shift-increment) (rebalance-leaves nm am n1 cnt1 n2 cnt2 transferred-leaves) @@ -1494,6 +1540,22 @@ [new-c1 new-c2] (zippath nm am (- shift p/shift-increment) c1 ccnt1 c2 ccnt2 next-transferred-leaves) d (.-val next-transferred-leaves)] + (when @extra-trace + (println "dbg zippath #1 shift=" shift + "cnt1=" cnt1 "(reg? n1)=" (.regular nm n1) + "ccnt1=" ccnt1 + "cnt2=" cnt2 "(reg? n2)=" (.regular nm n2) + "ccnt2=" ccnt2) + (println "dbg zippath #2 d=" d + "(.-val transferred-leaves)=" (.-val transferred-leaves)) + (show-ranges "dbg zippath #3" c1 nm "c1") + (show-ranges "dbg zippath #3" c2 nm "c2") + (show-ranges "dbg zippath #3" new-c1 nm "new-c1") + (show-ranges "dbg zippath #3" new-c2 nm "new-c2") + (println "dbg zippath #4 (identical? c1 new-c1)=" + (identical? c1 new-c1)) + (println "dbg zippath #4 (identical? c2 new-c2)=" + (identical? c2 new-c2))) (set! (.-val transferred-leaves) (+ (.-val transferred-leaves) d)) (rebalance nm am shift (if (identical? c1 new-c1) @@ -1508,6 +1570,18 @@ (- cnt2 d) transferred-leaves)))) +(def zippath-extra-check-fn (atom nil)) + +(defn zippath + [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] + (let [ret (zippath-main nm am shift n1 cnt1 n2 cnt2 transferred-leaves) + [n1 n2] ret] + (let [f @zippath-extra-check-fn] + (when f + (f n1 shift nm) + (f n2 shift nm))) + ret)) + (defn squash-nodes [^NodeManager nm shift n1 cnt1 n2 cnt2] (let [arr1 (.array nm n1) arr2 (.array nm n2) @@ -1615,7 +1689,12 @@ (let [rngs (ranges nm node)] (aget rngs (dec (aget rngs p/max-branches)))))) -(defn peephole-optimize-root [^Vector v] +(def peephole-extra-check (atom nil)) + +(defn peephole-optimize-root [arg1 arg2 ^Vector v] + (let [f @peephole-extra-check] + (if f + (f v arg1 arg2))) (let [config @peephole-optimization-config] (if (<= (.-shift v) (* 2 p/shift-increment)) ;; Tree depth cannot be reduced if shift <= p/shift-increment. @@ -1766,6 +1845,8 @@ s2 (.-shift v2) r1 (.-root v1) o? (overflow? nm r1 s1 (+ (count v1) (- p/max-branches (.alength am (.-tail v1))))) + _ (when @extra-trace + (println "dbg #1 s1=" s1 "s2=" s2 "o?=" o?)) r1 (if o? (let [tail (.-tail v1) tail-node (.node nm nil tail) @@ -1795,9 +1876,24 @@ d (.-val transferred-leaves) ncnt1 (+ (count v1) d) ncnt2 (- (count v2) (.alength am (.-tail v2)) d) + _ (when @extra-trace + (show-ranges "dbg #1b" n1 nm "n1") + (show-ranges "dbg #1b" n2 nm "n2") + (println "dbg #2 s=" s "(class d)=" (class d) "d=" d) + (println "dbg #3 (count v1)=" (count v1) + "(count v2)=" (count v2) + "ncnt1=" ncnt1 "ncnt2=" ncnt2) + (println "dbg #4 (identical? n2 r2)=" + (identical? n2 r2))) [n1 n2] (if (identical? n2 r2) (squash-nodes nm s n1 ncnt1 n2 ncnt2) (object-array (list n1 n2))) + _ (when @extra-trace + (println "dbg #5 (boolean n2)=" (boolean n2)) + (let [al1 (alength (.array n1))] + (if (== al1 p/non-regular-array-len) + (println "dbg #6 n1 ranges=" (seq (aget (.array n1) + p/max-branches)))))) ncnt1 (if n2 (int ncnt1) (unchecked-add-int (int ncnt1) (int ncnt2))) @@ -1826,7 +1922,7 @@ (defn splice-rrbts [^NodeManager nm ^ArrayManager am ^Vector v1 ^Vector v2] (let [r1 (splice-rrbts-main nm am v1 v2) - r2 (peephole-optimize-root r1)] + r2 (peephole-optimize-root v1 v2 r1)] (fallback-to-slow-splice-if-needed v1 v2 r2))) (defn array-copy [^ArrayManager am from i to j len] ================================================ FILE: doc/crrbv-27/proposed-fix-needs-thought-and-testing.patch ================================================ diff --git a/src/main/clojure/clojure/core/rrb_vector/rrbt.clj b/src/main/clojure/clojure/core/rrb_vector/rrbt.clj index 1d231a6..45bfb6b 100644 --- a/src/main/clojure/clojure/core/rrb_vector/rrbt.clj +++ b/src/main/clojure/clojure/core/rrb_vector/rrbt.clj @@ -1426,7 +1426,16 @@ (aset new-rngs 32 (inc i)) (recur (inc i) (next bs))))) (aset new-arr 32 new-rngs) - (set! (.-val transferred-leaves) cnt2) +;; (set! (.-val transferred-leaves) cnt2) + (when-not (zero? (.-val transferred-leaves)) + (println "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX") + (println "dbg CHANGED rebalance #3" + "shift=" shift + "cnt1=" cnt1 + "cnt2=" cnt2 + "cnt2 added to transferred-leaves=" (.-val transferred-leaves))) + (set! (.-val transferred-leaves) + (+ (.-val transferred-leaves) cnt2)) (pair new-n1 nil)) :else diff --git a/src/test/clojure/clojure/core/rrb_vector/long_test.clj b/src/test/clojure/clojure/core/rrb_vector/long_test.clj index b68cf5a..ea2540f 100644 --- a/src/test/clojure/clojure/core/rrb_vector/long_test.clj +++ b/src/test/clojure/clojure/core/rrb_vector/long_test.clj @@ -84,7 +84,7 @@ (defn vector-push-f [v my-catvec extra-checks-catvec] (loop [v v i 0] - (let [check? (or (zero? (mod i 10000)) + (let [check? (or (zero? (mod i 100)) (and (> i 99000) (zero? (mod i 100))) (and (> i 99900) (zero? (mod i 10))))] (when check? ================================================ FILE: doc/crrbv-27/use-shift-increment-2.patch ================================================ diff --git a/deps.edn b/deps.edn index 775cc3e..f03df7a 100644 --- a/deps.edn +++ b/deps.edn @@ -5,8 +5,8 @@ ;; want to test with modifications to it: ;; org.clojure/clojurescript {:local/root "/Users/jafinger/clj/clojurescript"} -{:paths ["src/main/clojure" "src/main/cljs" "src/main/cljc"] - ;;:paths ["src/parameterized/clojure" "src/main/cljs" "src/main/cljc"] +{;;:paths ["src/main/clojure" "src/main/cljs" "src/main/cljc"] + :paths ["src/parameterized/clojure" "src/main/cljs" "src/main/cljc"] :aliases {;; Common alias to use for all Clojure/Java commands :clj {:jvm-opts ["-XX:-OmitStackTraceInFastThrow"]} diff --git a/src/parameterized/clojure/clojure/core/rrb_vector/parameters.clj b/src/parameterized/clojure/clojure/core/rrb_vector/parameters.clj index 2cd5004..c8a58ea 100644 --- a/src/parameterized/clojure/clojure/core/rrb_vector/parameters.clj +++ b/src/parameterized/clojure/clojure/core/rrb_vector/parameters.clj @@ -8,7 +8,7 @@ ;; * when the shift-increment is 2 ;; 5 3 2 -(def shift-increment 5) +(def shift-increment 2) ;; 10 6 4 (def shift-increment-times-2 (* 2 shift-increment)) ================================================ FILE: doc/hash-details.md ================================================ # Background on Clojure collection hash calculation The persistent collections included with Clojure are immutable when accessed via Clojure's published methods, e.g. `conj`, `assoc`, `peek`, `seq`, etc. Their implementation actually uses mutable fields to store cached versions of their Java `.hashCode` and `clojure.core/hash` values (those values are different from each other for most collection values starting with Clojure 1.6.0). On the JVM, all fields of a newly constructed object are first initialized to their default JVM initial values, e.g. 0 for a primitive `int` field, `null` for all references, etc. Then any values assigned in the constructor are assigned. If the field is declared `final`, then as long as the reference to the object is not made visible to any other object before the constructor finishes executing, any thread that later sees the object should see only the value assigned while the constructor executed, not the default JVM initial value. This is promised by the Java Memory Model and the specialness of the `final` field modifier. However, the cached hash fields in Clojure, and some other Java objects, is intentionally stored in non-final fields, so that if no other code ever needs to know the hash of the value, no time is ever spent calculating it. This is a performance optimization. If some other code does later want the hash value, then it is calculated on demand at that time, and the calculated value is written into the field, so that later calls to get the hash value by the same thread are guaranteed to avoid calculating it again. If another thread calls the function/method to get the hash value, because the hash field has no special Java modifiers like `final` or `volatile`, the Java Memory Model says that it might get the updated value, or it might get the value from an older write to that field, e.g. the initial value of 0 from the JVM default initialization of all fields -- _even if_ the constructor assigned a non-0 value to the field. Thus for thread safety of getting hash values of immutable values using this "initialize a default value, and calculate on demand later", if the field where the cached value is stored has no special modifiers like `final` or `volatile`, and the function to get the hash value is not declared `synchronized` (none of which are true for the Clojure implementation), the only safe value to assign in the constructor is none at all (leaving the field as the default JVM initial value of 0), or to assign a value of 0 explicitly. If any other value is assigned during the constructor, e.g. -1, other threads calling the hash function might read a 0 from that field, or -1, depending upon all kinds of factors that are impossible to control or observe from a Java program. Before Clojure 1.9.0, every collection did assign a value of -1 to these fields during the constructor call, which was unsafe. This was fixed in the Clojure 1.9.0 release. See this JIRA ticket for more details: https://clojure.atlassian.net/browse/CLJ-2091 It links to this article on this pattern of writing Java code that has intentional data races, but is still correct according to the Java Memory Model: http://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html Note that JavaScript run time environments are single threaded (not counting WebWorkers, but as far as I know, no ClojureScript objects are shared between the main thread and any WebWorker threads via shared memory), so these issues do not arise, and any initial value can be stored in the hash field without a problem. # Java `.hashCode` vs. `clojure.core/hash` Some background and history on why `.hashCode` and `clojure.core/hash` return different values from each other can be found in the Clojure equality guide, especially the ["Equality and hash" section](https://clojure.org/guides/equality#equality_and_hash). # Details in core.rrb-vector Clojure implementation `core.rrb-vector`'s Clojure implementation should use an initial value of the mutable hash fields of 0, for the same reasons described above that any such fields should be initialized to 0 on the JVM. In `core.rrb-vector` release 0.0.14 and earlier, these fields were incorrectly initialized to -1. Again, this does not cause any problems in a single-threaded program, and would only cause problems in some timing-dependent cases (perhaps rarely, but there is no guarantee of this) in a multi-threaded program. This is a list of classes in `core.rrb-vector`'s Clojure implementation that were changed to correct this problem, after the release of version 0.0.14: * `VecSeq` - last field is Clojure _hasheq, second to last is Java _hash * `Vector` - same as `VecSeq` * `Transient` - no hash fields. I believe that calling `clojure.core/hash` on a transient collection falls through to some default hash implementation that is based on the identity of the mutable object, and is not the same for all transients with "the same" contents, the way it is for immutable collections. ## `VecSeq` methods and constructor calls The constructor to `VecSeq` in method `withMeta` actually appears safe to me to initialize the returned object with the same values as the original collection. The hash of the returned collection is guaranteed to be the same as that of the collection on which `withMeta` was called, so the initial values will either be 0, or the correct final hash value. All occurrences of the string VecSeq in the implementation are in the one source file rrbt.clj, so I feel safe in saying I have corrected all constructor calls to VecSeq. I also corrected an unsafe data race in the implementation of method `hasheq` for class `VecSeq`. It was reading the field `_hasheq` twice per call, instead of only once. My fixed version reads that field at most once. See this article for the details of the changes I made and why: http://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html ## `Vector` methods and constructor calls The constructor to `Vector` in method `withMeta` actually appears safe to me, for the same reason as it is for the same method in class `VecSeq` described in the previous section. Files containing constructor calls for class `Vector`: * rrbt.clj - many, fixed * rrb_vector.clj - fixed Files checked for occurrences of `-1`: * rrb_vector.clj - none remaining after constructor calls to `Vector` were fixed. * debug.clj - only one, used to return a "not found" value from `first-diff` * fork_join.clj - none, and no constructor calls * interop.clj.clj - none, and no constructor calls * nodes.clj - none, and no constructor calls to `Vector` or `VecSeq`. Many to `VecNode`, but it has no hash fields. * protocols.clj - none, and no constructor calls * rrbt.clj * Many occurrences of `Vector.` constructor calls were updated to use initial hash values of 0 instead of -1. * Fixed racy hash function/method bugs in these methods, for the same reason as described in previous section. * Vector/hashCode * Vector/hasheq * VecSeq/hasheq * transients.clj - none, and no mentions of `Vector` or `Vecseq` or hash # Details in core.rrb-vector ClojureScript implementation Since there are no thread-safety issues here, the only thing to double check is that the same value (e.g. 0, -1, or whatever constant value) is used consistently in all places where a new object is created that contains a cached hash field, and wherever the hash function is calculated. Classes in the ClojureScript `core.rrb-vector` implementation with a hash field: * `RRBChunkedSeq` - the last field of the constructor call, named `__hash` * `Vector` - the last field of the constructor call, named `__hash` All constructor calls for `RRBChunkedSeq` are in file rrbt.cljs, and all use `nil` as the initial value for the field `__hash`. Its `-hash` method calls `caching-hash` provided by the ClojureScript core code, which uses `nil` as the "hash not calculated yet" value. All constructor calls for `Vector` are in a few files, and all use `nil` as the initial value for the field `__hash`. Its `-hash` method also uses `caching-hash`, as described in previous paragraph. ================================================ FILE: doc/rrb-tree-notes.md ================================================ # Other implementations and descriptions of RRB trees Note that most implementations have an associated paper. If they have an author in common, then typically the paper or talk describes the implementation they published. Implementations: * TBD: Tiark Rompf and Phil Bagwell's original implementation code? * If it is available somewhere, likely it is instrumented for experimentation on multiple variations of the algorithms, and counting significant events that they were trying to optimize. Thus more of a research implementation than one intended for use in production. * [`scala-rrb-vector`](https://github.com/nicolasstucki/scala-rrb-vector) Scala library by Nicolas Stucki * As of Oct 2019, this library has a few unfixed bugs that were reported as Github issues in 2017. I have not examined them to see how easy they might be to fix. * [`bifurcan`](https://github.com/lacuna/bifurcan) Java library by Zach Tellman, Java class `io.lacuna.bifurcan.List` * [`Paguro`](https://github.com/GlenKPeterson/Paguro) Java library by Glen Peterson, Java class `org.organicdesign.fp.collections.RrbTree`. * From some comments in the code, it appears that perhaps this implementation is more precisely a data structure based upon B-trees, because those comments imply that nodes in the tree can have a number of children varying between some branching factor B, and be as low as B/2. * [`c-rrb`](https://github.com/hypirion/c-rrb) C library by Jean Niklas L'orange * [`Array`](https://github.com/xash/Array) RRB trees implemented in JavaScript for use in the Elm programming language. * [`immer`](https://sinusoid.es/immer) C++ library by Juan Pedro Bolivar Puente * [`Vector`](https://docs.rs/im/12.3.3/im/vector/enum.Vector.html) Rust implementation of RRB trees by Bodil Stokke * [Source code](https://docs.rs/crate/im/12.3.3/source/) * [`RRBVector`](https://github.com/rmunn/FSharpx.Collections/blob/rrb-vector/src/FSharpx.Collections.Experimental/RRBVector.fs) F# library by Robin Munn. * According to discussion on [this Github issue](https://github.com/fsprojects/FSharpx.Collections/issues/72) it appears to have known bugs the author would still like to find and fix as of 2019-Oct-10. Implementation of immutable vectors that are not RRB trees: * [`Clojure`](https://github.com/clojure/clojure) collections library, Java class `clojure.lang.PersistentVector` implemented in Java * Also class `clojure.core.Vector` implemented in Clojure, with memory/time optimizations achieved by restricting vector elements to all be the same type of Java primitive, e.g. all `long` vector elements, or all `double`. * [`Scala`](https://github.com/scala/scala) collection library, Java class `scala.collection.immutable.Vector` * In source file `src/library/scala/collection/immutable/Vector.scala` * As far as I can tell, as of 2019-Oct-10, it appears that this class does _not_ use RRB trees, and thus implements concatenation of vectors in linear time in the length of the second vector. [This Github issue](https://github.com/nicolasstucki/scala-rrb-vector/issues/9) from April 2019 implies that Scala has not yet had an RRB tree implementation incorporated into its standard library. Published papers and theses: * Phil Bagwell, Tiark Rompf, "RRB-Trees: Efficient Immutable Vectors", EPFL-REPORT-169879, September, 2011 [[PDF]](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.592.5377&rep=rep1&type=pdf) [[SemanticScholar page]](https://www.semanticscholar.org/paper/RRB-Trees-%3A-Efficient-Immutable-Vectors-Phil-Tiark-Bagwell-Rompf/30c8c562f6421ab6b00d0b7faebd897c407de69c) * Phil Bagwell talk [[video]](https://www.youtube.com/watch?v=K2NYwP90bNs), "Striving to Make Things Simple and Fast", January 2013, given at Clojure conj conference * Jean Niklas L'orange, "Improving RRB-Tree Performance through Transience", Master Thesis, 2014, [[PDF]](https://hypirion.com/thesis.pdf) * "RRB Vector: A Practical General Purpose Immutable Sequence", Nicolas Stucki, Tiark, Rompf, Vlad Ureche, Phil Bagwell, Proc. of the 20th ACM SIGPLAN International Conference on Functional Programming, 2015 [[ACM digital library link]](http://dx.doi.org/10.1145/2784731.2784739) [[PDF]](https://github.com/nicolasstucki/scala-rrb-vector/blob/master/documents/RRB%20Vector%20-%20A%20Practical%20General%20Purpose%20Immutable%20Sequence.pdf) * Nicolas Stucki, "Turning Relaxed Radix Balanced Vector from Theory into Practice for Scala Collections", Master Thesis, 2015 [[PDF]](https://github.com/nicolasstucki/scala-rrb-vector/blob/master/documents/Master%20Thesis%20-%20Nicolas%20Stucki%20-%20Turning%20Relaxed%20Radix%20Balanced%20Vector%20from%20Theory%20into%20Practice%20for%20Scala%20Collections.pdf?raw=true) * Juan Pedro Bolivar Puente, "Persistence for the Masses: RRB-Vectors in a Systems Language", Proc. ACM Program. Lang. 1, ICFP, Article 16 (September 2017), https://doi.org/10.1145/3110260 [[PDF]](https://public.sinusoid.es/misc/immer/immer-icfp17.pdf) * Juan's talk [[video]](https://www.youtube.com/watch?v=sPhpelUfu8Q) "Postmodern immutable data structures" given at CppCon 2017 * Bodil Stokke talk [[video]](https://www.youtube.com/watch?v=cUx2b_FO8EQ) "Meetings With Remarkable Trees" given at ClojuTRE 2018 Related things: * Jean Niklas L'orange's series of articles on Clojure's persistent vector data structure and how it works inside. These are good tutorial style articles. I have not found any similar articles like these on RRB trees. * ["Understanding Clojure's Persistent Vectors, Part 1"](https://hypirion.com/musings/understanding-persistent-vector-pt-1), September 2013 * ["Understanding Clojure's Persistent Vectors, Part 2](https://hypirion.com/musings/understanding-persistent-vector-pt-2), October 2013 * ["Understanding Clojure's Persistent Vectors, Part 3](https://hypirion.com/musings/understanding-persistent-vector-pt-3) April 2014 * ["Understanding Clojure's Transients"](https://hypirion.com/musings/understanding-clojure-transients), October 2014 * ["Persistent Vector Performance"](https://hypirion.com/musings/persistent-vector-performance), January 2015 * ["Persistent Vector Performance Summarised"](https://hypirion.com/musings/persistent-vector-performance-summarised), February 2015 * StackOverflow [question](https://stackoverflow.com/questions/14007153/what-invariant-do-rrb-trees-maintain) "What invariant do RRB-trees maintain?" Not RRB trees, but somewhat related ideas: * ["Theory and practice of chunked sequences"](http://www.andrew.cmu.edu/user/mrainey//chunkedseq/chunkedseq.html) web page has links to papers, talks, and Github repository containing C++ of their ideas. ================================================ FILE: doc/use-transducers/README.md ================================================ The patch use-transducers.md was developed around July or August 2019, but not yet included in the production core.rrb-vector code, because of a desire to continue to make core.rrb-vector compatible with Clojure 1.5.1 and later, whereas transducers were not implemented in Clojure until version 1.7.0. We should re-examine this patch when we are ready to require Clojure 1.7.0 or later as a minimum supported version for the core.rrb-vector library. I may include some performance measurements with and without these changes, to show how much they can improve the performance of some operations. ================================================ FILE: doc/use-transducers/use-transducers.patch ================================================ diff -cr core.rrb-vector/src/main/cljs/clojure/core/rrb_vector/rrbt.cljs core.rrb-vector-pff4i/src/main/cljs/clojure/core/rrb_vector/rrbt.cljs *** core.rrb-vector/src/main/cljs/clojure/core/rrb_vector/rrbt.cljs 2019-09-27 07:56:52.000000000 -0700 --- core.rrb-vector-pff4i/src/main/cljs/clojure/core/rrb_vector/rrbt.cljs 2019-09-30 23:36:02.000000000 -0700 *************** *** 629,635 **** (recur (inc i) (+ sbc (slot-count child cs)))))))))) (defn leaf-seq [arr] ! (mapcat #(.-arr %) (take (index-of-nil arr) arr))) (defn rebalance-leaves [n1 cnt1 n2 cnt2 transferred-leaves] --- 629,639 ---- (recur (inc i) (+ sbc (slot-count child cs)))))))))) (defn leaf-seq [arr] ! (into [] (comp (take-while (complement nil?)) ! (take 32) ! (map #(.-arr %)) ! cat) ! arr)) (defn rebalance-leaves [n1 cnt1 n2 cnt2 transferred-leaves] *************** *** 649,657 **** new-arr (make-array (if reg? 32 33)) new-n1 (->VectorNode nil new-arr)] (loop [i 0 ! bs (partition-all 32 ! (concat (leaf-seq (.-arr n1)) ! (leaf-seq (.-arr n2))))] (when-first [block bs] (let [a (make-array (count block))] (loop [i 0 xs (seq block)] --- 653,662 ---- new-arr (make-array (if reg? 32 33)) new-n1 (->VectorNode nil new-arr)] (loop [i 0 ! bs (into [] (comp (map #(leaf-seq (.-arr %))) ! cat ! (partition-all 32)) ! [n1 n2])] (when-first [block bs] (let [a (make-array (count block))] (loop [i 0 xs (seq block)] *************** *** 672,680 **** new-n1 (->VectorNode nil new-arr1) new-n2 (->VectorNode nil new-arr2)] (loop [i 0 ! bs (partition-all 32 ! (concat (leaf-seq (.-arr n1)) ! (leaf-seq (.-arr n2))))] (when-first [block bs] (let [a (make-array (count block))] (loop [i 0 xs (seq block)] --- 677,686 ---- new-n1 (->VectorNode nil new-arr1) new-n2 (->VectorNode nil new-arr2)] (loop [i 0 ! bs (into [] (comp (map #(leaf-seq (.-arr %))) ! cat ! (partition-all 32)) ! [n1 n2])] (when-first [block bs] (let [a (make-array (count block))] (loop [i 0 xs (seq block)] *************** *** 695,711 **** rngs (if (regular? node) (regular-ranges shift cnt) (node-ranges node)) ! cs (if rngs (aget rngs 32) (index-of-nil arr)) ! cseq (fn cseq [c r] (let [arr (.-arr c) rngs (if (regular? c) (regular-ranges (- shift 5) r) (node-ranges c)) ! gcs (if rngs (aget rngs 32) (index-of-nil arr))] ! (map list ! (take gcs arr) ! (take gcs (map - rngs (cons 0 rngs))))))] ! (mapcat cseq (take cs arr) (take cs (map - rngs (cons 0 rngs)))))) (defn rebalance [shift n1 cnt1 n2 cnt2 transferred-leaves] --- 701,726 ---- rngs (if (regular? node) (regular-ranges shift cnt) (node-ranges node)) ! cs (if rngs (aget rngs 32) 32) ! cseq (fn cseq [[c r]] (let [arr (.-arr c) rngs (if (regular? c) (regular-ranges (- shift 5) r) (node-ranges c)) ! gcs (if rngs (aget rngs 32) 32) ! rng-deltas (mapv - rngs (cons 0 rngs))] ! (into [] (comp (take-while (complement nil?)) ! (take gcs) ! (map-indexed (fn [idx node] ! [node (rng-deltas idx)]))) ! arr))) ! rng-deltas (mapv - rngs (cons 0 rngs))] ! (into [] (comp (take-while (complement nil?)) ! (take cs) ! (map-indexed (fn [idx node] [node (rng-deltas idx)])) ! (map cseq) ! cat) ! arr))) (defn rebalance [shift n1 cnt1 n2 cnt2 transferred-leaves] *************** *** 727,735 **** new-rngs (make-array 33) new-n1 (->VectorNode nil new-arr)] (loop [i 0 ! bs (partition-all 32 ! (concat (child-seq n1 shift cnt1) ! (child-seq n2 shift cnt2)))] (when-first [block bs] (let [a (make-array 33) r (make-array 33)] --- 742,752 ---- new-rngs (make-array 33) new-n1 (->VectorNode nil new-arr)] (loop [i 0 ! bs (into [] (comp (map (fn [[node cnt]] ! (child-seq node shift cnt))) ! cat ! (partition-all 32)) ! [[n1 cnt1] [n2 cnt2]])] (when-first [block bs] (let [a (make-array 33) r (make-array 33)] *************** *** 758,766 **** new-n1 (->VectorNode nil new-arr1) new-n2 (->VectorNode nil new-arr2)] (loop [i 0 ! bs (partition-all 32 ! (concat (child-seq n1 shift cnt1) ! (child-seq n2 shift cnt2)))] (when-first [block bs] (let [a (make-array 33) r (make-array 33)] --- 775,785 ---- new-n1 (->VectorNode nil new-arr1) new-n2 (->VectorNode nil new-arr2)] (loop [i 0 ! bs (into [] (comp (map (fn [[node cnt]] ! (child-seq node shift cnt))) ! cat ! (partition-all 32)) ! [[n1 cnt1] [n2 cnt2]])] (when-first [block bs] (let [a (make-array 33) r (make-array 33)] *************** *** 862,896 **** (def peephole-optimization-config (atom {:debug-fn nil})) (def peephole-optimization-count (atom 0)) ! ;; TBD: Transducer versions of child-nodes and bounded-grandchildren ! ;; are included here for when we are willing to rely upon Clojure ! ;; 1.7.0 as the minimum version supported by the core.rrb-vector ! ;; library. They are faster. ! ! #_(defn child-nodes [node] (into [] (comp (take-while (complement nil?)) (take 32)) (.-arr node))) - (defn child-nodes [node] - (->> (.-arr node) - (take-while (complement nil?)) - (take 32))) - ;; (take 33) is just a technique to avoid generating more ;; grandchildren than necessary. If there are at least 33, we do not ;; care how many there are. ! #_(defn bounded-grandchildren [children] (into [] (comp (map child-nodes) cat (take 33)) children)) - (defn bounded-grandchildren [children] - (->> children - (mapcat child-nodes) - (take 33))) - ;; TBD: Do functions like last-non-nil-idx and ;; count-vector-elements-beneath already exist elsewhere in this ;; library? It seems like they might. --- 881,900 ---- (def peephole-optimization-config (atom {:debug-fn nil})) (def peephole-optimization-count (atom 0)) ! (defn child-nodes [node] (into [] (comp (take-while (complement nil?)) (take 32)) (.-arr node))) ;; (take 33) is just a technique to avoid generating more ;; grandchildren than necessary. If there are at least 33, we do not ;; care how many there are. ! (defn bounded-grandchildren [children] (into [] (comp (map child-nodes) cat (take 33)) children)) ;; TBD: Do functions like last-non-nil-idx and ;; count-vector-elements-beneath already exist elsewhere in this ;; library? It seems like they might. diff -cr core.rrb-vector/src/main/clojure/clojure/core/rrb_vector/rrbt.clj core.rrb-vector-pff4i/src/main/clojure/clojure/core/rrb_vector/rrbt.clj *** core.rrb-vector/src/main/clojure/clojure/core/rrb_vector/rrbt.clj 2019-09-29 17:29:58.000000000 -0700 --- core.rrb-vector-pff4i/src/main/clojure/clojure/core/rrb_vector/rrbt.clj 2019-09-30 23:32:01.000000000 -0700 *************** *** 1311,1317 **** (recur (inc i) (+ sbc (long (slot-count nm am child cs))))))))))) (defn leaf-seq [^NodeManager nm arr] ! (mapcat #(.array nm %) (take (index-of-nil arr) arr))) (defn rebalance-leaves [^NodeManager nm ^ArrayManager am n1 cnt1 n2 cnt2 ^Box transferred-leaves] --- 1346,1356 ---- (recur (inc i) (+ sbc (long (slot-count nm am child cs))))))))))) (defn leaf-seq [^NodeManager nm arr] ! (into [] (comp (take-while (complement nil?)) ! (take 32) ! (map #(.array nm %)) ! cat) ! arr)) (defn rebalance-leaves [^NodeManager nm ^ArrayManager am n1 cnt1 n2 cnt2 ^Box transferred-leaves] *************** *** 1331,1339 **** new-arr (object-array (if reg? 32 33)) new-n1 (.node nm nil new-arr)] (loop [i 0 ! bs (partition-all 32 ! (concat (leaf-seq nm (.array nm n1)) ! (leaf-seq nm (.array nm n2))))] (when-first [block bs] (let [a (.array am (count block))] (loop [i 0 xs (seq block)] --- 1370,1379 ---- new-arr (object-array (if reg? 32 33)) new-n1 (.node nm nil new-arr)] (loop [i 0 ! bs (into [] (comp (map #(leaf-seq nm (.array nm %))) ! cat ! (partition-all 32)) ! [n1 n2])] (when-first [block bs] (let [a (.array am (count block))] (loop [i 0 xs (seq block)] *************** *** 1354,1362 **** new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] (loop [i 0 ! bs (partition-all 32 ! (concat (leaf-seq nm (.array nm n1)) ! (leaf-seq nm (.array nm n2))))] (when-first [block bs] (let [a (.array am (count block))] (loop [i 0 xs (seq block)] --- 1394,1403 ---- new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] (loop [i 0 ! bs (into [] (comp (map #(leaf-seq nm (.array nm %))) ! cat ! (partition-all 32)) ! [n1 n2])] (when-first [block bs] (let [a (.array am (count block))] (loop [i 0 xs (seq block)] *************** *** 1377,1391 **** rngs (if (.regular nm node) (ints (regular-ranges shift cnt)) (ranges nm node)) ! cs (if rngs (aget rngs 32) (index-of-nil arr)) ! cseq (fn cseq [c r] (let [arr (.array nm c) rngs (if (.regular nm c) (ints (regular-ranges (- shift 5) r)) (ranges nm c)) ! gcs (if rngs (aget rngs 32) (index-of-nil arr))] ! (map list (take gcs arr) (take gcs (map - rngs (cons 0 rngs))))))] ! (mapcat cseq (take cs arr) (take cs (map - rngs (cons 0 rngs)))))) (defn rebalance [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] --- 1418,1443 ---- rngs (if (.regular nm node) (ints (regular-ranges shift cnt)) (ranges nm node)) ! cs (if rngs (aget rngs 32) 32) ! cseq (fn cseq [[c r]] (let [arr (.array nm c) rngs (if (.regular nm c) (ints (regular-ranges (- shift 5) r)) (ranges nm c)) ! gcs (if rngs (aget rngs 32) 32) ! rng-deltas (mapv - rngs (cons 0 rngs))] ! (into [] (comp (take-while (complement nil?)) ! (take gcs) ! (map-indexed (fn [idx node] ! [node (rng-deltas idx)]))) ! arr))) ! rng-deltas (mapv - rngs (cons 0 rngs))] ! (into [] (comp (take-while (complement nil?)) ! (take cs) ! (map-indexed (fn [idx node] [node (rng-deltas idx)])) ! (map cseq) ! cat) ! arr))) (defn rebalance [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] *************** *** 1407,1415 **** new-rngs (int-array 33) new-n1 (.node nm nil new-arr)] (loop [i 0 ! bs (partition-all 32 ! (concat (child-seq nm n1 shift cnt1) ! (child-seq nm n2 shift cnt2)))] (when-first [block bs] (let [a (object-array 33) r (int-array 33)] --- 1459,1469 ---- new-rngs (int-array 33) new-n1 (.node nm nil new-arr)] (loop [i 0 ! bs (into [] (comp (map (fn [[node cnt]] ! (child-seq nm node shift cnt))) ! cat ! (partition-all 32)) ! [[n1 cnt1] [n2 cnt2]])] (when-first [block bs] (let [a (object-array 33) r (int-array 33)] *************** *** 1438,1446 **** new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] (loop [i 0 ! bs (partition-all 32 ! (concat (child-seq nm n1 shift cnt1) ! (child-seq nm n2 shift cnt2)))] (when-first [block bs] (let [a (object-array 33) r (int-array 33)] --- 1492,1502 ---- new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] (loop [i 0 ! bs (into [] (comp (map (fn [[node cnt]] ! (child-seq nm node shift cnt))) ! cat ! (partition-all 32)) ! [[n1 cnt1] [n2 cnt2]])] (when-first [block bs] (let [a (object-array 33) r (int-array 33)] *************** *** 1542,1576 **** (def peephole-optimization-config (atom {:debug-fn nil})) (def peephole-optimization-count (atom 0)) ! ;; TBD: Transducer versions of child-nodes and bounded-grandchildren ! ;; are included here for when we are willing to rely upon Clojure ! ;; 1.7.0 as the minimum version supported by the core.rrb-vector ! ;; library. They are faster. ! ! #_(defn child-nodes [node ^NodeManager nm] (into [] (comp (take-while (complement nil?)) (take 32)) (.array nm node))) - (defn child-nodes [node ^NodeManager nm] - (->> (.array nm node) - (take-while (complement nil?)) - (take 32))) - ;; (take 33) is just a technique to avoid generating more ;; grandchildren than necessary. If there are at least 33, we do not ;; care how many there are. ! #_(defn bounded-grandchildren [nm children] (into [] (comp (map #(child-nodes % nm)) cat (take 33)) children)) - (defn bounded-grandchildren [nm children] - (->> children - (mapcat #(child-nodes % nm)) - (take 33))) - ;; TBD: Do functions like last-non-nil-idx and ;; count-vector-elements-beneath already exist elsewhere in this ;; library? It seems like they might. --- 1598,1617 ---- (def peephole-optimization-config (atom {:debug-fn nil})) (def peephole-optimization-count (atom 0)) ! (defn child-nodes [node ^NodeManager nm] (into [] (comp (take-while (complement nil?)) (take 32)) (.array nm node))) ;; (take 33) is just a technique to avoid generating more ;; grandchildren than necessary. If there are at least 33, we do not ;; care how many there are. ! (defn bounded-grandchildren [nm children] (into [] (comp (map #(child-nodes % nm)) cat (take 33)) children)) ;; TBD: Do functions like last-non-nil-idx and ;; count-vector-elements-beneath already exist elsewhere in this ;; library? It seems like they might. ================================================ FILE: epl-v10.html ================================================ Eclipse Public License - Version 1.0

Eclipse Public License - v 1.0

THE ACCOMPANYING PROGRAM IS PROVIDED UNDER THE TERMS OF THIS ECLIPSE PUBLIC LICENSE ("AGREEMENT"). ANY USE, REPRODUCTION OR DISTRIBUTION OF THE PROGRAM CONSTITUTES RECIPIENT'S ACCEPTANCE OF THIS AGREEMENT.

1. DEFINITIONS

"Contribution" means:

a) in the case of the initial Contributor, the initial code and documentation distributed under this Agreement, and

b) in the case of each subsequent Contributor:

i) changes to the Program, and

ii) additions to the Program;

where such changes and/or additions to the Program originate from and are distributed by that particular Contributor. A Contribution 'originates' from a Contributor if it was added to the Program by such Contributor itself or anyone acting on such Contributor's behalf. Contributions do not include additions to the Program which: (i) are separate modules of software distributed in conjunction with the Program under their own license agreement, and (ii) are not derivative works of the Program.

"Contributor" means any person or entity that distributes the Program.

"Licensed Patents" mean patent claims licensable by a Contributor which are necessarily infringed by the use or sale of its Contribution alone or when combined with the Program.

"Program" means the Contributions distributed in accordance with this Agreement.

"Recipient" means anyone who receives the Program under this Agreement, including all Contributors.

2. GRANT OF RIGHTS

a) Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, distribute and sublicense the Contribution of such Contributor, if any, and such derivative works, in source code and object code form.

b) Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free patent license under Licensed Patents to make, use, sell, offer to sell, import and otherwise transfer the Contribution of such Contributor, if any, in source code and object code form. This patent license shall apply to the combination of the Contribution and the Program if, at the time the Contribution is added by the Contributor, such addition of the Contribution causes such combination to be covered by the Licensed Patents. The patent license shall not apply to any other combinations which include the Contribution. No hardware per se is licensed hereunder.

c) Recipient understands that although each Contributor grants the licenses to its Contributions set forth herein, no assurances are provided by any Contributor that the Program does not infringe the patent or other intellectual property rights of any other entity. Each Contributor disclaims any liability to Recipient for claims brought by any other entity based on infringement of intellectual property rights or otherwise. As a condition to exercising the rights and licenses granted hereunder, each Recipient hereby assumes sole responsibility to secure any other intellectual property rights needed, if any. For example, if a third party patent license is required to allow Recipient to distribute the Program, it is Recipient's responsibility to acquire that license before distributing the Program.

d) Each Contributor represents that to its knowledge it has sufficient copyright rights in its Contribution, if any, to grant the copyright license set forth in this Agreement.

3. REQUIREMENTS

A Contributor may choose to distribute the Program in object code form under its own license agreement, provided that:

a) it complies with the terms and conditions of this Agreement; and

b) its license agreement:

i) effectively disclaims on behalf of all Contributors all warranties and conditions, express and implied, including warranties or conditions of title and non-infringement, and implied warranties or conditions of merchantability and fitness for a particular purpose;

ii) effectively excludes on behalf of all Contributors all liability for damages, including direct, indirect, special, incidental and consequential damages, such as lost profits;

iii) states that any provisions which differ from this Agreement are offered by that Contributor alone and not by any other party; and

iv) states that source code for the Program is available from such Contributor, and informs licensees how to obtain it in a reasonable manner on or through a medium customarily used for software exchange.

When the Program is made available in source code form:

a) it must be made available under this Agreement; and

b) a copy of this Agreement must be included with each copy of the Program.

Contributors may not remove or alter any copyright notices contained within the Program.

Each Contributor must identify itself as the originator of its Contribution, if any, in a manner that reasonably allows subsequent Recipients to identify the originator of the Contribution.

4. COMMERCIAL DISTRIBUTION

Commercial distributors of software may accept certain responsibilities with respect to end users, business partners and the like. While this license is intended to facilitate the commercial use of the Program, the Contributor who includes the Program in a commercial product offering should do so in a manner which does not create potential liability for other Contributors. Therefore, if a Contributor includes the Program in a commercial product offering, such Contributor ("Commercial Contributor") hereby agrees to defend and indemnify every other Contributor ("Indemnified Contributor") against any losses, damages and costs (collectively "Losses") arising from claims, lawsuits and other legal actions brought by a third party against the Indemnified Contributor to the extent caused by the acts or omissions of such Commercial Contributor in connection with its distribution of the Program in a commercial product offering. The obligations in this section do not apply to any claims or Losses relating to any actual or alleged intellectual property infringement. In order to qualify, an Indemnified Contributor must: a) promptly notify the Commercial Contributor in writing of such claim, and b) allow the Commercial Contributor to control, and cooperate with the Commercial Contributor in, the defense and any related settlement negotiations. The Indemnified Contributor may participate in any such claim at its own expense.

For example, a Contributor might include the Program in a commercial product offering, Product X. That Contributor is then a Commercial Contributor. If that Commercial Contributor then makes performance claims, or offers warranties related to Product X, those performance claims and warranties are such Commercial Contributor's responsibility alone. Under this section, the Commercial Contributor would have to defend claims against the other Contributors related to those performance claims and warranties, and if a court requires any other Contributor to pay any damages as a result, the Commercial Contributor must pay those damages.

5. NO WARRANTY

EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is solely responsible for determining the appropriateness of using and distributing the Program and assumes all risks associated with its exercise of rights under this Agreement , including but not limited to the risks and costs of program errors, compliance with applicable laws, damage to or loss of data, programs or equipment, and unavailability or interruption of operations.

6. DISCLAIMER OF LIABILITY

EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

7. GENERAL

If any provision of this Agreement is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this Agreement, and without further action by the parties hereto, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable.

If Recipient institutes patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Program itself (excluding combinations of the Program with other software or hardware) infringes such Recipient's patent(s), then such Recipient's rights granted under Section 2(b) shall terminate as of the date such litigation is filed.

All Recipient's rights under this Agreement shall terminate if it fails to comply with any of the material terms or conditions of this Agreement and does not cure such failure in a reasonable period of time after becoming aware of such noncompliance. If all Recipient's rights under this Agreement terminate, Recipient agrees to cease use and distribution of the Program as soon as reasonably practicable. However, Recipient's obligations under this Agreement and any licenses granted by Recipient relating to the Program shall continue and survive.

Everyone is permitted to copy and distribute copies of this Agreement, but in order to avoid inconsistency the Agreement is copyrighted and may only be modified in the following manner. The Agreement Steward reserves the right to publish new versions (including revisions) of this Agreement from time to time. No one other than the Agreement Steward has the right to modify this Agreement. The Eclipse Foundation is the initial Agreement Steward. The Eclipse Foundation may assign the responsibility to serve as the Agreement Steward to a suitable separate entity. Each new version of the Agreement will be given a distinguishing version number. The Program (including Contributions) may always be distributed subject to the version of the Agreement under which it was received. In addition, after a new version of the Agreement is published, Contributor may elect to distribute the Program (including its Contributions) under the new version. Except as expressly stated in Sections 2(a) and 2(b) above, Recipient receives no rights or licenses to the intellectual property of any Contributor under this Agreement, whether expressly, by implication, estoppel or otherwise. All rights in the Program not expressly granted under this Agreement are reserved.

This Agreement is governed by the laws of the State of New York and the intellectual property laws of the United States of America. No party to this Agreement will bring a legal action under this Agreement more than one year after the cause of action arose. Each party waives its rights to a jury trial in any resulting litigation.

================================================ FILE: pom.xml ================================================ 4.0.0 core.rrb-vector 0.2.2-SNAPSHOT core.rrb-vector RRB-Trees for Clojure(Script) -- see Bagwell & Rompf Eclipse Public License 1.0 https://opensource.org/license/epl-1-0/ repo org.clojure pom.contrib 1.4.0 Michał Marczyk https://github.com/michalmarczyk scm:git:git://github.com/clojure/core.rrb-vector.git scm:git:git://github.com/clojure/core.rrb-vector.git https://github.com/clojure/core.rrb-vector HEAD 1.11.4 true org.codehaus.mojo build-helper-maven-plugin add-clojurescript-source-dirs generate-sources add-resource src/main/cljs ${project.basedir}/src/test/resources ${project.basedir}/src/test/cljs org.clojure test.check 1.1.3 test org.clojure clojurescript 1.10.439 test ================================================ FILE: project.clj ================================================ (defproject org.clojure/core.rrb-vector "0.1.1-SNAPSHOT" :description "RRB-Trees for Clojure(Script) -- see Bagwell & Rompf" :url "https://github.com/clojure/core.rrb-vector" :license {:name "Eclipse Public License" :url "http://www.eclipse.org/legal/epl-v10.html"} :min-lein-version "2.2.0" :parent [org.clojure/pom.contrib "1.4.0"] :dependencies [[org.clojure/clojure "1.11.4"]] :source-paths ["src/main/clojure" "src/main/cljs"] ;;:source-paths ["src/parameterized/clojure" "src/main/cljs"] :test-paths ["src/test/clojure"] :test-selectors {:default (complement :cljs-nashorn)} :jvm-opts ^:replace ["-XX:+UseG1GC" "-XX:-OmitStackTraceInFastThrow"] :profiles {:dev {:dependencies [[org.clojure/test.check "1.1.3"]] :plugins [[lein-cljsbuild "1.1.7"]]} :coll {:test-paths ["src/test_local/clojure"] :dependencies [[org.clojure/test.check "1.1.3"] [collection-check "0.1.7"]]} :cljs {:dependencies [[org.clojure/clojure "1.11.4"] [org.clojure/clojurescript "1.10.238"]]} :1.9 {:dependencies [[org.clojure/clojure "1.11.4"]]} :1.10 {:dependencies [[org.clojure/clojure "1.11.4"]]} :master {:dependencies [[org.clojure/clojure "1.13.0-master-SNAPSHOT"]]}} :cljsbuild {:builds {:test {:source-paths ["src/main/cljs" "src/test/cljs"] :compiler {;;:optimizations :none ;;:optimizations :whitespace ;;:optimizations :simple :optimizations :advanced :output-to "out/test.js"}}} :test-commands {"node" ["node" "-e" "require(\"./out/test\"); clojure.core.rrb_vector.test_cljs.run()"] "spidermonkey" ["js52" "-f" "out/test.js" "--execute=clojure.core.rrb_vector.test_cljs.run()"]}}) ================================================ FILE: script/jdo ================================================ #! /bin/bash # Run some task using Clojure/Java if [ $# -eq 0 ] then # Default if nothing else is specified is a REPL plus listening # for Socket REPL TASK="socket" elif [ $# -eq 1 ] then TASK="$1" else 1>&2 echo "usage: `basename $0` arg1" exit 1 fi set -x case ${TASK} in sock*) # Run REPL, with option to listen for Socket REPL connection, and # test paths in classpath. exec clj -A:clj:clj-test:clj-socket ;; test*) # Run 'short' tests exec clojure -A:clj:clj-test:clj-runt ;; chec*) # Run 'short' tests with extra checks enabled exec clojure -A:clj:clj-test:clj-extrachecks-runt ;; long*) # Run long/generative tests exec clojure -A:clj:clj-test:clj-runlongtests ;; coll*) # Run collection-check generative tests exec clojure -A:clj:clj-test:clj-check:clj-runcheck ;; perf*) # Run performance tests exec clojure -A:clj:clj-test:clj-runperf ;; focu*) # Run whatever the current 'focus' tests are exec clojure -A:clj:clj-test:clj-check:clj-runfocus ;; east*|lint*) # Run Eastwood exec clojure -A:clj:clj-test:clj-check:eastwood ;; *) 1>&2 echo "unknown task name: ${TASK}" exit 1 ;; esac ================================================ FILE: script/mvn-run-tests ================================================ #! /bin/bash # Example of a command run by the build.clojure.org Jenkins machine to # run Clojure/Java and ClojureScript tests for data.xml library: # /var/lib/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.2.5/bin/mvn "-Djdk=Oracle 11 EA" -DCLOJURE_VERSION=1.7.0 -Dclojure.version=1.7.0 clean test prog_name=`basename $0` usage() { 1>&2 echo "usage: $prog_name " 1>&2 echo "" 1>&2 echo "Examples:" 1>&2 echo "" 1>&2 echo " $prog_name 1.7.0" 1>&2 echo " $prog_name 1.10.1" } if [ $# -ne 1 ] then usage exit 1 fi CLOJURE_VERSION="$1" set -x mvn -DCLOJURE_VERSION=${CLOJURE_VERSION} -Dclojure.version=${CLOJURE_VERSION} clean test ================================================ FILE: script/replace-params ================================================ #!/bin/bash #_( #_DEPS is same format as deps.edn. Multiline is okay. DEPS='{:deps { org.clojure/clojure {:mvn/version "1.10.1"} }}' #_You can put other options here OPTS='-J-XX:-OmitStackTraceInFastThrow' exec clojure $OPTS -Sdeps "$DEPS" "$0" "$@" ) ;; For every file in this directory and its subdirectories: ;; src/parameterized/clojure/clojure/core/rrb_vector/ ;; create a corresponding file: ;; src/hardcoded/clojure/clojure/core/rrb_vector/ ;; that has the same contents, except for the string substitutions ;; specified in the var named 'substitutions' throughout the file, ;; wherever they occur. (require '[clojure.java.io :as io] '[clojure.java.shell :as sh] '[clojure.edn :as edn] '[clojure.string :as str] '[clojure.pprint :as pp]) (def substitutions [ ;; some comments have special cases like this, ;; which I want to replace before the other more ;; normal cases below. ["1/(p/max-branches)" "1/32"] ;;["by max-branches-squared" "by 1024"] ["1/max-branches-squared" "1/1024"] ["1/(p/max-branches-squared)" "1/1024"] ;; special case expression that appears in a few ;; places in the code. ["(inc p/max-branches)" "33"] ;; Note that these must be replaced before ;; p/max-branches is, because the substitution ;; code below does not know anything about ;; symbols, just raw sequences of characters. All ;; of these are suffixes of p/max-branches. ["p/max-branches-squared" "1024"] ["p/max-branches-minus-1" "31"] ["p/max-branches-minus-2" "30"] ;; Similarly, shift-increment-times-2 must be ;; replaced before shift-increment. ["p/shift-increment-times-2" "10"] ["p/shift-increment" "5"] ["p/max-branches" "32"] ["p/branch-mask" "0x1f"] ["p/non-regular-array-len" "33"] ["max-capacity-divided-by-max-branches-squared" "max-capacity-divided-by-1024"] ]) (def source-dir-prefix "src/parameterized/") (def target-dir-prefix "src/hardcoded/") (def common-intermediate-path "clojure/clojure/core") (def source-dir (str source-dir-prefix common-intermediate-path)) (def target-dir (str target-dir-prefix common-intermediate-path)) (defn source-to-target-name [source-fname] (let [expected-path? (str/starts-with? source-fname source-dir)] (if expected-path? (str target-dir (subs source-fname (count source-dir))) (throw (ex-info (format "Unexpected source path '%s' does not begin with '%s'" source-fname source-dir) {:source-fname source-fname :source-dir source-dir}))))) (defn make-all-substitutions [content substitution-pairs] (reduce (fn [content [to-replace-str replace-with-str]] (str/replace content to-replace-str replace-with-str)) content substitution-pairs)) (let [source-dir (str source-dir-prefix common-intermediate-path)] (doseq [source-f (file-seq (io/file source-dir))] (let [source-fname (str source-f) target-fname (source-to-target-name source-fname) target-f (io/file target-fname)] (println) (println "source file :" source-fname) (println "target file1:" target-fname) (println "target file2:" (str target-f)) (if (. source-f (isDirectory)) (println "skipping directory") (do (io/make-parents target-f) (let [contents (slurp source-f) new-contents (make-all-substitutions contents substitutions)] (spit target-f new-contents))))))) ================================================ FILE: script/sdo ================================================ #! /bin/bash # Run some task using ClojureScript if [ $# -eq 0 ] then # Default if nothing else is specified is a REPL plus listening # for Socket REPL TASK="socket" elif [ $# -eq 1 ] then TASK="$1" else 1>&2 echo "usage: `basename $0` arg1" exit 1 fi set -x case ${TASK} in sock*) # Run REPL, with option to listen for Socket REPL connection, and # test paths in classpath. exec clj -A:cljs:cljs-test:cljs-socket ;; test*) # Run 'short' tests exec clojure -A:cljs:cljs-test:cljs-runt ;; chec*) # Run 'short' tests with extra checks enabled exec clojure -A:cljs:cljs-test:cljs-extrachecks-runt ;; long*) # Run long/generative tests exec clojure -A:cljs:cljs-test:cljs-runlongtests ;; coll*) # Run collection-check generative tests exec clojure -A:cljs:cljs-test:cljs-check:cljs-runcheck ;; perf*) # Run performance tests exec clojure -A:cljs:cljs-test:cljs-runperf ;; focu*) # Run whatever the current 'focus' tests are exec clojure -A:cljs:cljs-test:cljs-check:cljs-runfocus ;; east*|lint*) 1>&2 echo "Eastwood not supported for ClojureScript" ;; *) 1>&2 echo "unknown task name: ${TASK}" exit 1 ;; esac ================================================ FILE: script/test ================================================ #!/bin/bash # See README.md for some sample install instructions for Ubuntu 18.04 # Linux and macOS. If you use those instructions, then the following environment variable settings should work: # export NODEJS_CMD="node" # export SPIDERMONKEY_CMD="js52" if [ "${NODEJS_CMD}" = "" -a "${SPIDERMONKEY_CMD}" = "" ]; then echo "Neither NODEJS_CMD nor SPIDERMONKEY_CMD is set, cannot run tests" exit 1 fi rm -rf out mkdir -p out lein with-profile +cljs cljsbuild once test echo "Launching test runner..." if [ "${NODEJS_CMD}" != "" ]; then echo "Testing with Node.js:" "${NODEJS_CMD}" -e 'require("./out/test"); clojure.core.rrb_vector.test_cljs.run()' fi if [ "${SPIDERMONKEY_CMD}" != "" ]; then echo "Testing with SpiderMonkey:" "${SPIDERMONKEY_CMD}" -f out/test.js "--execute=clojure.core.rrb_vector.test_cljs.run()" fi ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/debug.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.debug (:require [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.rrbt :as rrbt] ;; This page: ;; https://clojure.org/guides/reader_conditionals refers ;; to code that can go into common cljc files as platform ;; independent, and the code in the clj or cljs files as ;; platform dependent, so I will use that terminology ;; here, too. [clojure.core.rrb-vector.debug-platform-dependent :as pd])) ;; The intent is to keep this file as close to ;; src/main/clojure/clojure/core/rrb_vector/debug.clj as possible, so ;; that when we start requiring Clojure 1.7.0 and later for this ;; library, this file and that one can be replaced with a common file ;; with the suffix .cljc ;; Functions expected to be defined in the appropriate ;; clojure.core.rrb-vector.debug-platform-dependent namespace: ;; pd/internal-node? ;; pd/persistent-vector? ;; pd/transient-vector? ;; pd/is-vector? ;; pd/dbg-tailoff (formerly debug-tailoff) ;; pd/dbg-tidx (formerly debug-tailoff for clj, debug-tidx for cljs) ;; pd/format ;; pd/printf ;; pd/unwrap-subvec-accessors-for ;; pd/abbrev-for-type-of [vec-or-node] (formerly abbrev-type-name, but move type/class call inside) ;; pd/same-coll? (written already for clj, TBD for cljs) ;; Functions returned from unwrap-subvec-accessors-for that have ;; platform-dependent definitions, but the same general 'kind' ;; arguments and return values, where 'kind' could be: any vector, ;; persistent or transient, or a vector tree node object: ;; get-root - All get-* fns formerly called extract-* in the Java ;; platform dependent version of the debug namespace. ;; get-shift ;; get-tail ;; get-cnt ;; get-array [node] - clj (.array nm node) cljs (.-arr node) ;; get-ranges [node] - clj (ranges nm node) cljs (node-ranges node) ;; regular? [node] - clj (.regular nm node) cljs (regular? node) ;; tail-len [tail] - clj (.alength am tail) cljs (alength tail) ;; NO: nm am - cljs doesn't need them, and clj only uses them for the ;; last few functions above. (defn children-summary [node shift get-array get-ranges regular? opts] (let [children (get-array node) reg? (regular? node) rngs (if-not reg? (get-ranges node)) array-len (count children) children-seq (if reg? children (butlast children)) non-nils (remove nil? children-seq) regular-children (filter regular? non-nils) num-non-nils (count non-nils) num-regular-children (count regular-children) num-irregular-children (- num-non-nils num-regular-children) num-nils (- (count children-seq) num-non-nils) exp-array-len (if reg? 32 33) bad-array-len? (not= array-len exp-array-len)] ;; 'r' for regular, 'i' for irregular ;; For either type of node, its first 32 array elements are broken ;; down into: ;; # regular children ;; # irregular children ;; # of nil 'children' not shown, since it will always be 32 minus ;; # the total of # regular plus irregular children, unless the ;; # array is the wrong size, and in that case a BAD-ARRAY-LEN ;; # message will be included in the string. (pd/format "%s%d+%d%s" (if reg? "r" "i") num-regular-children num-irregular-children (if bad-array-len? (pd/format " BAD-ARRAY-LEN %d != %d" array-len exp-array-len) "")))) (defn filter-indexes "Return a sequence of all indexes of elements e of coll for which (pred e) returns logical true. 0 is the index of the first element." [pred coll] (filter (complement nil?) (map-indexed (fn [idx e] (if (pred e) idx)) coll))) (defn dbg-vec ([v] (dbg-vec v {:max-depth nil ;; integer to limit depth, nil for unlimited ;; force showing tree "fringes" beyond max-depth :always-show-fringes false ;; show vector elements. false for only count :show-elements true ;; show summary of number of children of each node, as ;; returned by function children-summary :show-children-summary false ;; default false means show ranges arrays with their raw ;; unprocessed contents. Use true to show only the ;; first n elements, where n=(aget (get-ranges node) ;; 32), and to show the 'deltas' between consecutive ;; pairs, e.g. if the original is (32 64 96 0 ... 0 3), ;; then instead show (32 32 32), which, if the data ;; structure is correct, is the number of vector ;; elements reachable through each of the node's 3 ;; children. :show-ranges-as-deltas false})) ([v opts] (let [{:keys [v subvector? subvec-start subvec-end get-root get-shift get-tail get-cnt get-array get-ranges regular? tail-len]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v) tail (get-tail v) cnt (get-cnt v)] (when subvector? (pd/printf "SubVector from start %d to end %d of vector:\n" subvec-start subvec-end)) (letfn [(go [indent shift i node on-left-fringe? on-right-fringe?] (when node (dotimes [_ indent] (print " ")) (pd/printf "%02d:%02d %s" shift i (pd/abbrev-for-type-of node)) (if (zero? shift) ;; this node has only vector elements as its children (if (:show-elements opts) (print ":" (vec (get-array node))) (print ":" (count (get-array node)) "vector elements elided")) ;; else this node has only other nodes as its children (do (when (:show-children-summary opts) (print " ") (print (children-summary node shift get-array get-ranges regular? opts))) (if (not (regular? node)) (if (:show-ranges-as-deltas opts) (let [rngs (get-ranges node) r (aget rngs 32) tmp (map - (take r rngs) (take r (cons 0 rngs)))] (print ":" (seq tmp))) (print ":" (seq (get-ranges node))))))) (println) (let [no-children? (zero? shift) visit-all-children? (and (not no-children?) (or (nil? (:max-depth opts)) (< (inc indent) (:max-depth opts)))) visit-some-children? (or visit-all-children? (and (not no-children?) (:always-show-fringes opts) (or on-left-fringe? on-right-fringe?)))] (if visit-some-children? (dorun (let [arr (get-array node) a (if (regular? node) arr (butlast arr)) non-nil-idxs (filter-indexes (complement nil?) a) first-non-nil-idx (first non-nil-idxs) last-non-nil-idx (last non-nil-idxs)] (map-indexed (fn [i node] (let [child-on-left-fringe? (and on-left-fringe? (= i first-non-nil-idx)) child-on-right-fringe? (and on-right-fringe? (= i last-non-nil-idx)) visit-this-child? (or visit-all-children? (and (:always-show-fringes opts) (or child-on-left-fringe? child-on-right-fringe?)))] (if visit-this-child? (go (inc indent) (- shift 5) i node child-on-left-fringe? child-on-right-fringe?)))) a)))))))] (pd/printf "%s (%d elements):\n" (pd/abbrev-for-type-of v) (count v)) (go 0 shift 0 root true true) (println (if (pd/transient-vector? v) (pd/format "tail (tidx %d):" (pd/dbg-tidx v)) "tail:") (vec tail)))))) (defn first-diff "Compare two sequences to see if they have = elements in the same order, and both sequences have the same number of elements. If all of those conditions are true, and no exceptions occur while calling seq, first, and next on the seqs of xs and ys, then return -1. If two elements at the same index in each sequence are found not = to each other, or the sequences differ in their number of elements, return the index, 0 or larger, at which the first difference occurs. If an exception occurs while calling seq, first, or next, throw an exception that contains the index at which this exception occurred." [xs ys] (loop [i 0 xs (seq xs) ys (seq ys)] (if (try (and xs ys (= (first xs) (first ys))) (catch js/Error e (.printStackTrace e) i)) (let [xs (try (next xs) (catch js/Error e (prn :xs i) (throw e))) ys (try (next ys) (catch js/Error e (prn :ys i) (throw e)))] (recur (inc i) xs ys)) (if (or xs ys) i -1)))) ;; When using non-default parameters for the tree data structure, ;; e.g. shift-increment not 5, then in test code with calls to ;; checking-* functions, they will be expecting those same non-default ;; parameter values, and will give errors if they are ever given a ;; vector returned by clojure.core/vec, because without changes to ;; Clojure itself, they always have shift-increment 5 and max-branches ;; 32. ;; ;; If we use (fv/vec coll) consistently in the test code, that in many ;; cases returns a core.rrb-vector data structure, but if given a ;; Clojure vector, it still returns that Clojure vector unmodified, ;; which has the same issues for checking-* functions. By ;; calling (fv/vec (seq coll)) when not using default parameters, we ;; force the return value of cvec to always be a core.rrb-vector data ;; structure. ;; ;; The name 'cvec' is intended to mean "construct a vector", and only ;; intended for use in test code that constructs vectors used as ;; parameters to other functions operating on vectors. (defn cvec [coll] (clojure.core/vec coll)) (defn slow-into [to from] (reduce conj to from)) (defn all-vector-tree-nodes [v] (let [{:keys [v get-root get-shift get-array regular?]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v)] (letfn [(go [depth shift node] (if node (if (not= shift 0) (cons {:depth depth :shift shift :kind :internal :node node} (apply concat (map (partial go (inc depth) (- shift 5)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr)))))) (cons {:depth depth :shift shift :kind :internal :node node} (map (fn [x] {:depth (inc depth) :kind :leaf :value x}) (get-array node))))))] (cons {:depth 0 :kind :base :shift shift :value v} (go 1 shift root))))) ;; All nodes that should be internal nodes are one of the internal ;; node types satisfying internal-node? All nodes that are less ;; than "leaf depth" must be internal nodes, and none of the ones ;; at "leaf depth" should be. Probably the most general restriction ;; checking for leaf values should be simply that they are any type ;; that is _not_ an internal node type. They could be objects that ;; return true for is-vector? for example, if a vector is an element ;; of another vector. (defn leaves-with-internal-node-type [node-infos] (filter (fn [node-info] (and (= :leaf (:kind node-info)) (pd/internal-node? (:node node-info)))) node-infos)) (defn non-leaves-not-internal-node-type [node-infos] (filter (fn [node-info] (and (= :internal (:kind node-info)) (not (pd/internal-node? (:node node-info))))) node-infos)) ;; The definition of nth in deftype Vector implies that every ;; descendant of a 'regular' node must also be regular. That would be ;; a straightforward sanity check to make, to return an error if a ;; non-regular node is found with a regular ancestor in the tree. (defn basic-node-errors [v] (let [{:keys [v get-shift]} (pd/unwrap-subvec-accessors-for v) shift (get-shift v) nodes (all-vector-tree-nodes v) by-kind (group-by :kind nodes) leaf-depths (set (map :depth (:leaf by-kind))) expected-leaf-depth (+ (quot shift 5) 2) max-internal-node-depth (->> (:internal by-kind) (map :depth) (apply max)) ;; Be a little loose in checking here. If we want to narrow ;; it down to one expected answer, we would need to look at ;; the tail to see how many elements it has, then use the ;; different between (count v) and that to determine how many ;; nodes are in the rest of the tree, whether it is 0 or ;; non-0. expected-internal-max-depths (cond (= (count v) 0) #{(- expected-leaf-depth 2)} (> (count v) 33) #{(dec expected-leaf-depth)} :else #{(dec expected-leaf-depth) (- expected-leaf-depth 2)})] (cond (not= (mod shift 5) 0) {:error true :description (str "shift value in root must be a multiple of 5. Found " shift) :data shift} ;; It is OK for this set size to be 0 if no leaves, but if there ;; are leaves, they should all be at the same depth. (> (count leaf-depths) 1) {:error true :description (str "There are leaf nodes at multiple different depths: " leaf-depths) :data leaf-depths} (and (= (count leaf-depths) 1) (not= (first leaf-depths) expected-leaf-depth)) {:error true :description (str "Expecting all leaves to be at depth " expected-leaf-depth " because root has shift=" shift " but found leaves at depth " (first leaf-depths)) :data leaf-depths} (not (contains? expected-internal-max-depths max-internal-node-depth)) {:error true :description (str "Expecting there to be some internal nodes at one of" " these depths: " expected-internal-max-depths " because count=" (count v) " and root has shift=" shift " but max depth among all internal nodes found was " max-internal-node-depth)} (seq (leaves-with-internal-node-type nodes)) {:error true :description "A leaf (at max depth) has one of the internal node types, returning true for internal-node?" :data (first (leaves-with-internal-node-type nodes))} (seq (non-leaves-not-internal-node-type nodes)) {:error true :description "A non-leaf node has a type that returns false for internal-node?" :data (first (non-leaves-not-internal-node-type nodes))} :else {:error false}))) ;; I believe that objects-in-slot-32-of-obj-arrays and ;; ranges-not-int-array are only called directly from one test ;; namespace right now. Consider making a combined invariant checking ;; function in this debug namespace that can be used from any test ;; namespace (or other debug-time code) that a developer wants to. (defn objects-in-slot-32-of-obj-arrays "Function to look for errors of the form where a node's node.array object, which is often an array of 32 or 33 java.lang.Object's, has an element at index 32 that is not nil, and refers to an object that is of any type _except_ an array of ints. There appears to be some situation in which this can occur, but it seems to almost certainly be a bug if that happens, and we should be able to detect it whenever it occurs." [v] (let [{:keys [v get-array]} (pd/unwrap-subvec-accessors-for v) node-maps (all-vector-tree-nodes v) internal (filter #(= :internal (:kind %)) node-maps)] (keep (fn [node-info] ;; TBD: Is there a way to do ^objects type hint for clj, ;; but none for cljs? Is it harmful for cljs to have such ;; a type hint? ;;(let [^objects arr (get-array (:node node-info)) (let [arr (get-array (:node node-info)) n (count arr)] (if (== n 33) (aget arr 32)))) internal))) ;; TBD: Should this function be defined in platform-specific file? ;;(defn ranges-not-int-array [x] ;; (seq (remove int-array? (objects-in-slot-32-of-obj-arrays x)))) ;; edit-nodes-errors is completely defined in platform-specific source ;; files. It is simply quite different between clj/cljs. (defn edit-nodes-errors [v] (pd/edit-nodes-errors v all-vector-tree-nodes)) (defn regular-node-errors [root-node? root-node-cnt children] ;; For regular nodes, there should be zero or more 'full' children, ;; followed optionally by one 'partial' child, followed by nils. (let [[full-children others] (split-with :full? children) [partial-children others] (split-with #(and (not (:full? %)) (not= :nil (:kind %))) others) [nil-children others] (split-with #(= :nil (:kind %)) others) num-full (count full-children) num-partial (count partial-children) num-non-nil (+ num-full num-partial)] (cond (not= 0 (count others)) {:error true, :kind :internal, :description (str "Found internal regular node with " num-full " full, " num-partial " partial, " (count nil-children) " nil, " (count others) " 'other' children." " - expected 0 children after nils.")} (> num-partial 1) {:error true, :kind :internal, :description (str "Found internal regular node with " num-full " full, " num-partial " partial, " (count nil-children) " nil children" " - expected 0 or 1 partial.")} (not (or (and root-node? (<= root-node-cnt 32) ;; all elements in tail (= 0 num-non-nil)) (<= 1 num-non-nil 32))) {:error true, :kind :internal :description (str "Found internal regular node with # full + # partial=" num-non-nil " children outside of range [1, 32]." " root-node?=" root-node? " root-node-cnt=" root-node-cnt) :data children} :else {:error false, :kind :internal, :full? (= 32 (count full-children)) :count (reduce + (map #(or (:count %) 0) children))}))) (defn non-regular-node-errors [node get-ranges children] (let [rng (get-ranges node) [non-nil-children others] (split-with #(not= :nil (:kind %)) children) [nil-children others] (split-with #(= :nil (:kind %)) others) num-non-nil (count non-nil-children) num-nil (count nil-children) expected-ranges (reductions + (map :count non-nil-children))] (cond (not= 0 (count others)) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil, " (count others) " 'other' children." " - expected 0 children after nils.")} (not= num-non-nil (aget rng 32)) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil children, and" " last elem of ranges=" (aget rng 32) " - expected it to match # non-nil children.")} (not= expected-ranges (take (count expected-ranges) (seq rng))) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil children, and" " # children prefix sums: " (seq expected-ranges) " - expected that to match stored ranges: " (seq rng))} ;; I believe that there must always be at least one ;; non-nil-child. By checking for this condition, we will ;; definitely find out if it is ever violated. ;; TBD: What if we have a tree with ranges, and then remove all ;; elements? Does the resulting tree triger this error? (not (<= 1 (aget rng 32) 32)) {:error true, :kind :internal :description (str "Found internal non-regular node with (aget rng 32)" "=" (aget rng 32) " outside of range [1, 32].")} :else {:error false, :kind :internal, :full? false, :count (last expected-ranges)}))) (defn max-capacity-divided-by-1024 [root-shift] (let [shift-amount (max 0 (- root-shift 5))] (bit-shift-left 1 shift-amount))) (defn fraction-full [v] (let [{:keys [v get-shift]} (pd/unwrap-subvec-accessors-for v) root-shift (get-shift v) tail-off (pd/dbg-tailoff v) max-tree-cap (bit-shift-left 1 (+ root-shift 5))] (/ (* 1.0 tail-off) max-tree-cap))) (defn ranges-errors [v] (let [{:keys [v get-root get-shift get-tail get-cnt get-array get-ranges regular? tail-len]} (pd/unwrap-subvec-accessors-for v) root (get-root v) root-node-cnt (count v) root-shift (get-shift v) tail-off (pd/dbg-tailoff v) tail (get-tail v)] (letfn [ (go [shift node] (cond (nil? node) {:error false :kind :nil} (zero? shift) (let [n (count (get-array node))] (merge {:error (zero? n), :kind :leaves, :full? (= n 32), :count n} (if (zero? n) {:description (str "Leaf array has 0 elements." " Expected > 0.")}))) :else ;; non-0 shift (let [children (map (partial go (- shift 5)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr)))) errs (filter :error children)] (cond (seq errs) {:error true, :description "One or more errors found", :data errs} (not= 32 (count children)) {:error true, :kind :internal, :description (str "Found internal node that has " (count children) " children - expected 32.")} (regular? node) (regular-node-errors (= shift root-shift) root-node-cnt children) :else (non-regular-node-errors node get-ranges children)))))] (let [x (go root-shift root)] (cond (:error x) x (not= tail-off (:count x)) {:error true, :kind :root, :description (str "Found tail-off=" tail-off " != " (:count x) "=count of values beneath internal nodes") :internal-node-leaf-count (:count x) :tail-off tail-off :cnt (get-cnt v)} (and (pd/transient-vector? v) (not= (tail-len tail) 32)) {:error true, :kind :root, :description (str "Found transient vector with tail length " (tail-len tail) " - expecting 32")} ;; It is always a bad thing if shift becomes more than 32, ;; because the bit-shift-left and bit-shift-right operations ;; on 32-bit ints actually behave like (bit-shift-left ;; x (mod shift-amount 32)) for shift-amount over 32. It is ;; also likely a bug in the implementation if that happens. (>= root-shift 32) {:error true, :kind :root, :description (str "shift of root is " root-shift " >= 32," " which is not supported.")} ;; This is not necessarily a bug, but it seems likely to be ;; a bug if a tree is less than 1/1024 full compared to its ;; max capacity. 1/32 full is normal when a tree becomes 1 ;; deeper than it was before. (< 0 (:count x) (max-capacity-divided-by-1024 root-shift)) {:error false, :warning true, :kind :root-too-deep, :description (str "For root shift=" root-shift " the maximum " "capacity divided by 1024 is " (max-capacity-divided-by-1024 root-shift) " but the tree contains only " (:count x) " vector elements outside of the tail")} :else x))))) #_(defn add-return-value-checks [f err-desc-str return-value-check-fn] (fn [& args] (let [ret (apply f args)] (apply return-value-check-fn err-desc-str ret args) ret))) (defn copying-seq [v] (let [{:keys [v subvector? subvec-start subvec-end get-root get-shift get-tail get-array regular?]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v)] (letfn [(go [shift node] (if node (if (not= shift 0) (apply concat (map (partial go (- shift 5)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr))))) (seq (get-array node)))))] (doall ;; always return a fully realized sequence. (let [all-elems (concat (go shift root) (if (pd/transient-vector? v) (take (pd/dbg-tidx v) (get-tail v)) (seq (get-tail v))))] (if subvector? (take (- subvec-end subvec-start) (drop subvec-start all-elems)) all-elems)))))) (def failure-data (atom [])) (def warning-data (atom [])) (defn clear-failure-data! [] (reset! failure-data [])) (let [orig-conj clojure.core/conj] (defn record-failure-data [d] (swap! failure-data orig-conj d)) (defn record-warning-data [d] (swap! warning-data orig-conj d))) ;; I would like to achieve a goal of providing an easy-to-use way that ;; a Clojure or ClojureScript developer could call a function, or ;; invoke their own code in a macro, and then within the run-time ;; scope of that, a selected set of calls to functions like conj, ;; conj!, pop, pop!, transient, subvec, slicev, catvec, splicev, and ;; perhaps others, would have extra checks enabled, such that if they ;; detected a bug, they would stop the execution immediately with a ;; lot of debug information recorded as near to the point of the ;; failure as can be achieved by checking the return values of such ;; function calls. ;; It would also be good if this goal could be achieved without having ;; a separate implementation of all of those functions, and/or custom ;; versions of Clojure, ClojureScript, or the core.rrb-vector library ;; to use. Actually a separate implementation of core.rrb-vector ;; might be acceptable and reasonable to implement and maintain, but ;; separate versions of Clojure and ClojureScript seems like too much ;; effort for the benefits achieved. ;; I have investigated approaches that attempt to use with-redefs on ;; the 'original Vars' in Clojure, and also in a ClojureScript ;; Node-based REPL. ;; There are differences between with-redefs behavior on functions in ;; clojure.core between Clojure and ClojureScript, because ;; direct-linking seems to also include user code calling to ;; clojure.core functions with ClojureScript: ;; https://clojure.atlassian.net/projects/CLJS/issues/CLJS-3154 ;; At least in Clojure, and perhaps also in ClojureScript, there is ;; sometimes an effect similar to direct linking involved when calling ;; protocol methods on objects defined via deftype. That prevents ;; with-redefs, and any technique that changes the definition of a Var ;; with alter-var-root! or set!, from causing the alternate function ;; to be called. ;; Here are the code paths that I think are most useful for debug ;; checks of operations on vectors. ;; Functions in clojure.core: ;; Lower value, because they are simpler functions, and in particular ;; do not operate on RRB vector trees with ranges inside: ;; vec vector vector-of ;; Similarly the RRB vector variants of those functions create regular ;; RRB vectors, so not as likely to have bugs. ;; peek can operate on trees with ranges inside, but always accesses ;; the tail, so not nearly as likely to have bugs. ;; Higher value, because they can operate on RRB vectors with ranges ;; inside the tree: ;; conj pop assoc ;; conj! pop! assoc! ;; transient persistent! ;; seq rseq ;; Functions in clojure.core.rrb-vector namespace, and internal ;; implementation functions/protocol-methods that they use: ;; defn fv/catvec ;; calls itself recursively for many args (clj and cljs versions) ;; -splicev protocol function (splicev for clj) ;; When -splicev is called on PersistentVector or Subvec, -as-rrbt ;; converts it to Vector, then method below is called. ;; deftype Vector -splicev / splicev method ;; -as-rrbt (cljs) / as-rrbt (clj) ;; -slicev (cljs) / slicev (clj) if used on a subvector object ;; defn splice-rrbts ;; defn splice-rrbts-main ;; Calls many internal implementation detail functions. ;; peephole-optimize-root ;; fallback-to-slow-splice-if-needed ;; defn fv/subvec ;; -slicev (cljs) / slicev (clj) protocol function ;; deftype Vector -slicev method ;; Calls many internal implementation detail functions, ;; e.g. slice-left slice-right make-array array-copy etc. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; See the documentation of the several checking-* functions for the ;; keys supported inside of the @debug-opts map. (def debug-opts (atom {})) (def full-debug-opts {:trace false :validate true :return-value-checks [edit-nodes-errors basic-node-errors ranges-errors] ;; false -> throw an exception when error detected :continue-on-error false ;; true -> do not throw an exception when warning found :continue-on-warning true}) (defn set-debug-opts! "set-debug-opts! modified the debug-opts atom of the core.rrb-vector library, which configures what kinds of extra checks are performed when calling the checking-* versions of functions defined in the library's debug namespace. Example call: (require '[clojure.core.rrb-vector.debug :as d]) (d/set-debug-opts! d/full-debug-opts) This call enables as thorough of extra verification checks as is supported by existing code, when you call any of the checking-* variants of the functions in this namespace, e.g. checking-catvec, checking-subvec. It will also slow down your code to do so. checking-* functions return the same values as their non checking-* original functions they are based upon, so you can write application code that mixes calls to both, calling the checking-* versions only occasionally, if you have a long sequence of operations that you want to look for bugs within core.rrb-vector's implementation of." [opts] (reset! debug-opts {:catvec opts ;; affects checking-catvec behavior, ;; via calling checking-splicev and ;; checking-splice-rrbts and enabling ;; their extra checks. :subvec opts ;; affects checking-subvec behavior, ;; via calling checking-slicev and ;; enabling its extra checks :pop opts ;; affects checking-pop :pop! opts ;; affects checking-pop! :transient opts})) ;; affects checking-transient (defn validation-failure [err-msg-str failure-data opts] (println "ERROR:" err-msg-str) (record-failure-data failure-data) (when-not (:continue-on-error opts) (throw (ex-info err-msg-str failure-data)))) (defn sanity-check-vector-internals "This function is called by all of the checking-* variants of functions in the debug namespace. It calls all of the functions in (:return-value-checks opts) in the order given, passing each of those functions a return value 'ret'. Each function performs sanity checks on the 'ret' data structure used to represent the vector. Those functions should return a map with key :error having a logical true value if any errors were found, or a key :warning having a logical true value if any warnings were found, otherwise both of those values must be logical false in the returned map (or no such key is present in the returned map at all). Three examples of such functions are included in core.rrb-vector's debug namespace. * edit-nodes-errors * basic-node-errors * ranges-errors They each look for different problems in the vector data structure internals. They were developed as separate functions in case there was ever a significant performance advantage to configuring only some of them to be called, not all of them, for long tests. If any errors are found, this function calls record-failure-data, to record the details in a global atom. It prints a message to *out*, and if (:continue-on-error opts) is logical false, it throws a data conveying exception using ex-info containing the same message, and the same error details map passed to record-failure-data. If no exception is thrown due to an error, then repeat the same checks for a warning message, recording details via calling record-warning-data, and throwing an exception if (:continue-on-warning opts) is logical false." [err-desc-str ret args opts] (doseq [check-fn (:return-value-checks opts)] (let [i (check-fn ret)] (when (:error i) (let [msg (str "found error in ret value from " err-desc-str ": " (:description i)) failure-data {:err-desc-str err-desc-str, :ret ret, :args args, :error-info i}] (println "ERROR:" msg) (record-failure-data failure-data) (when-not (:continue-on-error opts) (throw (ex-info msg failure-data))))) (when (:warning i) ;; It is perfectly normal for fv/subvec and slicev to return a ;; vector that causes this warning. (when-not (and (= err-desc-str "slicev") (= :root-too-deep (:kind i))) (let [msg (str "possible issue with ret value from " err-desc-str ": " (:description i)) failure-data {:err-desc-str err-desc-str, :ret ret, :args args, :error-info i}] (println "WARNING:" msg) (record-warning-data failure-data) (when-not (:continue-on-warning opts) (throw (ex-info msg failure-data))))))))) (defn validating-pop "validating-pop is not really designed to be called from user programs. checking-pop can do everything that validating-pop can, and more. See its documentation. A typical way of calling validating-pop is: (require '[clojure.core.rrb-vector.debug :as d]) (d/validating-pop clojure.core/pop \"pop\" coll) Most of the validating-* functions behave similarly. This one contains the most complete documentation, and the others refer to this one. They all differ in the function that they are intended to validate, and a few other details, which will be collected in one place here for function validating-pop so one can quickly see the differences between validating-pop and the other validating-* functions. good example f: clojure.core/pop opts map: (get @d/debug-opts :pop) The first argument can be any function f. f is expected to take arguments and return a value equal to what clojure.core/pop would, given the argument coll. validating-pop will first make a copy of the seq of items in coll, as a safety precaution, because some kinds of incorrect implementations of pop could mutate their input argument. That would be a bug, of course, but aiding a developer in detecting bugs is the reason validating-pop exists. It uses the function copying-seq to do this, which takes at least linear time in the size of coll. It will then calculate a sequence that is = to the expected return value, e.g. for pop, all items in coll except the last one. Then validating-pop will call (f coll), then call copying-seq on the return value. If the expected and returned sequences are not =, then a map containing details about the arguments and actual return value is created and passed to d/record-failure-data, which appends the map to the end of a vector that is the value of an atom named d/failure-data. An exception is thrown if (:continue-on-error opts) is logical false, with ex-data equal to this same map of error data. If the expected and actual sequences are the same, no state is modified and no exception is thrown. If validating-pop does not throw an exception, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq (butlast coll-seq) ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(pop coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :pop))) ret)) (defn checking-pop "These two namespace aliases will be used later in this documentation: (require '[clojure.core.rrb-vector.debug :as d]) (require '[clojure.core.rrb-vector.debug-platform-dependent :as pd]) checking-pop passes its argument to clojure.core/pop, and if it returns, it returns whatever clojure.core/pop does. If checking-pop detects any problems, it will record information about the problems found in one or both of the global atoms 'd/failure-data' and 'd/warning-data', and optionally throw an exception. If coll is not a vector type according to pd/is-vector?, then checking-pop simply behaves exactly like clojure.core/pop, with no additional checks performed. All of checking-pop's extra checks are specific to vectors. If coll is a vector, then checking-pop looks up the key :pop in a global atom 'd/debug-opts'. The result of that lookup is a map we will call 'opts' below. opts map: (get @d/debug-opts :pop) function called if (:validating opts) is logical true: validating-pop If (:trace opts) is true, then a debug trace message is printed to *out*. If (:validate opts) is true, then validating-pop is called, using clojure.core/pop to do the real work, but validating-pop will check whether the return value looks correct relative to the input parameter value, i.e. it is equal to a sequence of values containing all but the last element of the input coll's sequence of values. See validating-pop documentation for additional details. This step records details of problems found in the atoms d/failure-data. (:return-value-checks opts) should be a sequence of functions that each take the vector returned from calling clojure.core/pop, and return data about any errors or warnings they find in the internals of the vector data structure. Errors or warnings are appended to atoms d/failure-data and/or d/warning-data. If either the validate or return value checks steps find an error, they throw an exception if (:continue-on-error opts) is logical false. If the return value checks step finds no error, but does find a warning, it throws an exception if (:continue-on-warning opts) is logical false." [coll] (if-not (pd/is-vector? coll) (clojure.core/pop coll) (let [opts (get @debug-opts :pop) err-desc-str "pop"] (when (:trace opts) (println "checking-pop called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-pop clojure.core/pop err-desc-str coll) (clojure.core/pop coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-pop! "validating-pop! behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core/pop! opts map: (get @d/debug-opts :pop!) If no exception is thrown, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq (butlast coll-seq) ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(pop! coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :pop!))) ret)) (defn checking-pop! "checking-pop! is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :pop!) function called if (:validating opts) is logical true: validating-pop!" [coll] (if-not (pd/is-vector? coll) (clojure.core/pop! coll) (let [opts (get @debug-opts :pop!) err-desc-str "pop!"] (when (:trace opts) (println "checking-pop! called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-pop! clojure.core/pop! err-desc-str coll) (clojure.core/pop! coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-transient "validating-transient behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core/transient opts map: (get @d/debug-opts :transient) If no exception is thrown, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq coll-seq ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(transient coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :transient))) ret)) (defn checking-transient "checking-transient is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :transient) function called if (:validating opts) is logical true: validating-transient" [coll] (if-not (pd/is-vector? coll) (clojure.core/transient coll) (let [opts (get @debug-opts :transient) err-desc-str "transient"] (when (:trace opts) (println "checking-transient called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-transient clojure.core/transient err-desc-str coll) (clojure.core/transient coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-splice-rrbts-main "validating-splice-rrbts-main behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core.rrb-vector.rrbt/splice-rrbts-main opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts-main Given that splice-rrbts-main is an internal implementation detail of the core.rrb-vector library, it is expected that it is more likely you would call validating-catvec instead of this function. If no exception is thrown, the return value is (f v1 v2)." [err-desc-str v1 v2] (let [orig-fn rrbt/splice-rrbts-main v1-seq (copying-seq v1) v2-seq (copying-seq v2) exp-ret-seq (concat v1-seq v2-seq) ret (orig-fn v1 v2) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "splice-rrbts-main returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list v1 v2) :v1-seq v1-seq, :v2-seq v2-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :catvec))) ret)) (defn checking-splice-rrbts-main "checking-splice-rrbts-main is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-catvec. checking-splice-rrbts-main is part of the implementation of checking-catvec. opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts-main function called if (:validating opts) is logical true: validating-splice-rrbts-main" [& args] (let [opts (get @debug-opts :catvec) err-desc-str "splice-rrbts-main"] (when (:trace opts) (let [[v1 v2] args] (println "checking-splice-rrbts-main called with #v1=" (count v1) "#v2=" (count v2) "(type v1)=" (type v1) "(type v2)=" (type v2)))) (let [ret (if (:validate opts) (apply validating-splice-rrbts-main err-desc-str args) (apply rrbt/splice-rrbts-main args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn checking-splice-rrbts "checking-splice-rrbts is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-catvec. checking-splice-rrbts is part of the implementation of checking-catvec. opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts function called if (:validating opts) is logical true: validating-splice-rrbts" [& args] (let [opts (get @debug-opts :catvec) err-desc-str1 "splice-rrbts checking peephole-optimize-root result" err-desc-str2 "splice-rrbts checking fallback-to-slow-splice-if-needed result" [v1 v2] args] (when (:trace opts) (println "checking-splice-rrbts called with #v1=" (count v1) "#v2=" (count v2) "(type v1)=" (type v1) "(type v2)=" (type v2))) (let [r1 (checking-splice-rrbts-main v1 v2) r2 (rrbt/peephole-optimize-root r1)] ;; Optimize a bit by only doing all of the sanity checks on r2 ;; if it is not the same identical data structure r1 that ;; checking-splice-rrbts-main already checked. (when-not (identical? r2 r1) (sanity-check-vector-internals err-desc-str1 r2 args opts)) (let [r3 (rrbt/fallback-to-slow-splice-if-needed v1 v2 r2)] (when-not (identical? r3 r2) (sanity-check-vector-internals err-desc-str2 r3 args opts)) r3)))) (defn checking-splicev "checking-splicev is identical to splicev, except that it calls checking-splice-rrbts instead of splice-rrbts, for configurable additional checking on each call to checking-splice-rrbts. It is more likely that a core.rrb-vector library user will want to call checking-catvec rather than this one. checking-splicev is part of the implementation of checking-catvec." [v1 v2] (let [rv1 (rrbt/-as-rrbt v1)] (checking-splice-rrbts rv1 (rrbt/-as-rrbt v2)))) (defn checking-catvec-impl "checking-catvec-impl is identical to catvec, except that it calls checking-splicev instead of splicev, for configurable additional checking on each call to checking-splicev." ([] []) ([v1] v1) ([v1 v2] (checking-splicev v1 v2)) ([v1 v2 v3] (checking-splicev (checking-splicev v1 v2) v3)) ([v1 v2 v3 v4] (checking-splicev (checking-splicev v1 v2) (checking-splicev v3 v4))) ([v1 v2 v3 v4 & vn] (checking-splicev (checking-splicev (checking-splicev v1 v2) (checking-splicev v3 v4)) (apply checking-catvec-impl vn)))) (defn validating-catvec "validating-catvec behaves similarly to validating-pop, but note that it does not allow you to pass in a function f on which to concatenate its arguments. It hardcodes d/checking-catvec-impl for that purpose. See validating-pop for more details. opts map: (get @d/debug-opts :catvec) If no exception is thrown, the return value is (apply checking-catvec-impl vs)." [err-desc-str & vs] (let [orig-fn checking-catvec-impl ;; clojure.core.rrb-vector/catvec vs-seqs (doall (map copying-seq vs)) exp-ret-seq (apply concat vs-seqs) ret (apply orig-fn vs) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "catvec returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args vs, :vs-seqs vs-seqs, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :catvec))) ret)) (defn checking-catvec "checking-catvec is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Note that (get @d/debug-otps :catvec) is used to control tracing, validating, and return value sanity checks for checking-catvec as a whole. This includes controlling those options for the function checking-splice-rrbts, which is used to concatenate pairs of vectors when you call checking-catvec with 3 or more vectors. This takes a bit longer to do the checking on every concatenation, but catches problems closer to the time they are introduced. opts map: (get @d/debug-opts :catvec) function called if (:validating opts) is logical true: validating-catvec" [& args] (let [opts (get @debug-opts :catvec) err-desc-str "catvec"] (when (:trace opts) (println "checking-catvec called with" (count args) "args:") (dorun (map-indexed (fn [idx v] (println " arg" (inc idx) " count=" (count v) "type=" (type v))) args))) (let [ret (if (:validate opts) (apply validating-catvec err-desc-str args) (apply checking-catvec-impl ;; clojure.core.rrb-vector/catvec args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn validating-slicev "validating-slicev behaves similarly to validating-pop, but note that it does not allow you to pass in a function f to call. It hardcodes slicev for that purpose. See validating-pop for more details. opts map: (get @d/debug-opts :subvec) ;; _not_ :slicev" ([err-desc-str coll start] (validating-slicev err-desc-str coll start (count coll))) ([err-desc-str coll start end] (let [coll-seq (copying-seq coll) exp-ret-seq (take (- end start) (drop start coll-seq)) ret (clojure.core.rrb-vector.protocols/-slicev coll start end) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(slicev coll start end) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll start end), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :subvec))) ret))) (defn checking-slicev "checking-slicev is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-subvec. checking-slicev is part of the implementation of checking-subvec. opts map: (get @d/debug-opts :subvec) ;; _not_ :slicev function called if (:validating opts) is logical true: validating-slicev" [& args] (let [opts (get @debug-opts :subvec) err-desc-str "slicev"] (when (:trace opts) (let [[v start end] args] (println "checking-slicev #v=" (count v) "start=" start "end=" end "type=" (type v)))) (let [ret (if (:validate opts) (apply validating-slicev err-desc-str args) (apply clojure.core.rrb-vector.protocols/-slicev args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn checking-subvec "checking-subvec is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :subvec) function called if (:validating opts) is logical true: validating-slicev" ([v start] (checking-slicev v start (count v))) ([v start end] (checking-slicev v start end))) (defn check-subvec "Perform a sequence of calls to subvec an a core.rrb-vector vector, as well as a normal Clojure vector, returning true if they give the same results, otherwise false. Intended for use in tests of this library." [extra-checks? init & starts-and-ends] (let [v1 (loop [v (vec (range init)) ses (seq starts-and-ends)] (if ses (let [[s e] ses] (recur (clojure.core/subvec v s e) (nnext ses))) v)) my-subvec (if extra-checks? checking-subvec fv/subvec) v2 (loop [v (fv/vec (range init)) ses (seq starts-and-ends)] (if ses (let [[s e] ses] (recur (my-subvec v s e) (nnext ses))) v))] (pd/same-coll? v1 v2))) (defn check-catvec "Perform a sequence of calls to catvec or checking-catvec on one or more core.rrb-vector vectors. Return true if Clojure's built-in concat function give the same results, otherwise false. Intended for use in tests of this library." [extra-checks? & counts] (let [prefix-sums (reductions + counts) ranges (map range (cons 0 prefix-sums) prefix-sums) v1 (apply concat ranges) my-catvec (if extra-checks? checking-catvec fv/catvec) v2 (apply my-catvec (map fv/vec ranges))] (pd/same-coll? v1 v2))) (defn generative-check-subvec "Perform many calls to check-subvec with randomly generated inputs. Intended for use in tests of this library. Returns true if all tests pass, otherwise throws an exception containing data about the inputs that caused the failing test." [extra-checks? iterations max-init-cnt slices] (dotimes [_ iterations] (let [init-cnt (rand-int (inc max-init-cnt)) s1 (rand-int init-cnt) e1 (+ s1 (rand-int (- init-cnt s1)))] (loop [s&es [s1 e1] cnt (- e1 s1) slices slices] (if (or (zero? cnt) (zero? slices)) (if-not (try (apply check-subvec extra-checks? init-cnt s&es) (catch js/Error e (throw (ex-info "check-subvec failure w/ Exception" {:init-cnt init-cnt :s&es s&es} e)))) (throw (ex-info "check-subvec failure w/o Exception" {:init-cnt init-cnt :s&es s&es}))) (let [s (rand-int cnt) e (+ s (rand-int (- cnt s))) c (- e s)] (recur (conj s&es s e) c (dec slices))))))) true) (defn generative-check-catvec "Perform many calls to check-catvec with randomly generated inputs. Intended for use in tests of this library. Returns true if all tests pass, otherwise throws an exception containing data about the inputs that caused the failing test." [extra-checks? iterations max-vcnt min-cnt max-cnt] (dotimes [_ iterations] (let [vcnt (inc (rand-int (dec max-vcnt))) cnts (vec (repeatedly vcnt #(+ min-cnt (rand-int (- (inc max-cnt) min-cnt)))))] (if-not (try (apply check-catvec extra-checks? cnts) (catch js/Error e (throw (ex-info "check-catvec failure w/ Exception" {:cnts cnts} e)))) (throw (ex-info "check-catvec failure w/o Exception" {:cnts cnts}))))) true) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/debug_platform_dependent.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.debug-platform-dependent (:require [clojure.core.rrb-vector.rrbt :as rrbt] [clojure.core.rrb-vector.nodes :refer [regular? node-ranges]] [clojure.core.rrb-vector :as fv] [goog.string :as gstring] goog.string.format)) (defn format [& args] (apply gstring/format args)) (defn printf [& args] (print (apply gstring/format args))) (defn internal-node? [x] ;; TBD: Is there another type that should be included here? Clojure ;; only has this one, plus a VecNode that is distinct to the ;; primitive vectors, which ClojureScript does not have. (instance? cljs.core/VectorNode x)) (defn persistent-vector? [x] (or (instance? cljs.core/PersistentVector x) (instance? cljs.core/Subvec x) (instance? clojure.core.rrb-vector.rrbt/Vector x))) (defn transient-vector? [x] (or (instance? cljs.core/TransientVector x) (instance? clojure.core.rrb-vector.rrbt/Transient x))) (defn is-vector? [x] (or (persistent-vector? x) (transient-vector? x))) (defn dbg-tailoff [v] (cond (or (instance? cljs.core/PersistentVector v) (instance? cljs.core/TransientVector v)) (#'cljs.core/tail-off v) (or (instance? clojure.core.rrb-vector.rrbt/Vector v) (instance? clojure.core.rrb-vector.rrbt/Transient v)) (rrbt/-tail-offset v) :else (throw (ex-info (str "Called debug-tailoff on value with unsupported type " (pr-str (type v))) {:value v})))) (defn subvector-data [v] (if (instance? cljs.core/Subvec v) {:orig-v v :subvector? true :v (.-v v) :subvec-start (.-start v) :subvec-end (.-end v)} {:orig-v v :subvector? false :v v})) (defn unwrap-subvec-accessors-for [v] (let [{:keys [v] :as m} (subvector-data v)] (merge m {:get-root #(.-root %) :get-shift #(.-shift %) :get-tail #(.-tail %) :get-cnt #(.-cnt %) :get-array #(.-arr %) :get-ranges node-ranges :regular? regular? :tail-len alength}))) (defn dbg-tidx [tv] (if (transient-vector? tv) (if (instance? cljs.core/TransientVector tv) (let [c (.-cnt tv)] (if (== c 32) 32 (bit-and c 0x01f))) (.-tidx tv)))) (defn abbrev-for-type-of [obj] (let [tn (pr-str (type obj)) d (.lastIndexOf tn ".")] (subs tn (inc d)))) (defn same-coll? [a b] (and (= (count a) (count b)) (= a b) (= b a) (= (hash a) (hash b)) ;; TBD: Is there anything JavaScript specific that corresponds ;; to Java's .hashCode method? ;;(= (.hashCode ^Object a) (.hashCode ^Object b)) )) ;; In ClojureScript, as in JavaScript, there is only 1 thread, so all ;; updates to transients are automatically thread confined, to the ;; only thread. There is still an edit field/property on the vector ;; tree nodes, but it is used only for the purpose of distinguishing ;; which nodes are uniquely "owned" by this transient vector, ;; vs. those tree nodes that might be shared with other persistent ;; vectors. For that purpose, when a vector is made transient, its ;; root tree node is assigned a new unique JavaScript object, as ;; returned by (js-obj). Any tree node whose edit field is identical ;; to that one is owned by this transient, any not identical are not. ;; Only the root tree node has its edit field changed to nil when the ;; transient vector is converted to persistent, because any other tree ;; nodes that are identical to it will never be identical to another ;; new (js-obj) return value. ;; Thus for the JavaScript implementation, it is not the case that in ;; persistent vectors that all edit fields must be nil, or an ;; AtomicReference that contains nil. Some of them can be return ;; values of (js-obj) from times when they were part of a transient ;; vector, but are no longer. ;; About the only thing we can check here that I believe must always ;; be true, is that persistent vectors have a root tree node with edit ;; field equal to nil, and transient vectors must have a root tree ;; node with edit field not equal to nil. (defn edit-nodes-errors [v _] (let [{:keys [v get-root]} (unwrap-subvec-accessors-for v) root-edit (.-edit (get-root v)) root-edit-is-nil? (nil? root-edit)] (cond (and (transient-vector? v) root-edit-is-nil?) {:error true :description (str "A transient vector with type" (pr-str (type v)) " has a root edit property with value nil" " - expecting a non-nil JavaScript object") :data v} (and (persistent-vector? v) (not root-edit-is-nil?)) {:error false, :warning true, :description (str "A persistent vector with type " (pr-str (type v)) " has a root edit property with value " root-edit " - often this is nil instead." " It requires more thought to be certain" " whether this could lead to problems," " hence why this is only a warning") :data v} :else {:error false}))) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/interop.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.interop (:require [clojure.core.rrb-vector.protocols :refer [PSliceableVector -slicev PSpliceableVector -splicev]] [clojure.core.rrb-vector.rrbt :refer [-as-rrbt]])) (extend-protocol PSliceableVector cljs.core/PersistentVector (-slicev [v start end] (-slicev (-as-rrbt v) start end)) cljs.core/Subvec (-slicev [v start end] (-slicev (-as-rrbt v) start end))) (extend-protocol PSpliceableVector cljs.core/PersistentVector (-splicev [v1 v2] (-splicev (-as-rrbt v1) v2)) cljs.core/Subvec (-splicev [v1 v2] (-splicev (-as-rrbt v1) v2))) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/macros.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.macros (:refer-clojure :exclude [assert])) (def ^:const elide-assertions? true) (def ^:const elide-debug-printouts? true) (defmacro assert [& args] (if-not elide-assertions? `(clojure.core/assert ~@args))) (defmacro dbg [& args] (if-not elide-debug-printouts? `(prn ~@args))) (defmacro dbg- [& args]) (defmacro ^:private gen-vector-method [& params] (let [arr (gensym "arr__")] `(let [~arr (cljs.core/make-array ~(count params))] ~@(map-indexed (fn [i param] `(cljs.core/aset ~arr ~i ~param)) params) (clojure.core.rrb-vector.rrbt/Vector. ~(count params) 5 cljs.core/PersistentVector.EMPTY_NODE ~arr nil nil)))) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/nodes.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.nodes (:refer-clojure :exclude [clone])) ;;; node ops (def empty-node cljs.core.PersistentVector.EMPTY_NODE) (defn clone [shift node] (VectorNode. (.-edit node) (aclone (.-arr node)))) (defn regular? [node] (not (== (alength (.-arr node)) 33))) ;;; ranges (defn node-ranges [node] (aget (.-arr node) 32)) (defn last-range [node] (let [rngs (node-ranges node) i (dec (aget rngs 32))] (aget rngs i))) (defn regular-ranges [shift cnt] (let [step (bit-shift-left 1 shift) rngs (make-array 33)] (loop [i 0 r step] (if (< r cnt) (do (aset rngs i r) (recur (inc i) (+ r step))) (do (aset rngs i cnt) (aset rngs 32 (inc i)) rngs))))) ;;; root overflow (defn overflow? [root shift cnt] (if (regular? root) (> (bit-shift-right cnt 5) (bit-shift-left 1 shift)) (let [rngs (node-ranges root) slc (aget rngs 32)] (and (== slc 32) (or (== shift 5) (recur (aget (.-arr root) (dec slc)) (- shift 5) (+ (- (aget rngs 31) (aget rngs 30)) 32))))))) ;;; find nil / 0 (defn index-of-0 [arr] (loop [l 0 h 31] (if (>= l (dec h)) (if (zero? (int (aget arr l))) l (if (zero? (int (aget arr h))) h 32)) (let [mid (+ l (bit-shift-right (- h l) 1))] (if (zero? (int (aget arr mid))) (recur l mid) (recur (inc mid) h)))))) (defn index-of-nil ^long [arr] (loop [l 0 h 31] (if (>= l (dec h)) (if (nil? (aget arr l)) l (if (nil? (aget arr h)) h 32)) (let [mid (+ l (bit-shift-right (- h l) 1))] (if (nil? (aget arr mid)) (recur l mid) (recur (inc mid) h)))))) ;;; children (defn first-child [node] (aget (.-arr node) 0)) (defn last-child [node] (let [arr (.-arr node)] (if (regular? node) (aget arr (dec (index-of-nil arr))) (aget arr (dec (aget (node-ranges node) 32)))))) (defn remove-leftmost-child [shift parent] (let [arr (.-arr parent)] (if (nil? (aget arr 1)) nil (let [r? (regular? parent) new-arr (make-array (if r? 32 33))] (array-copy arr 1 new-arr 0 31) (if-not r? (let [rngs (node-ranges parent) rng0 (aget rngs 0) new-rngs (make-array 33) lim (aget rngs 32)] (array-copy rngs 1 new-rngs 0 (dec lim)) (loop [i 0] (when (< i lim) (aset new-rngs i (- (aget new-rngs i) rng0)) (recur (inc i)))) (aset new-rngs 32 (dec (aget rngs 32))) (aset new-rngs (dec (aget rngs 32)) 0) (aset new-arr 32 new-rngs))) (->VectorNode (.-edit parent) new-arr))))) (defn replace-leftmost-child [shift parent pcnt child d] (if (regular? parent) (let [step (bit-shift-left 1 shift) rng0 (- step d) ncnt (- pcnt d) li (bit-and (bit-shift-right shift (dec pcnt)) 0x1f) arr (.-arr parent) new-arr (make-array 33) new-rngs (make-array 33)] (aset new-arr 0 child) (array-copy arr 1 new-arr 1 li) (aset new-arr 32 new-rngs) (aset new-rngs 0 rng0) (aset new-rngs li ncnt) (aset new-rngs 32 (inc li)) (loop [i 1] (when (<= i li) (aset new-rngs i (+ (aget new-rngs (dec i)) step)) (recur (inc i)))) (->VectorNode nil new-arr)) (let [new-arr (aclone (.-arr parent)) rngs (node-ranges parent) new-rngs (make-array 33) li (dec (aget rngs 32))] (aset new-rngs 32 (aget rngs 32)) (aset new-arr 32 new-rngs) (aset new-arr 0 child) (loop [i 0] (when (<= i li) (aset new-rngs i (- (aget rngs i) d)) (recur (inc i)))) (->VectorNode nil new-arr)))) (defn replace-rightmost-child [shift parent child d] (if (regular? parent) (let [arr (.-arr parent) i (dec (index-of-nil arr))] (if (regular? child) (let [new-arr (aclone arr)] (aset new-arr i child) (->VectorNode nil new-arr)) (let [arr (.-arr parent) new-arr (make-array 33) step (bit-shift-left 1 shift) rngs (make-array 33)] (aset rngs 32 (inc i)) (aset new-arr 32 rngs) (array-copy arr 0 new-arr 0 i) (aset new-arr i child) (loop [j 0 r step] (when (<= j i) (aset rngs j r) (recur (inc j) (+ r step)))) (aset rngs i (last-range child)) (->VectorNode nil new-arr)))) (let [rngs (node-ranges parent) new-rngs (aclone rngs) i (dec (aget rngs 32)) new-arr (aclone (.-arr parent))] (aset new-arr i child) (aset new-arr 32 new-rngs) (aset new-rngs i (+ (aget rngs i) d)) (->VectorNode nil new-arr)))) ;;; fold-tail (defn new-path* [shift node] (let [reg? (== 32 (alength (.-arr node))) len (if reg? 32 33) arr (make-array len) rngs (if-not reg? (doto (make-array 33) (aset 0 (alength (.-arr node))) (aset 32 1))) ret (->VectorNode nil arr)] (loop [arr arr shift shift] (if (== shift 5) (do (if-not reg? (aset arr 32 rngs)) (aset arr 0 node)) (let [a (make-array len) e (->VectorNode nil a)] (aset arr 0 e) (if-not reg? (aset arr 32 rngs)) (recur a (- shift 5))))) ret)) (defn fold-tail [node shift cnt tail] (let [tlen (alength tail) reg? (and (regular? node) (== tlen 32)) arr (.-arr node) li (index-of-nil arr) new-arr (make-array (if reg? 32 33)) rngs (if-not (regular? node) (node-ranges node)) cret (if (== shift 5) (->VectorNode nil tail) (fold-tail (aget arr (dec li)) (- shift 5) (if (regular? node) (mod cnt (bit-shift-left 1 shift)) (let [li (dec (aget rngs 32))] (if (pos? li) (- (aget rngs li) (aget rngs (dec li))) (aget rngs 0)))) tail)) new-rngs (if-not reg? (if rngs (aclone rngs) (regular-ranges shift cnt)))] (when-not (and (or (nil? cret) (== shift 5)) (== li 32)) (array-copy arr 0 new-arr 0 li) (when-not reg? (if (or (nil? cret) (== shift 5)) (do (aset new-rngs li (+ (if (pos? li) (aget new-rngs (dec li)) (int 0)) tlen)) (aset new-rngs 32 (inc li))) (do (when (pos? li) (aset new-rngs (dec li) (+ (aget new-rngs (dec li)) tlen))) (aset new-rngs 32 li)))) (if-not reg? (aset new-arr 32 new-rngs)) (if (nil? cret) (aset new-arr li (new-path* (- shift 5) (->VectorNode nil tail))) (aset new-arr (if (== shift 5) li (dec li)) cret)) (->VectorNode nil new-arr)))) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/protocols.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.protocols) (defprotocol PSpliceableVector (-splicev [v1 v2])) (defprotocol PSliceableVector (-slicev [v start end])) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/rrbt.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.rrbt (:refer-clojure :exclude [array-for push-tail pop-tail new-path do-assoc]) (:require [clojure.core.rrb-vector.protocols :refer [PSliceableVector -slicev PSpliceableVector -splicev]] [clojure.core.rrb-vector.nodes :refer [regular? empty-node node-ranges overflow? last-range regular-ranges first-child last-child remove-leftmost-child replace-leftmost-child replace-rightmost-child fold-tail new-path* index-of-nil]] [clojure.core.rrb-vector.trees :refer [push-tail pop-tail new-path do-assoc]] [clojure.core.rrb-vector.transients :refer [ensure-editable editable-root editable-tail push-tail! pop-tail! do-assoc!]]) (:require-macros [clojure.core.rrb-vector.macros :refer [dbg]])) (def ^:const rrbt-concat-threshold 33) (def ^:const max-extra-search-steps 2) (defprotocol IVecImpl (-tail-offset [v]) (-array-for [v i])) (defprotocol AsRRBT (-as-rrbt [v])) ;;; chunked seqs: can't reuse cljs.core's without tweaks, since rrb ;;; vectors have a different array-for (declare rrb-chunked-seq) (deftype RRBChunkedSeq [vec node i off meta ^:mutable __hash] Object (toString [coll] (pr-str* coll)) IPrintWithWriter (-pr-writer [this writer opts] (pr-sequential-writer writer pr-writer "(" " " ")" opts this)) IWithMeta (-with-meta [coll m] (rrb-chunked-seq vec node i off m)) IMeta (-meta [coll] meta) ISeqable (-seq [coll] coll) ISequential IEquiv (-equiv [coll other] (equiv-sequential coll other)) ASeq ISeq (-first [coll] (aget node off)) (-rest [coll] (if (< (inc off) (alength node)) (let [s (rrb-chunked-seq vec node i (inc off))] (if (nil? s) () s)) (-chunked-rest coll))) INext (-next [coll] (if (< (inc off) (alength node)) (let [s (rrb-chunked-seq vec node i (inc off))] (if (nil? s) nil s)) (-chunked-next coll))) ICollection (-conj [coll o] (cons o coll)) IEmptyableCollection (-empty [coll] (with-meta cljs.core.List.EMPTY meta)) IChunkedSeq (-chunked-first [coll] (array-chunk node off)) (-chunked-rest [coll] (let [l (alength node) s (when (< (+ i l) (-count vec)) (rrb-chunked-seq vec (+ i l) 0))] (if (nil? s) () s))) IChunkedNext (-chunked-next [coll] (let [l (alength node) s (when (< (+ i l) (-count vec)) (rrb-chunked-seq vec (+ i l) 0))] (if (nil? s) nil s))) IHash (-hash [coll] (caching-hash coll hash-ordered-coll __hash)) IReduce (-reduce [coll f] (ci-reduce (cljs.core/subvec vec (+ i off) (count vec)) f)) (-reduce [coll f start] (ci-reduce (cljs.core/subvec vec (+ i off) (count vec)) f start))) (defn rrb-chunked-seq ([vec i off] (RRBChunkedSeq. vec (-array-for vec i) i off nil nil)) ([vec node i off] (RRBChunkedSeq. vec node i off nil nil)) ([vec node i off meta] (RRBChunkedSeq. vec node i off meta nil))) (defn slice-right [node shift end] (if (zero? shift) ;; potentially return a short node, although it would be better to ;; make sure a regular leaf is always left at the right, with any ;; items over the final 32 moved into tail (and then potentially ;; back into the tree should the tail become too long...) (let [arr (.-arr node) new-arr (make-array end)] (array-copy arr 0 new-arr 0 end) (->VectorNode nil new-arr)) (let [reg? (regular? node) rngs (if-not reg? (node-ranges node)) i (bit-and (bit-shift-right (dec end) shift) 0x1f) i (if reg? i (loop [j i] (if (<= end (aget rngs j)) j (recur (inc j))))) child-end (if reg? (let [ce (mod end (bit-shift-left 1 shift))] (if (zero? ce) (bit-shift-left 1 shift) ce)) (if (pos? i) (- end (aget rngs (dec i))) end)) arr (.-arr node) new-child (slice-right (aget arr i) (- shift 5) child-end) regular-child? (if (== shift 5) (== 32 (alength (.-arr new-child))) (regular? new-child)) new-arr (make-array (if (and reg? regular-child?) 32 33)) new-child-rng (if regular-child? (let [m (mod child-end (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (if (== shift 5) (alength (.-arr new-child)) (last-range new-child)))] (array-copy arr 0 new-arr 0 i) (aset new-arr i new-child) (if-not (and reg? regular-child?) (let [new-rngs (make-array 33) step (bit-shift-left 1 shift)] (if reg? (dotimes [j i] (aset new-rngs j (* (inc j) step))) (dotimes [j i] (aset new-rngs j (aget rngs j)))) (aset new-rngs i (+ (if (pos? i) (aget new-rngs (dec i)) 0) new-child-rng)) (aset new-rngs 32 (inc i)) (aset new-arr 32 new-rngs))) (->VectorNode nil new-arr)))) (defn slice-left [node shift start end] (if (zero? shift) ;; potentially return a short node (let [arr (.-arr node) new-len (- (alength arr) start) new-arr (make-array new-len)] (array-copy arr start new-arr 0 new-len) (->VectorNode nil new-arr)) (let [reg? (regular? node) arr (.-arr node) rngs (if-not reg? (node-ranges node)) i (bit-and (bit-shift-right start shift) 0x1f) i (if reg? i (loop [j i] (if (< start (aget rngs j)) j (recur (inc j))))) len (if reg? (loop [i i] (if (or (== i 32) (nil? (aget arr i))) i (recur (inc i)))) (aget rngs 32)) child-start (if (pos? i) (- start (if reg? (* i (bit-shift-left 1 shift)) (aget rngs (dec i)))) start) child-end (if reg? (min (bit-shift-left 1 shift) (if (pos? i) (- end (* i (bit-shift-left 1 shift))) end)) (let [capped-end (min (aget rngs i) end)] (if (pos? i) (- capped-end (aget rngs (dec i))) capped-end))) new-child (slice-left (aget arr i) (- shift 5) child-start child-end) new-len (- len i) new-len (if (nil? new-child) (dec new-len) new-len)] (cond (zero? new-len) nil reg? (let [new-arr (make-array 33) rngs (make-array 33) rng0 (if (or (nil? new-child) (== shift 5) (regular? new-child)) (- (bit-shift-left 1 shift) (bit-and (bit-shift-right start (- shift 5)) 0x1f)) (last-range new-child)) step (bit-shift-left 1 shift)] (loop [j 0 r rng0] (when (< j new-len) (aset rngs j r) (recur (inc j) (+ r step)))) (when (> new-len 1) (aset rngs (dec new-len) (- end start))) (aset rngs 32 new-len) (array-copy arr (if (nil? new-child) (inc i) i) new-arr 0 new-len) (if-not (nil? new-child) (aset new-arr 0 new-child)) (aset new-arr 32 rngs) (->VectorNode (.-edit node) new-arr)) :else (let [new-arr (make-array 33) new-rngs (make-array 33)] (loop [j 0 i i] (when (< j new-len) (aset new-rngs j (- (aget rngs i) start)) (recur (inc j) (inc i)))) (aset new-rngs 32 new-len) (array-copy arr (if (nil? new-child) (inc i) i) new-arr 0 new-len) (if-not (nil? new-child) (aset new-arr 0 new-child)) (aset new-arr 32 new-rngs) (->VectorNode (.-edit node) new-arr)))))) (declare splice-rrbts ->Transient) (deftype Vector [cnt shift root tail meta ^:mutable __hash] Object (toString [this] (pr-str* this)) IPrintWithWriter (-pr-writer [this writer opts] (pr-sequential-writer writer pr-writer "[" " " "]" opts this)) IWithMeta (-with-meta [this meta] (Vector. cnt shift root tail meta __hash)) IMeta (-meta [this] meta) ISequential IEquiv (-equiv [this that] (equiv-sequential this that)) IHash (-hash [this] (caching-hash this hash-ordered-coll __hash)) ISeqable (-seq [this] (cond (zero? cnt) nil (zero? (-tail-offset this)) (array-seq tail) :else (rrb-chunked-seq this 0 0))) ICounted (-count [_] cnt) IIndexed (-nth [this i] (if (and (<= 0 i) (< i cnt)) (let [tail-off (-tail-offset this)] (if (<= tail-off i) (aget tail (- i tail-off)) (loop [i i node root shift shift] (if (zero? shift) (let [arr (.-arr node)] (aget arr (bit-and (bit-shift-right i shift) 0x1f))) (if (regular? node) (let [arr (.-arr node) idx (bit-and (bit-shift-right i shift) 0x1f)] (loop [i i node (aget arr idx) shift (- shift 5)] (let [arr (.-arr node) idx (bit-and (bit-shift-right i shift) 0x1f)] (if (zero? shift) (aget arr idx) (recur i (aget arr idx) (- shift 5)))))) (let [arr (.-arr node) rngs (node-ranges node) idx (loop [j (bit-and (bit-shift-right i shift) 0x1f)] (if (< i (aget rngs j)) j (recur (inc j)))) i (if (zero? idx) i (- i (aget rngs (dec idx))))] (recur i (aget arr idx) (- shift 5)))))))) (vector-index-out-of-bounds i cnt))) (-nth [this i not-found] (if (and (>= i 0) (< i cnt)) (-nth this i) not-found)) IMapEntry (-key [this] (-nth this 0)) (-val [this] (-nth this 1)) ICollection (-conj [this val] (if (< (alength tail) 32) (let [tail-len (alength tail) new-tail (make-array (inc tail-len))] (array-copy tail 0 new-tail 0 tail-len) (aset new-tail tail-len val) (Vector. (inc cnt) shift root new-tail meta nil)) (let [tail-node (->VectorNode (.-edit root) tail) new-tail (let [new-arr (make-array 1)] (aset new-arr 0 val) new-arr)] (if (overflow? root shift cnt) (if (regular? root) (let [new-arr (make-array 32) new-root (->VectorNode (.-edit root) new-arr)] (doto new-arr (aset 0 root) (aset 1 (new-path tail (.-edit root) shift tail-node))) (Vector. (inc cnt) (+ shift 5) new-root new-tail meta nil)) (let [new-arr (make-array 33) new-rngs (make-array 33) new-root (->VectorNode (.-edit root) new-arr) root-total-range (aget (node-ranges root) 31)] (doto new-arr (aset 0 root) (aset 1 (new-path tail (.-edit root) shift tail-node)) (aset 32 new-rngs)) (doto new-rngs (aset 0 root-total-range) (aset 1 (+ root-total-range 32)) (aset 32 2)) (Vector. (inc cnt) (+ shift 5) new-root new-tail meta nil))) (Vector. (inc cnt) shift (push-tail shift cnt (.-edit root) root tail-node) new-tail meta nil))))) IEmptyableCollection (-empty [_] (with-meta cljs.core.PersistentVector.EMPTY meta)) IStack (-peek [this] (when (pos? cnt) (-nth this (dec cnt)))) (-pop [this] (cond (zero? cnt) (throw (js/Error. "Can't pop empty vector")) (== 1 cnt) (-with-meta cljs.core.PersistentVector.EMPTY meta) (> (alength tail) 1) (let [new-tail (make-array (dec (alength tail)))] (array-copy tail 0 new-tail 0 (alength new-tail)) (Vector. (dec cnt) shift root new-tail meta nil)) :else (let [new-tail (-array-for this (- cnt 2)) root-cnt (-tail-offset this) new-root (pop-tail shift root-cnt (.-edit root) root)] (cond (nil? new-root) (Vector. (dec cnt) shift empty-node new-tail meta nil) (and (> shift 5) (nil? (aget (.-arr new-root) 1))) (Vector. (dec cnt) (- shift 5) (aget (.-arr new-root) 0) new-tail meta nil) :else (Vector. (dec cnt) shift new-root new-tail meta nil))))) IVector (-assoc-n [this i val] (cond (and (<= 0 i) (< i cnt)) (let [tail-off (-tail-offset this)] (if (>= i tail-off) (let [new-tail (make-array (alength tail)) idx (- i tail-off)] (array-copy tail 0 new-tail 0 (alength tail)) (aset new-tail idx val) (Vector. cnt shift root new-tail meta nil)) (Vector. cnt shift (do-assoc shift root i val) tail meta nil))) (== i cnt) (-conj this val) :else (vector-index-out-of-bounds i cnt))) IReversible (-rseq [this] (if (pos? cnt) (RSeq. this (dec cnt) nil) nil)) IAssociative (-assoc [this k v] (-assoc-n this k v)) ILookup (-lookup [this k] (-nth this k nil)) (-lookup [this k not-found] (-nth this k not-found)) IFn (-invoke [this k] (-nth this k)) (-invoke [this k not-found] (-nth this k not-found)) IReduce (-reduce [this f] (ci-reduce this f)) (-reduce [this f start] (ci-reduce this f start)) IKVReduce (-kv-reduce [this f init] (if (zero? cnt) init (loop [i 0 j 0 init init arr (-array-for this i) lim (dec (alength arr)) step (inc lim)] (let [init (f init (+ i j) (aget arr j))] (if (reduced? init) @init (if (< j lim) (recur i (inc j) init arr lim step) (let [i (+ i step)] (if (< i cnt) (let [arr (-array-for this i) len (alength arr) lim (dec len)] (recur i 0 init arr lim len)) init)))))))) IComparable (-compare [this that] (compare-indexed this that)) IEditableCollection (-as-transient [this] (->Transient cnt shift (editable-root root) (editable-tail tail) (alength tail))) PSliceableVector (-slicev [this start end] (let [new-cnt (- end start)] (cond (or (neg? start) (> end cnt)) (throw (js/Error. "vector index out of bounds")) (== start end) ;; NB. preserves metadata (empty this) (> start end) (throw (js/Error. "start index greater than end index")) :else (let [tail-off (-tail-offset this)] (if (>= start tail-off) (let [new-tail (make-array new-cnt)] (array-copy tail (- start tail-off) new-tail 0 new-cnt) (Vector. new-cnt 5 empty-node new-tail meta nil)) (let [tail-cut? (> end tail-off) new-root (if tail-cut? root (slice-right root shift end)) new-root (if (zero? start) new-root (slice-left new-root shift start (min end tail-off))) new-tail (if tail-cut? (let [new-len (- end tail-off) new-tail (make-array new-len)] (array-copy tail 0 new-tail 0 new-len) new-tail) (-array-for (Vector. new-cnt shift new-root (array) meta nil) (dec new-cnt))) new-root (if tail-cut? new-root (pop-tail shift new-cnt (.-edit new-root) new-root))] (if (nil? new-root) (Vector. new-cnt 5 empty-node new-tail meta nil) (loop [r new-root s shift] (if (and (> s 5) (nil? (aget (.-arr r) 1))) (recur (aget (.-arr r) 0) (- s 5)) (Vector. new-cnt s r new-tail meta nil)))))))))) PSpliceableVector (-splicev [this that] (splice-rrbts this (-as-rrbt that))) IVecImpl (-tail-offset [this] (- cnt (alength tail))) (-array-for [this i] (if (and (<= 0 i) (< i cnt)) (if (>= i (-tail-offset this)) tail (loop [i i node root shift shift] (if (zero? shift) (.-arr node) (if (regular? node) (loop [node (aget (.-arr node) (bit-and (bit-shift-right i shift) 0x1f)) shift (- shift 5)] (if (zero? shift) (.-arr node) (recur (aget (.-arr node) (bit-and (bit-shift-right i shift) 0x1f)) (- shift 5)))) (let [rngs (node-ranges node) j (loop [j (bit-and (bit-shift-right i shift) 0x1f)] (if (< i (aget rngs j)) j (recur (inc j)))) i (if (pos? j) (- i (aget rngs (dec j))) i)] (recur i (aget (.-arr node) j) (- shift 5))))))) (vector-index-out-of-bounds i cnt))) AsRRBT (-as-rrbt [this] this)) (extend-protocol AsRRBT cljs.core.PersistentVector (-as-rrbt [this] (Vector. (count this) (.-shift this) (.-root this) (.-tail this) (meta this) nil)) Subvec (-as-rrbt [this] (let [v (.-v this) start (.-start this) end (.-end this)] (-slicev (-as-rrbt v) start end)))) (defn shift-from-to [node from to] (cond (== from to) node (regular? node) (recur (->VectorNode (.-edit node) (doto (make-array 32) (aset 0 node))) (+ 5 from) to) :else (recur (->VectorNode (.-edit node) (doto (make-array 33) (aset 0 node) (aset 32 (doto (make-array 33) (aset 0 (last-range node)) (aset 32 1))))) (+ 5 from) to))) (defn slot-count [node shift] (let [arr (.-arr node)] (if (zero? shift) (alength arr) (if (regular? node) (index-of-nil arr) (let [rngs (node-ranges node)] (aget rngs 32)))))) (defn subtree-branch-count [node shift] ;; NB. positive shifts only (let [arr (.-arr node) cs (- shift 5)] (if (regular? node) (loop [i 0 sbc 0] (if (== i 32) sbc (if-let [child (aget arr i)] (recur (inc i) (+ sbc (slot-count child cs))) sbc))) (let [lim (aget (node-ranges node) 32)] (loop [i 0 sbc 0] (if (== i lim) sbc (let [child (aget arr i)] (recur (inc i) (+ sbc (slot-count child cs)))))))))) (defn leaf-seq [arr] (mapcat #(.-arr %) (take (index-of-nil arr) arr))) (defn rebalance-leaves [n1 cnt1 n2 cnt2 transferred-leaves] (let [slc1 (slot-count n1 5) slc2 (slot-count n2 5) a (+ slc1 slc2) sbc1 (subtree-branch-count n1 5) sbc2 (subtree-branch-count n2 5) p (+ sbc1 sbc2) e (- a (inc (quot (dec p) 32)))] (cond (<= e max-extra-search-steps) (array n1 n2) (<= (+ sbc1 sbc2) 1024) (let [reg? (zero? (mod p 32)) new-arr (make-array (if reg? 32 33)) new-n1 (->VectorNode nil new-arr)] (loop [i 0 bs (partition-all 32 (concat (leaf-seq (.-arr n1)) (leaf-seq (.-arr n2))))] (when-first [block bs] (let [a (make-array (count block))] (loop [i 0 xs (seq block)] (when xs (aset a i (first xs)) (recur (inc i) (next xs)))) (aset new-arr i (->VectorNode nil a)) (recur (inc i) (next bs))))) (if-not reg? (aset new-arr 32 (regular-ranges 5 p))) (set! (.-val transferred-leaves) sbc2) (array new-n1 nil)) :else (let [reg? (zero? (mod p 32)) new-arr1 (make-array 32) new-arr2 (make-array (if reg? 32 33)) new-n1 (->VectorNode nil new-arr1) new-n2 (->VectorNode nil new-arr2)] (loop [i 0 bs (partition-all 32 (concat (leaf-seq (.-arr n1)) (leaf-seq (.-arr n2))))] (when-first [block bs] (let [a (make-array (count block))] (loop [i 0 xs (seq block)] (when xs (aset a i (first xs)) (recur (inc i) (next xs)))) (if (< i 32) (aset new-arr1 i (->VectorNode nil a)) (aset new-arr2 (- i 32) (->VectorNode nil a))) (recur (inc i) (next bs))))) (if-not reg? (aset new-arr2 32 (regular-ranges 5 (- p 1024)))) (set! (.-val transferred-leaves) (- 1024 sbc1)) (array new-n1 new-n2))))) (defn child-seq [node shift cnt] (let [arr (.-arr node) rngs (if (regular? node) (regular-ranges shift cnt) (node-ranges node)) cs (if rngs (aget rngs 32) (index-of-nil arr)) cseq (fn cseq [c r] (let [arr (.-arr c) rngs (if (regular? c) (regular-ranges (- shift 5) r) (node-ranges c)) gcs (if rngs (aget rngs 32) (index-of-nil arr))] (map list (take gcs arr) (take gcs (map - rngs (cons 0 rngs))))))] (mapcat cseq (take cs arr) (take cs (map - rngs (cons 0 rngs)))))) (defn rebalance [shift n1 cnt1 n2 cnt2 transferred-leaves] (if (nil? n2) (array n1 nil) (let [slc1 (slot-count n1 shift) slc2 (slot-count n2 shift) a (+ slc1 slc2) sbc1 (subtree-branch-count n1 shift) sbc2 (subtree-branch-count n2 shift) p (+ sbc1 sbc2) e (- a (inc (quot (dec p) 32)))] (cond (<= e max-extra-search-steps) (array n1 n2) (<= (+ sbc1 sbc2) 1024) (let [new-arr (make-array 33) new-rngs (make-array 33) new-n1 (->VectorNode nil new-arr)] (loop [i 0 bs (partition-all 32 (concat (child-seq n1 shift cnt1) (child-seq n2 shift cnt2)))] (when-first [block bs] (let [a (make-array 33) r (make-array 33)] (aset a 32 r) (aset r 32 (count block)) (loop [i 0 o 0 gcs (seq block)] (when-first [[gc gcr] gcs] (aset a i gc) (aset r i (+ o gcr)) (recur (inc i) (+ o gcr) (next gcs)))) (aset new-arr i (->VectorNode nil a)) (aset new-rngs i (+ (aget r (dec (aget r 32))) (if (pos? i) (aget new-rngs (dec i)) 0))) (aset new-rngs 32 (inc i)) (recur (inc i) (next bs))))) (aset new-arr 32 new-rngs) (set! (.-val transferred-leaves) cnt2) (array new-n1 nil)) :else (let [new-arr1 (make-array 33) new-arr2 (make-array 33) new-rngs1 (make-array 33) new-rngs2 (make-array 33) new-n1 (->VectorNode nil new-arr1) new-n2 (->VectorNode nil new-arr2)] (loop [i 0 bs (partition-all 32 (concat (child-seq n1 shift cnt1) (child-seq n2 shift cnt2)))] (when-first [block bs] (let [a (make-array 33) r (make-array 33)] (aset a 32 r) (aset r 32 (count block)) (loop [i 0 o 0 gcs (seq block)] (when-first [[gc gcr] gcs] (aset a i gc) (aset r i (+ o gcr)) (recur (inc i) (+ o gcr) (next gcs)))) (if (and (< i 32) (> (+ (* i 32) (count block)) sbc1)) (let [tbs (- (+ (* i 32) (count block)) sbc1) li (dec (aget r 32)) d (if (>= tbs 32) (aget r li) (- (aget r li) (aget r (- li tbs))))] (set! (.-val transferred-leaves) (+ (.-val transferred-leaves) d)))) (let [new-arr (if (< i 32) new-arr1 new-arr2) new-rngs (if (< i 32) new-rngs1 new-rngs2) i (mod i 32)] (aset new-arr i (->VectorNode nil a)) (aset new-rngs i (+ (aget r (dec (aget r 32))) (if (pos? i) (aget new-rngs (dec i)) 0))) (aset new-rngs 32 (inc i))) (recur (inc i) (next bs))))) (aset new-arr1 32 new-rngs1) (aset new-arr2 32 new-rngs2) (array new-n1 new-n2)))))) (defn zippath [shift n1 cnt1 n2 cnt2 transferred-leaves] (if (== shift 5) (rebalance-leaves n1 cnt1 n2 cnt2 transferred-leaves) (let [c1 (last-child n1) c2 (first-child n2) ccnt1 (if (regular? n1) (let [m (mod cnt1 (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (let [rngs (node-ranges n1) i (dec (aget rngs 32))] (if (zero? i) (aget rngs 0) (- (aget rngs i) (aget rngs (dec i)))))) ccnt2 (if (regular? n2) (let [m (mod cnt2 (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (aget (node-ranges n2) 0)) next-transferred-leaves (Box. 0) [new-c1 new-c2] (zippath (- shift 5) c1 ccnt1 c2 ccnt2 next-transferred-leaves) d (.-val next-transferred-leaves)] (set! (.-val transferred-leaves) (+ (.-val transferred-leaves) d)) (rebalance shift (if (identical? c1 new-c1) n1 (replace-rightmost-child shift n1 new-c1 d)) (+ cnt1 d) (if new-c2 (if (identical? c2 new-c2) n2 (replace-leftmost-child shift n2 cnt2 new-c2 d)) (remove-leftmost-child shift n2)) (- cnt2 d) transferred-leaves)))) (defn squash-nodes [shift n1 cnt1 n2 cnt2] (let [arr1 (.-arr n1) arr2 (.-arr n2) li1 (index-of-nil arr1) li2 (index-of-nil arr2) slots (concat (take li1 arr1) (take li2 arr2))] (if (> (count slots) 32) (array n1 n2) (let [new-rngs (make-array 33) new-arr (make-array 33) rngs1 (take li1 (if (regular? n1) (regular-ranges shift cnt1) (node-ranges n1))) rngs2 (take li2 (if (regular? n2) (regular-ranges shift cnt2) (node-ranges n2))) rngs2 (let [r (last rngs1)] (map #(+ % r) rngs2)) rngs (concat rngs1 rngs2)] (aset new-arr 32 new-rngs) (loop [i 0 cs (seq slots)] (when cs (aset new-arr i (first cs)) (recur (inc i) (next cs)))) (loop [i 0 rngs (seq rngs)] (if rngs (do (aset new-rngs i (first rngs)) (recur (inc i) (next rngs))) (aset new-rngs 32 i))) (array (->VectorNode nil new-arr) nil))))) (def peephole-optimization-config (atom {:debug-fn nil})) (def peephole-optimization-count (atom 0)) ;; TBD: Transducer versions of child-nodes and bounded-grandchildren ;; are included here for when we are willing to rely upon Clojure ;; 1.7.0 as the minimum version supported by the core.rrb-vector ;; library. They are faster. #_(defn child-nodes [node] (into [] (comp (take-while (complement nil?)) (take 32)) (.-arr node))) (defn child-nodes [node] (->> (.-arr node) (take-while (complement nil?)) (take 32))) ;; (take 33) is just a technique to avoid generating more ;; grandchildren than necessary. If there are at least 33, we do not ;; care how many there are. #_(defn bounded-grandchildren [children] (into [] (comp (map child-nodes) cat (take 33)) children)) (defn bounded-grandchildren [children] (->> children (mapcat child-nodes) (take 33))) ;; TBD: Do functions like last-non-nil-idx and ;; count-vector-elements-beneath already exist elsewhere in this ;; library? It seems like they might. ;; A regular tree node is guaranteed to have only 32-way branching at ;; all nodes, except perhaps along the right spine, where it can be ;; partial. From a regular tree node down, all leaf arrays ;; (containing vector elements directly) are restricted to contain a ;; full 32 vector elements. This code relies on these invariants to ;; quickly calculate the number of vector elements beneath a regular ;; node in O(log N) time. (defn last-non-nil-idx [arr] (loop [i (dec (alength arr))] (if (neg? i) i (if (nil? (aget arr i)) (recur (dec i)) i)))) (defn count-vector-elements-beneath [node shift] (if (regular? node) (loop [node node shift shift acc 0] (if (zero? shift) (if (nil? node) acc ;; The +32 is for the regular leaf node reached at shift 0 (+ acc 32)) (let [arr (.-arr node) max-child-idx (last-non-nil-idx arr) num-elems-in-full-child (bit-shift-left 1 shift)] (if (< max-child-idx 0) acc (recur (aget arr max-child-idx) (- shift 5) (+ acc (* max-child-idx num-elems-in-full-child))))))) ;; irregular case (let [rngs (node-ranges node)] (aget rngs (dec (aget rngs 32)))))) (defn peephole-optimize-root [v] (let [config @peephole-optimization-config] (if (<= (.-shift v) 10) ;; Tree depth cannot be reduced if shift <= 5. ;; TBD: If shift=10, the grandchildren nodes need to be handled ;; by an am array manager for primitive vectors, which I haven't ;; written code for yet below, but so far this peephole ;; optimizer seems to be working sufficiently well without ;; handling that case. v (let [root (.-root v) children (child-nodes root) grandchildren (bounded-grandchildren children) num-granchildren-bounded (count grandchildren) many-grandchildren? (> num-granchildren-bounded 32)] (if many-grandchildren? ;; If it is possible to reduce tree depth, it requires going ;; deeper than just to the grandchildren, which is beyond ;; what this peephole optimizer is intended to do. v ;; Create a new root node that points directly at the ;; grandchildren, since there are few enough of them. (let [new-arr (make-array 33) new-rngs (make-array 33) new-root (->VectorNode (.-edit root) new-arr) shift (.-shift v) grandchild-shift (- shift (* 2 5))] (swap! peephole-optimization-count inc) (loop [idx 0 remaining-gc grandchildren elem-sum 0] (if-let [remaining-gc (seq remaining-gc)] (let [grandchild (first remaining-gc) num-elems-this-grandchild (count-vector-elements-beneath grandchild grandchild-shift) next-elem-sum (+ elem-sum num-elems-this-grandchild)] (aset new-arr idx grandchild) (aset new-rngs idx next-elem-sum) (recur (inc idx) (rest remaining-gc) next-elem-sum)))) (aset new-rngs 32 num-granchildren-bounded) (aset new-arr 32 new-rngs) (let [new-v (Vector. (.-cnt v) (- shift 5) new-root (.-tail v) (.-meta v) nil)] (when (:debug-fn config) ((:debug-fn config) v new-v)) new-v))))))) ;; TBD: I do not know if this implementation actually supports this ;; many elements in one vector. What is the limit? I picked this ;; number simply to match what I believe is the upper limit for the ;; Clojure implementation. (def max-vector-elements 2147483647) ;; Larger shift values than 64 definitely break assumptions all over ;; the RRB vector implementation, e.g. (bit-shift-right 255 65) ;; returns the same result as (bit-shift-right 255 1), I believe ;; because the shift amount argument is effectively modulo'd by 64. ;; Larger shift values than 30 are unlikely to make sense, given that ;; the maximum number of vector elements supported is somewhere near ;; 2^31-1. (defn shift-too-large? [v] (> (.-shift v) 30)) ;; The maximum number of vector elements in a tree, not counting any ;; elements in the tail, with a given shift value is: ;; ;; (bit-shift-left 1 (+ shift 5)) ;; ;; It is perfectly normal to have vectors with a root tree node with ;; only 1 non-nil child, so at a fraction 1/32 of maximum capacity. I ;; do not know the exact minimum fraction that RRB vectors as ;; implemented here should allow, but I suspect it is well over ;; 1/1024. (defn poor-branching? [v] (let [tail-off (-tail-offset v)] (if (zero? tail-off) false (let [shift-amount (- (.-shift v) 5) max-capacity-divided-by-1024 (bit-shift-left 1 shift-amount)] (< tail-off max-capacity-divided-by-1024))))) ;; Note 3: ;; Consider measuring several ways in ClojureScript to create a ;; regular persistent vector from another one, to see which is ;; fastest, and use it here. ;; TBD: Is there any promise about what metadata catvec returns? ;; Always the same as on the first argument? (def fallback-config (atom {:debug-fn nil})) (def fallback-to-slow-splice-count1 (atom 0)) (def fallback-to-slow-splice-count2 (atom 0)) (defn fallback-to-slow-splice-if-needed [v1 v2 splice-result] (let [config @fallback-config] (if (or (shift-too-large? splice-result) (poor-branching? splice-result)) (do (dbg (str "splice-rrbts result had shift " (.-shift splice-result) " and " (-tail-offset splice-result) " elements not counting" " the tail. Falling back to slower method of concatenation.")) (if (poor-branching? v1) ;; The v1 we started with was not good, either. (do (swap! fallback-to-slow-splice-count1 inc) (dbg (str "splice-rrbts first arg had shift " (.-shift v1) " and " (-tail-offset v1) " elements not counting" " the tail. Building the result from scratch.")) ;: See Note 3 (let [new-splice-result (-> (empty v1) (into v1) (into v2))] (when (:debug-fn config) ((:debug-fn config) splice-result new-splice-result)) new-splice-result)) ;; Assume that v1 is balanced enough that we can use into to ;; add all elements of v2 to it, without problems. TBD: ;; That assumption might be incorrect. Consider checking ;; the result of this, too, and fall back again to the true ;; case above? (let [new-splice-result (into v1 v2)] (swap! fallback-to-slow-splice-count2 inc) (when (:debug-fn config) ((:debug-fn config) splice-result new-splice-result)) new-splice-result))) ;; else the fast result is good splice-result))) (defn splice-rrbts-main [v1 v2] (cond (zero? (count v1)) v2 (> (+ (count v1) (count v2)) max-vector-elements) (let [c1 (count v1), c2 (count v2)] (throw (js/Error. (str "Attempted to concatenate two vectors whose total" " number of elements is " (+ c1 c2) ", which is" " larger than the maximum number of elements " max-vector-elements " supported in a vector ")))) (< (count v2) rrbt-concat-threshold) (into v1 v2) :else (let [s1 (.-shift v1) s2 (.-shift v2) r1 (.-root v1) o? (overflow? r1 s1 (+ (count v1) (- 32 (alength (.-tail v1))))) r1 (if o? (let [tail (.-tail v1) tail-node (->VectorNode nil tail) reg? (and (regular? r1) (== (alength tail) 32)) arr (make-array (if reg? 32 33))] (aset arr 0 r1) (aset arr 1 (new-path* s1 tail-node)) (if-not reg? (let [rngs (make-array 33)] (aset rngs 32 2) (aset rngs 0 (- (count v1) (alength tail))) (aset rngs 1 (count v1)) (aset arr 32 rngs))) (->VectorNode nil arr)) (fold-tail r1 s1 (-tail-offset v1) (.-tail v1))) s1 (if o? (+ s1 5) s1) r2 (.-root v2) s (max s1 s2) r1 (shift-from-to r1 s1 s) r2 (shift-from-to r2 s2 s) transferred-leaves (Box. 0) [n1 n2] (zippath s r1 (count v1) r2 (- (count v2) (alength (.-tail v2))) transferred-leaves) d (.-val transferred-leaves) ncnt1 (+ (count v1) d) ncnt2 (- (count v2) (alength (.-tail v2)) d) [n1 n2] (if (identical? n2 r2) (squash-nodes s n1 ncnt1 n2 ncnt2) (array n1 n2)) ncnt1 (if n2 ncnt1 (+ ncnt1 ncnt2)) ncnt2 (if n2 ncnt2 0)] (if n2 (let [arr (make-array 33) new-root (->VectorNode nil arr)] (aset arr 0 n1) (aset arr 1 n2) (aset arr 32 (doto (make-array 33) (aset 0 ncnt1) (aset 1 (+ ncnt1 ncnt2)) (aset 32 2))) (Vector. (+ (count v1) (count v2)) (+ s 5) new-root (.-tail v2) nil nil)) (loop [r n1 s s] (if (and (> s 5) (nil? (aget (.-arr r) 1))) (recur (aget (.-arr r) 0) (- s 5)) (Vector. (+ (count v1) (count v2)) s r (.-tail v2) nil nil))))))) (defn splice-rrbts [v1 v2] (let [r1 (splice-rrbts-main v1 v2) r2 (peephole-optimize-root r1)] (fallback-to-slow-splice-if-needed v1 v2 r2))) (deftype Transient [^:mutable cnt ^:mutable shift ^:mutable root ^:mutable tail ^:mutable tidx] ;; The Clojure/Java deftype Transient implements the Java interface ;; clojure.lang.ILookup. The corresponding ClojureScript protocol ;; named ILookup is implemented by the deftype Vector in this file. ;; TBD: Should ILookup be implemented by type Transient, too? IIndexed (-nth [this i] (if (and (<= 0 i) (< i cnt)) (let [tail-off (-tail-offset this)] (if (<= tail-off i) (aget tail (- i tail-off)) (loop [i i node root shift shift] (if (zero? shift) (let [arr (.-arr node)] (aget arr (bit-and (bit-shift-right i shift) 0x1f))) (if (regular? node) (let [arr (.-arr node) idx (bit-and (bit-shift-right i shift) 0x1f)] (loop [i i node (aget arr idx) shift (- shift 5)] (let [arr (.-arr node) idx (bit-and (bit-shift-right i shift) 0x1f)] (if (zero? shift) (aget arr idx) (recur i (aget arr idx) (- shift 5)))))) (let [arr (.-arr node) rngs (node-ranges node) idx (loop [j (bit-and (bit-shift-right i shift) 0x1f)] (if (< i (aget rngs j)) j (recur (inc j)))) i (if (zero? idx) i (- i (aget rngs (dec idx))))] (recur i (aget arr idx) (- shift 5)))))))) (vector-index-out-of-bounds i cnt))) (-nth [this i not-found] (if (and (>= i 0) (< i cnt)) (-nth this i) not-found)) IFn (-invoke [this k] (-nth this k)) (-invoke [this k not-found] (-nth this k not-found)) ITransientCollection (-conj! [this o] (if ^boolean (.-edit root) (if (< tidx 32) (do (aset tail tidx o) (set! cnt (inc cnt)) (set! tidx (inc tidx)) this) (let [tail-node (->VectorNode (.-edit root) tail) new-tail (make-array 32)] (aset new-tail 0 o) (set! tail new-tail) (set! tidx 1) (if (overflow? root shift cnt) (if (regular? root) (let [new-arr (make-array 32)] (doto new-arr (aset 0 root) (aset 1 (new-path tail (.-edit root) shift tail-node))) (set! root (->VectorNode (.-edit root) new-arr)) (set! shift (+ shift 5)) (set! cnt (inc cnt)) this) (let [new-arr (make-array 33) new-rngs (make-array 33) new-root (->VectorNode (.-edit root) new-arr) root-total-range (aget (node-ranges root) 31)] (doto new-arr (aset 0 root) (aset 1 (new-path tail (.-edit root) shift tail-node)) (aset 32 new-rngs)) (doto new-rngs (aset 0 root-total-range) (aset 1 (+ root-total-range 32)) (aset 32 2)) (set! root new-root) (set! shift (+ shift 5)) (set! cnt (inc cnt)) this)) (let [new-root (push-tail! shift cnt (.-edit root) root tail-node)] (set! root new-root) (set! cnt (inc cnt)) this)))) (throw (js/Error. "conj! after persistent!")))) (-persistent! [this] (if ^boolean (.-edit root) (do (set! (.-edit root) nil) (let [trimmed-tail (make-array tidx)] (array-copy tail 0 trimmed-tail 0 tidx) (Vector. cnt shift root trimmed-tail nil nil))) (throw (js/Error. "persistent! called twice")))) ITransientAssociative (-assoc! [this key val] (-assoc-n! this key val)) ITransientVector (-assoc-n! [this i val] (if ^boolean (.-edit root) (cond (and (<= 0 i) (< i cnt)) (let [tail-off (-tail-offset this)] (if (<= tail-off i) (aset tail (- i tail-off) val) (set! root (do-assoc! shift (.-edit root) root i val))) this) (== i cnt) (-conj! this val) :else (vector-index-out-of-bounds i cnt)) (throw (js/Error. "assoc! after persistent!")))) (-pop! [this] (if ^boolean (.-edit root) (cond (zero? cnt) (throw (js/Error. "Can't pop empty vector")) (== 1 cnt) (do (set! cnt 0) (set! tidx 0) (aset tail 0 nil) this) (> tidx 1) (do (set! cnt (dec cnt)) (set! tidx (dec tidx)) (aset tail tidx nil) this) :else (let [new-tail-base (-array-for this (- cnt 2)) new-tail (editable-tail new-tail-base) new-tidx (alength new-tail-base) new-root (pop-tail! shift cnt (.-edit root) root)] (cond (nil? new-root) (do (set! cnt (dec cnt)) (set! root (ensure-editable (.-edit root) empty-node)) (set! tail new-tail) (set! tidx new-tidx) this) (and (> shift 5) (nil? (aget (.-arr new-root) 1))) (do (set! cnt (dec cnt)) (set! shift (- shift 5)) (set! root (ensure-editable (.-edit root) (aget (.-arr new-root) 0))) (set! tail new-tail) (set! tidx new-tidx) this) :else (do (set! cnt (dec cnt)) (set! root new-root) (set! tail new-tail) (set! tidx new-tidx) this)))) (throw (js/Error. "count after persistent!")))) ICounted (-count [this] (if ^boolean (.-edit root) cnt (throw (js/Error. "count after persistent!")))) IVecImpl (-tail-offset [this] (- cnt tidx)) (-array-for [this i] (if (and (<= 0 i) (< i cnt)) (if (>= i (-tail-offset this)) tail (loop [i i node root shift shift] (if (zero? shift) (.-arr node) (if (regular? node) (loop [node (aget (.-arr node) (bit-and (bit-shift-right i shift) 0x1f)) shift (- shift 5)] (if (zero? shift) (.-arr node) (recur (aget (.-arr node) (bit-and (bit-shift-right i shift) 0x1f)) (- shift 5)))) (let [rngs (node-ranges node) j (loop [j (bit-and (bit-shift-right i shift) 0x1f)] (if (< i (aget rngs j)) j (recur (inc j)))) i (if (pos? j) (- i (aget rngs (dec j))) i)] (recur i (aget (.-arr node) j) (- shift 5))))))) (vector-index-out-of-bounds i cnt)))) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/transients.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.transients (:refer-clojure :exclude [new-path]) (:require [clojure.core.rrb-vector.nodes :refer [regular? clone node-ranges last-range overflow?]] [clojure.core.rrb-vector.trees :refer [new-path]])) (defn ensure-editable [edit node] (if (identical? (.-edit node) edit) node (let [new-arr (aclone (.-arr node))] (if (== 33 (alength new-arr)) (aset new-arr 32 (aclone (aget new-arr 32)))) (VectorNode. edit new-arr)))) (defn editable-root [root] (let [new-arr (aclone (.-arr root))] (if (== 33 (alength new-arr)) (aset new-arr 32 (aclone (aget new-arr 32)))) (VectorNode. (js-obj) new-arr))) (defn editable-tail [tail] (let [ret (make-array 32)] (array-copy tail 0 ret 0 (alength tail)) ret)) ;; Note 1: This condition check and exception are a little bit closer ;; to the source of the cause for what was issue CRRBV-20, added in ;; case there is still some remaining way to cause this condition to ;; occur. ;; Note 2: In the worst case, when the tree is nearly full, calling ;; overflow? here takes run time O(tree_depth^2) here. That could be ;; made O(tree_depth). One way would be to call push-tail! in hopes ;; that it succeeds, but return some distinctive value indicating a ;; failure on the full condition, and create the node via a new-path ;; call at most recent recursive push-tail! call that has an empty ;; slot available. (defn push-tail! [shift cnt root-edit current-node tail-node] (let [ret (ensure-editable root-edit current-node)] (if (regular? ret) (do (loop [n ret shift shift] (let [arr (.-arr n) subidx (bit-and (bit-shift-right (dec cnt) shift) 0x1f)] (if (== shift 5) (aset arr subidx tail-node) (let [child (aget arr subidx)] (if (nil? child) (aset arr subidx (new-path (.-arr tail-node) root-edit (- shift 5) tail-node)) (let [editable-child (ensure-editable root-edit child)] (aset arr subidx editable-child) (recur editable-child (- shift 5)))))))) ret) (let [arr (.-arr ret) rngs (node-ranges ret) li (dec (aget rngs 32)) cret (if (== shift 5) nil (let [child (ensure-editable root-edit (aget arr li)) ccnt (+ (if (pos? li) (- (aget rngs li) (aget rngs (dec li))) (aget rngs 0)) ;; add 32 elems to account for the ;; new full tail we plan to add to ;; the subtree. 32)] ;; See Note 2 (if-not (overflow? child (- shift 5) ccnt) (push-tail! (- shift 5) ccnt root-edit child tail-node))))] (if cret (do (aset arr li cret) (aset rngs li (+ (aget rngs li) 32)) ret) (do (when (>= li 31) ;; See Note 1 (let [msg (str "Assigning index " (inc li) " of vector" " object array to become a node, when that" " index should only be used for storing" " range arrays.") data {:shift shift, :cnd cnt, :current-node current-node, :tail-node tail-node, :rngs rngs, :li li, :cret cret}] (throw (ex-info msg data)))) (aset arr (inc li) (new-path (.-arr tail-node) root-edit (- shift 5) tail-node)) (aset rngs (inc li) (+ (aget rngs li) 32)) (aset rngs 32 (inc (aget rngs 32))) ret)))))) (defn pop-tail! [shift cnt root-edit current-node] (let [ret (ensure-editable root-edit current-node)] (if (regular? ret) (let [subidx (bit-and (bit-shift-right (- cnt 2) shift) 0x1f)] (cond (> shift 5) (let [child (pop-tail! (- shift 5) cnt root-edit (aget (.-arr ret) subidx))] (if (and (nil? child) (zero? subidx)) nil (let [arr (.-arr ret)] (aset arr subidx child) ret))) (zero? subidx) nil :else (let [arr (.-arr ret)] (aset arr subidx nil) ret))) (let [rngs (node-ranges ret) subidx (dec (aget rngs 32))] (cond (> shift 5) (let [child (aget (.-arr ret) subidx) child-cnt (if (zero? subidx) (aget rngs 0) (- (aget rngs subidx) (aget rngs (dec subidx)))) new-child (pop-tail! (- shift 5) child-cnt root-edit child)] (cond (and (nil? new-child) (zero? subidx)) nil (regular? child) (let [arr (.-arr ret)] (aset rngs subidx (- (aget rngs subidx) 32)) (aset arr subidx new-child) (if (nil? new-child) (aset rngs 32 (dec (aget rngs 32)))) ret) :else (let [rng (last-range child) diff (- rng (if new-child (last-range new-child) 0)) arr (.-arr ret)] (aset rngs subidx (- (aget rngs subidx) diff)) (aset arr subidx new-child) (if (nil? new-child) (aset rngs 32 (dec (aget rngs 32)))) ret))) (zero? subidx) nil :else (let [arr (.-arr ret) child (aget arr subidx)] (aset arr subidx nil) (aset rngs subidx 0) (aset rngs 32 (dec (aget rngs 32))) ret)))))) (defn do-assoc! [shift root-edit current-node i val] (let [ret (ensure-editable root-edit current-node)] (if (regular? ret) (loop [shift shift node ret] (if (zero? shift) (let [arr (.-arr node)] (aset arr (bit-and i 0x1f) val)) (let [arr (.-arr node) subidx (bit-and (bit-shift-right i shift) 0x1f) child (ensure-editable root-edit (aget arr subidx))] (aset arr subidx child) (recur (- shift 5) child)))) (let [arr (.-arr ret) rngs (node-ranges ret) subidx (bit-and (bit-shift-right i shift) 0x1f) subidx (loop [subidx subidx] (if (< i (int (aget rngs subidx))) subidx (recur (inc subidx)))) i (if (zero? subidx) i (- i (aget rngs (dec subidx))))] (aset arr subidx (do-assoc! (- shift 5) root-edit (aget arr subidx) i val)))) ret)) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector/trees.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.trees (:refer-clojure :exclude [array-for push-tail pop-tail new-path do-assoc]) (:require [clojure.core.rrb-vector.nodes :refer [regular? clone node-ranges last-range overflow?]])) (defn new-path [tail edit shift current-node] (if (== (alength tail) 32) (loop [s 0 n current-node] (if (== s shift) n (let [arr (make-array 32) ret (->VectorNode edit arr)] (aset arr 0 n) (recur (+ s 5) ret)))) (loop [s 0 n current-node] (if (== s shift) n (let [arr (make-array 33) rngs (make-array 33) ret (->VectorNode edit arr)] (aset arr 0 n) (aset arr 32 rngs) (aset rngs 32 1) (aset rngs 0 (alength tail)) (recur (+ s 5) ret)))))) (defn push-tail [shift cnt root-edit current-node tail-node] (if (regular? current-node) (let [arr (aclone (.-arr current-node)) ret (->VectorNode (.-edit current-node) arr)] (loop [n ret shift shift] (let [arr (.-arr n) subidx (bit-and (bit-shift-right (dec cnt) shift) 0x1f)] (if (== shift 5) (aset arr subidx tail-node) (if-let [child (aget arr subidx)] (let [new-carr (aclone (.-arr child)) new-child (->VectorNode root-edit new-carr)] (aset arr subidx new-child) (recur new-child (- shift 5))) (aset arr subidx (new-path (.-arr tail-node) root-edit (- shift 5) tail-node)))))) ret) (let [arr (aclone (.-arr current-node)) rngs (node-ranges current-node) li (dec (aget rngs 32)) ret (->VectorNode (.-edit current-node) arr) cret (if (== shift 5) nil (let [child (aget arr li) ccnt (+ (if (pos? li) (- (aget rngs li) (aget rngs (dec li))) (aget rngs 0)) ;; add 32 elems to account for the new ;; 32-elem tail we plan to add to the ;; subtree. 32)] ;; See Note 2 in file transients.cljs (if-not (overflow? child (- shift 5) ccnt) (push-tail (- shift 5) ccnt root-edit child tail-node))))] (if cret (do (aset arr li cret) (aset rngs li (+ (aget rngs li) 32)) ret) (do (when (>= li 31) ;; See Note 1 in file transients.cljs (let [msg (str "Assigning index " (inc li) " of vector" " object array to become a node, when that" " index should only be used for storing" " range arrays.") data {:shift shift, :cnt cnt, :current-node current-node, :tail-node tail-node, :rngs rngs, :li li, :cret cret}] (throw (ex-info msg data)))) (aset arr (inc li) (new-path (.-arr tail-node) root-edit (- shift 5) tail-node)) (aset rngs (inc li) (+ (aget rngs li) 32)) (aset rngs 32 (inc (aget rngs 32))) ret))))) (defn pop-tail [shift cnt root-edit current-node] (if (regular? current-node) (let [subidx (bit-and (bit-shift-right (- cnt 2) shift) 0x1f)] (cond (> shift 5) (let [new-child (pop-tail (- shift 5) cnt root-edit (aget (.-arr current-node) subidx))] (if (and (nil? new-child) (zero? subidx)) nil (let [arr (aclone (.-arr current-node))] (aset arr subidx new-child) (->VectorNode root-edit arr)))) (zero? subidx) nil :else (let [arr (aclone (.-arr current-node))] (aset arr subidx nil) (->VectorNode root-edit arr)))) (let [rngs (node-ranges current-node) subidx (dec (aget rngs 32)) new-rngs (aclone rngs)] (cond (> shift 5) (let [child (aget (.-arr current-node) subidx) child-cnt (if (zero? subidx) (aget rngs 0) (- (aget rngs subidx) (aget rngs (dec subidx)))) new-child (pop-tail (- shift 5) child-cnt root-edit child)] (cond (and (nil? new-child) (zero? subidx)) nil (regular? child) (let [arr (aclone (.-arr current-node))] (aset new-rngs subidx (- (aget new-rngs subidx) 32)) (aset arr subidx new-child) (aset arr 32 new-rngs) (if (nil? new-child) (aset new-rngs 32 (dec (aget new-rngs 32)))) (->VectorNode root-edit arr)) :else (let [rng (last-range child) diff (- rng (if new-child (last-range new-child) 0)) arr (aclone (.-arr current-node))] (aset new-rngs subidx (- (aget new-rngs subidx) diff)) (aset arr subidx new-child) (aset arr 32 new-rngs) (if (nil? new-child) (aset new-rngs 32 (dec (aget new-rngs 32)))) (->VectorNode root-edit arr)))) (zero? subidx) nil :else (let [arr (aclone (.-arr current-node)) child (aget arr subidx) new-rngs (aclone rngs)] (aset arr subidx nil) (aset arr 32 new-rngs) (aset new-rngs subidx 0) (aset new-rngs 32 (dec (aget new-rngs 32))) (->VectorNode root-edit arr)))))) (defn do-assoc [shift current-node i val] (if (regular? current-node) (let [node (clone shift current-node)] (loop [shift shift node node] (if (zero? shift) (let [arr (.-arr node)] (aset arr (bit-and i 0x1f) val)) (let [arr (.-arr node) subidx (bit-and (bit-shift-right i shift) 0x1f) child (clone shift (aget arr subidx))] (aset arr subidx child) (recur (- shift 5) child)))) node) (let [arr (aclone (.-arr current-node)) rngs (node-ranges current-node) subidx (bit-and (bit-shift-right i shift) 0x1f) subidx (loop [subidx subidx] (if (< i (int (aget rngs subidx))) subidx (recur (inc subidx)))) i (if (zero? subidx) i (- i (aget rngs (dec subidx))))] (aset arr subidx (do-assoc (- shift 5) (aget arr subidx) i val)) (->VectorNode (.-edit current-node) arr)))) ================================================ FILE: src/main/cljs/clojure/core/rrb_vector.cljs ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector "An implementation of the confluently persistent vector data structure introduced in Bagwell, Rompf, \"RRB-Trees: Efficient Immutable Vectors\", EPFL-REPORT-169879, September, 2011. RRB-Trees build upon Clojure's PersistentVectors, adding logarithmic time concatenation and slicing. The main API entry points are clojure.core.rrb-vector/catvec, performing vector concatenation, and clojure.core.rrb-vector/subvec, which produces a new vector containing the appropriate subrange of the input vector (in contrast to cljs.core/subvec, which returns a view on the input vector). The implementation allows for seamless interoperability with cljs.core/PersistentVector and cljs.core.Subvec instances: clojure.core.rrb-vector/catvec and clojure.core.rrb-vector/subvec convert their inputs to clojure.core.rrb-vector.rrbt/Vector instances whenever necessary (this is a very fast constant time operation for PersistentVector; for Subvec it is O(log n), where n is the size of the underlying vector). clojure.core.rrb-vector also exports its own versions of vector and vec which always produce clojure.core.rrb-vector.rrbt.Vector instances." {:author "Michał Marczyk"} (:refer-clojure :exclude [vector vec subvec]) (:require [clojure.core.rrb-vector.protocols :refer [-slicev -splicev]] [clojure.core.rrb-vector.rrbt :refer [-as-rrbt]] clojure.core.rrb-vector.interop) (:require-macros [clojure.core.rrb-vector.macros :refer [gen-vector-method]])) (defn catvec "Concatenates the given vectors in logarithmic time." ([] []) ([v1] v1) ([v1 v2] (-splicev v1 v2)) ([v1 v2 v3] (-splicev (-splicev v1 v2) v3)) ([v1 v2 v3 v4] (-splicev (-splicev v1 v2) (-splicev v3 v4))) ([v1 v2 v3 v4 & vn] (-splicev (-splicev (-splicev v1 v2) (-splicev v3 v4)) (apply catvec vn)))) (defn subvec "Returns a new vector containing the elements of the given vector v lying between the start (inclusive) and end (exclusive) indices in logarithmic time. end defaults to end of vector. The resulting vector shares structure with the original, but does not hold on to any elements of the original vector lying outside the given index range." ([v start] (-slicev v start (count v))) ([v start end] (-slicev v start end))) (defn vector "Creates a new vector containing the args." ([] (gen-vector-method)) ([x1] (gen-vector-method x1)) ([x1 x2] (gen-vector-method x1 x2)) ([x1 x2 x3] (gen-vector-method x1 x2 x3)) ([x1 x2 x3 x4] (gen-vector-method x1 x2 x3 x4)) ([x1 x2 x3 x4 & xn] (into (vector x1 x2 x3 x4) xn) #_ (loop [v (vector x1 x2 x3 x4) xn xn] (if xn (recur (-conj ^not-native v (first xn)) (next xn)) v)))) (defn vec "Returns a vector containing the contents of coll. If coll is a vector, returns an RRB vector using the internal tree of coll." [coll] (if (vector? coll) (-as-rrbt coll) (apply vector coll))) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/debug.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.debug (:require [clojure.core.rrb-vector.parameters :as p] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.rrbt :as rrbt] ;; This page: ;; https://clojure.org/guides/reader_conditionals refers ;; to code that can go into common cljc files as platform ;; independent, and the code in the clj or cljs files as ;; platform dependent, so I will use that terminology ;; here, too. [clojure.core.rrb-vector.debug-platform-dependent :as pd])) ;; The intent is to keep this file as close to ;; src/main/cljs/clojure/core/rrb_vector/debug.cljs as possible, so ;; that when we start requiring Clojure 1.7.0 and later for this ;; library, this file and that one can be replaced with a common file ;; with the suffix .cljc ;; Functions expected to be defined in the appropriate ;; clojure.core.rrb-vector.debug-platform-dependent namespace: ;; pd/internal-node? ;; pd/persistent-vector? ;; pd/transient-vector? ;; pd/is-vector? ;; pd/dbg-tailoff (formerly debug-tailoff) ;; pd/dbg-tidx (formerly debug-tailoff for clj, debug-tidx for cljs) ;; pd/format ;; pd/printf ;; pd/unwrap-subvec-accessors-for ;; pd/abbrev-for-type-of [vec-or-node] (formerly abbrev-type-name, but move type/class call inside) ;; pd/same-coll? (written already for clj, TBD for cljs) ;; Functions returned from unwrap-subvec-accessors-for that have ;; platform-dependent definitions, but the same general 'kind' ;; arguments and return values, where 'kind' could be: any vector, ;; persistent or transient, or a vector tree node object: ;; get-root - All get-* fns formerly called extract-* in the Java ;; platform dependent version of the debug namespace. ;; get-shift ;; get-tail ;; get-cnt ;; get-array [node] - clj (.array nm node) cljs (.-arr node) ;; get-ranges [node] - clj (ranges nm node) cljs (node-ranges node) ;; regular? [node] - clj (.regular nm node) cljs (regular? node) ;; tail-len [tail] - clj (.alength am tail) cljs (alength tail) ;; NO: nm am - cljs doesn't need them, and clj only uses them for the ;; last few functions above. (defn children-summary [node shift get-array get-ranges regular? opts] (let [children (get-array node) reg? (regular? node) rngs (if-not reg? (get-ranges node)) array-len (count children) children-seq (if reg? children (butlast children)) non-nils (remove nil? children-seq) regular-children (filter regular? non-nils) num-non-nils (count non-nils) num-regular-children (count regular-children) num-irregular-children (- num-non-nils num-regular-children) num-nils (- (count children-seq) num-non-nils) exp-array-len (if reg? 32 33) bad-array-len? (not= array-len exp-array-len)] ;; 'r' for regular, 'i' for irregular ;; For either type of node, its first 32 array elements are broken ;; down into: ;; # regular children ;; # irregular children ;; # of nil 'children' not shown, since it will always be 32 minus ;; # the total of # regular plus irregular children, unless the ;; # array is the wrong size, and in that case a BAD-ARRAY-LEN ;; # message will be included in the string. (pd/format "%s%d+%d%s" (if reg? "r" "i") num-regular-children num-irregular-children (if bad-array-len? (pd/format " BAD-ARRAY-LEN %d != %d" array-len exp-array-len) "")))) (defn filter-indexes "Return a sequence of all indexes of elements e of coll for which (pred e) returns logical true. 0 is the index of the first element." [pred coll] (filter (complement nil?) (map-indexed (fn [idx e] (if (pred e) idx)) coll))) (defn dbg-vec ([v] (dbg-vec v {:max-depth nil ;; integer to limit depth, nil for unlimited ;; force showing tree "fringes" beyond max-depth :always-show-fringes false ;; show vector elements. false for only count :show-elements true ;; show summary of number of children of each node, as ;; returned by function children-summary :show-children-summary false ;; default false means show ranges arrays with their raw ;; unprocessed contents. Use true to show only the ;; first n elements, where n=(aget (get-ranges node) ;; 32), and to show the 'deltas' between consecutive ;; pairs, e.g. if the original is (32 64 96 0 ... 0 3), ;; then instead show (32 32 32), which, if the data ;; structure is correct, is the number of vector ;; elements reachable through each of the node's 3 ;; children. :show-ranges-as-deltas false})) ([v opts] (let [{:keys [v subvector? subvec-start subvec-end get-root get-shift get-tail get-cnt get-array get-ranges regular? tail-len]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v) tail (get-tail v) cnt (get-cnt v)] (when subvector? (pd/printf "SubVector from start %d to end %d of vector:\n" subvec-start subvec-end)) (letfn [(go [indent shift i node on-left-fringe? on-right-fringe?] (when node (dotimes [_ indent] (print " ")) (pd/printf "%02d:%02d %s" shift i (pd/abbrev-for-type-of node)) (if (zero? shift) ;; this node has only vector elements as its children (if (:show-elements opts) (print ":" (vec (get-array node))) (print ":" (count (get-array node)) "vector elements elided")) ;; else this node has only other nodes as its children (do (when (:show-children-summary opts) (print " ") (print (children-summary node shift get-array get-ranges regular? opts))) (if (not (regular? node)) (if (:show-ranges-as-deltas opts) (let [rngs (get-ranges node) r (aget rngs 32) tmp (map - (take r rngs) (take r (cons 0 rngs)))] (print ":" (seq tmp))) (print ":" (seq (get-ranges node))))))) (println) (let [no-children? (zero? shift) visit-all-children? (and (not no-children?) (or (nil? (:max-depth opts)) (< (inc indent) (:max-depth opts)))) visit-some-children? (or visit-all-children? (and (not no-children?) (:always-show-fringes opts) (or on-left-fringe? on-right-fringe?)))] (if visit-some-children? (dorun (let [arr (get-array node) a (if (regular? node) arr (butlast arr)) non-nil-idxs (filter-indexes (complement nil?) a) first-non-nil-idx (first non-nil-idxs) last-non-nil-idx (last non-nil-idxs)] (map-indexed (fn [i node] (let [child-on-left-fringe? (and on-left-fringe? (= i first-non-nil-idx)) child-on-right-fringe? (and on-right-fringe? (= i last-non-nil-idx)) visit-this-child? (or visit-all-children? (and (:always-show-fringes opts) (or child-on-left-fringe? child-on-right-fringe?)))] (if visit-this-child? (go (inc indent) (- shift 5) i node child-on-left-fringe? child-on-right-fringe?)))) a)))))))] (pd/printf "%s (%d elements):\n" (pd/abbrev-for-type-of v) (count v)) (go 0 shift 0 root true true) (println (if (pd/transient-vector? v) (pd/format "tail (tidx %d):" (pd/dbg-tidx v)) "tail:") (vec tail)))))) (defn first-diff "Compare two sequences to see if they have = elements in the same order, and both sequences have the same number of elements. If all of those conditions are true, and no exceptions occur while calling seq, first, and next on the seqs of xs and ys, then return -1. If two elements at the same index in each sequence are found not = to each other, or the sequences differ in their number of elements, return the index, 0 or larger, at which the first difference occurs. If an exception occurs while calling seq, first, or next, throw an exception that contains the index at which this exception occurred." [xs ys] (loop [i 0 xs (seq xs) ys (seq ys)] (if (try (and xs ys (= (first xs) (first ys))) (catch Exception e (.printStackTrace e) i)) (let [xs (try (next xs) (catch Exception e (prn :xs i) (throw e))) ys (try (next ys) (catch Exception e (prn :ys i) (throw e)))] (recur (inc i) xs ys)) (if (or xs ys) i -1)))) ;; When using non-default parameters for the tree data structure, ;; e.g. shift-increment not 5, then in test code with calls to ;; checking-* functions, they will be expecting those same non-default ;; parameter values, and will give errors if they are ever given a ;; vector returned by clojure.core/vec, because without changes to ;; Clojure itself, they always have shift-increment 5 and max-branches ;; 32. ;; ;; If we use (fv/vec coll) consistently in the test code, that in many ;; cases returns a core.rrb-vector data structure, but if given a ;; Clojure vector, it still returns that Clojure vector unmodified, ;; which has the same issues for checking-* functions. By ;; calling (fv/vec (seq coll)) when not using default parameters, we ;; force the return value of cvec to always be a core.rrb-vector data ;; structure. ;; ;; The name 'cvec' is intended to mean "construct a vector", and only ;; intended for use in test code that constructs vectors used as ;; parameters to other functions operating on vectors. (defn cvec [coll] (if (= p/shift-increment 5) (clojure.core/vec coll) (fv/vec (seq coll)))) (defn slow-into [to from] (reduce conj to from)) (defn all-vector-tree-nodes [v] (let [{:keys [v get-root get-shift get-array regular?]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v)] (letfn [(go [depth shift node] (if node (if (not= shift 0) (cons {:depth depth :shift shift :kind :internal :node node} (apply concat (map (partial go (inc depth) (- shift 5)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr)))))) (cons {:depth depth :shift shift :kind :internal :node node} (map (fn [x] {:depth (inc depth) :kind :leaf :value x}) (get-array node))))))] (cons {:depth 0 :kind :base :shift shift :value v} (go 1 shift root))))) ;; All nodes that should be internal nodes are one of the internal ;; node types satisfying internal-node? All nodes that are less ;; than "leaf depth" must be internal nodes, and none of the ones ;; at "leaf depth" should be. Probably the most general restriction ;; checking for leaf values should be simply that they are any type ;; that is _not_ an internal node type. They could be objects that ;; return true for is-vector? for example, if a vector is an element ;; of another vector. (defn leaves-with-internal-node-type [node-infos] (filter (fn [node-info] (and (= :leaf (:kind node-info)) (pd/internal-node? (:node node-info)))) node-infos)) (defn non-leaves-not-internal-node-type [node-infos] (filter (fn [node-info] (and (= :internal (:kind node-info)) (not (pd/internal-node? (:node node-info))))) node-infos)) ;; The definition of nth in deftype Vector implies that every ;; descendant of a 'regular' node must also be regular. That would be ;; a straightforward sanity check to make, to return an error if a ;; non-regular node is found with a regular ancestor in the tree. (defn basic-node-errors [v] (let [{:keys [v get-shift]} (pd/unwrap-subvec-accessors-for v) shift (get-shift v) nodes (all-vector-tree-nodes v) by-kind (group-by :kind nodes) leaf-depths (set (map :depth (:leaf by-kind))) expected-leaf-depth (+ (quot shift 5) 2) max-internal-node-depth (->> (:internal by-kind) (map :depth) (apply max)) ;; Be a little loose in checking here. If we want to narrow ;; it down to one expected answer, we would need to look at ;; the tail to see how many elements it has, then use the ;; different between (count v) and that to determine how many ;; nodes are in the rest of the tree, whether it is 0 or ;; non-0. expected-internal-max-depths (cond (= (count v) 0) #{(- expected-leaf-depth 2)} (> (count v) 33) #{(dec expected-leaf-depth)} :else #{(dec expected-leaf-depth) (- expected-leaf-depth 2)})] (cond (not= (mod shift 5) 0) {:error true :description (str "shift value in root must be a multiple of 5. Found " shift) :data shift} ;; It is OK for this set size to be 0 if no leaves, but if there ;; are leaves, they should all be at the same depth. (> (count leaf-depths) 1) {:error true :description (str "There are leaf nodes at multiple different depths: " leaf-depths) :data leaf-depths} (and (= (count leaf-depths) 1) (not= (first leaf-depths) expected-leaf-depth)) {:error true :description (str "Expecting all leaves to be at depth " expected-leaf-depth " because root has shift=" shift " but found leaves at depth " (first leaf-depths)) :data leaf-depths} (not (contains? expected-internal-max-depths max-internal-node-depth)) {:error true :description (str "Expecting there to be some internal nodes at one of" " these depths: " expected-internal-max-depths " because count=" (count v) " and root has shift=" shift " but max depth among all internal nodes found was " max-internal-node-depth)} (seq (leaves-with-internal-node-type nodes)) {:error true :description "A leaf (at max depth) has one of the internal node types, returning true for internal-node?" :data (first (leaves-with-internal-node-type nodes))} (seq (non-leaves-not-internal-node-type nodes)) {:error true :description "A non-leaf node has a type that returns false for internal-node?" :data (first (non-leaves-not-internal-node-type nodes))} :else {:error false}))) ;; I believe that objects-in-slot-32-of-obj-arrays and ;; ranges-not-int-array are only called directly from one test ;; namespace right now. Consider making a combined invariant checking ;; function in this debug namespace that can be used from any test ;; namespace (or other debug-time code) that a developer wants to. (defn objects-in-slot-32-of-obj-arrays "Function to look for errors of the form where a node's node.array object, which is often an array of 32 or 33 java.lang.Object's, has an element at index 32 that is not nil, and refers to an object that is of any type _except_ an array of ints. There appears to be some situation in which this can occur, but it seems to almost certainly be a bug if that happens, and we should be able to detect it whenever it occurs." [v] (let [{:keys [v get-array]} (pd/unwrap-subvec-accessors-for v) node-maps (all-vector-tree-nodes v) internal (filter #(= :internal (:kind %)) node-maps)] (keep (fn [node-info] ;; TBD: Is there a way to do ^objects type hint for clj, ;; but none for cljs? Is it harmful for cljs to have such ;; a type hint? ;;(let [^objects arr (get-array (:node node-info)) (let [arr (get-array (:node node-info)) n (count arr)] (if (== n 33) (aget arr 32)))) internal))) ;; TBD: Should this function be defined in platform-specific file? ;;(defn ranges-not-int-array [x] ;; (seq (remove int-array? (objects-in-slot-32-of-obj-arrays x)))) ;; edit-nodes-errors is completely defined in platform-specific source ;; files. It is simply quite different between clj/cljs. (defn edit-nodes-errors [v] (pd/edit-nodes-errors v all-vector-tree-nodes)) (defn regular-node-errors [root-node? root-node-cnt children] ;; For regular nodes, there should be zero or more 'full' children, ;; followed optionally by one 'partial' child, followed by nils. (let [[full-children others] (split-with :full? children) [partial-children others] (split-with #(and (not (:full? %)) (not= :nil (:kind %))) others) [nil-children others] (split-with #(= :nil (:kind %)) others) num-full (count full-children) num-partial (count partial-children) num-non-nil (+ num-full num-partial)] (cond (not= 0 (count others)) {:error true, :kind :internal, :description (str "Found internal regular node with " num-full " full, " num-partial " partial, " (count nil-children) " nil, " (count others) " 'other' children." " - expected 0 children after nils.")} (> num-partial 1) {:error true, :kind :internal, :description (str "Found internal regular node with " num-full " full, " num-partial " partial, " (count nil-children) " nil children" " - expected 0 or 1 partial.")} (not (or (and root-node? (<= root-node-cnt 32) ;; all elements in tail (= 0 num-non-nil)) (<= 1 num-non-nil 32))) {:error true, :kind :internal :description (str "Found internal regular node with # full + # partial=" num-non-nil " children outside of range [1, " 32 "]." " root-node?=" root-node? " root-node-cnt=" root-node-cnt) :data children} :else {:error false, :kind :internal, :full? (= 32 (count full-children)) :count (reduce + (map #(or (:count %) 0) children))}))) (defn non-regular-node-errors [node get-ranges children] (let [rng (get-ranges node) [non-nil-children others] (split-with #(not= :nil (:kind %)) children) [nil-children others] (split-with #(= :nil (:kind %)) others) num-non-nil (count non-nil-children) num-nil (count nil-children) expected-ranges (reductions + (map :count non-nil-children))] (cond (not= 0 (count others)) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil, " (count others) " 'other' children." " - expected 0 children after nils.")} (not= num-non-nil (aget rng 32)) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil children, and" " last elem of ranges=" (aget rng 32) " - expected it to match # non-nil children.")} (not= expected-ranges (take (count expected-ranges) (seq rng))) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil children, and" " # children prefix sums: " (seq expected-ranges) " - expected that to match stored ranges: " (seq rng))} ;; I believe that there must always be at least one ;; non-nil-child. By checking for this condition, we will ;; definitely find out if it is ever violated. ;; TBD: What if we have a tree with ranges, and then remove all ;; elements? Does the resulting tree triger this error? (not (<= 1 (aget rng 32) 32)) {:error true, :kind :internal :description (str "Found internal non-regular node with (aget rng 32)" "=" (aget rng 32) " outside of range [1, 32].")} :else {:error false, :kind :internal, :full? false, :count (last expected-ranges)}))) (defn max-capacity-divided-by-1024 [root-shift] (let [shift-amount (max 0 (- root-shift 5))] (bit-shift-left 1 shift-amount))) (defn fraction-full [v] (let [{:keys [v get-shift]} (pd/unwrap-subvec-accessors-for v) root-shift (get-shift v) tail-off (pd/dbg-tailoff v) max-tree-cap (bit-shift-left 1 (+ root-shift 5))] (/ (* 1.0 tail-off) max-tree-cap))) (defn ranges-errors [v] (let [{:keys [v get-root get-shift get-tail get-cnt get-array get-ranges regular? tail-len]} (pd/unwrap-subvec-accessors-for v) root (get-root v) root-node-cnt (count v) root-shift (get-shift v) tail-off (pd/dbg-tailoff v) tail (get-tail v)] (letfn [ (go [shift node] (cond (nil? node) {:error false :kind :nil} (zero? shift) (let [n (count (get-array node))] (merge {:error (zero? n), :kind :leaves, :full? (= n 32), :count n} (if (zero? n) {:description (str "Leaf array has 0 elements." " Expected > 0.")}))) :else ;; non-0 shift (let [children (map (partial go (- shift 5)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr)))) errs (filter :error children)] (cond (seq errs) {:error true, :description "One or more errors found", :data errs} (not= 32 (count children)) {:error true, :kind :internal, :description (str "Found internal node that has " (count children) " children - expected 32.")} (regular? node) (regular-node-errors (= shift root-shift) root-node-cnt children) :else (non-regular-node-errors node get-ranges children)))))] (let [x (go root-shift root)] (cond (:error x) x (not= tail-off (:count x)) {:error true, :kind :root, :description (str "Found tail-off=" tail-off " != " (:count x) "=count of values beneath internal nodes") :internal-node-leaf-count (:count x) :tail-off tail-off :cnt (get-cnt v)} (and (pd/transient-vector? v) (not= (tail-len tail) 32)) {:error true, :kind :root, :description (str "Found transient vector with tail length " (tail-len tail) " - expecting 32")} ;; It is always a bad thing if shift becomes more than 32, ;; because the bit-shift-left and bit-shift-right operations ;; on 32-bit ints actually behave like (bit-shift-left ;; x (mod shift-amount 32)) for shift-amount over 32. It is ;; also likely a bug in the implementation if that happens. (>= root-shift 32) {:error true, :kind :root, :description (str "shift of root is " root-shift " >= 32," " which is not supported.")} ;; This is not necessarily a bug, but it seems likely to be ;; a bug if a tree is less than 1/1024 full compared to its ;; max capacity. 1/32 full is normal when a tree becomes 1 ;; deeper than it was before. (< 0 (:count x) (max-capacity-divided-by-1024 root-shift)) {:error false, :warning true, :kind :root-too-deep, :description (str "For root shift=" root-shift " the maximum " "capacity divided by 1024 is " (max-capacity-divided-by-1024 root-shift) " but the tree contains only " (:count x) " vector elements outside of the tail")} :else x))))) #_(defn add-return-value-checks [f err-desc-str return-value-check-fn] (fn [& args] (let [ret (apply f args)] (apply return-value-check-fn err-desc-str ret args) ret))) (defn copying-seq [v] (let [{:keys [v subvector? subvec-start subvec-end get-root get-shift get-tail get-array regular?]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v)] (letfn [(go [shift node] (if node (if (not= shift 0) (apply concat (map (partial go (- shift 5)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr))))) (seq (get-array node)))))] (doall ;; always return a fully realized sequence. (let [all-elems (concat (go shift root) (if (pd/transient-vector? v) (take (pd/dbg-tidx v) (get-tail v)) (seq (get-tail v))))] (if subvector? (take (- subvec-end subvec-start) (drop subvec-start all-elems)) all-elems)))))) (def failure-data (atom [])) (def warning-data (atom [])) (defn clear-failure-data! [] (reset! failure-data [])) (let [orig-conj clojure.core/conj] (defn record-failure-data [d] (swap! failure-data orig-conj d)) (defn record-warning-data [d] (swap! warning-data orig-conj d))) ;; I would like to achieve a goal of providing an easy-to-use way that ;; a Clojure or ClojureScript developer could call a function, or ;; invoke their own code in a macro, and then within the run-time ;; scope of that, a selected set of calls to functions like conj, ;; conj!, pop, pop!, transient, subvec, slicev, catvec, splicev, and ;; perhaps others, would have extra checks enabled, such that if they ;; detected a bug, they would stop the execution immediately with a ;; lot of debug information recorded as near to the point of the ;; failure as can be achieved by checking the return values of such ;; function calls. ;; It would also be good if this goal could be achieved without having ;; a separate implementation of all of those functions, and/or custom ;; versions of Clojure, ClojureScript, or the core.rrb-vector library ;; to use. Actually a separate implementation of core.rrb-vector ;; might be acceptable and reasonable to implement and maintain, but ;; separate versions of Clojure and ClojureScript seems like too much ;; effort for the benefits achieved. ;; I have investigated approaches that attempt to use with-redefs on ;; the 'original Vars' in Clojure, and also in a ClojureScript ;; Node-based REPL. ;; There are differences between with-redefs behavior on functions in ;; clojure.core between Clojure and ClojureScript, because ;; direct-linking seems to also include user code calling to ;; clojure.core functions with ClojureScript: ;; https://clojure.atlassian.net/projects/CLJS/issues/CLJS-3154 ;; At least in Clojure, and perhaps also in ClojureScript, there is ;; sometimes an effect similar to direct linking involved when calling ;; protocol methods on objects defined via deftype. That prevents ;; with-redefs, and any technique that changes the definition of a Var ;; with alter-var-root! or set!, from causing the alternate function ;; to be called. ;; Here are the code paths that I think are most useful for debug ;; checks of operations on vectors. ;; Functions in clojure.core: ;; Lower value, because they are simpler functions, and in particular ;; do not operate on RRB vector trees with ranges inside: ;; vec vector vector-of ;; Similarly the RRB vector variants of those functions create regular ;; RRB vectors, so not as likely to have bugs. ;; peek can operate on trees with ranges inside, but always accesses ;; the tail, so not nearly as likely to have bugs. ;; Higher value, because they can operate on RRB vectors with ranges ;; inside the tree: ;; conj pop assoc ;; conj! pop! assoc! ;; transient persistent! ;; seq rseq ;; Functions in clojure.core.rrb-vector namespace, and internal ;; implementation functions/protocol-methods that they use: ;; defn fv/catvec ;; calls itself recursively for many args (clj and cljs versions) ;; -splicev protocol function (splicev for clj) ;; When -splicev is called on PersistentVector or Subvec, -as-rrbt ;; converts it to Vector, then method below is called. ;; deftype Vector -splicev / splicev method ;; -as-rrbt (cljs) / as-rrbt (clj) ;; -slicev (cljs) / slicev (clj) if used on a subvector object ;; defn splice-rrbts ;; defn splice-rrbts-main ;; Calls many internal implementation detail functions. ;; peephole-optimize-root ;; fallback-to-slow-splice-if-needed ;; defn fv/subvec ;; -slicev (cljs) / slicev (clj) protocol function ;; deftype Vector -slicev method ;; Calls many internal implementation detail functions, ;; e.g. slice-left slice-right make-array array-copy etc. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; See the documentation of the several checking-* functions for the ;; keys supported inside of the @debug-opts map. (def debug-opts (atom {})) (def full-debug-opts {:trace false :validate true :return-value-checks [edit-nodes-errors basic-node-errors ranges-errors] ;; false -> throw an exception when error detected :continue-on-error false ;; true -> do not throw an exception when warning found :continue-on-warning true}) (defn set-debug-opts! "set-debug-opts! modified the debug-opts atom of the core.rrb-vector library, which configures what kinds of extra checks are performed when calling the checking-* versions of functions defined in the library's debug namespace. Example call: (require '[clojure.core.rrb-vector.debug :as d]) (d/set-debug-opts! d/full-debug-opts) This call enables as thorough of extra verification checks as is supported by existing code, when you call any of the checking-* variants of the functions in this namespace, e.g. checking-catvec, checking-subvec. It will also slow down your code to do so. checking-* functions return the same values as their non checking-* original functions they are based upon, so you can write application code that mixes calls to both, calling the checking-* versions only occasionally, if you have a long sequence of operations that you want to look for bugs within core.rrb-vector's implementation of." [opts] (reset! debug-opts {:catvec opts ;; affects checking-catvec behavior, ;; via calling checking-splicev and ;; checking-splice-rrbts and enabling ;; their extra checks. :subvec opts ;; affects checking-subvec behavior, ;; via calling checking-slicev and ;; enabling its extra checks :pop opts ;; affects checking-pop :pop! opts ;; affects checking-pop! :transient opts})) ;; affects checking-transient (defn validation-failure [err-msg-str failure-data opts] (println "ERROR:" err-msg-str) (record-failure-data failure-data) (when-not (:continue-on-error opts) (throw (ex-info err-msg-str failure-data)))) (defn sanity-check-vector-internals "This function is called by all of the checking-* variants of functions in the debug namespace. It calls all of the functions in (:return-value-checks opts) in the order given, passing each of those functions a return value 'ret'. Each function performs sanity checks on the 'ret' data structure used to represent the vector. Those functions should return a map with key :error having a logical true value if any errors were found, or a key :warning having a logical true value if any warnings were found, otherwise both of those values must be logical false in the returned map (or no such key is present in the returned map at all). Three examples of such functions are included in core.rrb-vector's debug namespace. * edit-nodes-errors * basic-node-errors * ranges-errors They each look for different problems in the vector data structure internals. They were developed as separate functions in case there was ever a significant performance advantage to configuring only some of them to be called, not all of them, for long tests. If any errors are found, this function calls record-failure-data, to record the details in a global atom. It prints a message to *out*, and if (:continue-on-error opts) is logical false, it throws a data conveying exception using ex-info containing the same message, and the same error details map passed to record-failure-data. If no exception is thrown due to an error, then repeat the same checks for a warning message, recording details via calling record-warning-data, and throwing an exception if (:continue-on-warning opts) is logical false." [err-desc-str ret args opts] (doseq [check-fn (:return-value-checks opts)] (let [i (check-fn ret)] (when (:error i) (let [msg (str "found error in ret value from " err-desc-str ": " (:description i)) failure-data {:err-desc-str err-desc-str, :ret ret, :args args, :error-info i}] (println "ERROR:" msg) (record-failure-data failure-data) (when-not (:continue-on-error opts) (throw (ex-info msg failure-data))))) (when (:warning i) ;; It is perfectly normal for fv/subvec and slicev to return a ;; vector that causes this warning. (when-not (and (= err-desc-str "slicev") (= :root-too-deep (:kind i))) (let [msg (str "possible issue with ret value from " err-desc-str ": " (:description i)) failure-data {:err-desc-str err-desc-str, :ret ret, :args args, :error-info i}] (println "WARNING:" msg) (record-warning-data failure-data) (when-not (:continue-on-warning opts) (throw (ex-info msg failure-data))))))))) (defn validating-pop "validating-pop is not really designed to be called from user programs. checking-pop can do everything that validating-pop can, and more. See its documentation. A typical way of calling validating-pop is: (require '[clojure.core.rrb-vector.debug :as d]) (d/validating-pop clojure.core/pop \"pop\" coll) Most of the validating-* functions behave similarly. This one contains the most complete documentation, and the others refer to this one. They all differ in the function that they are intended to validate, and a few other details, which will be collected in one place here for function validating-pop so one can quickly see the differences between validating-pop and the other validating-* functions. good example f: clojure.core/pop opts map: (get @d/debug-opts :pop) The first argument can be any function f. f is expected to take arguments and return a value equal to what clojure.core/pop would, given the argument coll. validating-pop will first make a copy of the seq of items in coll, as a safety precaution, because some kinds of incorrect implementations of pop could mutate their input argument. That would be a bug, of course, but aiding a developer in detecting bugs is the reason validating-pop exists. It uses the function copying-seq to do this, which takes at least linear time in the size of coll. It will then calculate a sequence that is = to the expected return value, e.g. for pop, all items in coll except the last one. Then validating-pop will call (f coll), then call copying-seq on the return value. If the expected and returned sequences are not =, then a map containing details about the arguments and actual return value is created and passed to d/record-failure-data, which appends the map to the end of a vector that is the value of an atom named d/failure-data. An exception is thrown if (:continue-on-error opts) is logical false, with ex-data equal to this same map of error data. If the expected and actual sequences are the same, no state is modified and no exception is thrown. If validating-pop does not throw an exception, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq (butlast coll-seq) ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(pop coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :pop))) ret)) (defn checking-pop "These two namespace aliases will be used later in this documentation: (require '[clojure.core.rrb-vector.debug :as d]) (require '[clojure.core.rrb-vector.debug-platform-dependent :as pd]) checking-pop passes its argument to clojure.core/pop, and if it returns, it returns whatever clojure.core/pop does. If checking-pop detects any problems, it will record information about the problems found in one or both of the global atoms 'd/failure-data' and 'd/warning-data', and optionally throw an exception. If coll is not a vector type according to pd/is-vector?, then checking-pop simply behaves exactly like clojure.core/pop, with no additional checks performed. All of checking-pop's extra checks are specific to vectors. If coll is a vector, then checking-pop looks up the key :pop in a global atom 'd/debug-opts'. The result of that lookup is a map we will call 'opts' below. opts map: (get @d/debug-opts :pop) function called if (:validating opts) is logical true: validating-pop If (:trace opts) is true, then a debug trace message is printed to *out*. If (:validate opts) is true, then validating-pop is called, using clojure.core/pop to do the real work, but validating-pop will check whether the return value looks correct relative to the input parameter value, i.e. it is equal to a sequence of values containing all but the last element of the input coll's sequence of values. See validating-pop documentation for additional details. This step records details of problems found in the atoms d/failure-data. (:return-value-checks opts) should be a sequence of functions that each take the vector returned from calling clojure.core/pop, and return data about any errors or warnings they find in the internals of the vector data structure. Errors or warnings are appended to atoms d/failure-data and/or d/warning-data. If either the validate or return value checks steps find an error, they throw an exception if (:continue-on-error opts) is logical false. If the return value checks step finds no error, but does find a warning, it throws an exception if (:continue-on-warning opts) is logical false." [coll] (if-not (pd/is-vector? coll) (clojure.core/pop coll) (let [opts (get @debug-opts :pop) err-desc-str "pop"] (when (:trace opts) (println "checking-pop called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-pop clojure.core/pop err-desc-str coll) (clojure.core/pop coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-pop! "validating-pop! behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core/pop! opts map: (get @d/debug-opts :pop!) If no exception is thrown, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq (butlast coll-seq) ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(pop! coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :pop!))) ret)) (defn checking-pop! "checking-pop! is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :pop!) function called if (:validating opts) is logical true: validating-pop!" [coll] (if-not (pd/is-vector? coll) (clojure.core/pop! coll) (let [opts (get @debug-opts :pop!) err-desc-str "pop!"] (when (:trace opts) (println "checking-pop! called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-pop! clojure.core/pop! err-desc-str coll) (clojure.core/pop! coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-transient "validating-transient behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core/transient opts map: (get @d/debug-opts :transient) If no exception is thrown, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq coll-seq ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(transient coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :transient))) ret)) (defn checking-transient "checking-transient is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :transient) function called if (:validating opts) is logical true: validating-transient" [coll] (if-not (pd/is-vector? coll) (clojure.core/transient coll) (let [opts (get @debug-opts :transient) err-desc-str "transient"] (when (:trace opts) (println "checking-transient called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-transient clojure.core/transient err-desc-str coll) (clojure.core/transient coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-splice-rrbts-main "validating-splice-rrbts-main behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core.rrb-vector.rrbt/splice-rrbts-main opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts-main Given that splice-rrbts-main is an internal implementation detail of the core.rrb-vector library, it is expected that it is more likely you would call validating-catvec instead of this function. If no exception is thrown, the return value is (f v1 v2)." [err-desc-str nm am v1 v2] (let [orig-fn rrbt/splice-rrbts-main v1-seq (copying-seq v1) v2-seq (copying-seq v2) exp-ret-seq (concat v1-seq v2-seq) ret (orig-fn nm am v1 v2) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "splice-rrbts-main returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list nm am v1 v2) :v1-seq v1-seq, :v2-seq v2-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :catvec))) ret)) (defn checking-splice-rrbts-main "checking-splice-rrbts-main is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-catvec. checking-splice-rrbts-main is part of the implementation of checking-catvec. opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts-main function called if (:validating opts) is logical true: validating-splice-rrbts-main" [& args] (let [opts (get @debug-opts :catvec) err-desc-str "splice-rrbts-main"] (when (:trace opts) (let [[_ _ v1 v2] args] (println "checking-splice-rrbts-main called with #v1=" (count v1) "#v2=" (count v2) "(type v1)=" (type v1) "(type v2)=" (type v2)))) (let [ret (if (:validate opts) (apply validating-splice-rrbts-main err-desc-str args) (apply rrbt/splice-rrbts-main args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn checking-splice-rrbts "checking-splice-rrbts is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-catvec. checking-splice-rrbts is part of the implementation of checking-catvec. opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts function called if (:validating opts) is logical true: validating-splice-rrbts" [& args] (let [opts (get @debug-opts :catvec) err-desc-str1 "splice-rrbts checking peephole-optimize-root result" err-desc-str2 "splice-rrbts checking fallback-to-slow-splice-if-needed result" [nm am v1 v2] args] (when (:trace opts) (println "checking-splice-rrbts called with #v1=" (count v1) "#v2=" (count v2) "(type v1)=" (type v1) "(type v2)=" (type v2))) (let [r1 (checking-splice-rrbts-main nm am v1 v2) r2 (rrbt/peephole-optimize-root r1)] ;; Optimize a bit by only doing all of the sanity checks on r2 ;; if it is not the same identical data structure r1 that ;; checking-splice-rrbts-main already checked. (when-not (identical? r2 r1) (sanity-check-vector-internals err-desc-str1 r2 args opts)) (let [r3 (rrbt/fallback-to-slow-splice-if-needed v1 v2 r2)] (when-not (identical? r3 r2) (sanity-check-vector-internals err-desc-str2 r3 args opts)) r3)))) (defn checking-splicev "checking-splicev is identical to splicev, except that it calls checking-splice-rrbts instead of splice-rrbts, for configurable additional checking on each call to checking-splice-rrbts. It is more likely that a core.rrb-vector library user will want to call checking-catvec rather than this one. checking-splicev is part of the implementation of checking-catvec." [v1 v2] (let [rv1 (rrbt/as-rrbt v1)] (checking-splice-rrbts (.-nm rv1) (.-am rv1) rv1 (rrbt/as-rrbt v2)))) (defn checking-catvec-impl "checking-catvec-impl is identical to catvec, except that it calls checking-splicev instead of splicev, for configurable additional checking on each call to checking-splicev." ([] []) ([v1] v1) ([v1 v2] (checking-splicev v1 v2)) ([v1 v2 v3] (checking-splicev (checking-splicev v1 v2) v3)) ([v1 v2 v3 v4] (checking-splicev (checking-splicev v1 v2) (checking-splicev v3 v4))) ([v1 v2 v3 v4 & vn] (checking-splicev (checking-splicev (checking-splicev v1 v2) (checking-splicev v3 v4)) (apply checking-catvec-impl vn)))) (defn validating-catvec "validating-catvec behaves similarly to validating-pop, but note that it does not allow you to pass in a function f on which to concatenate its arguments. It hardcodes d/checking-catvec-impl for that purpose. See validating-pop for more details. opts map: (get @d/debug-opts :catvec) If no exception is thrown, the return value is (apply checking-catvec-impl vs)." [err-desc-str & vs] (let [orig-fn checking-catvec-impl ;; clojure.core.rrb-vector/catvec vs-seqs (doall (map copying-seq vs)) exp-ret-seq (apply concat vs-seqs) ret (apply orig-fn vs) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "catvec returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args vs, :vs-seqs vs-seqs, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :catvec))) ret)) (defn checking-catvec "checking-catvec is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Note that (get @d/debug-otps :catvec) is used to control tracing, validating, and return value sanity checks for checking-catvec as a whole. This includes controlling those options for the function checking-splice-rrbts, which is used to concatenate pairs of vectors when you call checking-catvec with 3 or more vectors. This takes a bit longer to do the checking on every concatenation, but catches problems closer to the time they are introduced. opts map: (get @d/debug-opts :catvec) function called if (:validating opts) is logical true: validating-catvec" [& args] (let [opts (get @debug-opts :catvec) err-desc-str "catvec"] (when (:trace opts) (println "checking-catvec called with" (count args) "args:") (dorun (map-indexed (fn [idx v] (println " arg" (inc idx) " count=" (count v) "type=" (type v))) args))) (let [ret (if (:validate opts) (apply validating-catvec err-desc-str args) (apply checking-catvec-impl ;; clojure.core.rrb-vector/catvec args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn validating-slicev "validating-slicev behaves similarly to validating-pop, but note that it does not allow you to pass in a function f to call. It hardcodes slicev for that purpose. See validating-pop for more details. opts map: (get @d/debug-opts :subvec) ;; _not_ :slicev" ([err-desc-str coll start] (validating-slicev err-desc-str coll start (count coll))) ([err-desc-str coll start end] (let [coll-seq (copying-seq coll) exp-ret-seq (take (- end start) (drop start coll-seq)) ret (clojure.core.rrb-vector.protocols/slicev coll start end) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(slicev coll start end) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll start end), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :subvec))) ret))) (defn checking-slicev "checking-slicev is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-subvec. checking-slicev is part of the implementation of checking-subvec. opts map: (get @d/debug-opts :subvec) ;; _not_ :slicev function called if (:validating opts) is logical true: validating-slicev" [& args] (let [opts (get @debug-opts :subvec) err-desc-str "slicev"] (when (:trace opts) (let [[v start end] args] (println "checking-slicev #v=" (count v) "start=" start "end=" end "type=" (type v)))) (let [ret (if (:validate opts) (apply validating-slicev err-desc-str args) (apply clojure.core.rrb-vector.protocols/slicev args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn checking-subvec "checking-subvec is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :subvec) function called if (:validating opts) is logical true: validating-slicev" ([v start] (checking-slicev v start (count v))) ([v start end] (checking-slicev v start end))) (defn check-subvec "Perform a sequence of calls to subvec an a core.rrb-vector vector, as well as a normal Clojure vector, returning true if they give the same results, otherwise false. Intended for use in tests of this library." [extra-checks? init & starts-and-ends] (let [v1 (loop [v (vec (range init)) ses (seq starts-and-ends)] (if ses (let [[s e] ses] (recur (clojure.core/subvec v s e) (nnext ses))) v)) my-subvec (if extra-checks? checking-subvec fv/subvec) v2 (loop [v (fv/vec (range init)) ses (seq starts-and-ends)] (if ses (let [[s e] ses] (recur (my-subvec v s e) (nnext ses))) v))] (pd/same-coll? v1 v2))) (defn check-catvec "Perform a sequence of calls to catvec or checking-catvec on one or more core.rrb-vector vectors. Return true if Clojure's built-in concat function give the same results, otherwise false. Intended for use in tests of this library." [extra-checks? & counts] (let [prefix-sums (reductions + counts) ranges (map range (cons 0 prefix-sums) prefix-sums) v1 (apply concat ranges) my-catvec (if extra-checks? checking-catvec fv/catvec) v2 (apply my-catvec (map fv/vec ranges))] (pd/same-coll? v1 v2))) (defn generative-check-subvec "Perform many calls to check-subvec with randomly generated inputs. Intended for use in tests of this library. Returns true if all tests pass, otherwise throws an exception containing data about the inputs that caused the failing test." [extra-checks? iterations max-init-cnt slices] (dotimes [_ iterations] (let [init-cnt (rand-int (inc max-init-cnt)) s1 (rand-int init-cnt) e1 (+ s1 (rand-int (- init-cnt s1)))] (loop [s&es [s1 e1] cnt (- e1 s1) slices slices] (if (or (zero? cnt) (zero? slices)) (if-not (try (apply check-subvec extra-checks? init-cnt s&es) (catch Exception e (throw (ex-info "check-subvec failure w/ Exception" {:init-cnt init-cnt :s&es s&es} e)))) (throw (ex-info "check-subvec failure w/o Exception" {:init-cnt init-cnt :s&es s&es}))) (let [s (rand-int cnt) e (+ s (rand-int (- cnt s))) c (- e s)] (recur (conj s&es s e) c (dec slices))))))) true) (defn generative-check-catvec "Perform many calls to check-catvec with randomly generated inputs. Intended for use in tests of this library. Returns true if all tests pass, otherwise throws an exception containing data about the inputs that caused the failing test." [extra-checks? iterations max-vcnt min-cnt max-cnt] (dotimes [_ iterations] (let [vcnt (inc (rand-int (dec max-vcnt))) cnts (vec (repeatedly vcnt #(+ min-cnt (rand-int (- (inc max-cnt) min-cnt)))))] (if-not (try (apply check-catvec extra-checks? cnts) (catch Exception e (throw (ex-info "check-catvec failure w/ Exception" {:cnts cnts} e)))) (throw (ex-info "check-catvec failure w/o Exception" {:cnts cnts}))))) true) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/debug_platform_dependent.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.debug-platform-dependent (:refer-clojure :exclude [format printf]) (:require [clojure.core.rrb-vector.parameters :as p] clojure.core.rrb-vector.rrbt [clojure.core.rrb-vector.nodes :refer [ranges object-nm primitive-nm object-am]] [clojure.core.rrb-vector :as fv]) (:import (clojure.lang PersistentVector PersistentVector$TransientVector PersistentVector$Node APersistentVector$SubVector) (java.util.concurrent.atomic AtomicReference) (java.lang.reflect Field Method) (clojure.core Vec VecNode ArrayManager) (clojure.core.rrb_vector.rrbt Vector Transient) (clojure.core.rrb_vector.nodes NodeManager))) ;; Work around the fact that several fields of type ;; PersistentVector$TransientVector are private, but note that this is ;; only intended for debug use. (def ^Class transient-core-vec-class (class (transient (vector)))) (def ^Field transient-core-root-field (.getDeclaredField transient-core-vec-class "root")) (.setAccessible transient-core-root-field true) (def ^Field transient-core-shift-field (.getDeclaredField transient-core-vec-class "shift")) (.setAccessible transient-core-shift-field true) (def ^Field transient-core-tail-field (.getDeclaredField transient-core-vec-class "tail")) (.setAccessible transient-core-tail-field true) (def ^Field transient-core-cnt-field (.getDeclaredField transient-core-vec-class "cnt")) (.setAccessible transient-core-cnt-field true) (def transient-core-vec-tailoff-methods (filter #(= "tailoff" (.getName ^Method %)) (.getDeclaredMethods transient-core-vec-class))) (assert (= (count transient-core-vec-tailoff-methods) 1)) (def ^Method transient-core-vec-tailoff-method (first transient-core-vec-tailoff-methods)) (.setAccessible transient-core-vec-tailoff-method true) (def ^Class persistent-core-vec-class (class (vector))) (def persistent-core-vec-tailoff-methods (filter #(= "tailoff" (.getName ^Method %)) (.getDeclaredMethods persistent-core-vec-class))) (assert (= (count persistent-core-vec-tailoff-methods) 1)) (def ^Method persistent-core-vec-tailoff-method (first persistent-core-vec-tailoff-methods)) (.setAccessible persistent-core-vec-tailoff-method true) (def format clojure.core/format) (def printf clojure.core/printf) (defn internal-node? [obj] (contains? #{PersistentVector$Node VecNode} (class obj))) (defn persistent-vector? [obj] (contains? #{PersistentVector Vec Vector} (class obj))) (defn transient-vector? [obj] (contains? #{PersistentVector$TransientVector Transient} (class obj))) (defn is-vector? [obj] (contains? #{PersistentVector Vec Vector PersistentVector$TransientVector Transient} (class obj))) (defn dbg-tailoff [v] (cond (instance? PersistentVector v) (.invoke persistent-core-vec-tailoff-method v (object-array 0)) (= PersistentVector$TransientVector (class v)) (.invoke transient-core-vec-tailoff-method v (object-array 0)) :else (.tailoff v))) (defn dbg-tidx [v] (- (count v) (dbg-tailoff v))) (defn subvector-data [v] (if (instance? APersistentVector$SubVector v) (let [^APersistentVector$SubVector v v] {:orig-v v :subvector? true :v (.v v) :subvec-start (.start v) :subvec-end (.end v)}) {:orig-v v :subvector? false :v v})) ;; All of the classes below have a .tailoff method implementation that ;; works correctly for that class. You can use the debug-tailoff ;; function to work around the fact that this method is not public for ;; some of the vector classes. (defn accessors-for [v] (condp identical? (class v) PersistentVector (let [nm object-nm, am object-am] {:get-root #(.-root ^PersistentVector %) :get-shift #(.-shift ^PersistentVector %) :get-tail #(.-tail ^PersistentVector %) :get-cnt #(.-cnt ^PersistentVector %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }) PersistentVector$TransientVector (let [nm object-nm, am object-am] {:get-root #(.get transient-core-root-field ^PersistentVector$TransientVector %) :get-shift #(.get transient-core-shift-field ^PersistentVector$TransientVector %) :get-tail #(.get transient-core-tail-field ^PersistentVector$TransientVector %) :get-cnt #(.get transient-core-cnt-field ^PersistentVector$TransientVector %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }) Vec (let [nm primitive-nm, am #(.-am ^Vec %)] {:get-root #(.-root ^Vec %) :get-shift #(.-shift ^Vec %) :get-tail #(.-tail ^Vec %) :get-cnt #(.-cnt ^Vec %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }) Vector (let [nm (.-nm ^Vector v), am #(.-am ^Vector %)] {:get-root #(.-root ^Vector %) :get-shift #(.-shift ^Vector %) :get-tail #(.-tail ^Vector %) :get-cnt #(.-cnt ^Vector %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }) Transient (let [nm (.-nm ^Transient v), am (.-am ^Transient v)] {:get-root #(.debugGetRoot ^Transient %) :get-shift #(.debugGetShift ^Transient %) :get-tail #(.debugGetTail ^Transient %) :get-cnt #(.debugGetCnt ^Transient %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }))) (defn unwrap-subvec-accessors-for [v] (let [{:keys [v] :as m} (subvector-data v) accessors (accessors-for v)] (merge m accessors))) (defn abbrev-for-type-of [obj] (let [cn (.getName (class obj)) d (.lastIndexOf cn ".")] (subs cn (inc d)))) (defn same-coll? [a b] (and (= (count a) (count b) (.size ^java.util.Collection a) (.size ^java.util.Collection b)) (= a b) (= b a) (= (hash a) (hash b)) (= (.hashCode ^Object a) (.hashCode ^Object b)))) ;; TBD: No cljs specific version yet (defn count-nodes [& vs] (let [m (java.util.IdentityHashMap.)] (doseq [v vs] (let [{:keys [v get-root get-shift get-array]} (unwrap-subvec-accessors-for v)] (letfn [(go [n shift] (when n (.put m n n) (if-not (zero? shift) (let [arr (get-array n) ns (take 32 arr)] (doseq [n ns] (go n (- shift 5)))))))] (go (get-root v) (get-shift v))))) (.size m))) (defn int-array? [x] (and (not (nil? x)) (.isArray (class x)) (= Integer/TYPE (. (class x) getComponentType)))) ;; TBD: No cljs-specific version of this function yet #_(defn ranges-not-int-array [x] (seq (remove int-array? (objects-in-slot-32-of-obj-arrays x)))) (defn atomicref? [x] (instance? AtomicReference x)) (defn thread? [x] (instance? java.lang.Thread x)) (defn non-identical-edit-nodes [v all-vector-tree-nodes] (let [{:keys [v]} (unwrap-subvec-accessors-for v) node-maps (all-vector-tree-nodes v) ^java.util.IdentityHashMap ihm (java.util.IdentityHashMap.)] (doseq [i node-maps] (when (= :internal (:kind i)) (.put ihm (.edit (:node i)) true))) ihm)) (defn edit-nodes-errors [v all-vector-tree-nodes] (let [{:keys [v get-root]} (unwrap-subvec-accessors-for v) klass (class v) ^java.util.IdentityHashMap ihm (non-identical-edit-nodes v all-vector-tree-nodes) objs-maybe-some-nils (.keySet ihm) ;; I do not believe that Clojure's built-in vector types can ;; ever have edit fields equal to nil, but there are some ;; cases where I have seen core.rrb-vector edit fields equal ;; to nil. As far as I can tell this seems harmless, as long ;; as it is in a persistent vector, not a transient one. objs (remove nil? objs-maybe-some-nils) neither-nil-nor-atomicref (remove atomicref? objs)] (if (seq neither-nil-nor-atomicref) {:error true :description (str "Found edit object with class " (class (first neither-nil-nor-atomicref)) " - expecting nil or AtomicReference") :data ihm :not-atomic-refs neither-nil-nor-atomicref} (let [refd-objs (map #(.get ^AtomicReference %) objs) non-nils (remove nil? refd-objs) not-threads (remove thread? non-nils) root-edit (.edit (get-root v))] (cond (seq not-threads) {:error true :description (str "Found edit AtomicReference ref'ing neither nil" " nor a Thread object") :data ihm} (persistent-vector? v) (if (= (count non-nils) 0) {:error false} {:error true :description (str "Within a persistent (i.e. not transient)" " vector, found at least one edit" " AtomicReference object that ref's a Thread" " object. Expected all of them to be nil.") :data ihm :val1 (count non-nils) :val2 non-nils}) (transient-vector? v) (cond (not= (count non-nils) 1) {:error true :description (str "Within a transient vector, found " (count non-nils) " edit AtomicReference" " object(s) that ref's a Thread object." " Expected exactly 1.") :data ihm :val1 (count non-nils) :val2 non-nils} (not (atomicref? root-edit)) {:error true :description (str "Within a transient vector, found root edit" " field that was ref'ing an object with class " (class root-edit) " - expected AtomicReference.") :data root-edit} (not (thread? (.get ^AtomicReference root-edit))) (let [obj (.get ^AtomicReference root-edit)] {:error true :description (str "Within a transient vector, found root edit" " field ref'ing an AtomicReference object," " but that in turn ref'd something with class " (class obj) " - expected java.lang.Thread.") :data obj}) :else {:error false}) :else {:error true :description (str "Unknown class " klass " for object checked" " by edit-nodes-wrong-number-of-threads") :data v}))))) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/fork_join.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.fork-join (:require [clojure.core.reducers :as r])) (def pool @#'r/pool) (def task @#'r/fjtask) (def invoke @#'r/fjinvoke) (def fork @#'r/fjfork) (def join @#'r/fjjoin) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/interop.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.interop (:require [clojure.core.rrb-vector.protocols :refer [PSliceableVector slicev PSpliceableVector splicev]] [clojure.core.rrb-vector.rrbt :refer [as-rrbt]]) (:import (clojure.core Vec) (clojure.lang PersistentVector APersistentVector$SubVector) (clojure.core.rrb_vector.rrbt Vector))) (extend-protocol PSliceableVector Vec (slicev [v start end] (slicev (as-rrbt v) start end)) PersistentVector (slicev [v start end] (slicev (as-rrbt v) start end)) APersistentVector$SubVector (slicev [v start end] (slicev (as-rrbt v) start end))) (extend-protocol PSpliceableVector Vec (splicev [v1 v2] (splicev (as-rrbt v1) v2)) PersistentVector (splicev [v1 v2] (splicev (as-rrbt v1) v2)) APersistentVector$SubVector (splicev [v1 v2] (splicev (as-rrbt v1) v2))) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/nodes.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.nodes (:require [clojure.core.rrb-vector.parameters :as p]) (:import (clojure.core VecNode ArrayManager) (clojure.lang PersistentVector PersistentVector$Node) (java.util.concurrent.atomic AtomicReference))) (set! *warn-on-reflection* true) (set! *unchecked-math* true) ;; :warn-on-boxed ;;; array managers (defmacro mk-am [t] (#'clojure.core/mk-am &env &form t)) (definline object [x] x) (def ams (assoc @#'clojure.core/ams :object (mk-am object))) (def object-am (ams :object)) ;;; empty nodes (def empty-pv-node PersistentVector/EMPTY_NODE) (def empty-gvec-node clojure.core/EMPTY-NODE) ;;; node managers (definterface NodeManager (node [^java.util.concurrent.atomic.AtomicReference edit arr]) (empty []) (array [node]) (^java.util.concurrent.atomic.AtomicReference edit [node]) (^boolean regular [node]) (clone [^clojure.core.ArrayManager am ^int shift node])) (def object-nm (reify NodeManager (node [_ edit arr] (PersistentVector$Node. edit arr)) (empty [_] empty-pv-node) (array [_ node] (.-array ^PersistentVector$Node node)) (edit [_ node] (.-edit ^PersistentVector$Node node)) (regular [_ node] (not (== (alength ^objects (.-array ^PersistentVector$Node node)) (int 33)))) (clone [_ am shift node] (PersistentVector$Node. (.-edit ^PersistentVector$Node node) (aclone ^objects (.-array ^PersistentVector$Node node)))))) (def primitive-nm (reify NodeManager (node [_ edit arr] (VecNode. edit arr)) (empty [_] empty-gvec-node) (array [_ node] (.-arr ^VecNode node)) (edit [_ node] (.-edit ^VecNode node)) (regular [_ node] (not (== (alength ^objects (.-arr ^VecNode node)) (int 33)))) (clone [_ am shift node] (if (zero? shift) (VecNode. (.-edit ^VecNode node) (.aclone am (.-arr ^VecNode node))) (VecNode. (.-edit ^VecNode node) (aclone ^objects (.-arr ^VecNode node))))))) ;;; ranges (defmacro ranges [nm node] `(ints (aget ~(with-meta `(.array ~nm ~node) {:tag 'objects}) 32))) (defn last-range [^NodeManager nm node] (let [rngs (ranges nm node) i (unchecked-dec-int (aget rngs 32))] (aget rngs i))) (defn regular-ranges [shift cnt] (let [step (bit-shift-left (int 1) (int shift)) rngs (int-array 33)] (loop [i (int 0) r step] (if (< r cnt) (do (aset rngs i r) (recur (unchecked-inc-int i) (unchecked-add-int r step))) (do (aset rngs i (int cnt)) (aset rngs 32 (unchecked-inc-int i)) rngs))))) ;;; root overflow (defn overflow? [^NodeManager nm root shift cnt] (if (.regular nm root) (> (bit-shift-right (unchecked-inc-int (int cnt)) (int 5)) (bit-shift-left (int 1) (int shift))) (let [rngs (ranges nm root) slc (aget rngs 32)] (and (== slc (int 32)) (or (== (int shift) (int 5)) (recur nm (aget ^objects (.array nm root) (unchecked-dec-int slc)) (unchecked-subtract-int (int shift) (int 5)) (unchecked-add-int (unchecked-subtract-int (aget rngs 31) (aget rngs 30)) (int 32)))))))) ;;; find nil / 0 (defn index-of-0 ^long [arr] (let [arr (ints arr)] (loop [l 0 h 31] (if (>= l (unchecked-dec h)) (if (zero? (aget arr l)) l (if (zero? (aget arr h)) h 32)) (let [mid (unchecked-add l (bit-shift-right (unchecked-subtract h l) 1))] (if (zero? (aget arr mid)) (recur l mid) (recur (unchecked-inc-int mid) h))))))) (defn index-of-nil ^long [arr] (loop [l 0 h 31] (if (>= l (unchecked-dec h)) (if (nil? (aget ^objects arr l)) l (if (nil? (aget ^objects arr h)) h 32)) (let [mid (unchecked-add l (bit-shift-right (unchecked-subtract h l) 1))] (if (nil? (aget ^objects arr mid)) (recur l mid) (recur (unchecked-inc-int mid) h)))))) ;;; children (defn first-child [^NodeManager nm node] (aget ^objects (.array nm node) 0)) (defn last-child [^NodeManager nm node] (let [arr (.array nm node)] (if (.regular nm node) (aget ^objects arr (dec (index-of-nil arr))) (aget ^objects arr (unchecked-dec-int (aget (ranges nm node) 32)))))) (defn remove-leftmost-child [^NodeManager nm shift parent] (let [arr (.array nm parent)] (if (nil? (aget ^objects arr 1)) nil (let [regular? (.regular nm parent) new-arr (object-array (if regular? 32 33))] (System/arraycopy arr 1 new-arr 0 31) (if-not regular? (let [rngs (ranges nm parent) rng0 (aget rngs 0) new-rngs (int-array 33) lim (aget rngs 32)] (System/arraycopy rngs 1 new-rngs 0 (dec lim)) (loop [i 0] (when (< i lim) (aset new-rngs i (- (aget new-rngs i) rng0)) (recur (inc i)))) (aset new-rngs 32 (dec (aget rngs 32))) (aset new-rngs (dec (aget rngs 32)) (int 0)) (aset ^objects new-arr 32 new-rngs))) (.node nm (.edit nm parent) new-arr))))) (defn replace-leftmost-child [^NodeManager nm shift parent pcnt child d] (if (.regular nm parent) (let [step (bit-shift-left 1 shift) rng0 (- step d) ncnt (- pcnt d) li (bit-and (bit-shift-right shift (dec pcnt)) 0x1f) arr (.array nm parent) new-arr (object-array 33) new-rngs (int-array 33)] (aset ^objects new-arr 0 child) (System/arraycopy arr 1 new-arr 1 li) (aset ^objects new-arr 32 new-rngs) (aset new-rngs 0 (int rng0)) (aset new-rngs li (int ncnt)) (aset new-rngs 32 (int (inc li))) (loop [i 1] (when (<= i li) (aset new-rngs i (+ (aget new-rngs (dec i)) step)) (recur (inc i)))) (.node nm nil new-arr)) (let [new-arr (aclone ^objects (.array nm parent)) rngs (ranges nm parent) new-rngs (int-array 33) li (dec (aget rngs 32))] (aset new-rngs 32 (aget rngs 32)) (aset ^objects new-arr 32 new-rngs) (aset ^objects new-arr 0 child) (loop [i 0] (when (<= i li) (aset new-rngs i (- (aget rngs i) (int d))) (recur (inc i)))) (.node nm nil new-arr)))) (defn replace-rightmost-child [^NodeManager nm shift parent child d] (if (.regular nm parent) (let [arr (.array nm parent) i (unchecked-dec (index-of-nil arr))] (if (.regular nm child) (let [new-arr (aclone ^objects arr)] (aset ^objects new-arr i child) (.node nm nil new-arr)) (let [arr (.array nm parent) new-arr (object-array 33) step (bit-shift-left 1 shift) rngs (int-array 33)] (aset rngs 32 (inc i)) (aset ^objects new-arr 32 rngs) (System/arraycopy arr 0 new-arr 0 i) (aset ^objects new-arr i child) (loop [j 0 r step] (when (<= j i) (aset rngs j r) (recur (inc j) (+ r step)))) (aset rngs i (int (last-range nm child))) (.node nm nil new-arr)))) (let [rngs (ranges nm parent) new-rngs (aclone rngs) i (dec (aget rngs 32)) new-arr (aclone ^objects (.array nm parent))] (aset ^objects new-arr i child) (aset ^objects new-arr 32 new-rngs) (aset new-rngs i (int (+ (aget rngs i) d))) (.node nm nil new-arr)))) ;;; fold-tail (defn new-path [^NodeManager nm ^ArrayManager am shift node] (let [reg? (== 32 (.alength am (.array nm node))) len (if reg? 32 33) arr (object-array len) rngs (if-not reg? (doto (int-array 33) (aset 0 (.alength am (.array nm node))) (aset 32 1))) ret (.node nm nil arr)] (loop [arr arr shift shift] (if (== shift 5) (do (if-not reg? (aset arr 32 rngs)) (aset arr 0 node)) (let [a (object-array len) e (.node nm nil a)] (aset arr 0 e) (if-not reg? (aset arr 32 rngs)) (recur a (- shift 5))))) ret)) (defn fold-tail [^NodeManager nm ^ArrayManager am node shift cnt tail] (let [tlen (.alength am tail) reg? (and (.regular nm node) (== tlen 32)) arr (.array nm node) li (index-of-nil arr) new-arr (object-array (if reg? 32 33)) rngs (if-not (.regular nm node) (ranges nm node)) cret (if (== shift 5) (.node nm nil tail) (fold-tail nm am (aget ^objects arr (dec li)) (- shift 5) (if (.regular nm node) (mod cnt (bit-shift-left 1 shift)) (let [li (unchecked-dec-int (aget rngs 32))] (if (pos? li) (unchecked-subtract-int (aget rngs li) (aget rngs (unchecked-dec-int li))) (aget rngs 0)))) tail)) new-rngs (ints (if-not reg? (if rngs (aclone rngs) (regular-ranges shift cnt))))] (when-not (and (or (nil? cret) (== shift 5)) (== li 32)) (System/arraycopy arr 0 new-arr 0 li) (when-not reg? (if (or (nil? cret) (== shift 5)) (do (aset new-rngs li (+ (if (pos? li) (aget new-rngs (dec li)) (int 0)) tlen)) (aset new-rngs 32 (inc li))) (do (when (pos? li) (aset new-rngs (dec li) (+ (aget new-rngs (dec li)) tlen))) (aset new-rngs 32 li)))) (if-not reg? (aset new-arr 32 new-rngs)) (if (nil? cret) (aset new-arr li (new-path nm am (unchecked-subtract-int shift 5) (.node nm nil tail))) (aset new-arr (if (== shift 5) li (dec li)) cret)) (.node nm nil new-arr)))) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/parameters.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.parameters) ;; This namespace exists primarily so that the parameterized version ;; of this code, and the 'production' version of this code, can be ;; more similar to each other, by requiring this namespace from most ;; of the other namespaces. ;; Even though the values below are not used in most of the production ;; code, they can serve a little bit as documentation of these ;; parameter values. (def shift-increment 5) (def shift-increment-times-2 (* 2 shift-increment)) (def max-branches (bit-shift-left 1 shift-increment)) (def branch-mask (dec max-branches)) (def max-branches-minus-1 (dec max-branches)) (def max-branches-minus-2 (- max-branches 2)) (def non-regular-array-len (inc max-branches)) (def max-branches-squared (* max-branches max-branches)) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/protocols.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.protocols) (defprotocol PSpliceableVector (splicev [v1 v2])) (defprotocol PSliceableVector (slicev [v start end])) (defprotocol PTransientDebugAccess (debugGetRoot [v]) (debugGetShift [v]) (debugGetTail [v]) (debugGetCnt [v])) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/rrbt.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.rrbt (:refer-clojure :exclude [assert ->VecSeq]) (:require [clojure.core.rrb-vector.parameters :as p] [clojure.core.rrb-vector.protocols :refer [PSliceableVector slicev PSpliceableVector splicev PTransientDebugAccess]] [clojure.core.rrb-vector.nodes :refer [ranges overflow? last-range regular-ranges first-child last-child remove-leftmost-child replace-leftmost-child replace-rightmost-child fold-tail new-path index-of-nil object-am object-nm primitive-nm]] [clojure.core.rrb-vector.transients :refer [transient-helper]] [clojure.core.rrb-vector.fork-join :as fj] [clojure.core.protocols :refer [IKVReduce]] [clojure.core.reducers :as r :refer [CollFold coll-fold]]) (:import (clojure.core ArrayManager Vec ArrayChunk) (clojure.lang RT Util Box PersistentVector APersistentVector$SubVector) (clojure.core.rrb_vector.nodes NodeManager) (java.util.concurrent.atomic AtomicReference))) (set! *warn-on-reflection* true) (set! *unchecked-math* true) ;; :warn-on-boxed (def ^:const rrbt-concat-threshold 33) (def ^:const max-extra-search-steps 2) (def ^:const elide-assertions? true) (def ^:const elide-debug-printouts? true) (defmacro assert [& args] (if-not elide-assertions? (apply #'clojure.core/assert &form &env args))) (defmacro dbg [& args] (if-not elide-debug-printouts? `(prn ~@args))) (defmacro dbg- [& args]) (defn throw-unsupported [] (throw (UnsupportedOperationException.))) (defmacro compile-if [test then else] (if (eval test) then else)) (defmacro ^:private caching-hash [coll hash-fn hash-key] `(let [h# ~hash-key] (if-not (== h# 0) h# (let [h# (~hash-fn ~coll)] (set! ~hash-key (int h#)) h#)))) (defn ^:private hash-gvec-seq [xs] (let [cnt (count xs)] (loop [h (int 1) xs (seq xs)] (if xs (let [x (first xs)] (recur (unchecked-add-int (unchecked-multiply-int 31 h) (clojure.lang.Util/hash x)) (next xs))) h)))) (definterface IVecImpl (^int tailoff []) (arrayFor [^int i]) (pushTail [^int shift ^int cnt parent tailnode]) (popTail [^int shift ^int cnt node]) (newPath [^java.util.concurrent.atomic.AtomicReference edit ^int shift node]) (doAssoc [^int shift node ^int i val])) (deftype VecSeq [^ArrayManager am ^IVecImpl vec anode ^int i ^int offset ^clojure.lang.IPersistentMap _meta ^:unsynchronized-mutable ^int _hash ^:unsynchronized-mutable ^int _hasheq] clojure.core.protocols/InternalReduce (internal-reduce [_ f val] (loop [result val aidx i off offset] (if (< aidx (count vec)) (let [node (.arrayFor vec aidx) alen (.alength am node) result (loop [result result node-idx off] (if (< node-idx alen) (let [result (f result (.aget am node node-idx))] (if (reduced? result) result (recur result (inc node-idx)))) result))] (if (reduced? result) @result (recur result (+ aidx alen) 0))) result))) Object (toString [this] (pr-str this)) (hashCode [this] (caching-hash this hash-gvec-seq _hash)) (equals [this that] (cond (identical? this that) true (not (or (sequential? that) (instance? java.util.List that))) false :else (loop [xs this ys (seq that)] (if xs (if ys (if (clojure.lang.Util/equals (first xs) (first ys)) (recur (next xs) (next ys)) false) false) (nil? ys))))) clojure.lang.IHashEq (hasheq [this] (let [h _hasheq] (if (== h 0) (compile-if (resolve 'clojure.core/hash-ordered-coll) (let [h (hash-ordered-coll this)] (do (set! _hasheq (int h)) h)) (loop [h (int 1) xs (seq this)] (if xs (recur (unchecked-add-int (unchecked-multiply-int (int 31) h) (Util/hasheq (first xs))) (next xs)) (do (set! _hasheq (int h)) h)))) h))) clojure.lang.IMeta (meta [this] _meta) clojure.lang.IObj (withMeta [this m] (VecSeq. am vec anode i offset m _hash _hasheq)) clojure.lang.Counted (count [this] (unchecked-subtract-int (unchecked-subtract-int (count vec) i) offset)) clojure.lang.ISeq (first [_] (.aget am anode offset)) (next [this] (if (< (inc offset) (.alength am anode)) (VecSeq. am vec anode i (inc offset) nil 0 0) (.chunkedNext this))) (more [this] (let [s (.next this)] (or s (clojure.lang.PersistentList/EMPTY)))) (cons [this o] (clojure.lang.Cons. o this)) (equiv [this o] (cond (identical? this o) true (or (instance? clojure.lang.Sequential o) (instance? java.util.List o)) (loop [me this you (seq o)] (if (nil? me) (nil? you) (and (clojure.lang.Util/equiv (first me) (first you)) (recur (next me) (next you))))) :else false)) (empty [_] clojure.lang.PersistentList/EMPTY) clojure.lang.Seqable (seq [this] this) clojure.lang.IChunkedSeq (chunkedFirst [_] (ArrayChunk. am anode offset (.alength am anode))) (chunkedNext [_] (let [nexti (+ i (.alength am anode))] (when (< nexti (count vec)) (VecSeq. am vec (.arrayFor vec nexti) nexti 0 nil 0 0)))) (chunkedMore [this] (let [s (.chunkedNext this)] (or s (clojure.lang.PersistentList/EMPTY)))) java.lang.Iterable (iterator [this] (let [xs (clojure.lang.Box. (seq this))] (reify java.util.Iterator (next [this] (locking xs (if-let [v (.-val xs)] (let [x (first v)] (set! (.-val xs) (next v)) x) (throw (java.util.NoSuchElementException. "no more elements in VecSeq iterator"))))) (hasNext [this] (locking xs (not (nil? (.-val xs))))) (remove [this] (throw-unsupported))))) java.io.Serializable java.util.Collection (contains [this o] (boolean (some #(= % o) this))) (containsAll [this c] (every? #(.contains this %) c)) (isEmpty [this] (zero? (count this))) (toArray [this] (into-array Object this)) (^"[Ljava.lang.Object;" toArray [this ^"[Ljava.lang.Object;" arr] (let [cnt (count this)] (if (>= (count arr) cnt) (do (dotimes [i cnt] (aset arr i (nth vec i))) arr) (into-array Object this)))) (size [this] (count this)) (add [_ o] (throw-unsupported)) (addAll [_ c] (throw-unsupported)) (clear [_] (throw-unsupported)) (^boolean remove [_ o] (throw-unsupported)) (removeAll [_ c] (throw-unsupported)) (retainAll [_ c] (throw-unsupported)) java.util.List (get [this i] (nth this i)) (indexOf [this o] (loop [xs (seq this) i 0] (if xs (let [x (first xs)] (if (= o x) i (recur (next xs) (unchecked-inc-int i)))) -1))) (lastIndexOf [this o] (loop [xs (rseq vec) l (unchecked-dec-int (- (count vec) i))] (cond (neg? l) -1 (= o (first xs)) l :else (recur (next xs) (unchecked-dec-int l))))) (listIterator [this] (.listIterator this 0)) (listIterator [this n] (let [n (java.util.concurrent.atomic.AtomicInteger. n)] (reify java.util.ListIterator (hasNext [_] (< (.get n) (count this))) (hasPrevious [_] (pos? n)) (next [_] (try (nth vec (unchecked-add-int i (unchecked-add-int offset (unchecked-dec-int (.incrementAndGet n))))) (catch IndexOutOfBoundsException e (throw (java.util.NoSuchElementException. "no more elements in VecSeq list iterator"))))) (nextIndex [_] (.get n)) (previous [_] (nth vec (unchecked-add i (unchecked-add offset (.decrementAndGet n))))) (previousIndex [_] (unchecked-dec-int (.get n))) (add [_ e] (throw-unsupported)) (remove [_] (throw-unsupported)) (set [_ e] (throw-unsupported))))) (subList [this a z] (seq (slicev vec (unchecked-add (unchecked-add i offset) a) (unchecked-add (unchecked-add i offset) z)))) (add [_ i o] (throw-unsupported)) (addAll [_ i c] (throw-unsupported)) (^Object remove [_ ^int i] (throw-unsupported)) (set [_ i e] (throw-unsupported))) (defprotocol AsRRBT (as-rrbt [v])) (defn slice-right [^NodeManager nm ^ArrayManager am node shift end] (let [shift (int shift) end (int end)] (if (zero? shift) ;; potentially return a short node, although it would be better to ;; make sure a regular leaf is always left at the right, with any ;; items over the final 32 moved into tail (and then potentially ;; back into the tree should the tail become too long...) (let [arr (.array nm node) new-arr (.array am end)] (System/arraycopy arr 0 new-arr 0 end) (.node nm nil new-arr)) (let [regular? (.regular nm node) rngs (if-not regular? (ranges nm node)) i (bit-and (bit-shift-right (unchecked-dec-int end) shift) (int 0x1f)) i (if regular? i (loop [j i] (if (<= end (aget rngs j)) j (recur (unchecked-inc-int j))))) child-end (if regular? (let [ce (unchecked-remainder-int end (bit-shift-left (int 1) shift))] (if (zero? ce) (bit-shift-left (int 1) shift) ce)) (if (pos? i) (unchecked-subtract-int end (aget rngs (unchecked-dec-int i))) end)) arr (.array nm node) new-child (slice-right nm am (aget ^objects arr i) (unchecked-subtract-int shift (int 5)) child-end) regular-child? (if (== shift (int 5)) (== (int 32) (.alength am (.array nm new-child))) (.regular nm new-child)) new-arr (object-array (if (and regular? regular-child?) 32 33)) new-child-rng (if regular-child? (let [m (mod child-end (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (if (== shift (int 5)) (.alength am (.array nm new-child)) (last-range nm new-child)))] (System/arraycopy arr 0 new-arr 0 i) (aset ^objects new-arr i new-child) (if-not (and regular? regular-child?) (let [new-rngs (int-array 33) step (bit-shift-left (int 1) shift)] (if regular? (dotimes [j i] (aset new-rngs j (unchecked-multiply-int (inc j) step))) (dotimes [j i] (aset new-rngs j (aget rngs j)))) (aset new-rngs i (unchecked-add-int (if (pos? i) (aget new-rngs (unchecked-dec-int i)) (int 0)) new-child-rng)) (aset new-rngs 32 (unchecked-inc-int i)) (aset new-arr 32 new-rngs))) (.node nm nil new-arr))))) (defn slice-left [^NodeManager nm ^ArrayManager am node shift start end] (let [shift (int shift) start (int start) end (int end)] (if (zero? shift) ;; potentially return a short node (let [arr (.array nm node) new-len (unchecked-subtract-int (.alength am arr) start) new-arr (.array am new-len)] (System/arraycopy arr start new-arr 0 new-len) (.node nm nil new-arr)) (let [regular? (.regular nm node) arr (.array nm node) rngs (if-not regular? (ranges nm node)) i (bit-and (bit-shift-right start shift) (int 0x1f)) i (if regular? i (loop [j i] (if (< start (aget rngs j)) j (recur (unchecked-inc-int j))))) len (if regular? (loop [i i] (if (or (== i (int 32)) (nil? (aget ^objects arr i))) i (recur (unchecked-inc-int i)))) (aget rngs 32)) child-start (if (pos? i) (unchecked-subtract-int start (if regular? (unchecked-multiply-int i (bit-shift-left (int 1) shift)) (aget rngs (unchecked-dec-int i)))) start) child-end (int (if regular? (min (bit-shift-left (int 1) shift) (if (pos? i) (unchecked-subtract-int end (unchecked-multiply-int i (bit-shift-left (int 1) shift))) end)) (let [capped-end (min (aget rngs i) end)] (if (pos? i) (unchecked-subtract-int capped-end (aget rngs (unchecked-dec-int i))) capped-end)))) new-child (slice-left nm am (aget ^objects arr i) (unchecked-subtract-int shift (int 5)) child-start child-end) new-len (unchecked-subtract-int len i) new-len (if (nil? new-child) (unchecked-dec-int new-len) new-len)] (cond (zero? new-len) nil regular? (let [new-arr (object-array 33) rngs (int-array 33) rng0 (if (or (nil? new-child) (== shift (int 5)) (.regular nm new-child)) (unchecked-subtract-int (bit-shift-left (int 1) shift) (bit-and (bit-shift-right start (unchecked-subtract-int shift (int 5))) (int 0x1f))) (int (last-range nm new-child))) step (bit-shift-left (int 1) shift)] (loop [j (int 0) r rng0] (when (< j new-len) (aset rngs j r) (recur (unchecked-inc-int j) (unchecked-add-int r step)))) (when (> new-len 1) (aset rngs (dec new-len) (- end start))) (aset rngs 32 new-len) (System/arraycopy arr (if (nil? new-child) (unchecked-inc-int i) i) new-arr 0 new-len) (if-not (nil? new-child) (aset new-arr 0 new-child)) (aset new-arr 32 rngs) (.node nm (.edit nm node) new-arr)) :else (let [new-arr (object-array 33) new-rngs (int-array 33)] (loop [j (int 0) i i] (when (< j new-len) (aset new-rngs j (unchecked-subtract-int (aget rngs i) start)) (recur (unchecked-inc-int j) (unchecked-inc-int i)))) (aset new-rngs 32 new-len) (System/arraycopy arr (if (nil? new-child) (unchecked-inc-int i) i) new-arr 0 new-len) (if-not (nil? new-child) (aset new-arr 0 new-child)) (aset new-arr 32 new-rngs) (.node nm (.edit nm node) new-arr))))))) (declare splice-rrbts ->Transient) (deftype Vector [^NodeManager nm ^ArrayManager am ^int cnt ^int shift root tail ^clojure.lang.IPersistentMap _meta ^:unsynchronized-mutable ^int _hash ^:unsynchronized-mutable ^int _hasheq] Object (equals [this that] (cond (identical? this that) true (or (instance? clojure.lang.IPersistentVector that) (instance? java.util.RandomAccess that)) (and (== cnt (count that)) (loop [i (int 0)] (cond (== i cnt) true (.equals (.nth this i) (nth that i)) (recur (unchecked-inc-int i)) :else false))) (or (instance? clojure.lang.Sequential that) (instance? java.util.List that)) (.equals (seq this) (seq that)) :else false)) (hashCode [this] (let [h _hash] (if (== h 0) (loop [h (int 1) i (int 0)] (if (== i cnt) (do (set! _hash (int h)) h) (let [val (.nth this i)] (recur (unchecked-add-int (unchecked-multiply-int (int 31) h) (Util/hash val)) (unchecked-inc-int i))))) h))) (toString [this] (pr-str this)) clojure.lang.IHashEq (hasheq [this] (let [h _hasheq] (if (== h 0) (compile-if (resolve 'clojure.core/hash-ordered-coll) (let [h (hash-ordered-coll this)] (do (set! _hasheq (int h)) h)) (loop [h (int 1) xs (seq this)] (if xs (recur (unchecked-add-int (unchecked-multiply-int (int 31) h) (Util/hasheq (first xs))) (next xs)) (do (set! _hasheq (int h)) h)))) h))) clojure.lang.Counted (count [_] cnt) clojure.lang.IMeta (meta [_] _meta) clojure.lang.IObj (withMeta [_ m] (Vector. nm am cnt shift root tail m _hash _hasheq)) clojure.lang.Indexed (nth [this i] (if (and (<= (int 0) i) (< i cnt)) (let [tail-off (.tailoff this)] (if (<= tail-off i) (.aget am tail (unchecked-subtract-int i tail-off)) (loop [i i node root shift shift] (if (zero? shift) (let [arr (.array nm node)] (.aget am arr (bit-and (bit-shift-right i shift) (int 0x1f)))) (if (.regular nm node) (let [arr (.array nm node) idx (bit-and (bit-shift-right i shift) (int 0x1f))] (loop [i i node (aget ^objects arr idx) shift (unchecked-subtract-int shift (int 5))] (let [arr (.array nm node) idx (bit-and (bit-shift-right i shift) (int 0x1f))] (if (zero? shift) (.aget am arr idx) (recur i (aget ^objects arr idx) (unchecked-subtract-int shift (int 5))))))) (let [arr (.array nm node) rngs (ranges nm node) idx (loop [j (bit-and (bit-shift-right i shift) (int 0x1f))] (if (< i (aget rngs j)) j (recur (unchecked-inc-int j)))) i (if (zero? idx) (int i) (unchecked-subtract-int (int i) (aget rngs (unchecked-dec-int idx))))] (recur i (aget ^objects arr idx) (unchecked-subtract-int shift (int 5))))))))) (throw (IndexOutOfBoundsException.)))) (nth [this i not-found] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) not-found)) clojure.lang.IPersistentCollection (cons [this val] (if (< (.alength am tail) (int 32)) (let [tail-len (.alength am tail) new-tail (.array am (unchecked-inc-int tail-len))] (System/arraycopy tail 0 new-tail 0 tail-len) (.aset am new-tail tail-len val) (Vector. nm am (unchecked-inc-int cnt) shift root new-tail _meta 0 0)) (let [tail-node (.node nm (.edit nm root) tail) new-tail (let [new-arr (.array am 1)] (.aset am new-arr 0 val) new-arr)] (if (overflow? nm root shift cnt) (if (.regular nm root) (let [new-arr (object-array 32) new-root (.node nm (.edit nm root) new-arr)] (doto new-arr (aset (int 0) root) (aset (int 1) (.newPath this (.edit nm root) shift tail-node))) (Vector. nm am (unchecked-inc-int cnt) (unchecked-add-int shift (int 5)) new-root new-tail _meta 0 0)) (let [new-arr (object-array 33) new-rngs (ints (int-array 33)) new-root (.node nm (.edit nm root) new-arr) root-total-range (aget (ranges nm root) (int 31))] (doto new-arr (aset (int 0) root) (aset (int 1) (.newPath this (.edit nm root) shift tail-node)) (aset (int 32) new-rngs)) (doto new-rngs (aset (int 0) root-total-range) (aset (int 1) (unchecked-add-int root-total-range (int 32))) (aset (int 32) (int 2))) (Vector. nm am (unchecked-inc-int cnt) (unchecked-add-int shift (int 5)) new-root new-tail _meta 0 0))) (Vector. nm am (unchecked-inc-int cnt) shift (.pushTail this shift cnt root tail-node) new-tail _meta 0 0))))) (empty [_] (Vector. nm am 0 5 (.empty nm) (.array am 0) _meta 0 0)) (equiv [this that] (cond (or (instance? clojure.lang.IPersistentVector that) (instance? java.util.RandomAccess that)) (and (== cnt (count that)) (loop [i (int 0)] (cond (== i cnt) true (= (.nth this i) (nth that i)) (recur (unchecked-inc-int i)) :else false))) (or (instance? clojure.lang.Sequential that) (instance? java.util.List that)) (Util/equiv (seq this) (seq that)) :else false)) clojure.lang.IPersistentStack (peek [this] (when (pos? cnt) (.nth this (unchecked-dec-int cnt)))) (pop [this] (cond (zero? cnt) (throw (IllegalStateException. "Can't pop empty vector")) (== 1 cnt) (Vector. nm am 0 5 (.empty nm) (.array am 0) _meta 0 0) (> (.alength am tail) (int 1)) (let [new-tail (.array am (unchecked-dec-int (.alength am tail)))] (System/arraycopy tail 0 new-tail 0 (.alength am new-tail)) (Vector. nm am (unchecked-dec-int cnt) shift root new-tail _meta 0 0)) :else (let [new-tail (.arrayFor this (unchecked-subtract-int cnt (int 2))) root-cnt (.tailoff this) new-root (.popTail this shift root-cnt root)] (cond (nil? new-root) (Vector. nm am (unchecked-dec-int cnt) shift (.empty nm) new-tail _meta 0 0) (and (> shift (int 5)) (nil? (aget ^objects (.array nm new-root) 1))) (Vector. nm am (unchecked-dec-int cnt) (unchecked-subtract-int shift (int 5)) (aget ^objects (.array nm new-root) 0) new-tail _meta 0 0) :else (Vector. nm am (unchecked-dec-int cnt) shift new-root new-tail _meta 0 0))))) clojure.lang.IPersistentVector (assocN [this i val] (cond (and (<= (int 0) i) (< i cnt)) (let [tail-off (.tailoff this)] (if (>= i tail-off) (let [new-tail (.array am (.alength am tail)) idx (unchecked-subtract-int i tail-off)] (System/arraycopy tail 0 new-tail 0 (.alength am tail)) (.aset am new-tail idx val) (Vector. nm am cnt shift root new-tail _meta 0 0)) (Vector. nm am cnt shift (.doAssoc this shift root i val) tail _meta 0 0))) (== i cnt) (.cons this val) :else (throw (IndexOutOfBoundsException.)))) (length [this] (.count this)) clojure.lang.Reversible (rseq [this] (if (pos? cnt) (clojure.lang.APersistentVector$RSeq. this (unchecked-dec-int cnt)) nil)) clojure.lang.Associative (assoc [this k v] (if (Util/isInteger k) (.assocN this k v) (throw (IllegalArgumentException. "Key must be integer")))) (containsKey [this k] (and (Util/isInteger k) (<= (int 0) (int k)) (< (int k) cnt))) (entryAt [this k] (if (.containsKey this k) (clojure.lang.MapEntry. k (.nth this (int k))) nil)) clojure.lang.ILookup (valAt [this k not-found] (if (Util/isInteger k) (let [i (int k)] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) not-found)) not-found)) (valAt [this k] (.valAt this k nil)) clojure.lang.IFn (invoke [this k] (if (Util/isInteger k) (let [i (int k)] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) (throw (IndexOutOfBoundsException.)))) (throw (IllegalArgumentException. "Key must be integer")))) (applyTo [this args] (let [n (RT/boundedLength args 1)] (case n 0 (throw (clojure.lang.ArityException. n (.. this (getClass) (getSimpleName)))) 1 (.invoke this (first args)) 2 (throw (clojure.lang.ArityException. n (.. this (getClass) (getSimpleName))))))) clojure.lang.Seqable (seq [this] (if (zero? cnt) nil (VecSeq. am this (.arrayFor this 0) 0 0 nil 0 0))) clojure.lang.Sequential clojure.lang.IEditableCollection (asTransient [this] (->Transient nm am (identical? am object-am) cnt shift (.editableRoot transient-helper nm am root) (.editableTail transient-helper am tail) (.alength am tail))) IVecImpl (tailoff [_] (unchecked-subtract-int cnt (.alength am tail))) (arrayFor [this i] (if (and (<= (int 0) i) (< i cnt)) (if (>= i (.tailoff this)) tail (loop [i (int i) node root shift shift] (if (zero? shift) (.array nm node) (if (.regular nm node) (loop [node (aget ^objects (.array nm node) (bit-and (bit-shift-right i shift) (int 0x1f))) shift (unchecked-subtract-int shift (int 5))] (if (zero? shift) (.array nm node) (recur (aget ^objects (.array nm node) (bit-and (bit-shift-right i shift) (int 0x1f))) (unchecked-subtract-int shift (int 5))))) (let [rngs (ranges nm node) j (loop [j (bit-and (bit-shift-right i shift) (int 0x1f))] (if (< i (aget rngs j)) j (recur (unchecked-inc-int j)))) i (if (pos? j) (unchecked-subtract-int i (aget rngs (unchecked-dec-int j))) i)] (recur (int i) (aget ^objects (.array nm node) j) (unchecked-subtract-int shift (int 5)))))))) (throw (IndexOutOfBoundsException.)))) (pushTail [this shift cnt node tail-node] (if (.regular nm node) (let [arr (aclone ^objects (.array nm node)) ret (.node nm (.edit nm node) arr)] (loop [node ret shift (int shift)] (let [arr (.array nm node) subidx (bit-and (bit-shift-right (unchecked-dec-int cnt) shift) (int 0x1f))] (if (== shift (int 5)) (aset ^objects arr subidx tail-node) (if-let [child (aget ^objects arr subidx)] (let [new-carr (aclone ^objects (.array nm child)) new-child (.node nm (.edit nm root) new-carr)] (aset ^objects arr subidx new-child) (recur new-child (unchecked-subtract-int shift (int 5)))) (aset ^objects arr subidx (.newPath this (.edit nm root) (unchecked-subtract-int shift (int 5)) tail-node)))))) ret) (let [arr (aclone ^objects (.array nm node)) rngs (ranges nm node) li (unchecked-dec-int (aget rngs 32)) ret (.node nm (.edit nm node) arr) cret (if (== shift (int 5)) nil (let [child (aget ^objects arr li) ccnt (unchecked-add-int (int (if (pos? li) (unchecked-subtract-int (aget rngs li) (aget rngs (unchecked-dec-int li))) (aget rngs 0))) ;; add 32 elems to account for the new ;; full tail we plan to add to the ;; subtree. (int 32))] ;; See Note 2 in file transients.clj (if-not (overflow? nm child (unchecked-subtract-int shift (int 5)) ccnt) (.pushTail this (unchecked-subtract-int shift (int 5)) ccnt (aget ^objects arr li) tail-node))))] (if cret (do (aset ^objects arr li cret) (aset rngs li (unchecked-add-int (aget rngs li) (int 32))) ret) (do (when (>= li 31) ;; See Note 1 in file transients.clj (let [msg (str "Assigning index " (inc li) " of vector" " object array to become a node, when that" " index should only be used for storing" " range arrays.") data {:shift shift, :cnt cnt, :node node, :tail-node tail-node, :rngs rngs, :li li, :cret cret}] (throw (ex-info msg data)))) (aset ^objects arr (unchecked-inc-int li) (.newPath this (.edit nm root) (unchecked-subtract-int shift (int 5)) tail-node)) (aset rngs (unchecked-inc-int li) (unchecked-add-int (aget rngs li) (int 32))) (aset rngs 32 (unchecked-inc-int (aget rngs 32))) ret))))) (popTail [this shift cnt node] (if (.regular nm node) (let [subidx (bit-and (bit-shift-right (unchecked-subtract-int cnt (int 2)) (int shift)) (int 0x1f))] (cond (> (int shift) (int 5)) (let [new-child (.popTail this (unchecked-subtract-int (int shift) (int 5)) cnt (aget ^objects (.array nm node) subidx))] (if (and (nil? new-child) (zero? subidx)) nil (let [arr (aclone ^objects (.array nm node))] (aset arr subidx new-child) (.node nm (.edit nm root) arr)))) (zero? subidx) nil :else (let [arr (aclone ^objects (.array nm node))] (aset arr subidx nil) (.node nm (.edit nm root) arr)))) (let [rngs (ranges nm node) subidx (unchecked-dec-int (aget rngs 32)) new-rngs (aclone rngs)] (cond (> (int shift) (int 5)) (let [child (aget ^objects (.array nm node) subidx) child-cnt (if (zero? subidx) (aget rngs 0) (unchecked-subtract-int (aget rngs subidx) (aget rngs (unchecked-dec-int subidx)))) new-child (.popTail this (unchecked-subtract-int (int shift) (int 5)) child-cnt child)] (cond (and (nil? new-child) (zero? subidx)) nil (.regular nm child) (let [arr (aclone ^objects (.array nm node))] (aset new-rngs subidx (unchecked-subtract-int (aget new-rngs subidx) (int 32))) (aset arr subidx new-child) (aset arr (int 32) new-rngs) (if (nil? new-child) (aset new-rngs 32 (unchecked-dec-int (aget new-rngs 32)))) (.node nm (.edit nm root) arr)) :else (let [rng (int (last-range nm child)) diff (unchecked-subtract-int rng (if new-child (last-range nm new-child) 0)) arr (aclone ^objects (.array nm node))] (aset new-rngs subidx (unchecked-subtract-int (aget new-rngs subidx) diff)) (aset arr subidx new-child) (aset arr (int 32) new-rngs) (if (nil? new-child) (aset new-rngs 32 (unchecked-dec-int (aget new-rngs 32)))) (.node nm (.edit nm root) arr)))) (zero? subidx) nil :else (let [arr (aclone ^objects (.array nm node)) child (aget arr subidx) new-rngs (aclone rngs)] (aset arr subidx nil) (aset arr (int 32) new-rngs) (aset new-rngs subidx 0) (aset new-rngs 32 (unchecked-dec-int (aget new-rngs (int 32)))) (.node nm (.edit nm root) arr)))))) (newPath [this ^AtomicReference edit ^int shift node] (if (== (.alength am tail) (int 32)) (let [shift (int shift)] (loop [s (int 0) node node] (if (== s shift) node (let [arr (object-array 32) ret (.node nm edit arr)] (aset arr 0 node) (recur (unchecked-add-int s (int 5)) ret))))) (let [shift (int shift)] (loop [s (int 0) node node] (if (== s shift) node (let [arr (object-array 33) rngs (int-array 33) ret (.node nm edit arr)] (aset arr 0 node) (aset arr 32 rngs) (aset rngs 32 1) (aset rngs 0 (.alength am tail)) (recur (unchecked-add-int s (int 5)) ret))))))) (doAssoc [this shift node i val] (if (.regular nm node) (let [node (.clone nm am shift node)] (loop [shift (int shift) node node] (if (zero? shift) (let [arr (.array nm node)] (.aset am arr (bit-and i (int 0x1f)) val)) (let [arr (.array nm node) subidx (bit-and (bit-shift-right i shift) (int 0x1f)) next-shift (int (unchecked-subtract-int shift (int 5))) child (.clone nm am next-shift (aget ^objects arr subidx))] (aset ^objects arr subidx child) (recur next-shift child)))) node) (let [arr (aclone ^objects (.array nm node)) rngs (ranges nm node) subidx (bit-and (bit-shift-right i shift) (int 0x1f)) subidx (loop [subidx subidx] (if (< i (aget rngs subidx)) subidx (recur (unchecked-inc-int subidx)))) i (if (zero? subidx) i (unchecked-subtract-int i (aget rngs (unchecked-dec-int subidx))))] (aset arr subidx (.doAssoc this (unchecked-subtract-int (int shift) (int 5)) (aget arr subidx) i val)) (.node nm (.edit nm node) arr)))) IKVReduce (kv-reduce [this f init] (if (zero? cnt) init (loop [i (int 0) j (int 0) init init arr (.arrayFor this i) lim (unchecked-dec-int (.alength am arr)) step (unchecked-inc-int lim)] (let [init (f init (unchecked-add-int i j) (.aget am arr j))] (if (reduced? init) @init (if (< j lim) (recur i (unchecked-inc-int j) init arr lim step) (let [i (unchecked-add-int i step)] (if (< i cnt) (let [arr (.arrayFor this i) len (.alength am arr) lim (unchecked-dec-int len)] (recur i (int 0) init arr lim len)) init)))))))) CollFold ;; adapted from #'clojure.core.reducers/foldvec (coll-fold [this n combinef reducef] (let [n (int n)] (cond (zero? cnt) (combinef) (<= cnt n) (r/reduce reducef (combinef) this) :else (let [split (quot cnt 2) v1 (slicev this 0 split) v2 (slicev this split cnt) fc (fn [child] #(coll-fold child n combinef reducef))] (fj/invoke #(let [f1 (fc v1) t2 (fj/task (fc v2))] (fj/fork t2) (combinef (f1) (fj/join t2)))))))) PSliceableVector (slicev [this start end] (let [start (int start) end (int end) new-cnt (unchecked-subtract-int end start)] (cond (or (neg? start) (> end cnt)) (throw (IndexOutOfBoundsException.)) (== start end) ;; NB. preserves metadata (empty this) (> start end) (throw (IllegalStateException. "start index greater than end index")) :else (let [tail-off (.tailoff this)] (if (>= start tail-off) (let [new-tail (.array am new-cnt)] (System/arraycopy tail (unchecked-subtract-int start tail-off) new-tail 0 new-cnt) (Vector. nm am new-cnt (int 5) (.empty nm) new-tail _meta 0 0)) (let [tail-cut? (> end tail-off) new-root (if tail-cut? root (slice-right nm am root shift end)) new-root (if (zero? start) new-root (slice-left nm am new-root shift start (min end tail-off))) new-tail (if tail-cut? (let [new-len (unchecked-subtract-int end tail-off) new-tail (.array am new-len)] (System/arraycopy tail 0 new-tail 0 new-len) new-tail) (.arrayFor (Vector. nm am new-cnt shift new-root (.array am 0) nil 0 0) (unchecked-dec-int new-cnt))) new-root (if tail-cut? new-root (.popTail (Vector. nm am new-cnt shift new-root (.array am 0) nil 0 0) shift new-cnt new-root))] (if (nil? new-root) (Vector. nm am new-cnt 5 (.empty nm) new-tail _meta 0 0) (loop [r new-root s (int shift)] (if (and (> s (int 5)) (nil? (aget ^objects (.array nm r) 1))) (recur (aget ^objects (.array nm r) 0) (unchecked-subtract-int s (int 5))) (Vector. nm am new-cnt s r new-tail _meta 0 0)))))))))) PSpliceableVector (splicev [this that] (splice-rrbts nm am this (as-rrbt that))) AsRRBT (as-rrbt [this] this) java.io.Serializable java.lang.Comparable (compareTo [this that] (if (identical? this that) 0 (let [^clojure.lang.IPersistentVector v (cast clojure.lang.IPersistentVector that) vcnt (.count v)] (cond (< cnt vcnt) -1 (> cnt vcnt) 1 :else (loop [i (int 0)] (if (== i cnt) 0 (let [comp (Util/compare (.nth this i) (.nth v i))] (if (zero? comp) (recur (unchecked-inc-int i)) comp)))))))) java.lang.Iterable (iterator [this] (let [i (java.util.concurrent.atomic.AtomicInteger. 0)] (reify java.util.Iterator (hasNext [_] (< (.get i) cnt)) (next [_] (try (.nth this (unchecked-dec-int (.incrementAndGet i))) (catch IndexOutOfBoundsException e (throw (java.util.NoSuchElementException. "no more elements in RRB vector iterator"))))) (remove [_] (throw-unsupported))))) java.util.Collection (contains [this o] (boolean (some #(= % o) this))) (containsAll [this c] (every? #(.contains this %) c)) (isEmpty [_] (zero? cnt)) (toArray [this] (into-array Object this)) (^"[Ljava.lang.Object;" toArray [this ^"[Ljava.lang.Object;" arr] (if (>= (count arr) cnt) (do (dotimes [i cnt] (aset arr i (.nth this i))) arr) (into-array Object this))) (size [_] cnt) (add [_ o] (throw-unsupported)) (addAll [_ c] (throw-unsupported)) (clear [_] (throw-unsupported)) (^boolean remove [_ o] (throw-unsupported)) (removeAll [_ c] (throw-unsupported)) (retainAll [_ c] (throw-unsupported)) java.util.RandomAccess java.util.List (get [this i] (.nth this i)) (indexOf [this o] (loop [i (int 0)] (cond (== i cnt) -1 (= o (.nth this i)) i :else (recur (unchecked-inc-int i))))) (lastIndexOf [this o] (loop [i (unchecked-dec-int cnt)] (cond (neg? i) -1 (= o (.nth this i)) i :else (recur (unchecked-dec-int i))))) (listIterator [this] (.listIterator this 0)) (listIterator [this i] (let [i (java.util.concurrent.atomic.AtomicInteger. i)] (reify java.util.ListIterator (hasNext [_] (< (.get i) cnt)) (hasPrevious [_] (pos? i)) (next [_] (try (.nth this (unchecked-dec-int (.incrementAndGet i))) (catch IndexOutOfBoundsException e (throw (java.util.NoSuchElementException. "no more elements in RRB vector list iterator"))))) (nextIndex [_] (.get i)) (previous [_] (.nth this (.decrementAndGet i))) (previousIndex [_] (unchecked-dec-int (.get i))) (add [_ e] (throw-unsupported)) (remove [_] (throw-unsupported)) (set [_ e] (throw-unsupported))))) (subList [this a z] (slicev this a z)) (add [_ i o] (throw-unsupported)) (addAll [_ i c] (throw-unsupported)) (^Object remove [_ ^int i] (throw-unsupported)) (set [_ i e] (throw-unsupported))) (extend-protocol AsRRBT Vec (as-rrbt [^Vec this] (Vector. primitive-nm (.-am this) (.-cnt this) (.-shift this) (.-root this) (.-tail this) (.-_meta this) 0 0)) PersistentVector (as-rrbt [^PersistentVector this] (Vector. object-nm object-am (count this) (.-shift this) (.-root this) (.-tail this) (meta this) 0 0)) APersistentVector$SubVector (as-rrbt [^APersistentVector$SubVector this] (let [v (.-v this) start (.-start this) end (.-end this)] (slicev (as-rrbt v) start end))) java.util.Map$Entry (as-rrbt [^java.util.Map$Entry this] (as-rrbt [(.getKey this) (.getValue this)]))) (defn shift-from-to [^NodeManager nm node from to] (cond (== from to) node (.regular nm node) (recur nm (.node nm (.edit nm node) (doto (object-array 32) (aset 0 node))) (unchecked-add-int (int 5) (int from)) to) :else (recur nm (.node nm (.edit nm node) (doto (object-array 33) (aset 0 node) (aset 32 (ints (doto (int-array 33) (aset 0 (int (last-range nm node))) (aset 32 (int 1))))))) (unchecked-add-int (int 5) (int from)) to))) (defn pair ^"[Ljava.lang.Object;" [x y] (doto (object-array 2) (aset 0 x) (aset 1 y))) (defn slot-count [^NodeManager nm ^ArrayManager am node shift] (let [arr (.array nm node)] (if (zero? shift) (.alength am arr) (if (.regular nm node) (index-of-nil arr) (let [rngs (ranges nm node)] (aget rngs 32)))))) (defn subtree-branch-count [^NodeManager nm ^ArrayManager am node shift] ;; NB. positive shifts only (let [arr (.array nm node) cs (- shift 5)] (if (.regular nm node) (loop [i 0 sbc 0] (if (== i 32) sbc (if-let [child (aget ^objects arr i)] (recur (inc i) (+ sbc (long (slot-count nm am child cs)))) sbc))) (let [lim (aget (ranges nm node) 32)] (loop [i 0 sbc 0] (if (== i lim) sbc (let [child (aget ^objects arr i)] (recur (inc i) (+ sbc (long (slot-count nm am child cs))))))))))) (defn leaf-seq [^NodeManager nm arr] (mapcat #(.array nm %) (take (index-of-nil arr) arr))) (defn rebalance-leaves [^NodeManager nm ^ArrayManager am n1 cnt1 n2 cnt2 ^Box transferred-leaves] (let [slc1 (slot-count nm am n1 5) slc2 (slot-count nm am n2 5) a (+ slc1 slc2) sbc1 (subtree-branch-count nm am n1 5) sbc2 (subtree-branch-count nm am n2 5) p (+ sbc1 sbc2) e (- a (inc (quot (dec p) 32)))] (cond (<= e max-extra-search-steps) (pair n1 n2) (<= (+ sbc1 sbc2) 1024) (let [reg? (zero? (mod p 32)) new-arr (object-array (if reg? 32 33)) new-n1 (.node nm nil new-arr)] (loop [i 0 bs (partition-all 32 (concat (leaf-seq nm (.array nm n1)) (leaf-seq nm (.array nm n2))))] (when-first [block bs] (let [a (.array am (count block))] (loop [i 0 xs (seq block)] (when xs (.aset am a i (first xs)) (recur (inc i) (next xs)))) (aset new-arr i (.node nm nil a)) (recur (inc i) (next bs))))) (if-not reg? (aset new-arr 32 (regular-ranges 5 p))) (set! (.-val transferred-leaves) sbc2) (pair new-n1 nil)) :else (let [reg? (zero? (mod p 32)) new-arr1 (object-array 32) new-arr2 (object-array (if reg? 32 33)) new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] (loop [i 0 bs (partition-all 32 (concat (leaf-seq nm (.array nm n1)) (leaf-seq nm (.array nm n2))))] (when-first [block bs] (let [a (.array am (count block))] (loop [i 0 xs (seq block)] (when xs (.aset am a i (first xs)) (recur (inc i) (next xs)))) (if (< i 32) (aset new-arr1 i (.node nm nil a)) (aset new-arr2 (- i 32) (.node nm nil a))) (recur (inc i) (next bs))))) (if-not reg? (aset new-arr2 32 (regular-ranges 5 (- p 1024)))) (set! (.-val transferred-leaves) (- 1024 sbc1)) (pair new-n1 new-n2))))) (defn child-seq [^NodeManager nm node shift cnt] (let [arr (.array nm node) rngs (if (.regular nm node) (ints (regular-ranges shift cnt)) (ranges nm node)) cs (if rngs (aget rngs 32) (index-of-nil arr)) cseq (fn cseq [c r] (let [arr (.array nm c) rngs (if (.regular nm c) (ints (regular-ranges (- shift 5) r)) (ranges nm c)) gcs (if rngs (aget rngs 32) (index-of-nil arr))] (map list (take gcs arr) (take gcs (map - rngs (cons 0 rngs))))))] (mapcat cseq (take cs arr) (take cs (map - rngs (cons 0 rngs)))))) (defn rebalance [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] (if (nil? n2) (pair n1 nil) (let [slc1 (slot-count nm am n1 shift) slc2 (slot-count nm am n2 shift) a (+ slc1 slc2) sbc1 (subtree-branch-count nm am n1 shift) sbc2 (subtree-branch-count nm am n2 shift) p (+ sbc1 sbc2) e (- a (inc (quot (dec p) 32)))] (cond (<= e max-extra-search-steps) (pair n1 n2) (<= (+ sbc1 sbc2) 1024) (let [new-arr (object-array 33) new-rngs (int-array 33) new-n1 (.node nm nil new-arr)] (loop [i 0 bs (partition-all 32 (concat (child-seq nm n1 shift cnt1) (child-seq nm n2 shift cnt2)))] (when-first [block bs] (let [a (object-array 33) r (int-array 33)] (aset a 32 r) (aset r 32 (count block)) (loop [i 0 o (int 0) gcs (seq block)] (when-first [[gc gcr] gcs] (aset ^objects a i gc) (aset r i (unchecked-add-int o (int gcr))) (recur (inc i) (unchecked-add-int o (int gcr)) (next gcs)))) (aset ^objects new-arr i (.node nm nil a)) (aset new-rngs i (+ (aget r (dec (aget r 32))) (if (pos? i) (aget new-rngs (dec i)) (int 0)))) (aset new-rngs 32 (inc i)) (recur (inc i) (next bs))))) (aset new-arr 32 new-rngs) (set! (.-val transferred-leaves) cnt2) (pair new-n1 nil)) :else (let [new-arr1 (object-array 33) new-arr2 (object-array 33) new-rngs1 (int-array 33) new-rngs2 (int-array 33) new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] (loop [i 0 bs (partition-all 32 (concat (child-seq nm n1 shift cnt1) (child-seq nm n2 shift cnt2)))] (when-first [block bs] (let [a (object-array 33) r (int-array 33)] (aset a 32 r) (aset r 32 (count block)) (loop [i 0 o (int 0) gcs (seq block)] (when-first [[gc gcr] gcs] (aset a i gc) (aset r i (unchecked-add-int o (int gcr))) (recur (inc i) (unchecked-add-int o (int gcr)) (next gcs)))) (if (and (< i 32) (> (+ (* i 32) (count block)) sbc1)) (let [tbs (- (+ (* i 32) (count block)) sbc1) li (dec (aget r 32)) d (if (>= tbs 32) (aget r li) (- (aget r li) (aget r (- li tbs))))] (set! (.-val transferred-leaves) (+ (.-val transferred-leaves) d)))) (let [new-arr (if (< i 32) new-arr1 new-arr2) new-rngs (if (< i 32) new-rngs1 new-rngs2) i (mod i 32)] (aset ^objects new-arr i (.node nm nil a)) (aset new-rngs i (+ (aget r (dec (aget r 32))) (if (pos? i) (aget new-rngs (dec i)) (int 0)))) (aset new-rngs 32 (int (inc i)))) (recur (inc i) (next bs))))) (aset new-arr1 32 new-rngs1) (aset new-arr2 32 new-rngs2) (pair new-n1 new-n2)))))) (defn zippath [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] (if (== shift 5) (rebalance-leaves nm am n1 cnt1 n2 cnt2 transferred-leaves) (let [c1 (last-child nm n1) c2 (first-child nm n2) ccnt1 (if (.regular nm n1) (let [m (mod cnt1 (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (let [rngs (ranges nm n1) i (dec (aget rngs 32))] (if (zero? i) (aget rngs 0) (- (aget rngs i) (aget rngs (dec i)))))) ccnt2 (if (.regular nm n2) (let [m (mod cnt2 (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (aget (ranges nm n2) 0)) next-transferred-leaves (Box. 0) [new-c1 new-c2] (zippath nm am (- shift 5) c1 ccnt1 c2 ccnt2 next-transferred-leaves) d (.-val next-transferred-leaves)] (set! (.-val transferred-leaves) (+ (.-val transferred-leaves) d)) (rebalance nm am shift (if (identical? c1 new-c1) n1 (replace-rightmost-child nm shift n1 new-c1 d)) (+ cnt1 d) (if new-c2 (if (identical? c2 new-c2) n2 (replace-leftmost-child nm shift n2 cnt2 new-c2 d)) (remove-leftmost-child nm shift n2)) (- cnt2 d) transferred-leaves)))) (defn squash-nodes [^NodeManager nm shift n1 cnt1 n2 cnt2] (let [arr1 (.array nm n1) arr2 (.array nm n2) li1 (index-of-nil arr1) li2 (index-of-nil arr2) slots (concat (take li1 arr1) (take li2 arr2))] (if (> (count slots) 32) (pair n1 n2) (let [new-rngs (int-array 33) new-arr (object-array 33) rngs1 (take li1 (if (.regular nm n1) (regular-ranges shift cnt1) (ranges nm n1))) rngs2 (take li2 (if (.regular nm n2) (regular-ranges shift cnt2) (ranges nm n2))) rngs2 (let [r (last rngs1)] (map #(+ % r) rngs2)) rngs (concat rngs1 rngs2)] (aset new-arr 32 new-rngs) (loop [i 0 cs (seq slots)] (when cs (aset new-arr i (first cs)) (recur (inc i) (next cs)))) (loop [i 0 rngs (seq rngs)] (if rngs (do (aset new-rngs i (int (first rngs))) (recur (inc i) (next rngs))) (aset new-rngs 32 i))) (pair (.node nm nil new-arr) nil))))) (def peephole-optimization-config (atom {:debug-fn nil})) (def peephole-optimization-count (atom 0)) ;; TBD: Transducer versions of child-nodes and bounded-grandchildren ;; are included here for when we are willing to rely upon Clojure ;; 1.7.0 as the minimum version supported by the core.rrb-vector ;; library. They are faster. #_(defn child-nodes [node ^NodeManager nm] (into [] (comp (take-while (complement nil?)) (take 32)) (.array nm node))) (defn child-nodes [node ^NodeManager nm] (->> (.array nm node) (take-while (complement nil?)) (take 32))) ;; (take 33) is just a technique to avoid generating more ;; grandchildren than necessary. If there are at least 33, we do not ;; care how many there are. #_(defn bounded-grandchildren [nm children] (into [] (comp (map #(child-nodes % nm)) cat (take 33)) children)) (defn bounded-grandchildren [nm children] (->> children (mapcat #(child-nodes % nm)) (take 33))) ;; TBD: Do functions like last-non-nil-idx and ;; count-vector-elements-beneath already exist elsewhere in this ;; library? It seems like they might. ;; A regular tree node is guaranteed to have only 32-way branching at ;; all nodes, except perhaps along the right spine, where it can be ;; partial. From a regular tree node down, all leaf arrays ;; (containing vector elements directly) are restricted to contain a ;; full 32 vector elements. This code relies on these invariants to ;; quickly calculate the number of vector elements beneath a regular ;; node in O(log N) time. (defn last-non-nil-idx [^objects arr] (loop [i (int (dec (alength arr)))] (if (neg? i) i (if (nil? (aget arr (int i))) (recur (unchecked-dec-int i)) i)))) (defn count-vector-elements-beneath [node shift ^NodeManager nm] (if (.regular nm node) (loop [node node shift shift acc 0] (if (zero? shift) (if (nil? node) acc ;; The +32 is for the regular leaf node reached at shift 0 (+ acc 32)) (let [^objects arr (.array nm node) max-child-idx (int (last-non-nil-idx arr)) num-elems-in-full-child (int (bit-shift-left 1 shift))] (if (< max-child-idx 0) acc (recur (aget ^objects arr max-child-idx) (unchecked-subtract-int shift (int 5)) (unchecked-add-int acc (unchecked-multiply-int max-child-idx num-elems-in-full-child))))))) ;; irregular case (let [rngs (ranges nm node)] (aget rngs (dec (aget rngs 32)))))) (defn peephole-optimize-root [^Vector v] (let [config @peephole-optimization-config] (if (<= (.-shift v) 10) ;; Tree depth cannot be reduced if shift <= 5. ;; TBD: If shift=10, the grandchildren nodes need to be handled ;; by an am array manager for primitive vectors, which I haven't ;; written code for yet below, but so far this peephole ;; optimizer seems to be working sufficiently well without ;; handling that case. v (let [root (.-root v) ^NodeManager nm (.-nm v) children (child-nodes root nm) grandchildren (bounded-grandchildren nm children) num-granchildren-bounded (count grandchildren) many-grandchildren? (> num-granchildren-bounded 32)] (if many-grandchildren? ;; If it is possible to reduce tree depth, it requires going ;; deeper than just to the grandchildren, which is beyond ;; what this peephole optimizer is intended to do. v ;; Create a new root node that points directly at the ;; grandchildren, since there are few enough of them. (let [^objects new-arr (object-array 33) ^ints new-rngs (int-array 33) new-root (.node nm (.edit nm root) new-arr) shift (.-shift v) grandchild-shift (- shift (* 2 5))] (swap! peephole-optimization-count inc) (loop [idx 0 remaining-gc grandchildren elem-sum (int 0)] (if-let [remaining-gc (seq remaining-gc)] (let [grandchild (first remaining-gc) num-elems-this-grandchild (count-vector-elements-beneath grandchild grandchild-shift nm) next-elem-sum (int (+ elem-sum num-elems-this-grandchild))] (aset new-arr idx grandchild) (aset new-rngs idx next-elem-sum) (recur (inc idx) (rest remaining-gc) next-elem-sum)))) (aset new-rngs 32 num-granchildren-bounded) (aset new-arr 32 new-rngs) (let [new-v (Vector. nm (.-am v) (.-cnt v) (- shift 5) new-root (.-tail v) (.-_meta v) 0 0)] (when (:debug-fn config) ((:debug-fn config) v new-v)) new-v))))))) (def max-vector-elements Integer/MAX_VALUE) ;; Larger shift values than 64 definitely break assumptions all over ;; the RRB vector implementation, e.g. (bit-shift-right 255 65) ;; returns the same result as (bit-shift-right 255 1), I believe ;; because the shift amount argument is effectively modulo'd by 64. ;; Larger shift values than 30 are unlikely to make sense, given that ;; the maximum number of vector elements supported is somewhere near ;; Integer/MAX_VALUE=2^31-1. (defn shift-too-large? [^Vector v] (> (.-shift v) 30)) ;; The maximum number of vector elements in a tree, not counting any ;; elements in the tail, with a given shift value is: ;; ;; (bit-shift-left 1 (+ shift 5)) ;; ;; It is perfectly normal to have vectors with a root tree node with ;; only 1 non-nil child, so at a fraction 1/32 of maximum capacity. I ;; do not know the exact minimum fraction that RRB vectors as ;; implemented here should allow, but I suspect it is well over ;; 1/1024. (defn poor-branching? [^Vector v] (let [tail-off (.tailoff v)] (if (zero? tail-off) false (let [shift-amount (unchecked-subtract-int (.-shift v) (int 5)) max-capacity-divided-by-1024 (bit-shift-left 1 shift-amount)] (< tail-off max-capacity-divided-by-1024))))) ;; Note 3: ;; Consider checking the performance of an expression like the one ;; used now against the one below: ;;(into (clojure.lang.LazilyPersistentVector/create v1) v2)) ;; If the LazilyPersistentVector/create version is faster, it would be ;; good to create a version of the create method that returns ;; primitive vectors when given a primitive vector, and an Object ;; vector when given an Object vector. The existing method in Clojure ;; core always returns an Object vector, even when given a primitive ;; vector. ;; TBD: Is there any promise about what metadata catvec returns? ;; Always the same as on the first argument? (def fallback-config (atom {:debug-fn nil})) (def fallback-to-slow-splice-count1 (atom 0)) (def fallback-to-slow-splice-count2 (atom 0)) (defn fallback-to-slow-splice-if-needed [^Vector v1 ^Vector v2 ^Vector splice-result] (let [config @fallback-config] (if (or (shift-too-large? splice-result) (poor-branching? splice-result)) (do (dbg (str "splice-rrbts result had shift " (.-shift splice-result) " and " (.tailoff splice-result) " elements not counting" " the tail. Falling back to slower method of concatenation.")) (if (poor-branching? v1) ;; The v1 we started with was not good, either. (do (swap! fallback-to-slow-splice-count1 inc) (dbg (str "splice-rrbts first arg had shift " (.-shift v1) " and " (.tailoff v1) " elements not counting" " the tail. Building the result from scratch.")) ;: See Note 3 (let [new-splice-result (-> (empty v1) (into v1) (into v2))] (when (:debug-fn config) ((:debug-fn config) splice-result new-splice-result)) new-splice-result)) ;; Assume that v1 is balanced enough that we can use into to ;; add all elements of v2 to it, without problems. ;; TBD: That assumption might be incorrect. Consider ;; checking the result of this, too, and fall back again to ;; the true case above? (let [new-splice-result (into v1 v2)] (swap! fallback-to-slow-splice-count2 inc) (when (:debug-fn config) ((:debug-fn config) splice-result new-splice-result)) new-splice-result))) ;; else the fast result is good splice-result))) (defn splice-rrbts-main [^NodeManager nm ^ArrayManager am ^Vector v1 ^Vector v2] (cond (zero? (count v1)) v2 (> (+ (long (count v1)) (long (count v2))) max-vector-elements) (let [c1 (long (count v1)), c2 (long (count v2))] (throw (IllegalArgumentException. (str "Attempted to concatenate two vectors whose total" " number of elements is " (+ c1 c2) ", which is" " larger than the maximum number of elements " max-vector-elements " supported in a vector ")))) (< (count v2) rrbt-concat-threshold) (into v1 v2) :else (let [s1 (.-shift v1) s2 (.-shift v2) r1 (.-root v1) o? (overflow? nm r1 s1 (+ (count v1) (- 32 (.alength am (.-tail v1))))) r1 (if o? (let [tail (.-tail v1) tail-node (.node nm nil tail) reg? (and (.regular nm r1) (== (.alength am tail) 32)) arr (object-array (if reg? 32 33))] (aset arr 0 r1) (aset arr 1 (new-path nm am s1 tail-node)) (if-not reg? (let [rngs (int-array 33)] (aset rngs 32 2) (aset rngs 0 (- (count v1) (.alength am tail))) (aset rngs 1 (count v1)) (aset arr 32 rngs))) (.node nm nil arr)) (fold-tail nm am r1 s1 (.tailoff v1) (.-tail v1))) s1 (if o? (+ s1 5) s1) r2 (.-root v2) s (max s1 s2) r1 (shift-from-to nm r1 s1 s) r2 (shift-from-to nm r2 s2 s) transferred-leaves (Box. 0) [n1 n2] (zippath nm am s r1 (count v1) r2 (- (count v2) (.alength am (.-tail v2))) transferred-leaves) d (.-val transferred-leaves) ncnt1 (+ (count v1) d) ncnt2 (- (count v2) (.alength am (.-tail v2)) d) [n1 n2] (if (identical? n2 r2) (squash-nodes nm s n1 ncnt1 n2 ncnt2) (object-array (list n1 n2))) ncnt1 (if n2 (int ncnt1) (unchecked-add-int (int ncnt1) (int ncnt2))) ncnt2 (if n2 (int ncnt2) (int 0))] (if n2 (let [arr (object-array 33) new-root (.node nm nil arr)] (aset arr 0 n1) (aset arr 1 n2) (aset arr 32 (doto (int-array 33) (aset 0 ncnt1) (aset 1 (+ ncnt1 ncnt2)) (aset 32 2))) (Vector. nm am (+ (count v1) (count v2)) (+ s 5) new-root (.-tail v2) nil 0 0)) (loop [r n1 s (int s)] (if (and (> s (int 5)) (nil? (aget ^objects (.array nm r) 1))) (recur (aget ^objects (.array nm r) 0) (unchecked-subtract-int s (int 5))) (Vector. nm am (+ (count v1) (count v2)) s r (.-tail v2) nil 0 0))))))) (defn splice-rrbts [^NodeManager nm ^ArrayManager am ^Vector v1 ^Vector v2] (let [r1 (splice-rrbts-main nm am v1 v2) r2 (peephole-optimize-root r1)] (fallback-to-slow-splice-if-needed v1 v2 r2))) (defn array-copy [^ArrayManager am from i to j len] (loop [i (int i) j (int j) len (int len)] (when (pos? len) (.aset am to j (.aget am from i)) (recur (unchecked-inc-int i) (unchecked-inc-int j) (unchecked-dec-int len))))) (deftype Transient [^NodeManager nm ^ArrayManager am ^boolean objects? ^:unsynchronized-mutable ^int cnt ^:unsynchronized-mutable ^int shift ^:unsynchronized-mutable root ^:unsynchronized-mutable tail ^:unsynchronized-mutable ^int tidx] clojure.lang.Counted (count [this] (.ensureEditable transient-helper nm root) cnt) clojure.lang.Indexed (nth [this i] (.ensureEditable transient-helper nm root) (if (and (<= (int 0) i) (< i cnt)) (let [tail-off (.tailoff this)] (if (<= tail-off i) (.aget am tail (unchecked-subtract-int i tail-off)) (loop [i i node root shift shift] (if (zero? shift) (let [arr (.array nm node)] (.aget am arr (bit-and (bit-shift-right i shift) (int 0x1f)))) (if (.regular nm node) (let [arr (.array nm node) idx (bit-and (bit-shift-right i shift) (int 0x1f))] (loop [i i node (aget ^objects arr idx) shift (unchecked-subtract-int shift (int 5))] (let [arr (.array nm node) idx (bit-and (bit-shift-right i shift) (int 0x1f))] (if (zero? shift) (.aget am arr idx) (recur i (aget ^objects arr idx) (unchecked-subtract-int shift (int 5))))))) (let [arr (.array nm node) rngs (ranges nm node) idx (loop [j (bit-and (bit-shift-right i shift) (int 0x1f))] (if (< i (aget rngs j)) j (recur (unchecked-inc-int j)))) i (if (zero? idx) (int i) (unchecked-subtract-int (int i) (aget rngs (unchecked-dec-int idx))))] (recur i (aget ^objects arr idx) (unchecked-subtract-int shift (int 5))))))))) (throw (IndexOutOfBoundsException.)))) (nth [this i not-found] (.ensureEditable transient-helper nm root) (if (and (>= i (int 0)) (< i cnt)) (.nth this i) not-found)) clojure.lang.ILookup (valAt [this k not-found] (.ensureEditable transient-helper nm root) (if (Util/isInteger k) (let [i (int k)] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) not-found)) not-found)) (valAt [this k] (.valAt this k nil)) clojure.lang.IFn (invoke [this k] (.ensureEditable transient-helper nm root) (if (Util/isInteger k) (let [i (int k)] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) (throw (IndexOutOfBoundsException.)))) (throw (IllegalArgumentException. "Key must be integer")))) (applyTo [this args] (.ensureEditable transient-helper nm root) (let [n (RT/boundedLength args 1)] (case n 0 (throw (clojure.lang.ArityException. n (.. this (getClass) (getSimpleName)))) 1 (.invoke this (first args)) 2 (throw (clojure.lang.ArityException. n (.. this (getClass) (getSimpleName))))))) clojure.lang.ITransientCollection (conj [this val] (.ensureEditable transient-helper nm root) (if (< tidx 32) (do (.aset am tail tidx val) (set! cnt (unchecked-inc-int cnt)) (set! tidx (unchecked-inc-int tidx)) this) (let [tail-node (.node nm (.edit nm root) tail) new-tail (.array am 32)] (.aset am new-tail 0 val) (set! tail new-tail) (set! tidx (int 1)) (if (overflow? nm root shift cnt) (if (.regular nm root) (let [new-arr (object-array 32)] (doto new-arr (aset 0 root) (aset 1 (.newPath transient-helper nm am tail (.edit nm root) shift tail-node))) (set! root (.node nm (.edit nm root) new-arr)) (set! shift (unchecked-add-int shift (int 5))) (set! cnt (unchecked-inc-int cnt)) this) (let [new-arr (object-array 33) new-rngs (int-array 33) new-root (.node nm (.edit nm root) new-arr) root-total-range (aget (ranges nm root) 31)] (doto new-arr (aset 0 root) (aset 1 (.newPath transient-helper nm am tail (.edit nm root) shift tail-node)) (aset 32 new-rngs)) (doto new-rngs (aset 0 root-total-range) (aset 1 (unchecked-add-int root-total-range (int 32))) (aset 32 2)) (set! root new-root) (set! shift (unchecked-add-int shift (int 5))) (set! cnt (unchecked-inc-int cnt)) this)) (let [new-root (.pushTail transient-helper nm am shift cnt (.edit nm root) root tail-node)] (set! root new-root) (set! cnt (unchecked-inc-int cnt)) this))))) (persistent [this] (.ensureEditable transient-helper nm root) (.set (.edit nm root) nil) (let [trimmed-tail (.array am tidx)] (array-copy am tail 0 trimmed-tail 0 tidx) (Vector. nm am cnt shift root trimmed-tail nil 0 0))) clojure.lang.ITransientVector (assocN [this i val] (.ensureEditable transient-helper nm root) (cond (and (<= 0 i) (< i cnt)) (let [tail-off (.tailoff this)] (if (<= tail-off i) (.aset am tail (unchecked-subtract-int i tail-off) val) (set! root (.doAssoc transient-helper nm am shift (.edit nm root) root i val))) this) (== i cnt) (.conj this val) :else (throw (IndexOutOfBoundsException.)))) (pop [this] (.ensureEditable transient-helper nm root) (cond (zero? cnt) (throw (IllegalStateException. "Can't pop empty vector")) (== 1 cnt) (do (set! cnt (int 0)) (set! tidx (int 0)) (if objects? (.aset am tail 0 nil)) this) (> tidx 1) (do (set! cnt (unchecked-dec-int cnt)) (set! tidx (unchecked-dec-int tidx)) (if objects? (.aset am tail tidx nil)) this) :else (let [new-tail-base (.arrayFor this (unchecked-subtract-int cnt (int 2))) new-tail (.editableTail transient-helper am new-tail-base) new-tidx (.alength am new-tail-base) new-root (.popTail transient-helper nm am shift cnt (.edit nm root) root)] (cond (nil? new-root) (do (set! cnt (unchecked-dec-int cnt)) (set! root (.ensureEditable transient-helper nm am (.edit nm root) (.empty nm) 5)) (set! tail new-tail) (set! tidx new-tidx) this) (and (> shift 5) (nil? (aget ^objects (.array nm new-root) 1))) (do (set! cnt (unchecked-dec-int cnt)) (set! shift (unchecked-subtract-int shift (int 5))) (set! root (.ensureEditable transient-helper nm am (.edit nm root) (aget ^objects (.array nm new-root) 0) (unchecked-subtract-int shift (int 5)))) (set! tail new-tail) (set! tidx new-tidx) this) :else (do (set! cnt (unchecked-dec-int cnt)) (set! root new-root) (set! tail new-tail) (set! tidx new-tidx) this))))) clojure.lang.ITransientAssociative (assoc [this k v] (.assocN this k v)) ;; temporary kludge IVecImpl (tailoff [_] (unchecked-subtract-int cnt tidx)) (arrayFor [this i] (if (and (<= (int 0) i) (< i cnt)) (if (>= i (.tailoff this)) tail (loop [i (int i) node root shift shift] (if (zero? shift) (.array nm node) (if (.regular nm node) (loop [node (aget ^objects (.array nm node) (bit-and (bit-shift-right i shift) (int 0x1f))) shift (unchecked-subtract-int shift (int 5))] (if (zero? shift) (.array nm node) (recur (aget ^objects (.array nm node) (bit-and (bit-shift-right i shift) (int 0x1f))) (unchecked-subtract-int shift (int 5))))) (let [rngs (ranges nm node) j (loop [j (bit-and (bit-shift-right i shift) (int 0x1f))] (if (< i (aget rngs j)) j (recur (unchecked-inc-int j)))) i (if (pos? j) (unchecked-subtract-int i (aget rngs (unchecked-dec-int j))) i)] (recur (int i) (aget ^objects (.array nm node) j) (unchecked-subtract-int shift (int 5)))))))) (throw (IndexOutOfBoundsException.)))) PTransientDebugAccess (debugGetRoot [_] root) (debugGetShift [_] shift) (debugGetTail [_] tail) (debugGetCnt [_] cnt)) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector/transients.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector.transients (:require [clojure.core.rrb-vector.parameters :as p] [clojure.core.rrb-vector.nodes :refer [ranges last-range overflow?]]) (:import (clojure.core.rrb_vector.nodes NodeManager) (clojure.core ArrayManager) (java.util.concurrent.atomic AtomicReference))) (set! *warn-on-reflection* true) (set! *unchecked-math* true) ;; :warn-on-boxed (definterface ITransientHelper (editableRoot [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am root]) (editableTail [^clojure.core.ArrayManager am tail]) (ensureEditable [^clojure.core.rrb_vector.nodes.NodeManager nm root]) (ensureEditable [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am ^java.util.concurrent.atomic.AtomicReference root-edit current-node ^int shift]) (pushTail [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am ^int shift ^int cnt ^java.util.concurrent.atomic.AtomicReference root-edit current-node tail-node]) (popTail [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am ^int shift ^int cnt ^java.util.concurrent.atomic.AtomicReference root-edit current-node]) (doAssoc [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am ^int shift ^java.util.concurrent.atomic.AtomicReference root-edit current-node ^int i val]) (newPath [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am tail ^java.util.concurrent.atomic.AtomicReference edit ^int shift current-node])) (def ^ITransientHelper transient-helper (reify ITransientHelper (editableRoot [this nm am root] (let [new-arr (clojure.core/aclone ^objects (.array nm root))] (if (== 33 (alength ^objects new-arr)) (aset new-arr 32 (aclone (ints (aget ^objects new-arr 32))))) (.node nm (AtomicReference. (Thread/currentThread)) new-arr))) (editableTail [this am tail] (let [ret (.array am 32)] (System/arraycopy tail 0 ret 0 (.alength am tail)) ret)) (ensureEditable [this nm root] (let [owner (->> root (.edit nm) (.get))] (cond (identical? owner (Thread/currentThread)) nil (not (nil? owner)) (throw (IllegalAccessError. "Transient used by non-owner thread")) :else (throw (IllegalAccessError. "Transient used after persistent! call"))))) (ensureEditable [this nm am root-edit current-node shift] (if (identical? root-edit (.edit nm current-node)) current-node (if (zero? shift) (let [new-arr (.aclone am (.array nm current-node))] (.node nm root-edit new-arr)) (let [new-arr (aclone ^objects (.array nm current-node))] (if (== 33 (alength ^objects new-arr)) (aset new-arr 32 (aclone (ints (aget ^objects new-arr 32))))) (.node nm root-edit new-arr))))) ;; Note 1: This condition check and exception are a little bit ;; closer to the source of the cause for what was issue CRRBV-20, ;; added in case there is still some remaining way to cause this ;; condition to occur. ;; Note 2: In the worst case, when the tree is nearly full, ;; calling overflow? here takes run time O(tree_depth^2) here. ;; That could be made O(tree_depth). One way would be to call ;; pushTail in hopes that it succeeds, but return some distinctive ;; value indicating a failure on the full condition, and create ;; the node via a .newPath call at most recent recursive pushTail ;; call that has an empty slot available. (pushTail [this nm am shift cnt root-edit current-node tail-node] (let [ret (.ensureEditable this nm am root-edit current-node shift)] (if (.regular nm ret) (do (loop [n ret shift shift] (let [arr (.array nm n) subidx (bit-and (bit-shift-right (dec cnt) shift) 0x1f)] (if (== shift 5) (aset ^objects arr subidx tail-node) (let [child (aget ^objects arr subidx)] (if (nil? child) (aset ^objects arr subidx (.newPath this nm am (.array nm tail-node) root-edit (unchecked-subtract-int shift 5) tail-node)) (let [editable-child (.ensureEditable this nm am root-edit child (unchecked-subtract-int shift 5))] (aset ^objects arr subidx editable-child) (recur editable-child (- shift (int 5))))))))) ret) (let [arr (.array nm ret) rngs (ranges nm ret) li (unchecked-dec-int (aget rngs 32)) cret (if (== shift 5) nil (let [child (.ensureEditable this nm am root-edit (aget ^objects arr li) (unchecked-subtract-int shift 5)) ccnt (unchecked-add-int (int (if (pos? li) (unchecked-subtract-int (aget rngs li) (aget rngs (unchecked-dec-int li))) (aget rngs 0))) ;; add 32 elems to account for the ;; new full tail we plan to add to ;; the subtree. (int 32))] ;; See Note 2 (if-not (overflow? nm child (unchecked-subtract-int shift 5) ccnt) (.pushTail this nm am (unchecked-subtract-int shift 5) ccnt root-edit child tail-node))))] (if cret (do (aset ^objects arr li cret) (aset rngs li (unchecked-add-int (aget rngs li) 32)) ret) (do (when (>= li 31) ;; See Note 1 (let [msg (str "Assigning index " (inc li) " of vector" " object array to become a node, when that" " index should only be used for storing" " range arrays.") data {:shift shift, :cnd cnt, :current-node current-node, :tail-node tail-node, :rngs rngs, :li li, :cret cret}] (throw (ex-info msg data)))) (aset ^objects arr (inc li) (.newPath this nm am (.array nm tail-node) root-edit (unchecked-subtract-int shift 5) tail-node)) (aset rngs (unchecked-inc-int li) (unchecked-add-int (aget rngs li) 32)) (aset rngs 32 (unchecked-inc-int (aget rngs 32))) ret)))))) (popTail [this nm am shift cnt root-edit current-node] (let [ret (.ensureEditable this nm am root-edit current-node shift)] (if (.regular nm ret) (let [subidx (bit-and (bit-shift-right (unchecked-subtract-int cnt (int 2)) (int shift)) (int 0x1f))] (cond (> shift 5) (let [child (.popTail this nm am (unchecked-subtract-int shift 5) cnt ;; TBD: Should this be smaller than cnt? root-edit (aget ^objects (.array nm ret) subidx))] (if (and (nil? child) (zero? subidx)) nil (let [arr (.array nm ret)] (aset ^objects arr subidx child) ret))) (zero? subidx) nil :else (let [arr (.array nm ret)] (aset ^objects arr subidx nil) ret))) (let [rngs (ranges nm ret) subidx (unchecked-dec-int (aget rngs 32))] (cond (> shift 5) (let [child (aget ^objects (.array nm ret) subidx) child-cnt (if (zero? subidx) (aget rngs 0) (unchecked-subtract-int (aget rngs subidx) (aget rngs (unchecked-dec-int subidx)))) new-child (.popTail this nm am (unchecked-subtract-int shift 5) child-cnt root-edit child)] (cond (and (nil? new-child) (zero? subidx)) nil (.regular nm child) (let [arr (.array nm ret)] (aset rngs subidx (unchecked-subtract-int (aget rngs subidx) 32)) (aset ^objects arr subidx new-child) (if (nil? new-child) (aset rngs 32 (unchecked-dec-int (aget rngs 32)))) ret) :else (let [rng (last-range nm child) diff (unchecked-subtract-int rng (if new-child (last-range nm new-child) 0)) arr (.array nm ret)] (aset rngs subidx (unchecked-subtract-int (aget rngs subidx) diff)) (aset ^objects arr subidx new-child) (if (nil? new-child) (aset rngs 32 (unchecked-dec-int (aget rngs 32)))) ret))) (zero? subidx) nil :else (let [arr (.array nm ret) child (aget ^objects arr subidx)] (aset ^objects arr subidx nil) (aset rngs subidx 0) (aset rngs 32 (unchecked-dec-int (aget rngs 32))) ret)))))) (doAssoc [this nm am shift root-edit current-node i val] (let [ret (.ensureEditable this nm am root-edit current-node shift)] (if (.regular nm ret) (loop [shift shift node ret] (if (zero? shift) (let [arr (.array nm node)] (.aset am arr (bit-and i 0x1f) val)) (let [arr (.array nm node) subidx (bit-and (bit-shift-right i shift) 0x1f) next-shift (int (unchecked-subtract-int shift 5)) child (.ensureEditable this nm am root-edit (aget ^objects arr subidx) next-shift)] (aset ^objects arr subidx child) (recur next-shift child)))) (let [arr (.array nm ret) rngs (ranges nm ret) subidx (bit-and (bit-shift-right i shift) 0x1f) subidx (loop [subidx subidx] (if (< i (aget rngs subidx)) subidx (recur (unchecked-inc-int subidx)))) i (if (zero? subidx) i (unchecked-subtract-int i (aget rngs (unchecked-dec-int subidx))))] (aset ^objects arr subidx (.doAssoc this nm am (unchecked-subtract-int shift 5) root-edit (aget ^objects arr subidx) i val)))) ret)) (newPath [this nm am tail edit shift current-node] (if (== (.alength am tail) 32) (loop [s 0 n current-node] (if (== s shift) n (let [arr (object-array 32) ret (.node nm edit arr)] (aset ^objects arr 0 n) (recur (unchecked-add s (int 5)) ret)))) (loop [s 0 n current-node] (if (== s shift) n (let [arr (object-array 33) rngs (int-array 33) ret (.node nm edit arr)] (aset ^objects arr 0 n) (aset ^objects arr 32 rngs) (aset rngs 32 1) (aset rngs 0 (.alength am tail)) (recur (unchecked-add s (int 5)) ret)))))))) ================================================ FILE: src/main/clojure/clojure/core/rrb_vector.clj ================================================ ; Copyright (c) Rich Hickey and contributors. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns clojure.core.rrb-vector "An implementation of the confluently persistent vector data structure introduced in Bagwell, Rompf, \"RRB-Trees: Efficient Immutable Vectors\", EPFL-REPORT-169879, September, 2011. RRB-Trees build upon Clojure's PersistentVectors, adding logarithmic time concatenation and slicing. The main API entry points are clojure.core.rrb-vector/catvec, performing vector concatenation, and clojure.core.rrb-vector/subvec, which produces a new vector containing the appropriate subrange of the input vector (in contrast to clojure.core/subvec, which returns a view on the input vector). core.rrb-vector's vectors can store objects or unboxed primitives. The implementation allows for seamless interoperability with clojure.lang.PersistentVector, clojure.core.Vec (more commonly known as gvec) and clojure.lang.APersistentVector$SubVector instances: clojure.core.rrb-vector/catvec and clojure.core.rrb-vector/subvec convert their inputs to clojure.core.rrb-vector.rrbt.Vector instances whenever necessary (this is a very fast constant time operation for PersistentVector and gvec; for SubVector it is O(log n), where n is the size of the underlying vector). clojure.core.rrb-vector also exports its own versions of vector and vector-of and vec which always produce clojure.core.rrb-vector.rrbt.Vector instances. Note that vector-of accepts :object as one of the possible type arguments, in addition to keywords naming primitive types." {:author "Michał Marczyk"} (:refer-clojure :exclude [vector vector-of vec subvec]) (:require [clojure.core.rrb-vector.parameters :as p] [clojure.core.rrb-vector.protocols :refer [slicev splicev]] [clojure.core.rrb-vector.nodes :refer [ams object-am object-nm primitive-nm empty-pv-node empty-gvec-node]] [clojure.core.rrb-vector.rrbt :refer [as-rrbt]] clojure.core.rrb-vector.interop) (:import (clojure.core.rrb_vector.rrbt Vector) (clojure.core.rrb_vector.nodes NodeManager) (clojure.core ArrayManager))) (set! *warn-on-reflection* true) (set! *unchecked-math* true) ;; :warn-on-boxed (defn catvec "Concatenates the given vectors in logarithmic time." ([] []) ([v1] v1) ([v1 v2] (splicev v1 v2)) ([v1 v2 v3] (splicev (splicev v1 v2) v3)) ([v1 v2 v3 v4] (splicev (splicev v1 v2) (splicev v3 v4))) ([v1 v2 v3 v4 & vn] (splicev (splicev (splicev v1 v2) (splicev v3 v4)) (apply catvec vn)))) (defn subvec "Returns a new vector containing the elements of the given vector v lying between the start (inclusive) and end (exclusive) indices in logarithmic time. end defaults to end of vector. The resulting vector shares structure with the original, but does not hold on to any elements of the original vector lying outside the given index range." ([v start] (slicev v start (count v))) ([v start end] (slicev v start end))) (defmacro ^:private gen-vector-method [& params] (let [arr (with-meta (gensym "arr__") {:tag 'objects})] `(let [~arr (object-array ~(count params))] ~@(map-indexed (fn [i param] `(aset ~arr ~i ~param)) params) (Vector. ^NodeManager object-nm ^ArrayManager object-am ~(count params) 5 empty-pv-node ~arr nil 0 0)))) (defn vector "Creates a new vector containing the args." ([] (gen-vector-method)) ([x1] (gen-vector-method x1)) ([x1 x2] (gen-vector-method x1 x2)) ([x1 x2 x3] (gen-vector-method x1 x2 x3)) ([x1 x2 x3 x4] (gen-vector-method x1 x2 x3 x4)) ([x1 x2 x3 x4 & xn] (loop [v (transient (vector x1 x2 x3 x4)) xn xn] (if xn (recur (.conj ^clojure.lang.ITransientCollection v (first xn)) (next xn)) (persistent! v))))) (defn vec "Returns a vector containing the contents of coll. If coll is a vector, returns an RRB vector using the internal tree of coll." [coll] (if (vector? coll) (as-rrbt coll) (apply vector coll))) (defmacro ^:private gen-vector-of-method [t & params] (let [am (gensym "am__") nm (gensym "nm__") arr (gensym "arr__")] `(let [~am ^ArrayManager (ams ~t) ~nm ^NodeManager (if (identical? ~t :object) object-nm primitive-nm) ~arr (.array ~am ~(count params))] ~@(map-indexed (fn [i param] `(.aset ~am ~arr ~i ~param)) params) (Vector. ~nm ~am ~(count params) 5 (if (identical? ~t :object) empty-pv-node empty-gvec-node) ~arr nil 0 0)))) (defn vector-of "Creates a new vector capable of storing homogenous items of type t, which should be one of :object, :int, :long, :float, :double, :byte, :short, :char, :boolean. Primitives are stored unboxed. Optionally takes one or more elements to populate the vector." ([t] (gen-vector-of-method t)) ([t x1] (gen-vector-of-method t x1)) ([t x1 x2] (gen-vector-of-method t x1 x2)) ([t x1 x2 x3] (gen-vector-of-method t x1 x2 x3)) ([t x1 x2 x3 x4] (gen-vector-of-method t x1 x2 x3 x4)) ([t x1 x2 x3 x4 & xn] (loop [v (transient (vector-of t x1 x2 x3 x4)) xn xn] (if xn (recur (.conj ^clojure.lang.ITransientCollection v (first xn)) (next xn)) (persistent! v))))) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/debug.clj ================================================ (ns clojure.core.rrb-vector.debug (:require [clojure.core.rrb-vector.parameters :as p] [clojure.core.rrb-vector.rrbt :refer [as-rrbt]] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.rrbt :as rrbt] ;; This page: ;; https://clojure.org/guides/reader_conditionals refers ;; to code that can go into common cljc files as platform ;; independent, and the code in the clj or cljs files as ;; platform dependent, so I will use that terminology ;; here, too. [clojure.core.rrb-vector.debug-platform-dependent :as pd])) ;; The intent is to keep this file as close to ;; src/main/cljs/clojure/core/rrb_vector/debug.cljs as possible, so ;; that when we start requiring Clojure 1.7.0 and later for this ;; library, this file and that one can be replaced with a common file ;; with the suffix .cljc ;; Functions expected to be defined in the appropriate ;; clojure.core.rrb-vector.debug-platform-dependent namespace: ;; pd/internal-node? ;; pd/persistent-vector? ;; pd/transient-vector? ;; pd/is-vector? ;; pd/dbg-tailoff (formerly debug-tailoff) ;; pd/dbg-tidx (formerly debug-tailoff for clj, debug-tidx for cljs) ;; pd/format ;; pd/printf ;; pd/unwrap-subvec-accessors-for ;; pd/abbrev-for-type-of [vec-or-node] (formerly abbrev-type-name, but move type/class call inside) ;; pd/same-coll? (written already for clj, TBD for cljs) ;; Functions returned from unwrap-subvec-accessors-for that have ;; platform-dependent definitions, but the same general 'kind' ;; arguments and return values, where 'kind' could be: any vector, ;; persistent or transient, or a vector tree node object: ;; get-root - All get-* fns formerly called extract-* in the Java ;; platform dependent version of the debug namespace. ;; get-shift ;; get-tail ;; get-cnt ;; get-array [node] - clj (.array nm node) cljs (.-arr node) ;; get-ranges [node] - clj (ranges nm node) cljs (node-ranges node) ;; regular? [node] - clj (.regular nm node) cljs (regular? node) ;; tail-len [tail] - clj (.alength am tail) cljs (alength tail) ;; NO: nm am - cljs doesn't need them, and clj only uses them for the ;; last few functions above. (defn children-summary [node shift get-array get-ranges regular? opts] (let [children (get-array node) reg? (regular? node) rngs (if-not reg? (get-ranges node)) array-len (count children) children-seq (if reg? children (butlast children)) non-nils (remove nil? children-seq) regular-children (filter regular? non-nils) num-non-nils (count non-nils) num-regular-children (count regular-children) num-irregular-children (- num-non-nils num-regular-children) num-nils (- (count children-seq) num-non-nils) exp-array-len (if reg? p/max-branches p/non-regular-array-len) bad-array-len? (not= array-len exp-array-len)] ;; 'r' for regular, 'i' for irregular ;; For either type of node, its first p/max-branches array elements are broken ;; down into: ;; # regular children ;; # irregular children ;; # of nil 'children' not shown, since it will always be p/max-branches minus ;; # the total of # regular plus irregular children, unless the ;; # array is the wrong size, and in that case a BAD-ARRAY-LEN ;; # message will be included in the string. (pd/format "%s%d+%d%s" (if reg? "r" "i") num-regular-children num-irregular-children (if bad-array-len? (pd/format " BAD-ARRAY-LEN %d != %d" array-len exp-array-len) "")))) (defn filter-indexes "Return a sequence of all indexes of elements e of coll for which (pred e) returns logical true. 0 is the index of the first element." [pred coll] (filter (complement nil?) (map-indexed (fn [idx e] (if (pred e) idx)) coll))) (defn dbg-vec ([v] (dbg-vec v {:max-depth nil ;; integer to limit depth, nil for unlimited ;; force showing tree "fringes" beyond max-depth :always-show-fringes false ;; show vector elements. false for only count :show-elements true ;; show summary of number of children of each node, as ;; returned by function children-summary :show-children-summary false ;; default false means show ranges arrays with their raw ;; unprocessed contents. Use true to show only the ;; first n elements, where n=(aget (get-ranges node) ;; p/max-branches), and to show the 'deltas' between consecutive ;; pairs, e.g. if the original is (32 64 96 0 ... 0 3), ;; then instead show (32 32 32), which, if the data ;; structure is correct, is the number of vector ;; elements reachable through each of the node's 3 ;; children. :show-ranges-as-deltas false})) ([v opts] (let [{:keys [v subvector? subvec-start subvec-end get-root get-shift get-tail get-cnt get-array get-ranges regular? tail-len]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v) tail (get-tail v) cnt (get-cnt v)] (when subvector? (pd/printf "SubVector from start %d to end %d of vector:\n" subvec-start subvec-end)) (letfn [(go [indent shift i node on-left-fringe? on-right-fringe?] (when node (dotimes [_ indent] (print " ")) (pd/printf "%02d:%02d %s" shift i (pd/abbrev-for-type-of node)) (if (zero? shift) ;; this node has only vector elements as its children (if (:show-elements opts) (print ":" (vec (get-array node))) (print ":" (count (get-array node)) "vector elements elided")) ;; else this node has only other nodes as its children (do (when (:show-children-summary opts) (print " ") (print (children-summary node shift get-array get-ranges regular? opts))) (if (not (regular? node)) (if (:show-ranges-as-deltas opts) (let [rngs (get-ranges node) r (aget rngs p/max-branches) tmp (map - (take r rngs) (take r (cons 0 rngs)))] (print ":" (seq tmp))) (print ":" (seq (get-ranges node))))))) (println) (let [no-children? (zero? shift) visit-all-children? (and (not no-children?) (or (nil? (:max-depth opts)) (< (inc indent) (:max-depth opts)))) visit-some-children? (or visit-all-children? (and (not no-children?) (:always-show-fringes opts) (or on-left-fringe? on-right-fringe?)))] (if visit-some-children? (dorun (let [arr (get-array node) a (if (regular? node) arr (butlast arr)) non-nil-idxs (filter-indexes (complement nil?) a) first-non-nil-idx (first non-nil-idxs) last-non-nil-idx (last non-nil-idxs)] (map-indexed (fn [i node] (let [child-on-left-fringe? (and on-left-fringe? (= i first-non-nil-idx)) child-on-right-fringe? (and on-right-fringe? (= i last-non-nil-idx)) visit-this-child? (or visit-all-children? (and (:always-show-fringes opts) (or child-on-left-fringe? child-on-right-fringe?)))] (if visit-this-child? (go (inc indent) (- shift p/shift-increment) i node child-on-left-fringe? child-on-right-fringe?)))) a)))))))] (pd/printf "%s (%d elements):\n" (pd/abbrev-for-type-of v) (count v)) (go 0 shift 0 root true true) (println (if (pd/transient-vector? v) (pd/format "tail (tidx %d):" (pd/dbg-tidx v)) "tail:") (vec tail)))))) (defn first-diff "Compare two sequences to see if they have = elements in the same order, and both sequences have the same number of elements. If all of those conditions are true, and no exceptions occur while calling seq, first, and next on the seqs of xs and ys, then return -1. If two elements at the same index in each sequence are found not = to each other, or the sequences differ in their number of elements, return the index, 0 or larger, at which the first difference occurs. If an exception occurs while calling seq, first, or next, throw an exception that contains the index at which this exception occurred." [xs ys] (loop [i 0 xs (seq xs) ys (seq ys)] (if (try (and xs ys (= (first xs) (first ys))) (catch Exception e (.printStackTrace e) i)) (let [xs (try (next xs) (catch Exception e (prn :xs i) (throw e))) ys (try (next ys) (catch Exception e (prn :ys i) (throw e)))] (recur (inc i) xs ys)) (if (or xs ys) i -1)))) ;; When using non-default parameters for the tree data structure, ;; e.g. shift-increment not 5, then in test code with calls to ;; checking-* functions, they will be expecting those same non-default ;; parameter values, and will give errors if they are ever given a ;; vector returned by clojure.core/vec, because without changes to ;; Clojure itself, they always have shift-increment 5 and max-branches ;; 32. ;; ;; If we use (fv/vec coll) consistently in the test code, that in many ;; cases returns a core.rrb-vector data structure, but if given a ;; Clojure vector, it still returns that Clojure vector unmodified, ;; which has the same issues for checking-* functions. By ;; calling (fv/vec (seq coll)) when not using default parameters, we ;; force the return value of cvec to always be a core.rrb-vector data ;; structure. ;; ;; The name 'cvec' is intended to mean "construct a vector", and only ;; intended for use in test code that constructs vectors used as ;; parameters to other functions operating on vectors. (defn cvec [coll] (if (= p/shift-increment 5) (clojure.core/vec coll) (fv/vec (seq coll)))) (defn slow-into [to from] (reduce conj to from)) (defn all-vector-tree-nodes [v] (let [{:keys [v get-root get-shift get-array regular?]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v)] (letfn [(go [depth shift node] (if node (if (not= shift 0) (cons {:depth depth :shift shift :kind :internal :node node} (apply concat (map (partial go (inc depth) (- shift p/shift-increment)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr)))))) (cons {:depth depth :shift shift :kind :internal :node node} (map (fn [x] {:depth (inc depth) :kind :leaf :value x}) (get-array node))))))] (cons {:depth 0 :kind :base :shift shift :value v} (go 1 shift root))))) ;; All nodes that should be internal nodes are one of the internal ;; node types satisfying internal-node? All nodes that are less ;; than "leaf depth" must be internal nodes, and none of the ones ;; at "leaf depth" should be. Probably the most general restriction ;; checking for leaf values should be simply that they are any type ;; that is _not_ an internal node type. They could be objects that ;; return true for is-vector? for example, if a vector is an element ;; of another vector. (defn leaves-with-internal-node-type [node-infos] (filter (fn [node-info] (and (= :leaf (:kind node-info)) (pd/internal-node? (:node node-info)))) node-infos)) (defn non-leaves-not-internal-node-type [node-infos] (filter (fn [node-info] (and (= :internal (:kind node-info)) (not (pd/internal-node? (:node node-info))))) node-infos)) ;; The definition of nth in deftype Vector implies that every ;; descendant of a 'regular' node must also be regular. That would be ;; a straightforward sanity check to make, to return an error if a ;; non-regular node is found with a regular ancestor in the tree. (defn basic-node-errors [v] (let [{:keys [v get-shift]} (pd/unwrap-subvec-accessors-for v) shift (get-shift v) nodes (all-vector-tree-nodes v) by-kind (group-by :kind nodes) leaf-depths (set (map :depth (:leaf by-kind))) expected-leaf-depth (+ (quot shift p/shift-increment) 2) max-internal-node-depth (->> (:internal by-kind) (map :depth) (apply max)) ;; Be a little loose in checking here. If we want to narrow ;; it down to one expected answer, we would need to look at ;; the tail to see how many elements it has, then use the ;; different between (count v) and that to determine how many ;; nodes are in the rest of the tree, whether it is 0 or ;; non-0. expected-internal-max-depths (cond (= (count v) 0) #{(- expected-leaf-depth 2)} (> (count v) p/non-regular-array-len) #{(dec expected-leaf-depth)} :else #{(dec expected-leaf-depth) (- expected-leaf-depth 2)})] (cond (not= (mod shift p/shift-increment) 0) {:error true :description (str "shift value in root must be a multiple of p/shift-increment. Found " shift) :data shift} ;; It is OK for this set size to be 0 if no leaves, but if there ;; are leaves, they should all be at the same depth. (> (count leaf-depths) 1) {:error true :description (str "There are leaf nodes at multiple different depths: " leaf-depths) :data leaf-depths} (and (= (count leaf-depths) 1) (not= (first leaf-depths) expected-leaf-depth)) {:error true :description (str "Expecting all leaves to be at depth " expected-leaf-depth " because root has shift=" shift " but found leaves at depth " (first leaf-depths)) :data leaf-depths} (not (contains? expected-internal-max-depths max-internal-node-depth)) {:error true :description (str "Expecting there to be some internal nodes at one of" " these depths: " expected-internal-max-depths " because count=" (count v) " and root has shift=" shift " but max depth among all internal nodes found was " max-internal-node-depth)} (seq (leaves-with-internal-node-type nodes)) {:error true :description "A leaf (at max depth) has one of the internal node types, returning true for internal-node?" :data (first (leaves-with-internal-node-type nodes))} (seq (non-leaves-not-internal-node-type nodes)) {:error true :description "A non-leaf node has a type that returns false for internal-node?" :data (first (non-leaves-not-internal-node-type nodes))} :else {:error false}))) ;; I believe that objects-in-slot-32-of-obj-arrays and ;; ranges-not-int-array are only called directly from one test ;; namespace right now. Consider making a combined invariant checking ;; function in this debug namespace that can be used from any test ;; namespace (or other debug-time code) that a developer wants to. (defn objects-in-slot-32-of-obj-arrays "Function to look for errors of the form where a node's node.array object, which is often an array of p/max-branches or p/non-regular-array-len java.lang.Object's, has an element at index p/max-branches that is not nil, and refers to an object that is of any type _except_ an array of ints. There appears to be some situation in which this can occur, but it seems to almost certainly be a bug if that happens, and we should be able to detect it whenever it occurs." [v] (let [{:keys [v get-array]} (pd/unwrap-subvec-accessors-for v) node-maps (all-vector-tree-nodes v) internal (filter #(= :internal (:kind %)) node-maps)] (keep (fn [node-info] ;; TBD: Is there a way to do ^objects type hint for clj, ;; but none for cljs? Is it harmful for cljs to have such ;; a type hint? ;;(let [^objects arr (get-array (:node node-info)) (let [arr (get-array (:node node-info)) n (count arr)] (if (== n p/non-regular-array-len) (aget arr p/max-branches)))) internal))) ;; TBD: Should this function be defined in platform-specific file? ;;(defn ranges-not-int-array [x] ;; (seq (remove int-array? (objects-in-slot-32-of-obj-arrays x)))) ;; edit-nodes-errors is completely defined in platform-specific source ;; files. It is simply quite different between clj/cljs. (defn edit-nodes-errors [v] (pd/edit-nodes-errors v all-vector-tree-nodes)) (defn regular-node-errors [root-node? root-node-cnt children] ;; For regular nodes, there should be zero or more 'full' children, ;; followed optionally by one 'partial' child, followed by nils. (let [[full-children others] (split-with :full? children) [partial-children others] (split-with #(and (not (:full? %)) (not= :nil (:kind %))) others) [nil-children others] (split-with #(= :nil (:kind %)) others) num-full (count full-children) num-partial (count partial-children) num-non-nil (+ num-full num-partial)] (cond (not= 0 (count others)) {:error true, :kind :internal, :description (str "Found internal regular node with " num-full " full, " num-partial " partial, " (count nil-children) " nil, " (count others) " 'other' children." " - expected 0 children after nils.")} (> num-partial 1) {:error true, :kind :internal, :description (str "Found internal regular node with " num-full " full, " num-partial " partial, " (count nil-children) " nil children" " - expected 0 or 1 partial.")} (not (or (and root-node? (<= root-node-cnt p/max-branches) ;; all elements in tail (= 0 num-non-nil)) (<= 1 num-non-nil p/max-branches))) {:error true, :kind :internal :description (str "Found internal regular node with # full + # partial=" num-non-nil " children outside of range [1, " p/max-branches "]." " root-node?=" root-node? " root-node-cnt=" root-node-cnt) :data children} :else {:error false, :kind :internal, :full? (= p/max-branches (count full-children)) :count (reduce + (map #(or (:count %) 0) children))}))) (defn non-regular-node-errors [node get-ranges children] (let [rng (get-ranges node) [non-nil-children others] (split-with #(not= :nil (:kind %)) children) [nil-children others] (split-with #(= :nil (:kind %)) others) num-non-nil (count non-nil-children) num-nil (count nil-children) expected-ranges (reductions + (map :count non-nil-children))] (cond (not= 0 (count others)) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil, " (count others) " 'other' children." " - expected 0 children after nils.")} (not= num-non-nil (aget rng p/max-branches)) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil children, and" " last elem of ranges=" (aget rng p/max-branches) " - expected it to match # non-nil children.")} (not= expected-ranges (take (count expected-ranges) (seq rng))) {:error true, :kind :internal, :description (str "Found internal non-regular node with " num-non-nil " non-nil, " num-nil " nil children, and" " # children prefix sums: " (seq expected-ranges) " - expected that to match stored ranges: " (seq rng))} ;; I believe that there must always be at least one ;; non-nil-child. By checking for this condition, we will ;; definitely find out if it is ever violated. ;; TBD: What if we have a tree with ranges, and then remove all ;; elements? Does the resulting tree triger this error? (not (<= 1 (aget rng p/max-branches) p/max-branches)) {:error true, :kind :internal :description (str "Found internal non-regular node with (aget rng p/max-branches)" "=" (aget rng p/max-branches) " outside of range [1, p/max-branches].")} :else {:error false, :kind :internal, :full? false, :count (last expected-ranges)}))) (defn max-capacity-divided-by-max-branches-squared [root-shift] (let [shift-amount (max 0 (- root-shift p/shift-increment))] (bit-shift-left 1 shift-amount))) (defn fraction-full [v] (let [{:keys [v get-shift]} (pd/unwrap-subvec-accessors-for v) root-shift (get-shift v) tail-off (pd/dbg-tailoff v) max-tree-cap (bit-shift-left 1 (+ root-shift p/shift-increment))] (/ (* 1.0 tail-off) max-tree-cap))) (defn ranges-errors [v] (let [{:keys [v get-root get-shift get-tail get-cnt get-array get-ranges regular? tail-len]} (pd/unwrap-subvec-accessors-for v) root (get-root v) root-node-cnt (count v) root-shift (get-shift v) tail-off (pd/dbg-tailoff v) tail (get-tail v)] (letfn [ (go [shift node] (cond (nil? node) {:error false :kind :nil} (zero? shift) (let [n (count (get-array node))] (merge {:error (zero? n), :kind :leaves, :full? (= n p/max-branches), :count n} (if (zero? n) {:description (str "Leaf array has 0 elements." " Expected > 0.")}))) :else ;; non-0 shift (let [children (map (partial go (- shift p/shift-increment)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr)))) errs (filter :error children)] (cond (seq errs) {:error true, :description "One or more errors found", :data errs} (not= p/max-branches (count children)) {:error true, :kind :internal, :description (str "Found internal node that has " (count children) " children - expected p/max-branches.")} (regular? node) (regular-node-errors (= shift root-shift) root-node-cnt children) :else (non-regular-node-errors node get-ranges children)))))] (let [x (go root-shift root)] (cond (:error x) x (not= tail-off (:count x)) {:error true, :kind :root, :description (str "Found tail-off=" tail-off " != " (:count x) "=count of values beneath internal nodes") :internal-node-leaf-count (:count x) :tail-off tail-off :cnt (get-cnt v)} (and (pd/transient-vector? v) (not= (tail-len tail) p/max-branches)) {:error true, :kind :root, :description (str "Found transient vector with tail length " (tail-len tail) " - expecting p/max-branches")} ;; It is always a bad thing if shift becomes more than 32, ;; because the bit-shift-left and bit-shift-right operations ;; on 32-bit ints actually behave like (bit-shift-left ;; x (mod shift-amount 32)) for shift-amount over 32. It is ;; also likely a bug in the implementation if that happens. (>= root-shift 32) {:error true, :kind :root, :description (str "shift of root is " root-shift " >= 32," " which is not supported.")} ;; This is not necessarily a bug, but it seems likely to be ;; a bug if a tree is less than 1/max-branches-squared full compared to its ;; max capacity. 1/(p/max-branches) full is normal when a tree becomes 1 ;; deeper than it was before. (< 0 (:count x) (max-capacity-divided-by-max-branches-squared root-shift)) {:error false, :warning true, :kind :root-too-deep, :description (str "For root shift=" root-shift " the maximum " "capacity divided by p/max-branches-squared is " (max-capacity-divided-by-max-branches-squared root-shift) " but the tree contains only " (:count x) " vector elements outside of the tail")} :else x))))) #_(defn add-return-value-checks [f err-desc-str return-value-check-fn] (fn [& args] (let [ret (apply f args)] (apply return-value-check-fn err-desc-str ret args) ret))) (defn copying-seq [v] (let [{:keys [v subvector? subvec-start subvec-end get-root get-shift get-tail get-array regular?]} (pd/unwrap-subvec-accessors-for v) root (get-root v) shift (get-shift v)] (letfn [(go [shift node] (if node (if (not= shift 0) (apply concat (map (partial go (- shift p/shift-increment)) (let [arr (get-array node)] (if (regular? node) arr (butlast arr))))) (seq (get-array node)))))] (doall ;; always return a fully realized sequence. (let [all-elems (concat (go shift root) (if (pd/transient-vector? v) (take (pd/dbg-tidx v) (get-tail v)) (seq (get-tail v))))] (if subvector? (take (- subvec-end subvec-start) (drop subvec-start all-elems)) all-elems)))))) (def failure-data (atom [])) (def warning-data (atom [])) (defn clear-failure-data! [] (reset! failure-data [])) (let [orig-conj clojure.core/conj] (defn record-failure-data [d] (swap! failure-data orig-conj d)) (defn record-warning-data [d] (swap! warning-data orig-conj d))) ;; I would like to achieve a goal of providing an easy-to-use way that ;; a Clojure or ClojureScript developer could call a function, or ;; invoke their own code in a macro, and then within the run-time ;; scope of that, a selected set of calls to functions like conj, ;; conj!, pop, pop!, transient, subvec, slicev, catvec, splicev, and ;; perhaps others, would have extra checks enabled, such that if they ;; detected a bug, they would stop the execution immediately with a ;; lot of debug information recorded as near to the point of the ;; failure as can be achieved by checking the return values of such ;; function calls. ;; It would also be good if this goal could be achieved without having ;; a separate implementation of all of those functions, and/or custom ;; versions of Clojure, ClojureScript, or the core.rrb-vector library ;; to use. Actually a separate implementation of core.rrb-vector ;; might be acceptable and reasonable to implement and maintain, but ;; separate versions of Clojure and ClojureScript seems like too much ;; effort for the benefits achieved. ;; I have investigated approaches that attempt to use with-redefs on ;; the 'original Vars' in Clojure, and also in a ClojureScript ;; Node-based REPL. ;; There are differences between with-redefs behavior on functions in ;; clojure.core between Clojure and ClojureScript, because ;; direct-linking seems to also include user code calling to ;; clojure.core functions with ClojureScript: ;; https://clojure.atlassian.net/projects/CLJS/issues/CLJS-3154 ;; At least in Clojure, and perhaps also in ClojureScript, there is ;; sometimes an effect similar to direct linking involved when calling ;; protocol methods on objects defined via deftype. That prevents ;; with-redefs, and any technique that changes the definition of a Var ;; with alter-var-root! or set!, from causing the alternate function ;; to be called. ;; Here are the code paths that I think are most useful for debug ;; checks of operations on vectors. ;; Functions in clojure.core: ;; Lower value, because they are simpler functions, and in particular ;; do not operate on RRB vector trees with ranges inside: ;; vec vector vector-of ;; Similarly the RRB vector variants of those functions create regular ;; RRB vectors, so not as likely to have bugs. ;; peek can operate on trees with ranges inside, but always accesses ;; the tail, so not nearly as likely to have bugs. ;; Higher value, because they can operate on RRB vectors with ranges ;; inside the tree: ;; conj pop assoc ;; conj! pop! assoc! ;; transient persistent! ;; seq rseq ;; Functions in clojure.core.rrb-vector namespace, and internal ;; implementation functions/protocol-methods that they use: ;; defn fv/catvec ;; calls itself recursively for many args (clj and cljs versions) ;; -splicev protocol function (splicev for clj) ;; When -splicev is called on PersistentVector or Subvec, -as-rrbt ;; converts it to Vector, then method below is called. ;; deftype Vector -splicev / splicev method ;; -as-rrbt (cljs) / as-rrbt (clj) ;; -slicev (cljs) / slicev (clj) if used on a subvector object ;; defn splice-rrbts ;; defn splice-rrbts-main ;; Calls many internal implementation detail functions. ;; peephole-optimize-root ;; fallback-to-slow-splice-if-needed ;; defn fv/subvec ;; -slicev (cljs) / slicev (clj) protocol function ;; deftype Vector -slicev method ;; Calls many internal implementation detail functions, ;; e.g. slice-left slice-right make-array array-copy etc. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; See the documentation of the several checking-* functions for the ;; keys supported inside of the @debug-opts map. (def debug-opts (atom {})) (def full-debug-opts {:trace false :validate true :return-value-checks [edit-nodes-errors basic-node-errors ranges-errors] ;; false -> throw an exception when error detected :continue-on-error false ;; true -> do not throw an exception when warning found :continue-on-warning true}) (defn set-debug-opts! "set-debug-opts! modified the debug-opts atom of the core.rrb-vector library, which configures what kinds of extra checks are performed when calling the checking-* versions of functions defined in the library's debug namespace. Example call: (require '[clojure.core.rrb-vector.debug :as d]) (d/set-debug-opts! d/full-debug-opts) This call enables as thorough of extra verification checks as is supported by existing code, when you call any of the checking-* variants of the functions in this namespace, e.g. checking-catvec, checking-subvec. It will also slow down your code to do so. checking-* functions return the same values as their non checking-* original functions they are based upon, so you can write application code that mixes calls to both, calling the checking-* versions only occasionally, if you have a long sequence of operations that you want to look for bugs within core.rrb-vector's implementation of." [opts] (reset! debug-opts {:catvec opts ;; affects checking-catvec behavior, ;; via calling checking-splicev and ;; checking-splice-rrbts and enabling ;; their extra checks. :subvec opts ;; affects checking-subvec behavior, ;; via calling checking-slicev and ;; enabling its extra checks :pop opts ;; affects checking-pop :pop! opts ;; affects checking-pop! :transient opts})) ;; affects checking-transient (defn validation-failure [err-msg-str failure-data opts] (println "ERROR:" err-msg-str) (record-failure-data failure-data) (when-not (:continue-on-error opts) (throw (ex-info err-msg-str failure-data)))) (defn sanity-check-vector-internals "This function is called by all of the checking-* variants of functions in the debug namespace. It calls all of the functions in (:return-value-checks opts) in the order given, passing each of those functions a return value 'ret'. Each function performs sanity checks on the 'ret' data structure used to represent the vector. Those functions should return a map with key :error having a logical true value if any errors were found, or a key :warning having a logical true value if any warnings were found, otherwise both of those values must be logical false in the returned map (or no such key is present in the returned map at all). Three examples of such functions are included in core.rrb-vector's debug namespace. * edit-nodes-errors * basic-node-errors * ranges-errors They each look for different problems in the vector data structure internals. They were developed as separate functions in case there was ever a significant performance advantage to configuring only some of them to be called, not all of them, for long tests. If any errors are found, this function calls record-failure-data, to record the details in a global atom. It prints a message to *out*, and if (:continue-on-error opts) is logical false, it throws a data conveying exception using ex-info containing the same message, and the same error details map passed to record-failure-data. If no exception is thrown due to an error, then repeat the same checks for a warning message, recording details via calling record-warning-data, and throwing an exception if (:continue-on-warning opts) is logical false." [err-desc-str ret args opts] (doseq [check-fn (:return-value-checks opts)] (let [i (check-fn ret)] (when (:error i) (let [msg (str "found error in ret value from " err-desc-str ": " (:description i)) failure-data {:err-desc-str err-desc-str, :ret ret, :args args, :error-info i}] (println "ERROR:" msg) (record-failure-data failure-data) (when-not (:continue-on-error opts) (throw (ex-info msg failure-data))))) (when (:warning i) ;; It is perfectly normal for fv/subvec and slicev to return a ;; vector that causes this warning. (when-not (and (= err-desc-str "slicev") (= :root-too-deep (:kind i))) (let [msg (str "possible issue with ret value from " err-desc-str ": " (:description i)) failure-data {:err-desc-str err-desc-str, :ret ret, :args args, :error-info i}] (println "WARNING:" msg) (record-warning-data failure-data) (when-not (:continue-on-warning opts) (throw (ex-info msg failure-data))))))))) (defn validating-pop "validating-pop is not really designed to be called from user programs. checking-pop can do everything that validating-pop can, and more. See its documentation. A typical way of calling validating-pop is: (require '[clojure.core.rrb-vector.debug :as d]) (d/validating-pop clojure.core/pop \"pop\" coll) Most of the validating-* functions behave similarly. This one contains the most complete documentation, and the others refer to this one. They all differ in the function that they are intended to validate, and a few other details, which will be collected in one place here for function validating-pop so one can quickly see the differences between validating-pop and the other validating-* functions. good example f: clojure.core/pop opts map: (get @d/debug-opts :pop) The first argument can be any function f. f is expected to take arguments and return a value equal to what clojure.core/pop would, given the argument coll. validating-pop will first make a copy of the seq of items in coll, as a safety precaution, because some kinds of incorrect implementations of pop could mutate their input argument. That would be a bug, of course, but aiding a developer in detecting bugs is the reason validating-pop exists. It uses the function copying-seq to do this, which takes at least linear time in the size of coll. It will then calculate a sequence that is = to the expected return value, e.g. for pop, all items in coll except the last one. Then validating-pop will call (f coll), then call copying-seq on the return value. If the expected and returned sequences are not =, then a map containing details about the arguments and actual return value is created and passed to d/record-failure-data, which appends the map to the end of a vector that is the value of an atom named d/failure-data. An exception is thrown if (:continue-on-error opts) is logical false, with ex-data equal to this same map of error data. If the expected and actual sequences are the same, no state is modified and no exception is thrown. If validating-pop does not throw an exception, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq (butlast coll-seq) ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(pop coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :pop))) ret)) (defn checking-pop "These two namespace aliases will be used later in this documentation: (require '[clojure.core.rrb-vector.debug :as d]) (require '[clojure.core.rrb-vector.debug-platform-dependent :as pd]) checking-pop passes its argument to clojure.core/pop, and if it returns, it returns whatever clojure.core/pop does. If checking-pop detects any problems, it will record information about the problems found in one or both of the global atoms 'd/failure-data' and 'd/warning-data', and optionally throw an exception. If coll is not a vector type according to pd/is-vector?, then checking-pop simply behaves exactly like clojure.core/pop, with no additional checks performed. All of checking-pop's extra checks are specific to vectors. If coll is a vector, then checking-pop looks up the key :pop in a global atom 'd/debug-opts'. The result of that lookup is a map we will call 'opts' below. opts map: (get @d/debug-opts :pop) function called if (:validating opts) is logical true: validating-pop If (:trace opts) is true, then a debug trace message is printed to *out*. If (:validate opts) is true, then validating-pop is called, using clojure.core/pop to do the real work, but validating-pop will check whether the return value looks correct relative to the input parameter value, i.e. it is equal to a sequence of values containing all but the last element of the input coll's sequence of values. See validating-pop documentation for additional details. This step records details of problems found in the atoms d/failure-data. (:return-value-checks opts) should be a sequence of functions that each take the vector returned from calling clojure.core/pop, and return data about any errors or warnings they find in the internals of the vector data structure. Errors or warnings are appended to atoms d/failure-data and/or d/warning-data. If either the validate or return value checks steps find an error, they throw an exception if (:continue-on-error opts) is logical false. If the return value checks step finds no error, but does find a warning, it throws an exception if (:continue-on-warning opts) is logical false." [coll] (if-not (pd/is-vector? coll) (clojure.core/pop coll) (let [opts (get @debug-opts :pop) err-desc-str "pop"] (when (:trace opts) (println "checking-pop called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-pop clojure.core/pop err-desc-str coll) (clojure.core/pop coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-pop! "validating-pop! behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core/pop! opts map: (get @d/debug-opts :pop!) If no exception is thrown, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq (butlast coll-seq) ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(pop! coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :pop!))) ret)) (defn checking-pop! "checking-pop! is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :pop!) function called if (:validating opts) is logical true: validating-pop!" [coll] (if-not (pd/is-vector? coll) (clojure.core/pop! coll) (let [opts (get @debug-opts :pop!) err-desc-str "pop!"] (when (:trace opts) (println "checking-pop! called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-pop! clojure.core/pop! err-desc-str coll) (clojure.core/pop! coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-transient "validating-transient behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core/transient opts map: (get @d/debug-opts :transient) If no exception is thrown, the return value is (f coll)." [f err-desc-str coll] (let [coll-seq (copying-seq coll) exp-ret-seq coll-seq ret (f coll) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(transient coll) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :transient))) ret)) (defn checking-transient "checking-transient is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :transient) function called if (:validating opts) is logical true: validating-transient" [coll] (if-not (pd/is-vector? coll) (clojure.core/transient coll) (let [opts (get @debug-opts :transient) err-desc-str "transient"] (when (:trace opts) (println "checking-transient called with #v=" (count coll) "(type v)=" (type coll))) (let [ret (if (:validate opts) (validating-transient clojure.core/transient err-desc-str coll) (clojure.core/transient coll))] (sanity-check-vector-internals err-desc-str ret [coll] opts) ret)))) (defn validating-splice-rrbts-main "validating-splice-rrbts-main behaves the same as validating-pop, with the differences described here. See validating-pop for details. good example f: clojure.core.rrb-vector.rrbt/splice-rrbts-main opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts-main Given that splice-rrbts-main is an internal implementation detail of the core.rrb-vector library, it is expected that it is more likely you would call validating-catvec instead of this function. If no exception is thrown, the return value is (f v1 v2)." [err-desc-str nm am v1 v2] (let [orig-fn clojure.core.rrb-vector.rrbt/splice-rrbts-main v1-seq (copying-seq v1) v2-seq (copying-seq v2) exp-ret-seq (concat v1-seq v2-seq) ret (orig-fn nm am v1 v2) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "splice-rrbts-main returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list nm am v1 v2) :v1-seq v1-seq, :v2-seq v2-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :catvec))) ret)) (defn checking-splice-rrbts-main "checking-splice-rrbts-main is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-catvec. checking-splice-rrbts-main is part of the implementation of checking-catvec. opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts-main function called if (:validating opts) is logical true: validating-splice-rrbts-main" [& args] (let [opts (get @debug-opts :catvec) err-desc-str "splice-rrbts-main"] (when (:trace opts) (let [[_ _ v1 v2] args] (println "checking-splice-rrbts-main called with #v1=" (count v1) "#v2=" (count v2) "(type v1)=" (type v1) "(type v2)=" (type v2)))) (let [ret (if (:validate opts) (apply validating-splice-rrbts-main err-desc-str args) (apply clojure.core.rrb-vector.rrbt/splice-rrbts-main args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn checking-splice-rrbts "checking-splice-rrbts is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-catvec. checking-splice-rrbts is part of the implementation of checking-catvec. opts map: (get @d/debug-opts :catvec) ;; _not_ :splice-rrbts function called if (:validating opts) is logical true: validating-splice-rrbts" [& args] (let [opts (get @debug-opts :catvec) err-desc-str1 "splice-rrbts checking peephole-optimize-root result" err-desc-str2 "splice-rrbts checking fallback-to-slow-splice-if-needed result" [nm am v1 v2] args] (when (:trace opts) (println "checking-splice-rrbts called with #v1=" (count v1) "#v2=" (count v2) "(type v1)=" (type v1) "(type v2)=" (type v2))) (let [r1 (checking-splice-rrbts-main nm am v1 v2) r2 (rrbt/peephole-optimize-root r1)] ;; Optimize a bit by only doing all of the sanity checks on r2 ;; if it is not the same identical data structure r1 that ;; checking-splice-rrbts-main already checked. (when-not (identical? r2 r1) (sanity-check-vector-internals err-desc-str1 r2 args opts)) (let [r3 (rrbt/fallback-to-slow-splice-if-needed v1 v2 r2)] (when-not (identical? r3 r2) (sanity-check-vector-internals err-desc-str2 r3 args opts)) r3)))) (defn checking-splicev "checking-splicev is identical to splicev, except that it calls checking-splice-rrbts instead of splice-rrbts, for configurable additional checking on each call to checking-splice-rrbts. It is more likely that a core.rrb-vector library user will want to call checking-catvec rather than this one. checking-splicev is part of the implementation of checking-catvec." [v1 v2] (let [rv1 (as-rrbt v1)] (checking-splice-rrbts (.-nm rv1) (.-am rv1) rv1 (as-rrbt v2)))) (defn checking-catvec-impl "checking-catvec-impl is identical to catvec, except that it calls checking-splicev instead of splicev, for configurable additional checking on each call to checking-splicev." ([] []) ([v1] v1) ([v1 v2] (checking-splicev v1 v2)) ([v1 v2 v3] (checking-splicev (checking-splicev v1 v2) v3)) ([v1 v2 v3 v4] (checking-splicev (checking-splicev v1 v2) (checking-splicev v3 v4))) ([v1 v2 v3 v4 & vn] (checking-splicev (checking-splicev (checking-splicev v1 v2) (checking-splicev v3 v4)) (apply checking-catvec-impl vn)))) (defn validating-catvec "validating-catvec behaves similarly to validating-pop, but note that it does not allow you to pass in a function f on which to concatenate its arguments. It hardcodes d/checking-catvec-impl for that purpose. See validating-pop for more details. opts map: (get @d/debug-opts :catvec) If no exception is thrown, the return value is (apply checking-catvec-impl vs)." [err-desc-str & vs] (let [orig-fn checking-catvec-impl ;; clojure.core.rrb-vector/catvec vs-seqs (doall (map copying-seq vs)) exp-ret-seq (apply concat vs-seqs) ret (apply orig-fn vs) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "catvec returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args vs, :vs-seqs vs-seqs, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :catvec))) ret)) (defn checking-catvec "checking-catvec is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Note that (get @d/debug-otps :catvec) is used to control tracing, validating, and return value sanity checks for checking-catvec as a whole. This includes controlling those options for the function checking-splice-rrbts, which is used to concatenate pairs of vectors when you call checking-catvec with 3 or more vectors. This takes a bit longer to do the checking on every concatenation, but catches problems closer to the time they are introduced. opts map: (get @d/debug-opts :catvec) function called if (:validating opts) is logical true: validating-catvec" [& args] (let [opts (get @debug-opts :catvec) err-desc-str "catvec"] (when (:trace opts) (println "checking-catvec called with" (count args) "args:") (dorun (map-indexed (fn [idx v] (println " arg" (inc idx) " count=" (count v) "type=" (type v))) args))) (let [ret (if (:validate opts) (apply validating-catvec err-desc-str args) (apply checking-catvec-impl ;; clojure.core.rrb-vector/catvec args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn validating-slicev "validating-slicev behaves similarly to validating-pop, but note that it does not allow you to pass in a function f to call. It hardcodes slicev for that purpose. See validating-pop for more details. opts map: (get @d/debug-opts :subvec) ;; _not_ :slicev" ([err-desc-str coll start] (validating-slicev err-desc-str coll start (count coll))) ([err-desc-str coll start end] (let [coll-seq (copying-seq coll) exp-ret-seq (take (- end start) (drop start coll-seq)) ret (clojure.core.rrb-vector.protocols/slicev coll start end) ret-seq (copying-seq ret)] (when (not= ret-seq exp-ret-seq) (validation-failure "(slicev coll start end) returned incorrect value" {:err-desc-str err-desc-str, :ret ret, :args (list coll start end), :coll-seq coll-seq, :ret-seq ret-seq, :exp-ret-seq exp-ret-seq} (get @debug-opts :subvec))) ret))) (defn checking-slicev "checking-slicev is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. Unlike checking-pop, it seems unlikely that a user of core.rrb-vector would want to call this function directly. See checking-subvec. checking-slicev is part of the implementation of checking-subvec. opts map: (get @d/debug-opts :subvec) ;; _not_ :slicev function called if (:validating opts) is logical true: validating-slicev" [& args] (let [opts (get @debug-opts :subvec) err-desc-str "slicev"] (when (:trace opts) (let [[v start end] args] (println "checking-slicev #v=" (count v) "start=" start "end=" end "type=" (type v)))) (let [ret (if (:validate opts) (apply validating-slicev err-desc-str args) (apply clojure.core.rrb-vector.protocols/slicev args))] (sanity-check-vector-internals err-desc-str ret args opts) ret))) (defn checking-subvec "checking-subvec is similar to checking-pop, with the differences summarized below. See checking-pop documentation for details. opts map: (get @d/debug-opts :subvec) function called if (:validating opts) is logical true: validating-slicev" ([v start] (checking-slicev v start (count v))) ([v start end] (checking-slicev v start end))) (defn check-subvec "Perform a sequence of calls to subvec an a core.rrb-vector vector, as well as a normal Clojure vector, returning true if they give the same results, otherwise false. Intended for use in tests of this library." [extra-checks? init & starts-and-ends] (let [v1 (loop [v (vec (range init)) ses (seq starts-and-ends)] (if ses (let [[s e] ses] (recur (clojure.core/subvec v s e) (nnext ses))) v)) my-subvec (if extra-checks? checking-subvec fv/subvec) v2 (loop [v (fv/vec (range init)) ses (seq starts-and-ends)] (if ses (let [[s e] ses] (recur (my-subvec v s e) (nnext ses))) v))] (pd/same-coll? v1 v2))) (defn check-catvec "Perform a sequence of calls to catvec or checking-catvec on one or more core.rrb-vector vectors. Return true if Clojure's built-in concat function give the same results, otherwise false. Intended for use in tests of this library." [extra-checks? & counts] (let [prefix-sums (reductions + counts) ranges (map range (cons 0 prefix-sums) prefix-sums) v1 (apply concat ranges) my-catvec (if extra-checks? checking-catvec fv/catvec) v2 (apply my-catvec (map fv/vec ranges))] (pd/same-coll? v1 v2))) (defn generative-check-subvec "Perform many calls to check-subvec with randomly generated inputs. Intended for use in tests of this library. Returns true if all tests pass, otherwise throws an exception containing data about the inputs that caused the failing test." [extra-checks? iterations max-init-cnt slices] (dotimes [_ iterations] (let [init-cnt (rand-int (inc max-init-cnt)) s1 (rand-int init-cnt) e1 (+ s1 (rand-int (- init-cnt s1)))] (loop [s&es [s1 e1] cnt (- e1 s1) slices slices] (if (or (zero? cnt) (zero? slices)) (if-not (try (apply check-subvec extra-checks? init-cnt s&es) (catch Exception e (throw (ex-info "check-subvec failure w/ Exception" {:init-cnt init-cnt :s&es s&es} e)))) (throw (ex-info "check-subvec failure w/o Exception" {:init-cnt init-cnt :s&es s&es}))) (let [s (rand-int cnt) e (+ s (rand-int (- cnt s))) c (- e s)] (recur (conj s&es s e) c (dec slices))))))) true) (defn generative-check-catvec "Perform many calls to check-catvec with randomly generated inputs. Intended for use in tests of this library. Returns true if all tests pass, otherwise throws an exception containing data about the inputs that caused the failing test." [extra-checks? iterations max-vcnt min-cnt max-cnt] (dotimes [_ iterations] (let [vcnt (inc (rand-int (dec max-vcnt))) cnts (vec (repeatedly vcnt #(+ min-cnt (rand-int (- (inc max-cnt) min-cnt)))))] (if-not (try (apply check-catvec extra-checks? cnts) (catch Exception e (throw (ex-info "check-catvec failure w/ Exception" {:cnts cnts} e)))) (throw (ex-info "check-catvec failure w/o Exception" {:cnts cnts}))))) true) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/debug_platform_dependent.clj ================================================ (ns clojure.core.rrb-vector.debug-platform-dependent (:refer-clojure :exclude [format printf]) (:require [clojure.core.rrb-vector.parameters :as p] clojure.core.rrb-vector.rrbt [clojure.core.rrb-vector.nodes :refer [ranges object-nm primitive-nm object-am]] [clojure.core.rrb-vector :as fv]) (:import (clojure.lang PersistentVector PersistentVector$TransientVector PersistentVector$Node APersistentVector$SubVector) (java.util.concurrent.atomic AtomicReference) (java.lang.reflect Field Method) (clojure.core Vec VecNode ArrayManager) (clojure.core.rrb_vector.rrbt Vector Transient) (clojure.core.rrb_vector.nodes NodeManager))) ;; Work around the fact that several fields of type ;; PersistentVector$TransientVector are private, but note that this is ;; only intended for debug use. (def ^Class transient-core-vec-class (class (transient (vector)))) (def ^Field transient-core-root-field (.getDeclaredField transient-core-vec-class "root")) (.setAccessible transient-core-root-field true) (def ^Field transient-core-shift-field (.getDeclaredField transient-core-vec-class "shift")) (.setAccessible transient-core-shift-field true) (def ^Field transient-core-tail-field (.getDeclaredField transient-core-vec-class "tail")) (.setAccessible transient-core-tail-field true) (def ^Field transient-core-cnt-field (.getDeclaredField transient-core-vec-class "cnt")) (.setAccessible transient-core-cnt-field true) (def transient-core-vec-tailoff-methods (filter #(= "tailoff" (.getName ^Method %)) (.getDeclaredMethods transient-core-vec-class))) (assert (= (count transient-core-vec-tailoff-methods) 1)) (def ^Method transient-core-vec-tailoff-method (first transient-core-vec-tailoff-methods)) (.setAccessible transient-core-vec-tailoff-method true) (def ^Class persistent-core-vec-class (class (vector))) (def persistent-core-vec-tailoff-methods (filter #(= "tailoff" (.getName ^Method %)) (.getDeclaredMethods persistent-core-vec-class))) (assert (= (count persistent-core-vec-tailoff-methods) 1)) (def ^Method persistent-core-vec-tailoff-method (first persistent-core-vec-tailoff-methods)) (.setAccessible persistent-core-vec-tailoff-method true) (def format clojure.core/format) (def printf clojure.core/printf) (defn internal-node? [obj] (contains? #{PersistentVector$Node VecNode} (class obj))) (defn persistent-vector? [obj] (contains? #{PersistentVector Vec Vector} (class obj))) (defn transient-vector? [obj] (contains? #{PersistentVector$TransientVector Transient} (class obj))) (defn is-vector? [obj] (contains? #{PersistentVector Vec Vector PersistentVector$TransientVector Transient} (class obj))) (defn dbg-tailoff [v] (cond (instance? PersistentVector v) (.invoke persistent-core-vec-tailoff-method v (object-array 0)) (= PersistentVector$TransientVector (class v)) (.invoke transient-core-vec-tailoff-method v (object-array 0)) :else (.tailoff v))) (defn dbg-tidx [v] (- (count v) (dbg-tailoff v))) (defn subvector-data [v] (if (instance? APersistentVector$SubVector v) (let [^APersistentVector$SubVector v v] {:orig-v v :subvector? true :v (.v v) :subvec-start (.start v) :subvec-end (.end v)}) {:orig-v v :subvector? false :v v})) ;; All of the classes below have a .tailoff method implementation that ;; works correctly for that class. You can use the debug-tailoff ;; function to work around the fact that this method is not public for ;; some of the vector classes. (defn accessors-for [v] (condp identical? (class v) PersistentVector (let [nm object-nm, am object-am] {:get-root #(.-root ^PersistentVector %) :get-shift #(.-shift ^PersistentVector %) :get-tail #(.-tail ^PersistentVector %) :get-cnt #(.-cnt ^PersistentVector %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }) PersistentVector$TransientVector (let [nm object-nm, am object-am] {:get-root #(.get transient-core-root-field ^PersistentVector$TransientVector %) :get-shift #(.get transient-core-shift-field ^PersistentVector$TransientVector %) :get-tail #(.get transient-core-tail-field ^PersistentVector$TransientVector %) :get-cnt #(.get transient-core-cnt-field ^PersistentVector$TransientVector %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }) Vec (let [nm primitive-nm, am #(.-am ^Vec %)] {:get-root #(.-root ^Vec %) :get-shift #(.-shift ^Vec %) :get-tail #(.-tail ^Vec %) :get-cnt #(.-cnt ^Vec %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }) Vector (let [nm (.-nm ^Vector v), am #(.-am ^Vector %)] {:get-root #(.-root ^Vector %) :get-shift #(.-shift ^Vector %) :get-tail #(.-tail ^Vector %) :get-cnt #(.-cnt ^Vector %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }) Transient (let [nm (.-nm ^Transient v), am (.-am ^Transient v)] {:get-root #(.debugGetRoot ^Transient %) :get-shift #(.debugGetShift ^Transient %) :get-tail #(.debugGetTail ^Transient %) :get-cnt #(.debugGetCnt ^Transient %) :get-array #(.array ^NodeManager nm %) :get-ranges #(ranges ^NodeManager nm %) :regular? #(.regular ^NodeManager nm %) :tail-len #(.alength ^ArrayManager am %) }))) (defn unwrap-subvec-accessors-for [v] (let [{:keys [v] :as m} (subvector-data v) accessors (accessors-for v)] (merge m accessors))) (defn abbrev-for-type-of [obj] (let [cn (.getName (class obj)) d (.lastIndexOf cn ".")] (subs cn (inc d)))) (defn same-coll? [a b] (and (= (count a) (count b) (.size ^java.util.Collection a) (.size ^java.util.Collection b)) (= a b) (= b a) (= (hash a) (hash b)) (= (.hashCode ^Object a) (.hashCode ^Object b)))) ;; TBD: No cljs specific version yet (defn count-nodes [& vs] (let [m (java.util.IdentityHashMap.)] (doseq [v vs] (let [{:keys [v get-root get-shift get-array]} (unwrap-subvec-accessors-for v)] (letfn [(go [n shift] (when n (.put m n n) (if-not (zero? shift) (let [arr (get-array n) ns (take p/max-branches arr)] (doseq [n ns] (go n (- shift p/shift-increment)))))))] (go (get-root v) (get-shift v))))) (.size m))) (defn int-array? [x] (and (not (nil? x)) (.isArray (class x)) (= Integer/TYPE (. (class x) getComponentType)))) ;; TBD: No cljs-specific version of this function yet #_(defn ranges-not-int-array [x] (seq (remove int-array? (objects-in-slot-32-of-obj-arrays x)))) (defn atomicref? [x] (instance? AtomicReference x)) (defn thread? [x] (instance? java.lang.Thread x)) (defn non-identical-edit-nodes [v all-vector-tree-nodes] (let [{:keys [v]} (unwrap-subvec-accessors-for v) node-maps (all-vector-tree-nodes v) ^java.util.IdentityHashMap ihm (java.util.IdentityHashMap.)] (doseq [i node-maps] (when (= :internal (:kind i)) (.put ihm (.edit (:node i)) true))) ihm)) (defn edit-nodes-errors [v all-vector-tree-nodes] (let [{:keys [v get-root]} (unwrap-subvec-accessors-for v) klass (class v) ^java.util.IdentityHashMap ihm (non-identical-edit-nodes v all-vector-tree-nodes) objs-maybe-some-nils (.keySet ihm) ;; I do not believe that Clojure's built-in vector types can ;; ever have edit fields equal to nil, but there are some ;; cases where I have seen core.rrb-vector edit fields equal ;; to nil. As far as I can tell this seems harmless, as long ;; as it is in a persistent vector, not a transient one. objs (remove nil? objs-maybe-some-nils) neither-nil-nor-atomicref (remove atomicref? objs)] (if (seq neither-nil-nor-atomicref) {:error true :description (str "Found edit object with class " (class (first neither-nil-nor-atomicref)) " - expecting nil or AtomicReference") :data ihm :not-atomic-refs neither-nil-nor-atomicref} (let [refd-objs (map #(.get ^AtomicReference %) objs) non-nils (remove nil? refd-objs) not-threads (remove thread? non-nils) root-edit (.edit (get-root v))] (cond (seq not-threads) {:error true :description (str "Found edit AtomicReference ref'ing neither nil" " nor a Thread object") :data ihm} (persistent-vector? v) (if (= (count non-nils) 0) {:error false} {:error true :description (str "Within a persistent (i.e. not transient)" " vector, found at least one edit" " AtomicReference object that ref's a Thread" " object. Expected all of them to be nil.") :data ihm :val1 (count non-nils) :val2 non-nils}) (transient-vector? v) (cond (not= (count non-nils) 1) {:error true :description (str "Within a transient vector, found " (count non-nils) " edit AtomicReference" " object(s) that ref's a Thread object." " Expected exactly 1.") :data ihm :val1 (count non-nils) :val2 non-nils} (not (atomicref? root-edit)) {:error true :description (str "Within a transient vector, found root edit" " field that was ref'ing an object with class " (class root-edit) " - expected AtomicReference.") :data root-edit} (not (thread? (.get ^AtomicReference root-edit))) (let [obj (.get ^AtomicReference root-edit)] {:error true :description (str "Within a transient vector, found root edit" " field ref'ing an AtomicReference object," " but that in turn ref'd something with class " (class obj) " - expected java.lang.Thread.") :data obj}) :else {:error false}) :else {:error true :description (str "Unknown class " klass " for object checked" " by edit-nodes-wrong-number-of-threads") :data v}))))) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/fork_join.clj ================================================ (ns clojure.core.rrb-vector.fork-join (:require [clojure.core.reducers :as r])) (def pool @#'r/pool) (def task @#'r/fjtask) (def invoke @#'r/fjinvoke) (def fork @#'r/fjfork) (def join @#'r/fjjoin) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/interop.clj ================================================ (ns clojure.core.rrb-vector.interop (:require [clojure.core.rrb-vector.protocols :refer [PSliceableVector slicev PSpliceableVector splicev]] [clojure.core.rrb-vector.rrbt :refer [as-rrbt]]) (:import (clojure.core Vec) (clojure.lang PersistentVector APersistentVector$SubVector) (clojure.core.rrb_vector.rrbt Vector))) (extend-protocol PSliceableVector Vec (slicev [v start end] (slicev (as-rrbt v) start end)) PersistentVector (slicev [v start end] (slicev (as-rrbt v) start end)) APersistentVector$SubVector (slicev [v start end] (slicev (as-rrbt v) start end))) (extend-protocol PSpliceableVector Vec (splicev [v1 v2] (splicev (as-rrbt v1) v2)) PersistentVector (splicev [v1 v2] (splicev (as-rrbt v1) v2)) APersistentVector$SubVector (splicev [v1 v2] (splicev (as-rrbt v1) v2))) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/nodes.clj ================================================ (ns clojure.core.rrb-vector.nodes (:require [clojure.core.rrb-vector.parameters :as p]) (:import (clojure.core VecNode ArrayManager) (clojure.lang PersistentVector PersistentVector$Node) (java.util.concurrent.atomic AtomicReference))) (set! *warn-on-reflection* true) (set! *unchecked-math* true) ;; :warn-on-boxed ;;; array managers (defmacro mk-am [t] (#'clojure.core/mk-am &env &form t)) (definline object [x] x) (def ams (assoc @#'clojure.core/ams :object (mk-am object))) (def object-am (ams :object)) ;;; empty nodes ;; The checking-* functions for the parameterized version expect an ;; empty node with at most p/max-branches children. The Clojure ;; versions have 32, unless one also parameterizes the Clojure code, ;; which I have not done. I will instead try changing the definition ;; of these empty nodes to have the branch factor of the parameterized ;; version, so the checking-* functions will not give errors for them. (def NOEDIT (java.util.concurrent.atomic.AtomicReference. nil)) (def empty-pv-node (PersistentVector$Node. NOEDIT (object-array p/max-branches))) (def empty-gvec-node (VecNode. nil (object-array p/max-branches))) ;;; node managers (definterface NodeManager (node [^java.util.concurrent.atomic.AtomicReference edit arr]) (empty []) (array [node]) (^java.util.concurrent.atomic.AtomicReference edit [node]) (^boolean regular [node]) (clone [^clojure.core.ArrayManager am ^int shift node])) (def object-nm (reify NodeManager (node [_ edit arr] (PersistentVector$Node. edit arr)) (empty [_] empty-pv-node) (array [_ node] (.-array ^PersistentVector$Node node)) (edit [_ node] (.-edit ^PersistentVector$Node node)) (regular [_ node] (not (== (alength ^objects (.-array ^PersistentVector$Node node)) (int p/non-regular-array-len)))) (clone [_ am shift node] (PersistentVector$Node. (.-edit ^PersistentVector$Node node) (aclone ^objects (.-array ^PersistentVector$Node node)))))) (def primitive-nm (reify NodeManager (node [_ edit arr] (VecNode. edit arr)) (empty [_] empty-gvec-node) (array [_ node] (.-arr ^VecNode node)) (edit [_ node] (.-edit ^VecNode node)) (regular [_ node] (not (== (alength ^objects (.-arr ^VecNode node)) (int p/non-regular-array-len)))) (clone [_ am shift node] (if (zero? shift) (VecNode. (.-edit ^VecNode node) (.aclone am (.-arr ^VecNode node))) (VecNode. (.-edit ^VecNode node) (aclone ^objects (.-arr ^VecNode node))))))) ;;; ranges (defmacro ranges [nm node] `(ints (aget ~(with-meta `(.array ~nm ~node) {:tag 'objects}) p/max-branches))) (defn last-range [^NodeManager nm node] (let [rngs (ranges nm node) i (unchecked-dec-int (aget rngs p/max-branches))] (aget rngs i))) (defn regular-ranges [shift cnt] (let [step (bit-shift-left (int 1) (int shift)) rngs (int-array p/non-regular-array-len)] (loop [i (int 0) r step] (if (< r cnt) (do (aset rngs i r) (recur (unchecked-inc-int i) (unchecked-add-int r step))) (do (aset rngs i (int cnt)) (aset rngs p/max-branches (unchecked-inc-int i)) rngs))))) ;;; root overflow (defn overflow? [^NodeManager nm root shift cnt] (if (.regular nm root) (> (bit-shift-right (unchecked-inc-int (int cnt)) (int p/shift-increment)) (bit-shift-left (int 1) (int shift))) (let [rngs (ranges nm root) slc (aget rngs p/max-branches)] (and (== slc (int p/max-branches)) (or (== (int shift) (int p/shift-increment)) (recur nm (aget ^objects (.array nm root) (unchecked-dec-int slc)) (unchecked-subtract-int (int shift) (int p/shift-increment)) (unchecked-add-int (unchecked-subtract-int (aget rngs p/max-branches-minus-1) (aget rngs p/max-branches-minus-2)) (int p/max-branches)))))))) ;;; find nil / 0 (defn index-of-0 ^long [arr] (let [arr (ints arr)] (loop [l 0 h p/max-branches-minus-1] (if (>= l (unchecked-dec h)) (if (zero? (aget arr l)) l (if (zero? (aget arr h)) h p/max-branches)) (let [mid (unchecked-add l (bit-shift-right (unchecked-subtract h l) 1))] (if (zero? (aget arr mid)) (recur l mid) (recur (unchecked-inc-int mid) h))))))) (defn index-of-nil ^long [arr] (loop [l 0 h p/max-branches-minus-1] (if (>= l (unchecked-dec h)) (if (nil? (aget ^objects arr l)) l (if (nil? (aget ^objects arr h)) h p/max-branches)) (let [mid (unchecked-add l (bit-shift-right (unchecked-subtract h l) 1))] (if (nil? (aget ^objects arr mid)) (recur l mid) (recur (unchecked-inc-int mid) h)))))) ;;; children (defn first-child [^NodeManager nm node] (aget ^objects (.array nm node) 0)) (defn last-child [^NodeManager nm node] (let [arr (.array nm node)] (if (.regular nm node) (aget ^objects arr (dec (index-of-nil arr))) (aget ^objects arr (unchecked-dec-int (aget (ranges nm node) p/max-branches)))))) (defn remove-leftmost-child [^NodeManager nm shift parent] (let [arr (.array nm parent)] (if (nil? (aget ^objects arr 1)) nil (let [regular? (.regular nm parent) new-arr (object-array (if regular? p/max-branches p/non-regular-array-len))] (System/arraycopy arr 1 new-arr 0 p/max-branches-minus-1) (if-not regular? (let [rngs (ranges nm parent) rng0 (aget rngs 0) new-rngs (int-array p/non-regular-array-len) lim (aget rngs p/max-branches)] (System/arraycopy rngs 1 new-rngs 0 (dec lim)) (loop [i 0] (when (< i lim) (aset new-rngs i (- (aget new-rngs i) rng0)) (recur (inc i)))) (aset new-rngs p/max-branches (dec (aget rngs p/max-branches))) (aset new-rngs (dec (aget rngs p/max-branches)) (int 0)) (aset ^objects new-arr p/max-branches new-rngs))) (.node nm (.edit nm parent) new-arr))))) (defn replace-leftmost-child [^NodeManager nm shift parent pcnt child d] (if (.regular nm parent) (let [step (bit-shift-left 1 shift) rng0 (- step d) ncnt (- pcnt d) li (bit-and (bit-shift-right shift (dec pcnt)) p/branch-mask) arr (.array nm parent) new-arr (object-array p/non-regular-array-len) new-rngs (int-array p/non-regular-array-len)] (aset ^objects new-arr 0 child) (System/arraycopy arr 1 new-arr 1 li) (aset ^objects new-arr p/max-branches new-rngs) (aset new-rngs 0 (int rng0)) (aset new-rngs li (int ncnt)) (aset new-rngs p/max-branches (int (inc li))) (loop [i 1] (when (<= i li) (aset new-rngs i (+ (aget new-rngs (dec i)) step)) (recur (inc i)))) (.node nm nil new-arr)) (let [new-arr (aclone ^objects (.array nm parent)) rngs (ranges nm parent) new-rngs (int-array p/non-regular-array-len) li (dec (aget rngs p/max-branches))] (aset new-rngs p/max-branches (aget rngs p/max-branches)) (aset ^objects new-arr p/max-branches new-rngs) (aset ^objects new-arr 0 child) (loop [i 0] (when (<= i li) (aset new-rngs i (- (aget rngs i) (int d))) (recur (inc i)))) (.node nm nil new-arr)))) (defn replace-rightmost-child [^NodeManager nm shift parent child d] (if (.regular nm parent) (let [arr (.array nm parent) i (unchecked-dec (index-of-nil arr))] (if (.regular nm child) (let [new-arr (aclone ^objects arr)] (aset ^objects new-arr i child) (.node nm nil new-arr)) (let [arr (.array nm parent) new-arr (object-array p/non-regular-array-len) step (bit-shift-left 1 shift) rngs (int-array p/non-regular-array-len)] (aset rngs p/max-branches (inc i)) (aset ^objects new-arr p/max-branches rngs) (System/arraycopy arr 0 new-arr 0 i) (aset ^objects new-arr i child) (loop [j 0 r step] (when (<= j i) (aset rngs j r) (recur (inc j) (+ r step)))) (aset rngs i (int (last-range nm child))) (.node nm nil new-arr)))) (let [rngs (ranges nm parent) new-rngs (aclone rngs) i (dec (aget rngs p/max-branches)) new-arr (aclone ^objects (.array nm parent))] (aset ^objects new-arr i child) (aset ^objects new-arr p/max-branches new-rngs) (aset new-rngs i (int (+ (aget rngs i) d))) (.node nm nil new-arr)))) ;;; fold-tail (defn new-path [^NodeManager nm ^ArrayManager am shift node] (let [reg? (== p/max-branches (.alength am (.array nm node))) len (if reg? p/max-branches p/non-regular-array-len) arr (object-array len) rngs (if-not reg? (doto (int-array p/non-regular-array-len) (aset 0 (.alength am (.array nm node))) (aset p/max-branches 1))) ret (.node nm nil arr)] (loop [arr arr shift shift] (if (== shift p/shift-increment) (do (if-not reg? (aset arr p/max-branches rngs)) (aset arr 0 node)) (let [a (object-array len) e (.node nm nil a)] (aset arr 0 e) (if-not reg? (aset arr p/max-branches rngs)) (recur a (- shift p/shift-increment))))) ret)) (defn fold-tail [^NodeManager nm ^ArrayManager am node shift cnt tail] (let [tlen (.alength am tail) reg? (and (.regular nm node) (== tlen p/max-branches)) arr (.array nm node) li (index-of-nil arr) new-arr (object-array (if reg? p/max-branches p/non-regular-array-len)) rngs (if-not (.regular nm node) (ranges nm node)) cret (if (== shift p/shift-increment) (.node nm nil tail) (fold-tail nm am (aget ^objects arr (dec li)) (- shift p/shift-increment) (if (.regular nm node) (mod cnt (bit-shift-left 1 shift)) (let [li (unchecked-dec-int (aget rngs p/max-branches))] (if (pos? li) (unchecked-subtract-int (aget rngs li) (aget rngs (unchecked-dec-int li))) (aget rngs 0)))) tail)) new-rngs (ints (if-not reg? (if rngs (aclone rngs) (regular-ranges shift cnt))))] (when-not (and (or (nil? cret) (== shift p/shift-increment)) (== li p/max-branches)) (System/arraycopy arr 0 new-arr 0 li) (when-not reg? (if (or (nil? cret) (== shift p/shift-increment)) (do (aset new-rngs li (+ (if (pos? li) (aget new-rngs (dec li)) (int 0)) tlen)) (aset new-rngs p/max-branches (inc li))) (do (when (pos? li) (aset new-rngs (dec li) (+ (aget new-rngs (dec li)) tlen))) (aset new-rngs p/max-branches li)))) (if-not reg? (aset new-arr p/max-branches new-rngs)) (if (nil? cret) (aset new-arr li (new-path nm am (unchecked-subtract-int shift p/shift-increment) (.node nm nil tail))) (aset new-arr (if (== shift p/shift-increment) li (dec li)) cret)) (.node nm nil new-arr)))) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/parameters.clj ================================================ (ns clojure.core.rrb-vector.parameters) ;; The values in comments before each def are the value of that ;; parameter: ;; * when the shift-increment is 5 ;; * when the shift-increment is 3 ;; * when the shift-increment is 2 ;; 5 3 2 (def shift-increment 5) ;; 10 6 4 (def shift-increment-times-2 (* 2 shift-increment)) ;; 32 8 4 (def max-branches (bit-shift-left 1 shift-increment)) ;; 0x1f 0x7 0x3 (def branch-mask (dec max-branches)) ;; 31 7 3 (def max-branches-minus-1 (dec max-branches)) ;; 30 6 2 (def max-branches-minus-2 (- max-branches 2)) ;; 33 9 5 (def non-regular-array-len (inc max-branches)) ;; 1024 64 16 (def max-branches-squared (* max-branches max-branches)) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/protocols.clj ================================================ (ns clojure.core.rrb-vector.protocols) (defprotocol PSpliceableVector (splicev [v1 v2])) (defprotocol PSliceableVector (slicev [v start end])) (defprotocol PTransientDebugAccess (debugGetRoot [v]) (debugGetShift [v]) (debugGetTail [v]) (debugGetCnt [v])) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/rrbt.clj ================================================ (ns clojure.core.rrb-vector.rrbt (:refer-clojure :exclude [assert ->VecSeq]) (:require [clojure.core.rrb-vector.parameters :as p] [clojure.core.rrb-vector.protocols :refer [PSliceableVector slicev PSpliceableVector splicev PTransientDebugAccess]] [clojure.core.rrb-vector.nodes :refer [ranges overflow? last-range regular-ranges first-child last-child remove-leftmost-child replace-leftmost-child replace-rightmost-child fold-tail new-path index-of-nil object-am object-nm primitive-nm]] [clojure.core.rrb-vector.transients :refer [transient-helper]] [clojure.core.rrb-vector.fork-join :as fj] [clojure.core.protocols :refer [IKVReduce]] [clojure.core.reducers :as r :refer [CollFold coll-fold]]) (:import (clojure.core ArrayManager Vec ArrayChunk) (clojure.lang RT Util Box PersistentVector APersistentVector$SubVector) (clojure.core.rrb_vector.nodes NodeManager) (java.util.concurrent.atomic AtomicReference))) (set! *warn-on-reflection* true) (set! *unchecked-math* true) ;; :warn-on-boxed (def ^:const rrbt-concat-threshold (inc p/max-branches)) (def ^:const max-extra-search-steps 2) (def ^:const elide-assertions? true) (def ^:const elide-debug-printouts? true) (defmacro assert [& args] (if-not elide-assertions? (apply #'clojure.core/assert &form &env args))) (defmacro dbg [& args] (if-not elide-debug-printouts? `(prn ~@args))) (defmacro dbg- [& args]) (defn throw-unsupported [] (throw (UnsupportedOperationException.))) (defmacro compile-if [test then else] (if (eval test) then else)) (defmacro ^:private caching-hash [coll hash-fn hash-key] `(let [h# ~hash-key] (if-not (== h# 0) h# (let [h# (~hash-fn ~coll)] (set! ~hash-key (int h#)) h#)))) (defn ^:private hash-gvec-seq [xs] (let [cnt (count xs)] (loop [h (int 1) xs (seq xs)] (if xs (let [x (first xs)] (recur (unchecked-add-int (unchecked-multiply-int 31 h) (clojure.lang.Util/hash x)) (next xs))) h)))) (definterface IVecImpl (^int tailoff []) (arrayFor [^int i]) (pushTail [^int shift ^int cnt parent tailnode]) (popTail [^int shift ^int cnt node]) (newPath [^java.util.concurrent.atomic.AtomicReference edit ^int shift node]) (doAssoc [^int shift node ^int i val])) (deftype VecSeq [^ArrayManager am ^IVecImpl vec anode ^int i ^int offset ^clojure.lang.IPersistentMap _meta ^:unsynchronized-mutable ^int _hash ^:unsynchronized-mutable ^int _hasheq] clojure.core.protocols/InternalReduce (internal-reduce [_ f val] (loop [result val aidx i off offset] (if (< aidx (count vec)) (let [node (.arrayFor vec aidx) alen (.alength am node) result (loop [result result node-idx off] (if (< node-idx alen) (let [result (f result (.aget am node node-idx))] (if (reduced? result) result (recur result (inc node-idx)))) result))] (if (reduced? result) @result (recur result (+ aidx alen) 0))) result))) Object (toString [this] (pr-str this)) (hashCode [this] (caching-hash this hash-gvec-seq _hash)) (equals [this that] (cond (identical? this that) true (not (or (sequential? that) (instance? java.util.List that))) false :else (loop [xs this ys (seq that)] (if xs (if ys (if (clojure.lang.Util/equals (first xs) (first ys)) (recur (next xs) (next ys)) false) false) (nil? ys))))) clojure.lang.IHashEq (hasheq [this] (let [h _hasheq] (if (== h 0) (compile-if (resolve 'clojure.core/hash-ordered-coll) (let [h (hash-ordered-coll this)] (do (set! _hasheq (int h)) h)) (loop [h (int 1) xs (seq this)] (if xs (recur (unchecked-add-int (unchecked-multiply-int (int 31) h) (Util/hasheq (first xs))) (next xs)) (do (set! _hasheq (int h)) h)))) h))) clojure.lang.IMeta (meta [this] _meta) clojure.lang.IObj (withMeta [this m] (VecSeq. am vec anode i offset m _hash _hasheq)) clojure.lang.Counted (count [this] (unchecked-subtract-int (unchecked-subtract-int (count vec) i) offset)) clojure.lang.ISeq (first [_] (.aget am anode offset)) (next [this] (if (< (inc offset) (.alength am anode)) (VecSeq. am vec anode i (inc offset) nil 0 0) (.chunkedNext this))) (more [this] (let [s (.next this)] (or s (clojure.lang.PersistentList/EMPTY)))) (cons [this o] (clojure.lang.Cons. o this)) (equiv [this o] (cond (identical? this o) true (or (instance? clojure.lang.Sequential o) (instance? java.util.List o)) (loop [me this you (seq o)] (if (nil? me) (nil? you) (and (clojure.lang.Util/equiv (first me) (first you)) (recur (next me) (next you))))) :else false)) (empty [_] clojure.lang.PersistentList/EMPTY) clojure.lang.Seqable (seq [this] this) clojure.lang.IChunkedSeq (chunkedFirst [_] (ArrayChunk. am anode offset (.alength am anode))) (chunkedNext [_] (let [nexti (+ i (.alength am anode))] (when (< nexti (count vec)) (VecSeq. am vec (.arrayFor vec nexti) nexti 0 nil 0 0)))) (chunkedMore [this] (let [s (.chunkedNext this)] (or s (clojure.lang.PersistentList/EMPTY)))) java.lang.Iterable (iterator [this] (let [xs (clojure.lang.Box. (seq this))] (reify java.util.Iterator (next [this] (locking xs (if-let [v (.-val xs)] (let [x (first v)] (set! (.-val xs) (next v)) x) (throw (java.util.NoSuchElementException. "no more elements in VecSeq iterator"))))) (hasNext [this] (locking xs (not (nil? (.-val xs))))) (remove [this] (throw-unsupported))))) java.io.Serializable java.util.Collection (contains [this o] (boolean (some #(= % o) this))) (containsAll [this c] (every? #(.contains this %) c)) (isEmpty [this] (zero? (count this))) (toArray [this] (into-array Object this)) (^"[Ljava.lang.Object;" toArray [this ^"[Ljava.lang.Object;" arr] (let [cnt (count this)] (if (>= (count arr) cnt) (do (dotimes [i cnt] (aset arr i (nth vec i))) arr) (into-array Object this)))) (size [this] (count this)) (add [_ o] (throw-unsupported)) (addAll [_ c] (throw-unsupported)) (clear [_] (throw-unsupported)) (^boolean remove [_ o] (throw-unsupported)) (removeAll [_ c] (throw-unsupported)) (retainAll [_ c] (throw-unsupported)) java.util.List (get [this i] (nth this i)) (indexOf [this o] (loop [xs (seq this) i 0] (if xs (let [x (first xs)] (if (= o x) i (recur (next xs) (unchecked-inc-int i)))) -1))) (lastIndexOf [this o] (loop [xs (rseq vec) l (unchecked-dec-int (- (count vec) i))] (cond (neg? l) -1 (= o (first xs)) l :else (recur (next xs) (unchecked-dec-int l))))) (listIterator [this] (.listIterator this 0)) (listIterator [this n] (let [n (java.util.concurrent.atomic.AtomicInteger. n)] (reify java.util.ListIterator (hasNext [_] (< (.get n) (count this))) (hasPrevious [_] (pos? n)) (next [_] (try (nth vec (unchecked-add-int i (unchecked-add-int offset (unchecked-dec-int (.incrementAndGet n))))) (catch IndexOutOfBoundsException e (throw (java.util.NoSuchElementException. "no more elements in VecSeq list iterator"))))) (nextIndex [_] (.get n)) (previous [_] (nth vec (unchecked-add i (unchecked-add offset (.decrementAndGet n))))) (previousIndex [_] (unchecked-dec-int (.get n))) (add [_ e] (throw-unsupported)) (remove [_] (throw-unsupported)) (set [_ e] (throw-unsupported))))) (subList [this a z] (seq (slicev vec (unchecked-add (unchecked-add i offset) a) (unchecked-add (unchecked-add i offset) z)))) (add [_ i o] (throw-unsupported)) (addAll [_ i c] (throw-unsupported)) (^Object remove [_ ^int i] (throw-unsupported)) (set [_ i e] (throw-unsupported))) (defprotocol AsRRBT (as-rrbt [v])) (defn slice-right [^NodeManager nm ^ArrayManager am node shift end] (let [shift (int shift) end (int end)] (if (zero? shift) ;; potentially return a short node, although it would be better to ;; make sure a regular leaf is always left at the right, with any ;; items over the final p/max-branches moved into tail (and then potentially ;; back into the tree should the tail become too long...) (let [arr (.array nm node) new-arr (.array am end)] (System/arraycopy arr 0 new-arr 0 end) (.node nm nil new-arr)) (let [regular? (.regular nm node) rngs (if-not regular? (ranges nm node)) i (bit-and (bit-shift-right (unchecked-dec-int end) shift) (int p/branch-mask)) i (if regular? i (loop [j i] (if (<= end (aget rngs j)) j (recur (unchecked-inc-int j))))) child-end (if regular? (let [ce (unchecked-remainder-int end (bit-shift-left (int 1) shift))] (if (zero? ce) (bit-shift-left (int 1) shift) ce)) (if (pos? i) (unchecked-subtract-int end (aget rngs (unchecked-dec-int i))) end)) arr (.array nm node) new-child (slice-right nm am (aget ^objects arr i) (unchecked-subtract-int shift (int p/shift-increment)) child-end) regular-child? (if (== shift (int p/shift-increment)) (== (int p/max-branches) (.alength am (.array nm new-child))) (.regular nm new-child)) new-arr (object-array (if (and regular? regular-child?) p/max-branches p/non-regular-array-len)) new-child-rng (if regular-child? (let [m (mod child-end (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (if (== shift (int p/shift-increment)) (.alength am (.array nm new-child)) (last-range nm new-child)))] (System/arraycopy arr 0 new-arr 0 i) (aset ^objects new-arr i new-child) (if-not (and regular? regular-child?) (let [new-rngs (int-array p/non-regular-array-len) step (bit-shift-left (int 1) shift)] (if regular? (dotimes [j i] (aset new-rngs j (unchecked-multiply-int (inc j) step))) (dotimes [j i] (aset new-rngs j (aget rngs j)))) (aset new-rngs i (unchecked-add-int (if (pos? i) (aget new-rngs (unchecked-dec-int i)) (int 0)) new-child-rng)) (aset new-rngs p/max-branches (unchecked-inc-int i)) (aset new-arr p/max-branches new-rngs))) (.node nm nil new-arr))))) (defn slice-left [^NodeManager nm ^ArrayManager am node shift start end] (let [shift (int shift) start (int start) end (int end)] (if (zero? shift) ;; potentially return a short node (let [arr (.array nm node) new-len (unchecked-subtract-int (.alength am arr) start) new-arr (.array am new-len)] (System/arraycopy arr start new-arr 0 new-len) (.node nm nil new-arr)) (let [regular? (.regular nm node) arr (.array nm node) rngs (if-not regular? (ranges nm node)) i (bit-and (bit-shift-right start shift) (int p/branch-mask)) i (if regular? i (loop [j i] (if (< start (aget rngs j)) j (recur (unchecked-inc-int j))))) len (if regular? (loop [i i] (if (or (== i (int p/max-branches)) (nil? (aget ^objects arr i))) i (recur (unchecked-inc-int i)))) (aget rngs p/max-branches)) child-start (if (pos? i) (unchecked-subtract-int start (if regular? (unchecked-multiply-int i (bit-shift-left (int 1) shift)) (aget rngs (unchecked-dec-int i)))) start) child-end (int (if regular? (min (bit-shift-left (int 1) shift) (if (pos? i) (unchecked-subtract-int end (unchecked-multiply-int i (bit-shift-left (int 1) shift))) end)) (let [capped-end (min (aget rngs i) end)] (if (pos? i) (unchecked-subtract-int capped-end (aget rngs (unchecked-dec-int i))) capped-end)))) new-child (slice-left nm am (aget ^objects arr i) (unchecked-subtract-int shift (int p/shift-increment)) child-start child-end) new-len (unchecked-subtract-int len i) new-len (if (nil? new-child) (unchecked-dec-int new-len) new-len)] (cond (zero? new-len) nil regular? (let [new-arr (object-array p/non-regular-array-len) rngs (int-array p/non-regular-array-len) rng0 (if (or (nil? new-child) (== shift (int p/shift-increment)) (.regular nm new-child)) (unchecked-subtract-int (bit-shift-left (int 1) shift) (bit-and (bit-shift-right start (unchecked-subtract-int shift (int p/shift-increment))) (int p/branch-mask))) (int (last-range nm new-child))) step (bit-shift-left (int 1) shift)] (loop [j (int 0) r rng0] (when (< j new-len) (aset rngs j r) (recur (unchecked-inc-int j) (unchecked-add-int r step)))) (when (> new-len 1) (aset rngs (dec new-len) (- end start))) (aset rngs p/max-branches new-len) (System/arraycopy arr (if (nil? new-child) (unchecked-inc-int i) i) new-arr 0 new-len) (if-not (nil? new-child) (aset new-arr 0 new-child)) (aset new-arr p/max-branches rngs) (.node nm (.edit nm node) new-arr)) :else (let [new-arr (object-array p/non-regular-array-len) new-rngs (int-array p/non-regular-array-len)] (loop [j (int 0) i i] (when (< j new-len) (aset new-rngs j (unchecked-subtract-int (aget rngs i) start)) (recur (unchecked-inc-int j) (unchecked-inc-int i)))) (aset new-rngs p/max-branches new-len) (System/arraycopy arr (if (nil? new-child) (unchecked-inc-int i) i) new-arr 0 new-len) (if-not (nil? new-child) (aset new-arr 0 new-child)) (aset new-arr p/max-branches new-rngs) (.node nm (.edit nm node) new-arr))))))) (declare splice-rrbts ->Transient) (deftype Vector [^NodeManager nm ^ArrayManager am ^int cnt ^int shift root tail ^clojure.lang.IPersistentMap _meta ^:unsynchronized-mutable ^int _hash ^:unsynchronized-mutable ^int _hasheq] Object (equals [this that] (cond (identical? this that) true (or (instance? clojure.lang.IPersistentVector that) (instance? java.util.RandomAccess that)) (and (== cnt (count that)) (loop [i (int 0)] (cond (== i cnt) true (.equals (.nth this i) (nth that i)) (recur (unchecked-inc-int i)) :else false))) (or (instance? clojure.lang.Sequential that) (instance? java.util.List that)) (.equals (seq this) (seq that)) :else false)) (hashCode [this] (let [h _hash] (if (== h 0) (loop [h (int 1) i (int 0)] (if (== i cnt) (do (set! _hash (int h)) h) (let [val (.nth this i)] (recur (unchecked-add-int (unchecked-multiply-int (int 31) h) (Util/hash val)) (unchecked-inc-int i))))) h))) (toString [this] (pr-str this)) clojure.lang.IHashEq (hasheq [this] (let [h _hasheq] (if (== h 0) (compile-if (resolve 'clojure.core/hash-ordered-coll) (let [h (hash-ordered-coll this)] (do (set! _hasheq (int h)) h)) (loop [h (int 1) xs (seq this)] (if xs (recur (unchecked-add-int (unchecked-multiply-int (int 31) h) (Util/hasheq (first xs))) (next xs)) (do (set! _hasheq (int h)) h)))) h))) clojure.lang.Counted (count [_] cnt) clojure.lang.IMeta (meta [_] _meta) clojure.lang.IObj (withMeta [_ m] (Vector. nm am cnt shift root tail m _hash _hasheq)) clojure.lang.Indexed (nth [this i] (if (and (<= (int 0) i) (< i cnt)) (let [tail-off (unchecked-subtract-int cnt (.alength am tail))] (if (<= tail-off i) (.aget am tail (unchecked-subtract-int i tail-off)) (loop [i i node root shift shift] (if (zero? shift) (let [arr (.array nm node)] (.aget am arr (bit-and (bit-shift-right i shift) (int p/branch-mask)))) (if (.regular nm node) (let [arr (.array nm node) idx (bit-and (bit-shift-right i shift) (int p/branch-mask))] (loop [i i node (aget ^objects arr idx) shift (unchecked-subtract-int shift (int p/shift-increment))] (let [arr (.array nm node) idx (bit-and (bit-shift-right i shift) (int p/branch-mask))] (if (zero? shift) (.aget am arr idx) (recur i (aget ^objects arr idx) (unchecked-subtract-int shift (int p/shift-increment))))))) (let [arr (.array nm node) rngs (ranges nm node) idx (loop [j (bit-and (bit-shift-right i shift) (int p/branch-mask))] (if (< i (aget rngs j)) j (recur (unchecked-inc-int j)))) i (if (zero? idx) (int i) (unchecked-subtract-int (int i) (aget rngs (unchecked-dec-int idx))))] (recur i (aget ^objects arr idx) (unchecked-subtract-int shift (int p/shift-increment))))))))) (throw (IndexOutOfBoundsException.)))) (nth [this i not-found] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) not-found)) clojure.lang.IPersistentCollection (cons [this val] (if (< (.alength am tail) (int p/max-branches)) (let [tail-len (.alength am tail) new-tail (.array am (unchecked-inc-int tail-len))] (System/arraycopy tail 0 new-tail 0 tail-len) (.aset am new-tail tail-len val) (Vector. nm am (unchecked-inc-int cnt) shift root new-tail _meta 0 0)) (let [tail-node (.node nm (.edit nm root) tail) new-tail (let [new-arr (.array am 1)] (.aset am new-arr 0 val) new-arr)] (if (overflow? nm root shift cnt) (if (.regular nm root) (let [new-arr (object-array p/max-branches) new-root (.node nm (.edit nm root) new-arr)] (doto new-arr (aset (int 0) root) (aset (int 1) (.newPath this (.edit nm root) shift tail-node))) (Vector. nm am (unchecked-inc-int cnt) (unchecked-add-int shift (int p/shift-increment)) new-root new-tail _meta 0 0)) (let [new-arr (object-array p/non-regular-array-len) new-rngs (ints (int-array p/non-regular-array-len)) new-root (.node nm (.edit nm root) new-arr) root-total-range (aget (ranges nm root) (int p/max-branches-minus-1))] (doto new-arr (aset (int 0) root) (aset (int 1) (.newPath this (.edit nm root) shift tail-node)) (aset (int p/max-branches) new-rngs)) (doto new-rngs (aset (int 0) root-total-range) (aset (int 1) (unchecked-add-int root-total-range (int p/max-branches))) (aset (int p/max-branches) (int 2))) (Vector. nm am (unchecked-inc-int cnt) (unchecked-add-int shift (int p/shift-increment)) new-root new-tail _meta 0 0))) (Vector. nm am (unchecked-inc-int cnt) shift (.pushTail this shift cnt root tail-node) new-tail _meta 0 0))))) (empty [_] (Vector. nm am 0 p/shift-increment (.empty nm) (.array am 0) _meta 0 0)) (equiv [this that] (cond (or (instance? clojure.lang.IPersistentVector that) (instance? java.util.RandomAccess that)) (and (== cnt (count that)) (loop [i (int 0)] (cond (== i cnt) true (= (.nth this i) (nth that i)) (recur (unchecked-inc-int i)) :else false))) (or (instance? clojure.lang.Sequential that) (instance? java.util.List that)) (Util/equiv (seq this) (seq that)) :else false)) clojure.lang.IPersistentStack (peek [this] (when (pos? cnt) (.nth this (unchecked-dec-int cnt)))) (pop [this] (cond (zero? cnt) (throw (IllegalStateException. "Can't pop empty vector")) (== 1 cnt) (Vector. nm am 0 p/shift-increment (.empty nm) (.array am 0) _meta 0 0) (> (.alength am tail) (int 1)) (let [new-tail (.array am (unchecked-dec-int (.alength am tail)))] (System/arraycopy tail 0 new-tail 0 (.alength am new-tail)) (Vector. nm am (unchecked-dec-int cnt) shift root new-tail _meta 0 0)) :else (let [new-tail (.arrayFor this (unchecked-subtract-int cnt (int 2))) root-cnt (.tailoff this) new-root (.popTail this shift root-cnt root)] (cond (nil? new-root) (Vector. nm am (unchecked-dec-int cnt) shift (.empty nm) new-tail _meta 0 0) (and (> shift (int p/shift-increment)) (nil? (aget ^objects (.array nm new-root) 1))) (Vector. nm am (unchecked-dec-int cnt) (unchecked-subtract-int shift (int p/shift-increment)) (aget ^objects (.array nm new-root) 0) new-tail _meta 0 0) :else (Vector. nm am (unchecked-dec-int cnt) shift new-root new-tail _meta 0 0))))) clojure.lang.IPersistentVector (assocN [this i val] (cond (and (<= (int 0) i) (< i cnt)) (let [tail-off (.tailoff this)] (if (>= i tail-off) (let [new-tail (.array am (.alength am tail)) idx (unchecked-subtract-int i tail-off)] (System/arraycopy tail 0 new-tail 0 (.alength am tail)) (.aset am new-tail idx val) (Vector. nm am cnt shift root new-tail _meta 0 0)) (Vector. nm am cnt shift (.doAssoc this shift root i val) tail _meta 0 0))) (== i cnt) (.cons this val) :else (throw (IndexOutOfBoundsException.)))) (length [this] (.count this)) clojure.lang.Reversible (rseq [this] (if (pos? cnt) (clojure.lang.APersistentVector$RSeq. this (unchecked-dec-int cnt)) nil)) clojure.lang.Associative (assoc [this k v] (if (Util/isInteger k) (.assocN this k v) (throw (IllegalArgumentException. "Key must be integer")))) (containsKey [this k] (and (Util/isInteger k) (<= (int 0) (int k)) (< (int k) cnt))) (entryAt [this k] (if (.containsKey this k) (clojure.lang.MapEntry. k (.nth this (int k))) nil)) clojure.lang.ILookup (valAt [this k not-found] (if (Util/isInteger k) (let [i (int k)] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) not-found)) not-found)) (valAt [this k] (.valAt this k nil)) clojure.lang.IFn (invoke [this k] (if (Util/isInteger k) (let [i (int k)] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) (throw (IndexOutOfBoundsException.)))) (throw (IllegalArgumentException. "Key must be integer")))) (applyTo [this args] (let [n (RT/boundedLength args 1)] (case n 0 (throw (clojure.lang.ArityException. n (.. this (getClass) (getSimpleName)))) 1 (.invoke this (first args)) 2 (throw (clojure.lang.ArityException. n (.. this (getClass) (getSimpleName))))))) clojure.lang.Seqable (seq [this] (if (zero? cnt) nil (VecSeq. am this (.arrayFor this 0) 0 0 nil 0 0))) clojure.lang.Sequential clojure.lang.IEditableCollection (asTransient [this] (->Transient nm am (identical? am object-am) cnt shift (.editableRoot transient-helper nm am root) (.editableTail transient-helper am tail) (.alength am tail))) IVecImpl (tailoff [_] (unchecked-subtract-int cnt (.alength am tail))) (arrayFor [this i] (if (and (<= (int 0) i) (< i cnt)) (if (>= i (.tailoff this)) tail (loop [i (int i) node root shift shift] (if (zero? shift) (.array nm node) (if (.regular nm node) (loop [node (aget ^objects (.array nm node) (bit-and (bit-shift-right i shift) (int p/branch-mask))) shift (unchecked-subtract-int shift (int p/shift-increment))] (if (zero? shift) (.array nm node) (recur (aget ^objects (.array nm node) (bit-and (bit-shift-right i shift) (int p/branch-mask))) (unchecked-subtract-int shift (int p/shift-increment))))) (let [rngs (ranges nm node) j (loop [j (bit-and (bit-shift-right i shift) (int p/branch-mask))] (if (< i (aget rngs j)) j (recur (unchecked-inc-int j)))) i (if (pos? j) (unchecked-subtract-int i (aget rngs (unchecked-dec-int j))) i)] (recur (int i) (aget ^objects (.array nm node) j) (unchecked-subtract-int shift (int p/shift-increment)))))))) (throw (IndexOutOfBoundsException.)))) (pushTail [this shift cnt node tail-node] (if (.regular nm node) (let [arr (aclone ^objects (.array nm node)) ret (.node nm (.edit nm node) arr)] (loop [node ret shift (int shift)] (let [arr (.array nm node) subidx (bit-and (bit-shift-right (unchecked-dec-int cnt) shift) (int p/branch-mask))] (if (== shift (int p/shift-increment)) (aset ^objects arr subidx tail-node) (if-let [child (aget ^objects arr subidx)] (let [new-carr (aclone ^objects (.array nm child)) new-child (.node nm (.edit nm root) new-carr)] (aset ^objects arr subidx new-child) (recur new-child (unchecked-subtract-int shift (int p/shift-increment)))) (aset ^objects arr subidx (.newPath this (.edit nm root) (unchecked-subtract-int shift (int p/shift-increment)) tail-node)))))) ret) (let [arr (aclone ^objects (.array nm node)) rngs (ranges nm node) li (unchecked-dec-int (aget rngs p/max-branches)) ret (.node nm (.edit nm node) arr) cret (if (== shift (int p/shift-increment)) nil (let [child (aget ^objects arr li) ccnt (unchecked-add-int (int (if (pos? li) (unchecked-subtract-int (aget rngs li) (aget rngs (unchecked-dec-int li))) (aget rngs 0))) ;; add p/max-branches elems to account for the new ;; full tail we plan to add to the ;; subtree. (int p/max-branches))] ;; See Note 2 in file transients.clj (if-not (overflow? nm child (unchecked-subtract-int shift (int p/shift-increment)) ccnt) (.pushTail this (unchecked-subtract-int shift (int p/shift-increment)) ccnt (aget ^objects arr li) tail-node))))] (if cret (do (aset ^objects arr li cret) (aset rngs li (unchecked-add-int (aget rngs li) (int p/max-branches))) ret) (do (when (>= li p/max-branches-minus-1) ;; See Note 1 in file transients.clj (let [msg (str "Assigning index " (inc li) " of vector" " object array to become a node, when that" " index should only be used for storing" " range arrays.") data {:shift shift, :cnt cnt, :node node, :tail-node tail-node, :rngs rngs, :li li, :cret cret}] (throw (ex-info msg data)))) (aset ^objects arr (unchecked-inc-int li) (.newPath this (.edit nm root) (unchecked-subtract-int shift (int p/shift-increment)) tail-node)) (aset rngs (unchecked-inc-int li) (unchecked-add-int (aget rngs li) (int p/max-branches))) (aset rngs p/max-branches (unchecked-inc-int (aget rngs p/max-branches))) ret))))) (popTail [this shift cnt node] (if (.regular nm node) (let [subidx (bit-and (bit-shift-right (unchecked-subtract-int cnt (int 2)) (int shift)) (int p/branch-mask))] (cond (> (int shift) (int p/shift-increment)) (let [new-child (.popTail this (unchecked-subtract-int (int shift) (int p/shift-increment)) cnt (aget ^objects (.array nm node) subidx))] (if (and (nil? new-child) (zero? subidx)) nil (let [arr (aclone ^objects (.array nm node))] (aset arr subidx new-child) (.node nm (.edit nm root) arr)))) (zero? subidx) nil :else (let [arr (aclone ^objects (.array nm node))] (aset arr subidx nil) (.node nm (.edit nm root) arr)))) (let [rngs (ranges nm node) subidx (unchecked-dec-int (aget rngs p/max-branches)) new-rngs (aclone rngs)] (cond (> (int shift) (int p/shift-increment)) (let [child (aget ^objects (.array nm node) subidx) child-cnt (if (zero? subidx) (aget rngs 0) (unchecked-subtract-int (aget rngs subidx) (aget rngs (unchecked-dec-int subidx)))) new-child (.popTail this (unchecked-subtract-int (int shift) (int p/shift-increment)) child-cnt child)] (cond (and (nil? new-child) (zero? subidx)) nil (.regular nm child) (let [arr (aclone ^objects (.array nm node))] (aset new-rngs subidx (unchecked-subtract-int (aget new-rngs subidx) (int p/max-branches))) (aset arr subidx new-child) (aset arr (int p/max-branches) new-rngs) (if (nil? new-child) (aset new-rngs p/max-branches (unchecked-dec-int (aget new-rngs p/max-branches)))) (.node nm (.edit nm root) arr)) :else (let [rng (int (last-range nm child)) diff (unchecked-subtract-int rng (if new-child (last-range nm new-child) 0)) arr (aclone ^objects (.array nm node))] (aset new-rngs subidx (unchecked-subtract-int (aget new-rngs subidx) diff)) (aset arr subidx new-child) (aset arr (int p/max-branches) new-rngs) (if (nil? new-child) (aset new-rngs p/max-branches (unchecked-dec-int (aget new-rngs p/max-branches)))) (.node nm (.edit nm root) arr)))) (zero? subidx) nil :else (let [arr (aclone ^objects (.array nm node)) child (aget arr subidx) new-rngs (aclone rngs)] (aset arr subidx nil) (aset arr (int p/max-branches) new-rngs) (aset new-rngs subidx 0) (aset new-rngs p/max-branches (unchecked-dec-int (aget new-rngs (int p/max-branches)))) (.node nm (.edit nm root) arr)))))) (newPath [this ^AtomicReference edit ^int shift node] (if (== (.alength am tail) (int p/max-branches)) (let [shift (int shift)] (loop [s (int 0) node node] (if (== s shift) node (let [arr (object-array p/max-branches) ret (.node nm edit arr)] (aset arr 0 node) (recur (unchecked-add-int s (int p/shift-increment)) ret))))) (let [shift (int shift)] (loop [s (int 0) node node] (if (== s shift) node (let [arr (object-array p/non-regular-array-len) rngs (int-array p/non-regular-array-len) ret (.node nm edit arr)] (aset arr 0 node) (aset arr p/max-branches rngs) (aset rngs p/max-branches 1) (aset rngs 0 (.alength am tail)) (recur (unchecked-add-int s (int p/shift-increment)) ret))))))) (doAssoc [this shift node i val] (if (.regular nm node) (let [node (.clone nm am shift node)] (loop [shift (int shift) node node] (if (zero? shift) (let [arr (.array nm node)] (.aset am arr (bit-and i (int p/branch-mask)) val)) (let [arr (.array nm node) subidx (bit-and (bit-shift-right i shift) (int p/branch-mask)) next-shift (int (unchecked-subtract-int shift (int p/shift-increment))) child (.clone nm am next-shift (aget ^objects arr subidx))] (aset ^objects arr subidx child) (recur next-shift child)))) node) (let [arr (aclone ^objects (.array nm node)) rngs (ranges nm node) subidx (bit-and (bit-shift-right i shift) (int p/branch-mask)) subidx (loop [subidx subidx] (if (< i (aget rngs subidx)) subidx (recur (unchecked-inc-int subidx)))) i (if (zero? subidx) i (unchecked-subtract-int i (aget rngs (unchecked-dec-int subidx))))] (aset arr subidx (.doAssoc this (unchecked-subtract-int (int shift) (int p/shift-increment)) (aget arr subidx) i val)) (.node nm (.edit nm node) arr)))) IKVReduce (kv-reduce [this f init] (loop [i (int 0) j (int 0) init init arr (.arrayFor this i) lim (unchecked-dec-int (.alength am arr)) step (unchecked-inc-int lim)] (let [init (f init (unchecked-add-int i j) (.aget am arr j))] (if (reduced? init) @init (if (< j lim) (recur i (unchecked-inc-int j) init arr lim step) (let [i (unchecked-add-int i step)] (if (< i cnt) (let [arr (.arrayFor this i) len (.alength am arr) lim (unchecked-dec-int len)] (recur i (int 0) init arr lim len)) init))))))) CollFold ;; adapted from #'clojure.core.reducers/foldvec (coll-fold [this n combinef reducef] (let [n (int n)] (cond (zero? cnt) (combinef) (<= cnt n) (r/reduce reducef (combinef) this) :else (let [split (quot cnt 2) v1 (slicev this 0 split) v2 (slicev this split cnt) fc (fn [child] #(coll-fold child n combinef reducef))] (fj/invoke #(let [f1 (fc v1) t2 (fj/task (fc v2))] (fj/fork t2) (combinef (f1) (fj/join t2)))))))) PSliceableVector (slicev [this start end] (let [start (int start) end (int end) new-cnt (unchecked-subtract-int end start)] (cond (or (neg? start) (> end cnt)) (throw (IndexOutOfBoundsException.)) (== start end) ;; NB. preserves metadata (empty this) (> start end) (throw (IllegalStateException. "start index greater than end index")) :else (let [tail-off (.tailoff this)] (if (>= start tail-off) (let [new-tail (.array am new-cnt)] (System/arraycopy tail (unchecked-subtract-int start tail-off) new-tail 0 new-cnt) (Vector. nm am new-cnt (int p/shift-increment) (.empty nm) new-tail _meta 0 0)) (let [tail-cut? (> end tail-off) new-root (if tail-cut? root (slice-right nm am root shift end)) new-root (if (zero? start) new-root (slice-left nm am new-root shift start (min end tail-off))) new-tail (if tail-cut? (let [new-len (unchecked-subtract-int end tail-off) new-tail (.array am new-len)] (System/arraycopy tail 0 new-tail 0 new-len) new-tail) (.arrayFor (Vector. nm am new-cnt shift new-root (.array am 0) nil 0 0) (unchecked-dec-int new-cnt))) new-root (if tail-cut? new-root (.popTail (Vector. nm am new-cnt shift new-root (.array am 0) nil 0 0) shift new-cnt new-root))] (if (nil? new-root) (Vector. nm am new-cnt p/shift-increment (.empty nm) new-tail _meta 0 0) (loop [r new-root s (int shift)] (if (and (> s (int p/shift-increment)) (nil? (aget ^objects (.array nm r) 1))) (recur (aget ^objects (.array nm r) 0) (unchecked-subtract-int s (int p/shift-increment))) (Vector. nm am new-cnt s r new-tail _meta 0 0)))))))))) PSpliceableVector (splicev [this that] (splice-rrbts nm am this (as-rrbt that))) AsRRBT (as-rrbt [this] this) java.io.Serializable java.lang.Comparable (compareTo [this that] (if (identical? this that) 0 (let [^clojure.lang.IPersistentVector v (cast clojure.lang.IPersistentVector that) vcnt (.count v)] (cond (< cnt vcnt) -1 (> cnt vcnt) 1 :else (loop [i (int 0)] (if (== i cnt) 0 (let [comp (Util/compare (.nth this i) (.nth v i))] (if (zero? comp) (recur (unchecked-inc-int i)) comp)))))))) java.lang.Iterable (iterator [this] (let [i (java.util.concurrent.atomic.AtomicInteger. 0)] (reify java.util.Iterator (hasNext [_] (< (.get i) cnt)) (next [_] (try (.nth this (unchecked-dec-int (.incrementAndGet i))) (catch IndexOutOfBoundsException e (throw (java.util.NoSuchElementException. "no more elements in RRB vector iterator"))))) (remove [_] (throw-unsupported))))) java.util.Collection (contains [this o] (boolean (some #(= % o) this))) (containsAll [this c] (every? #(.contains this %) c)) (isEmpty [_] (zero? cnt)) (toArray [this] (into-array Object this)) (^"[Ljava.lang.Object;" toArray [this ^"[Ljava.lang.Object;" arr] (if (>= (count arr) cnt) (do (dotimes [i cnt] (aset arr i (.nth this i))) arr) (into-array Object this))) (size [_] cnt) (add [_ o] (throw-unsupported)) (addAll [_ c] (throw-unsupported)) (clear [_] (throw-unsupported)) (^boolean remove [_ o] (throw-unsupported)) (removeAll [_ c] (throw-unsupported)) (retainAll [_ c] (throw-unsupported)) java.util.RandomAccess java.util.List (get [this i] (.nth this i)) (indexOf [this o] (loop [i (int 0)] (cond (== i cnt) -1 (= o (.nth this i)) i :else (recur (unchecked-inc-int i))))) (lastIndexOf [this o] (loop [i (unchecked-dec-int cnt)] (cond (neg? i) -1 (= o (.nth this i)) i :else (recur (unchecked-dec-int i))))) (listIterator [this] (.listIterator this 0)) (listIterator [this i] (let [i (java.util.concurrent.atomic.AtomicInteger. i)] (reify java.util.ListIterator (hasNext [_] (< (.get i) cnt)) (hasPrevious [_] (pos? i)) (next [_] (try (.nth this (unchecked-dec-int (.incrementAndGet i))) (catch IndexOutOfBoundsException e (throw (java.util.NoSuchElementException. "no more elements in RRB vector list iterator"))))) (nextIndex [_] (.get i)) (previous [_] (.nth this (.decrementAndGet i))) (previousIndex [_] (unchecked-dec-int (.get i))) (add [_ e] (throw-unsupported)) (remove [_] (throw-unsupported)) (set [_ e] (throw-unsupported))))) (subList [this a z] (slicev this a z)) (add [_ i o] (throw-unsupported)) (addAll [_ i c] (throw-unsupported)) (^Object remove [_ ^int i] (throw-unsupported)) (set [_ i e] (throw-unsupported))) (extend-protocol AsRRBT Vec (as-rrbt [^Vec this] (Vector. primitive-nm (.-am this) (.-cnt this) (.-shift this) (.-root this) (.-tail this) (.-_meta this) 0 0)) PersistentVector (as-rrbt [^PersistentVector this] (Vector. object-nm object-am (count this) (.-shift this) (.-root this) (.-tail this) (meta this) 0 0)) APersistentVector$SubVector (as-rrbt [^APersistentVector$SubVector this] (let [v (.-v this) start (.-start this) end (.-end this)] (slicev (as-rrbt v) start end))) java.util.Map$Entry (as-rrbt [^java.util.Map$Entry this] (as-rrbt [(.getKey this) (.getValue this)]))) (defn shift-from-to [^NodeManager nm node from to] (cond (== from to) node (.regular nm node) (recur nm (.node nm (.edit nm node) (doto (object-array p/max-branches) (aset 0 node))) (unchecked-add-int (int p/shift-increment) (int from)) to) :else (recur nm (.node nm (.edit nm node) (doto (object-array p/non-regular-array-len) (aset 0 node) (aset p/max-branches (ints (doto (int-array p/non-regular-array-len) (aset 0 (int (last-range nm node))) (aset p/max-branches (int 1))))))) (unchecked-add-int (int p/shift-increment) (int from)) to))) (defn pair ^"[Ljava.lang.Object;" [x y] (doto (object-array 2) (aset 0 x) (aset 1 y))) (defn slot-count [^NodeManager nm ^ArrayManager am node shift] (let [arr (.array nm node)] (if (zero? shift) (.alength am arr) (if (.regular nm node) (index-of-nil arr) (let [rngs (ranges nm node)] (aget rngs p/max-branches)))))) (defn subtree-branch-count [^NodeManager nm ^ArrayManager am node shift] ;; NB. positive shifts only (let [arr (.array nm node) cs (- shift p/shift-increment)] (if (.regular nm node) (loop [i 0 sbc 0] (if (== i p/max-branches) sbc (if-let [child (aget ^objects arr i)] (recur (inc i) (+ sbc (long (slot-count nm am child cs)))) sbc))) (let [lim (aget (ranges nm node) p/max-branches)] (loop [i 0 sbc 0] (if (== i lim) sbc (let [child (aget ^objects arr i)] (recur (inc i) (+ sbc (long (slot-count nm am child cs))))))))))) (defn leaf-seq [^NodeManager nm arr] (mapcat #(.array nm %) (take (index-of-nil arr) arr))) (defn rebalance-leaves [^NodeManager nm ^ArrayManager am n1 cnt1 n2 cnt2 ^Box transferred-leaves] (let [slc1 (slot-count nm am n1 p/shift-increment) slc2 (slot-count nm am n2 p/shift-increment) a (+ slc1 slc2) sbc1 (subtree-branch-count nm am n1 p/shift-increment) sbc2 (subtree-branch-count nm am n2 p/shift-increment) p (+ sbc1 sbc2) e (- a (inc (quot (dec p) p/max-branches)))] (cond (<= e max-extra-search-steps) (pair n1 n2) (<= (+ sbc1 sbc2) p/max-branches-squared) (let [reg? (zero? (mod p p/max-branches)) new-arr (object-array (if reg? p/max-branches p/non-regular-array-len)) new-n1 (.node nm nil new-arr)] (loop [i 0 bs (partition-all p/max-branches (concat (leaf-seq nm (.array nm n1)) (leaf-seq nm (.array nm n2))))] (when-first [block bs] (let [a (.array am (count block))] (loop [i 0 xs (seq block)] (when xs (.aset am a i (first xs)) (recur (inc i) (next xs)))) (aset new-arr i (.node nm nil a)) (recur (inc i) (next bs))))) (if-not reg? (aset new-arr p/max-branches (regular-ranges p/shift-increment p))) (set! (.-val transferred-leaves) sbc2) (pair new-n1 nil)) :else (let [reg? (zero? (mod p p/max-branches)) new-arr1 (object-array p/max-branches) new-arr2 (object-array (if reg? p/max-branches p/non-regular-array-len)) new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] (loop [i 0 bs (partition-all p/max-branches (concat (leaf-seq nm (.array nm n1)) (leaf-seq nm (.array nm n2))))] (when-first [block bs] (let [a (.array am (count block))] (loop [i 0 xs (seq block)] (when xs (.aset am a i (first xs)) (recur (inc i) (next xs)))) (if (< i p/max-branches) (aset new-arr1 i (.node nm nil a)) (aset new-arr2 (- i p/max-branches) (.node nm nil a))) (recur (inc i) (next bs))))) (if-not reg? (aset new-arr2 p/max-branches (regular-ranges p/shift-increment (- p p/max-branches-squared)))) (set! (.-val transferred-leaves) (- p/max-branches-squared sbc1)) (pair new-n1 new-n2))))) (defn child-seq [^NodeManager nm node shift cnt] (let [arr (.array nm node) rngs (if (.regular nm node) (ints (regular-ranges shift cnt)) (ranges nm node)) cs (if rngs (aget rngs p/max-branches) (index-of-nil arr)) cseq (fn cseq [c r] (let [arr (.array nm c) rngs (if (.regular nm c) (ints (regular-ranges (- shift p/shift-increment) r)) (ranges nm c)) gcs (if rngs (aget rngs p/max-branches) (index-of-nil arr))] (map list (take gcs arr) (take gcs (map - rngs (cons 0 rngs))))))] (mapcat cseq (take cs arr) (take cs (map - rngs (cons 0 rngs)))))) (defn rebalance [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] (if (nil? n2) (pair n1 nil) (let [slc1 (slot-count nm am n1 shift) slc2 (slot-count nm am n2 shift) a (+ slc1 slc2) sbc1 (subtree-branch-count nm am n1 shift) sbc2 (subtree-branch-count nm am n2 shift) p (+ sbc1 sbc2) e (- a (inc (quot (dec p) p/max-branches)))] (cond (<= e max-extra-search-steps) (pair n1 n2) (<= (+ sbc1 sbc2) p/max-branches-squared) (let [new-arr (object-array p/non-regular-array-len) new-rngs (int-array p/non-regular-array-len) new-n1 (.node nm nil new-arr)] (loop [i 0 bs (partition-all p/max-branches (concat (child-seq nm n1 shift cnt1) (child-seq nm n2 shift cnt2)))] (when-first [block bs] (let [a (object-array p/non-regular-array-len) r (int-array p/non-regular-array-len)] (aset a p/max-branches r) (aset r p/max-branches (count block)) (loop [i 0 o (int 0) gcs (seq block)] (when-first [[gc gcr] gcs] (aset ^objects a i gc) (aset r i (unchecked-add-int o (int gcr))) (recur (inc i) (unchecked-add-int o (int gcr)) (next gcs)))) (aset ^objects new-arr i (.node nm nil a)) (aset new-rngs i (+ (aget r (dec (aget r p/max-branches))) (if (pos? i) (aget new-rngs (dec i)) (int 0)))) (aset new-rngs p/max-branches (inc i)) (recur (inc i) (next bs))))) (aset new-arr p/max-branches new-rngs) (set! (.-val transferred-leaves) cnt2) (pair new-n1 nil)) :else (let [new-arr1 (object-array p/non-regular-array-len) new-arr2 (object-array p/non-regular-array-len) new-rngs1 (int-array p/non-regular-array-len) new-rngs2 (int-array p/non-regular-array-len) new-n1 (.node nm nil new-arr1) new-n2 (.node nm nil new-arr2)] (loop [i 0 bs (partition-all p/max-branches (concat (child-seq nm n1 shift cnt1) (child-seq nm n2 shift cnt2)))] (when-first [block bs] (let [a (object-array p/non-regular-array-len) r (int-array p/non-regular-array-len)] (aset a p/max-branches r) (aset r p/max-branches (count block)) (loop [i 0 o (int 0) gcs (seq block)] (when-first [[gc gcr] gcs] (aset a i gc) (aset r i (unchecked-add-int o (int gcr))) (recur (inc i) (unchecked-add-int o (int gcr)) (next gcs)))) (if (and (< i p/max-branches) (> (+ (* i p/max-branches) (count block)) sbc1)) (let [tbs (- (+ (* i p/max-branches) (count block)) sbc1) li (dec (aget r p/max-branches)) d (if (>= tbs p/max-branches) (aget r li) (- (aget r li) (aget r (- li tbs))))] (set! (.-val transferred-leaves) (+ (.-val transferred-leaves) d)))) (let [new-arr (if (< i p/max-branches) new-arr1 new-arr2) new-rngs (if (< i p/max-branches) new-rngs1 new-rngs2) i (mod i p/max-branches)] (aset ^objects new-arr i (.node nm nil a)) (aset new-rngs i (+ (aget r (dec (aget r p/max-branches))) (if (pos? i) (aget new-rngs (dec i)) (int 0)))) (aset new-rngs p/max-branches (int (inc i)))) (recur (inc i) (next bs))))) (aset new-arr1 p/max-branches new-rngs1) (aset new-arr2 p/max-branches new-rngs2) (pair new-n1 new-n2)))))) (defn zippath [^NodeManager nm ^ArrayManager am shift n1 cnt1 n2 cnt2 ^Box transferred-leaves] (if (== shift p/shift-increment) (rebalance-leaves nm am n1 cnt1 n2 cnt2 transferred-leaves) (let [c1 (last-child nm n1) c2 (first-child nm n2) ccnt1 (if (.regular nm n1) (let [m (mod cnt1 (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (let [rngs (ranges nm n1) i (dec (aget rngs p/max-branches))] (if (zero? i) (aget rngs 0) (- (aget rngs i) (aget rngs (dec i)))))) ccnt2 (if (.regular nm n2) (let [m (mod cnt2 (bit-shift-left 1 shift))] (if (zero? m) (bit-shift-left 1 shift) m)) (aget (ranges nm n2) 0)) next-transferred-leaves (Box. 0) [new-c1 new-c2] (zippath nm am (- shift p/shift-increment) c1 ccnt1 c2 ccnt2 next-transferred-leaves) d (.-val next-transferred-leaves)] (set! (.-val transferred-leaves) (+ (.-val transferred-leaves) d)) (rebalance nm am shift (if (identical? c1 new-c1) n1 (replace-rightmost-child nm shift n1 new-c1 d)) (+ cnt1 d) (if new-c2 (if (identical? c2 new-c2) n2 (replace-leftmost-child nm shift n2 cnt2 new-c2 d)) (remove-leftmost-child nm shift n2)) (- cnt2 d) transferred-leaves)))) (defn squash-nodes [^NodeManager nm shift n1 cnt1 n2 cnt2] (let [arr1 (.array nm n1) arr2 (.array nm n2) li1 (index-of-nil arr1) li2 (index-of-nil arr2) slots (concat (take li1 arr1) (take li2 arr2))] (if (> (count slots) p/max-branches) (pair n1 n2) (let [new-rngs (int-array p/non-regular-array-len) new-arr (object-array p/non-regular-array-len) rngs1 (take li1 (if (.regular nm n1) (regular-ranges shift cnt1) (ranges nm n1))) rngs2 (take li2 (if (.regular nm n2) (regular-ranges shift cnt2) (ranges nm n2))) rngs2 (let [r (last rngs1)] (map #(+ % r) rngs2)) rngs (concat rngs1 rngs2)] (aset new-arr p/max-branches new-rngs) (loop [i 0 cs (seq slots)] (when cs (aset new-arr i (first cs)) (recur (inc i) (next cs)))) (loop [i 0 rngs (seq rngs)] (if rngs (do (aset new-rngs i (int (first rngs))) (recur (inc i) (next rngs))) (aset new-rngs p/max-branches i))) (pair (.node nm nil new-arr) nil))))) (def peephole-optimization-config (atom {:debug-fn nil})) (def peephole-optimization-count (atom 0)) ;; TBD: Transducer versions of child-nodes and bounded-grandchildren ;; are included here for when we are willing to rely upon Clojure ;; 1.7.0 as the minimum version supported by the core.rrb-vector ;; library. They are faster. #_(defn child-nodes [node ^NodeManager nm] (into [] (comp (take-while (complement nil?)) (take p/max-branches)) (.array nm node))) (defn child-nodes [node ^NodeManager nm] (->> (.array nm node) (take-while (complement nil?)) (take p/max-branches))) ;; (take (inc p/max-branches)) is just a technique to avoid generating more ;; grandchildren than necessary. If there are at least (inc p/max-branches), we do not ;; care how many there are. #_(defn bounded-grandchildren [nm children] (into [] (comp (map #(child-nodes % nm)) cat (take (inc p/max-branches))) children)) (defn bounded-grandchildren [nm children] (->> children (mapcat #(child-nodes % nm)) (take (inc p/max-branches)))) ;; TBD: Do functions like last-non-nil-idx and ;; count-vector-elements-beneath already exist elsewhere in this ;; library? It seems like they might. ;; A regular tree node is guaranteed to have only p/max-branches-way branching at ;; all nodes, except perhaps along the right spine, where it can be ;; partial. From a regular tree node down, all leaf arrays ;; (containing vector elements directly) are restricted to contain a ;; full p/max-branches vector elements. This code relies on these invariants to ;; quickly calculate the number of vector elements beneath a regular ;; node in O(log N) time. (defn last-non-nil-idx [^objects arr] (loop [i (int (dec (alength arr)))] (if (neg? i) i (if (nil? (aget arr (int i))) (recur (unchecked-dec-int i)) i)))) (defn count-vector-elements-beneath [node shift ^NodeManager nm] (if (.regular nm node) (loop [node node shift shift acc 0] (if (zero? shift) (if (nil? node) acc ;; The +p/max-branches is for the regular leaf node reached at shift 0 (+ acc p/max-branches)) (let [^objects arr (.array nm node) max-child-idx (int (last-non-nil-idx arr)) num-elems-in-full-child (int (bit-shift-left 1 shift))] (if (< max-child-idx 0) acc (recur (aget ^objects arr max-child-idx) (unchecked-subtract-int shift (int p/shift-increment)) (unchecked-add-int acc (unchecked-multiply-int max-child-idx num-elems-in-full-child))))))) ;; irregular case (let [rngs (ranges nm node)] (aget rngs (dec (aget rngs p/max-branches)))))) (defn peephole-optimize-root [^Vector v] (let [config @peephole-optimization-config] (if (<= (.-shift v) p/shift-increment-times-2) ;; Tree depth cannot be reduced if shift <= p/shift-increment. ;; TBD: If shift=p/shift-increment-times-2, the grandchildren nodes need to be handled ;; by an am array manager for primitive vectors, which I haven't ;; written code for yet below, but so far this peephole ;; optimizer seems to be working sufficiently well without ;; handling that case. v (let [root (.-root v) ^NodeManager nm (.-nm v) children (child-nodes root nm) grandchildren (bounded-grandchildren nm children) num-granchildren-bounded (count grandchildren) many-grandchildren? (> num-granchildren-bounded p/max-branches)] (if many-grandchildren? ;; If it is possible to reduce tree depth, it requires going ;; deeper than just to the grandchildren, which is beyond ;; what this peephole optimizer is intended to do. v ;; Create a new root node that points directly at the ;; grandchildren, since there are few enough of them. (let [^objects new-arr (object-array p/non-regular-array-len) ^ints new-rngs (int-array p/non-regular-array-len) new-root (.node nm (.edit nm root) new-arr) shift (.-shift v) grandchild-shift (- shift (* 2 p/shift-increment))] (swap! peephole-optimization-count inc) (loop [idx 0 remaining-gc grandchildren elem-sum (int 0)] (if-let [remaining-gc (seq remaining-gc)] (let [grandchild (first remaining-gc) num-elems-this-grandchild (count-vector-elements-beneath grandchild grandchild-shift nm) next-elem-sum (int (+ elem-sum num-elems-this-grandchild))] (aset new-arr idx grandchild) (aset new-rngs idx next-elem-sum) (recur (inc idx) (rest remaining-gc) next-elem-sum)))) (aset new-rngs p/max-branches num-granchildren-bounded) (aset new-arr p/max-branches new-rngs) (let [new-v (Vector. nm (.-am v) (.-cnt v) (- shift p/shift-increment) new-root (.-tail v) (.-_meta v) 0 0)] (when (:debug-fn config) ((:debug-fn config) v new-v)) new-v))))))) (def max-vector-elements Integer/MAX_VALUE) ;; Larger shift values than 64 definitely break assumptions all over ;; the RRB vector implementation, e.g. (bit-shift-right 255 65) ;; returns the same result as (bit-shift-right 255 1), I believe ;; because the shift amount argument is effectively modulo'd by 64. ;; Larger shift values than 30 are unlikely to make sense, given that ;; the maximum number of vector elements supported is somewhere near ;; Integer/MAX_VALUE=2^31-1. (defn shift-too-large? [^Vector v] (> (.-shift v) 30)) ;; The maximum number of vector elements in a tree, not counting any ;; elements in the tail, with a given shift value is: ;; ;; (bit-shift-left 1 (+ shift p/shift-increment)) ;; ;; It is perfectly normal to have vectors with a root tree node with ;; only 1 non-nil child, so at a fraction 1/(p/max-branches) of maximum capacity. I ;; do not know the exact minimum fraction that RRB vectors as ;; implemented here should allow, but I suspect it is well over ;; 1/(p/max-branches-squared). (defn poor-branching? [^Vector v] (let [tail-off (.tailoff v)] (if (zero? tail-off) false (let [shift-amount (unchecked-subtract-int (.-shift v) (int p/shift-increment)) max-capacity-divided-by-max-branches-squared (bit-shift-left 1 shift-amount)] (< tail-off max-capacity-divided-by-max-branches-squared))))) ;; Note 3: ;; Consider checking the performance of an expression like the one ;; used now against the one below: ;;(into (clojure.lang.LazilyPersistentVector/create v1) v2)) ;; If the LazilyPersistentVector/create version is faster, it would be ;; good to create a version of the create method that returns ;; primitive vectors when given a primitive vector, and an Object ;; vector when given an Object vector. The existing method in Clojure ;; core always returns an Object vector, even when given a primitive ;; vector. ;; TBD: Is there any promise about what metadata catvec returns? ;; Always the same as on the first argument? (def fallback-config (atom {:debug-fn nil})) (def fallback-to-slow-splice-count1 (atom 0)) (def fallback-to-slow-splice-count2 (atom 0)) (defn fallback-to-slow-splice-if-needed [^Vector v1 ^Vector v2 ^Vector splice-result] (let [config @fallback-config] (if (or (shift-too-large? splice-result) (poor-branching? splice-result)) (do (dbg (str "splice-rrbts result had shift " (.-shift splice-result) " and " (.tailoff splice-result) " elements not counting" " the tail. Falling back to slower method of concatenation.")) (if (poor-branching? v1) ;; The v1 we started with was not good, either. (do (swap! fallback-to-slow-splice-count1 inc) (dbg (str "splice-rrbts first arg had shift " (.-shift v1) " and " (.tailoff v1) " elements not counting" " the tail. Building the result from scratch.")) ;: See Note 3 (let [new-splice-result (-> (empty v1) (into v1) (into v2))] (when (:debug-fn config) ((:debug-fn config) splice-result new-splice-result)) new-splice-result)) ;; Assume that v1 is balanced enough that we can use into to ;; add all elements of v2 to it, without problems. ;; TBD: That assumption might be incorrect. Consider ;; checking the result of this, too, and fall back again to ;; the true case above? (let [new-splice-result (into v1 v2)] (swap! fallback-to-slow-splice-count2 inc) (when (:debug-fn config) ((:debug-fn config) splice-result new-splice-result)) new-splice-result))) ;; else the fast result is good splice-result))) (defn splice-rrbts-main [^NodeManager nm ^ArrayManager am ^Vector v1 ^Vector v2] (cond (zero? (count v1)) v2 (> (+ (long (count v1)) (long (count v2))) max-vector-elements) (let [c1 (long (count v1)), c2 (long (count v2))] (throw (IllegalArgumentException. (str "Attempted to concatenate two vectors whose total" " number of elements is " (+ c1 c2) ", which is" " larger than the maximum number of elements " max-vector-elements " supported in a vector ")))) (< (count v2) rrbt-concat-threshold) (into v1 v2) :else (let [s1 (.-shift v1) s2 (.-shift v2) r1 (.-root v1) o? (overflow? nm r1 s1 (+ (count v1) (- p/max-branches (.alength am (.-tail v1))))) r1 (if o? (let [tail (.-tail v1) tail-node (.node nm nil tail) reg? (and (.regular nm r1) (== (.alength am tail) p/max-branches)) arr (object-array (if reg? p/max-branches p/non-regular-array-len))] (aset arr 0 r1) (aset arr 1 (new-path nm am s1 tail-node)) (if-not reg? (let [rngs (int-array p/non-regular-array-len)] (aset rngs p/max-branches 2) (aset rngs 0 (- (count v1) (.alength am tail))) (aset rngs 1 (count v1)) (aset arr p/max-branches rngs))) (.node nm nil arr)) (fold-tail nm am r1 s1 (.tailoff v1) (.-tail v1))) s1 (if o? (+ s1 p/shift-increment) s1) r2 (.-root v2) s (max s1 s2) r1 (shift-from-to nm r1 s1 s) r2 (shift-from-to nm r2 s2 s) transferred-leaves (Box. 0) [n1 n2] (zippath nm am s r1 (count v1) r2 (- (count v2) (.alength am (.-tail v2))) transferred-leaves) d (.-val transferred-leaves) ncnt1 (+ (count v1) d) ncnt2 (- (count v2) (.alength am (.-tail v2)) d) [n1 n2] (if (identical? n2 r2) (squash-nodes nm s n1 ncnt1 n2 ncnt2) (object-array (list n1 n2))) ncnt1 (if n2 (int ncnt1) (unchecked-add-int (int ncnt1) (int ncnt2))) ncnt2 (if n2 (int ncnt2) (int 0))] (if n2 (let [arr (object-array p/non-regular-array-len) new-root (.node nm nil arr)] (aset arr 0 n1) (aset arr 1 n2) (aset arr p/max-branches (doto (int-array p/non-regular-array-len) (aset 0 ncnt1) (aset 1 (+ ncnt1 ncnt2)) (aset p/max-branches 2))) (Vector. nm am (+ (count v1) (count v2)) (+ s p/shift-increment) new-root (.-tail v2) nil 0 0)) (loop [r n1 s (int s)] (if (and (> s (int p/shift-increment)) (nil? (aget ^objects (.array nm r) 1))) (recur (aget ^objects (.array nm r) 0) (unchecked-subtract-int s (int p/shift-increment))) (Vector. nm am (+ (count v1) (count v2)) s r (.-tail v2) nil 0 0))))))) (defn splice-rrbts [^NodeManager nm ^ArrayManager am ^Vector v1 ^Vector v2] (let [r1 (splice-rrbts-main nm am v1 v2) r2 (peephole-optimize-root r1)] (fallback-to-slow-splice-if-needed v1 v2 r2))) (defn array-copy [^ArrayManager am from i to j len] (loop [i (int i) j (int j) len (int len)] (when (pos? len) (.aset am to j (.aget am from i)) (recur (unchecked-inc-int i) (unchecked-inc-int j) (unchecked-dec-int len))))) (deftype Transient [^NodeManager nm ^ArrayManager am ^boolean objects? ^:unsynchronized-mutable ^int cnt ^:unsynchronized-mutable ^int shift ^:unsynchronized-mutable root ^:unsynchronized-mutable tail ^:unsynchronized-mutable ^int tidx] clojure.lang.Counted (count [this] (.ensureEditable transient-helper nm root) cnt) clojure.lang.Indexed (nth [this i] (.ensureEditable transient-helper nm root) (if (and (<= (int 0) i) (< i cnt)) (let [tail-off (unchecked-subtract-int cnt (.alength am tail))] (if (<= tail-off i) (.aget am tail (unchecked-subtract-int i tail-off)) (loop [i i node root shift shift] (if (zero? shift) (let [arr (.array nm node)] (.aget am arr (bit-and (bit-shift-right i shift) (int p/branch-mask)))) (if (.regular nm node) (let [arr (.array nm node) idx (bit-and (bit-shift-right i shift) (int p/branch-mask))] (loop [i i node (aget ^objects arr idx) shift (unchecked-subtract-int shift (int p/shift-increment))] (let [arr (.array nm node) idx (bit-and (bit-shift-right i shift) (int p/branch-mask))] (if (zero? shift) (.aget am arr idx) (recur i (aget ^objects arr idx) (unchecked-subtract-int shift (int p/shift-increment))))))) (let [arr (.array nm node) rngs (ranges nm node) idx (loop [j (bit-and (bit-shift-right i shift) (int p/branch-mask))] (if (< i (aget rngs j)) j (recur (unchecked-inc-int j)))) i (if (zero? idx) (int i) (unchecked-subtract-int (int i) (aget rngs (unchecked-dec-int idx))))] (recur i (aget ^objects arr idx) (unchecked-subtract-int shift (int p/shift-increment))))))))) (throw (IndexOutOfBoundsException.)))) (nth [this i not-found] (.ensureEditable transient-helper nm root) (if (and (>= i (int 0)) (< i cnt)) (.nth this i) not-found)) clojure.lang.ILookup (valAt [this k not-found] (.ensureEditable transient-helper nm root) (if (Util/isInteger k) (let [i (int k)] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) not-found)) not-found)) (valAt [this k] (.valAt this k nil)) clojure.lang.IFn (invoke [this k] (.ensureEditable transient-helper nm root) (if (Util/isInteger k) (let [i (int k)] (if (and (>= i (int 0)) (< i cnt)) (.nth this i) (throw (IndexOutOfBoundsException.)))) (throw (IllegalArgumentException. "Key must be integer")))) (applyTo [this args] (.ensureEditable transient-helper nm root) (let [n (RT/boundedLength args 1)] (case n 0 (throw (clojure.lang.ArityException. n (.. this (getClass) (getSimpleName)))) 1 (.invoke this (first args)) 2 (throw (clojure.lang.ArityException. n (.. this (getClass) (getSimpleName))))))) clojure.lang.ITransientCollection (conj [this val] (.ensureEditable transient-helper nm root) (if (< tidx p/max-branches) (do (.aset am tail tidx val) (set! cnt (unchecked-inc-int cnt)) (set! tidx (unchecked-inc-int tidx)) this) (let [tail-node (.node nm (.edit nm root) tail) new-tail (.array am p/max-branches)] (.aset am new-tail 0 val) (set! tail new-tail) (set! tidx (int 1)) (if (overflow? nm root shift cnt) (if (.regular nm root) (let [new-arr (object-array p/max-branches)] (doto new-arr (aset 0 root) (aset 1 (.newPath transient-helper nm am tail (.edit nm root) shift tail-node))) (set! root (.node nm (.edit nm root) new-arr)) (set! shift (unchecked-add-int shift (int p/shift-increment))) (set! cnt (unchecked-inc-int cnt)) this) (let [new-arr (object-array p/non-regular-array-len) new-rngs (int-array p/non-regular-array-len) new-root (.node nm (.edit nm root) new-arr) root-total-range (aget (ranges nm root) p/max-branches-minus-1)] (doto new-arr (aset 0 root) (aset 1 (.newPath transient-helper nm am tail (.edit nm root) shift tail-node)) (aset p/max-branches new-rngs)) (doto new-rngs (aset 0 root-total-range) (aset 1 (unchecked-add-int root-total-range (int p/max-branches))) (aset p/max-branches 2)) (set! root new-root) (set! shift (unchecked-add-int shift (int p/shift-increment))) (set! cnt (unchecked-inc-int cnt)) this)) (let [new-root (.pushTail transient-helper nm am shift cnt (.edit nm root) root tail-node)] (set! root new-root) (set! cnt (unchecked-inc-int cnt)) this))))) (persistent [this] (.ensureEditable transient-helper nm root) (.set (.edit nm root) nil) (let [trimmed-tail (.array am tidx)] (array-copy am tail 0 trimmed-tail 0 tidx) (Vector. nm am cnt shift root trimmed-tail nil 0 0))) clojure.lang.ITransientVector (assocN [this i val] (.ensureEditable transient-helper nm root) (cond (and (<= 0 i) (< i cnt)) (let [tail-off (unchecked-subtract-int cnt tidx)] (if (<= tail-off i) (.aset am tail (unchecked-subtract-int i tail-off) val) (set! root (.doAssoc transient-helper nm am shift (.edit nm root) root i val))) this) (== i cnt) (.conj this val) :else (throw (IndexOutOfBoundsException.)))) (pop [this] (.ensureEditable transient-helper nm root) (cond (zero? cnt) (throw (IllegalStateException. "Can't pop empty vector")) (== 1 cnt) (do (set! cnt (int 0)) (set! tidx (int 0)) (if objects? (.aset am tail 0 nil)) this) (> tidx 1) (do (set! cnt (unchecked-dec-int cnt)) (set! tidx (unchecked-dec-int tidx)) (if objects? (.aset am tail tidx nil)) this) :else (let [new-tail-base (.arrayFor this (unchecked-subtract-int cnt (int 2))) new-tail (.editableTail transient-helper am new-tail-base) new-tidx (.alength am new-tail-base) new-root (.popTail transient-helper nm am shift cnt (.edit nm root) root)] (cond (nil? new-root) (do (set! cnt (unchecked-dec-int cnt)) (set! root (.ensureEditable transient-helper nm am (.edit nm root) (.empty nm) p/shift-increment)) (set! tail new-tail) (set! tidx new-tidx) this) (and (> shift p/shift-increment) (nil? (aget ^objects (.array nm new-root) 1))) (do (set! cnt (unchecked-dec-int cnt)) (set! shift (unchecked-subtract-int shift (int p/shift-increment))) (set! root (.ensureEditable transient-helper nm am (.edit nm root) (aget ^objects (.array nm new-root) 0) (unchecked-subtract-int shift (int p/shift-increment)))) (set! tail new-tail) (set! tidx new-tidx) this) :else (do (set! cnt (unchecked-dec-int cnt)) (set! root new-root) (set! tail new-tail) (set! tidx new-tidx) this))))) clojure.lang.ITransientAssociative (assoc [this k v] (.assocN this k v)) ;; temporary kludge IVecImpl (tailoff [_] (unchecked-subtract-int cnt tidx)) (arrayFor [this i] (if (and (<= (int 0) i) (< i cnt)) (if (>= i (.tailoff this)) tail (loop [i (int i) node root shift shift] (if (zero? shift) (.array nm node) (if (.regular nm node) (loop [node (aget ^objects (.array nm node) (bit-and (bit-shift-right i shift) (int p/branch-mask))) shift (unchecked-subtract-int shift (int p/shift-increment))] (if (zero? shift) (.array nm node) (recur (aget ^objects (.array nm node) (bit-and (bit-shift-right i shift) (int p/branch-mask))) (unchecked-subtract-int shift (int p/shift-increment))))) (let [rngs (ranges nm node) j (loop [j (bit-and (bit-shift-right i shift) (int p/branch-mask))] (if (< i (aget rngs j)) j (recur (unchecked-inc-int j)))) i (if (pos? j) (unchecked-subtract-int i (aget rngs (unchecked-dec-int j))) i)] (recur (int i) (aget ^objects (.array nm node) j) (unchecked-subtract-int shift (int p/shift-increment)))))))) (throw (IndexOutOfBoundsException.)))) PTransientDebugAccess (debugGetRoot [_] root) (debugGetShift [_] shift) (debugGetTail [_] tail) (debugGetCnt [_] cnt)) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector/transients.clj ================================================ (ns clojure.core.rrb-vector.transients (:require [clojure.core.rrb-vector.parameters :as p] [clojure.core.rrb-vector.nodes :refer [ranges last-range overflow?]]) (:import (clojure.core.rrb_vector.nodes NodeManager) (clojure.core ArrayManager) (java.util.concurrent.atomic AtomicReference))) (set! *warn-on-reflection* true) (set! *unchecked-math* true) ;; :warn-on-boxed (definterface ITransientHelper (editableRoot [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am root]) (editableTail [^clojure.core.ArrayManager am tail]) (ensureEditable [^clojure.core.rrb_vector.nodes.NodeManager nm root]) (ensureEditable [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am ^java.util.concurrent.atomic.AtomicReference root-edit current-node ^int shift]) (pushTail [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am ^int shift ^int cnt ^java.util.concurrent.atomic.AtomicReference root-edit current-node tail-node]) (popTail [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am ^int shift ^int cnt ^java.util.concurrent.atomic.AtomicReference root-edit current-node]) (doAssoc [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am ^int shift ^java.util.concurrent.atomic.AtomicReference root-edit current-node ^int i val]) (newPath [^clojure.core.rrb_vector.nodes.NodeManager nm ^clojure.core.ArrayManager am tail ^java.util.concurrent.atomic.AtomicReference edit ^int shift current-node])) (def ^ITransientHelper transient-helper (reify ITransientHelper (editableRoot [this nm am root] (let [new-arr (clojure.core/aclone ^objects (.array nm root))] (if (== p/non-regular-array-len (alength ^objects new-arr)) (aset new-arr p/max-branches (aclone (ints (aget ^objects new-arr p/max-branches))))) (.node nm (AtomicReference. (Thread/currentThread)) new-arr))) (editableTail [this am tail] (let [ret (.array am p/max-branches)] (System/arraycopy tail 0 ret 0 (.alength am tail)) ret)) (ensureEditable [this nm root] (let [owner (->> root (.edit nm) (.get))] (cond (identical? owner (Thread/currentThread)) nil (not (nil? owner)) (throw (IllegalAccessError. "Transient used by non-owner thread")) :else (throw (IllegalAccessError. "Transient used after persistent! call"))))) (ensureEditable [this nm am root-edit current-node shift] (if (identical? root-edit (.edit nm current-node)) current-node (if (zero? shift) (let [new-arr (.aclone am (.array nm current-node))] (.node nm root-edit new-arr)) (let [new-arr (aclone ^objects (.array nm current-node))] (if (== p/non-regular-array-len (alength ^objects new-arr)) (aset new-arr p/max-branches (aclone (ints (aget ^objects new-arr p/max-branches))))) (.node nm root-edit new-arr))))) ;; Note 1: This condition check and exception are a little bit ;; closer to the source of the cause for what was issue CRRBV-20, ;; added in case there is still some remaining way to cause this ;; condition to occur. ;; Note 2: In the worst case, when the tree is nearly full, ;; calling overflow? here takes run time O(tree_depth^2) here. ;; That could be made O(tree_depth). One way would be to call ;; pushTail in hopes that it succeeds, but return some distinctive ;; value indicating a failure on the full condition, and create ;; the node via a .newPath call at most recent recursive pushTail ;; call that has an empty slot available. (pushTail [this nm am shift cnt root-edit current-node tail-node] (let [ret (.ensureEditable this nm am root-edit current-node shift)] (if (.regular nm ret) (do (loop [n ret shift shift] (let [arr (.array nm n) subidx (bit-and (bit-shift-right (dec cnt) shift) p/branch-mask)] (if (== shift p/shift-increment) (aset ^objects arr subidx tail-node) (let [child (aget ^objects arr subidx)] (if (nil? child) (aset ^objects arr subidx (.newPath this nm am (.array nm tail-node) root-edit (unchecked-subtract-int shift p/shift-increment) tail-node)) (let [editable-child (.ensureEditable this nm am root-edit child (unchecked-subtract-int shift p/shift-increment))] (aset ^objects arr subidx editable-child) (recur editable-child (- shift (int p/shift-increment))))))))) ret) (let [arr (.array nm ret) rngs (ranges nm ret) li (unchecked-dec-int (aget rngs p/max-branches)) cret (if (== shift p/shift-increment) nil (let [child (.ensureEditable this nm am root-edit (aget ^objects arr li) (unchecked-subtract-int shift p/shift-increment)) ccnt (unchecked-add-int (int (if (pos? li) (unchecked-subtract-int (aget rngs li) (aget rngs (unchecked-dec-int li))) (aget rngs 0))) ;; add p/max-branches elems to account for the ;; new full tail we plan to add to ;; the subtree. (int p/max-branches))] ;; See Note 2 (if-not (overflow? nm child (unchecked-subtract-int shift p/shift-increment) ccnt) (.pushTail this nm am (unchecked-subtract-int shift p/shift-increment) ccnt root-edit child tail-node))))] (if cret (do (aset ^objects arr li cret) (aset rngs li (unchecked-add-int (aget rngs li) p/max-branches)) ret) (do (when (>= li p/max-branches-minus-1) ;; See Note 1 (let [msg (str "Assigning index " (inc li) " of vector" " object array to become a node, when that" " index should only be used for storing" " range arrays.") data {:shift shift, :cnd cnt, :current-node current-node, :tail-node tail-node, :rngs rngs, :li li, :cret cret}] (throw (ex-info msg data)))) (aset ^objects arr (inc li) (.newPath this nm am (.array nm tail-node) root-edit (unchecked-subtract-int shift p/shift-increment) tail-node)) (aset rngs (unchecked-inc-int li) (unchecked-add-int (aget rngs li) p/max-branches)) (aset rngs p/max-branches (unchecked-inc-int (aget rngs p/max-branches))) ret)))))) (popTail [this nm am shift cnt root-edit current-node] (let [ret (.ensureEditable this nm am root-edit current-node shift)] (if (.regular nm ret) (let [subidx (bit-and (bit-shift-right (unchecked-subtract-int cnt (int 2)) (int shift)) (int p/branch-mask))] (cond (> shift p/shift-increment) (let [child (.popTail this nm am (unchecked-subtract-int shift p/shift-increment) cnt ;; TBD: Should this be smaller than cnt? root-edit (aget ^objects (.array nm ret) subidx))] (if (and (nil? child) (zero? subidx)) nil (let [arr (.array nm ret)] (aset ^objects arr subidx child) ret))) (zero? subidx) nil :else (let [arr (.array nm ret)] (aset ^objects arr subidx nil) ret))) (let [rngs (ranges nm ret) subidx (unchecked-dec-int (aget rngs p/max-branches))] (cond (> shift p/shift-increment) (let [child (aget ^objects (.array nm ret) subidx) child-cnt (if (zero? subidx) (aget rngs 0) (unchecked-subtract-int (aget rngs subidx) (aget rngs (unchecked-dec-int subidx)))) new-child (.popTail this nm am (unchecked-subtract-int shift p/shift-increment) child-cnt root-edit child)] (cond (and (nil? new-child) (zero? subidx)) nil (.regular nm child) (let [arr (.array nm ret)] (aset rngs subidx (unchecked-subtract-int (aget rngs subidx) p/max-branches)) (aset ^objects arr subidx new-child) (if (nil? new-child) (aset rngs p/max-branches (unchecked-dec-int (aget rngs p/max-branches)))) ret) :else (let [rng (last-range nm child) diff (unchecked-subtract-int rng (if new-child (last-range nm new-child) 0)) arr (.array nm ret)] (aset rngs subidx (unchecked-subtract-int (aget rngs subidx) diff)) (aset ^objects arr subidx new-child) (if (nil? new-child) (aset rngs p/max-branches (unchecked-dec-int (aget rngs p/max-branches)))) ret))) (zero? subidx) nil :else (let [arr (.array nm ret) child (aget ^objects arr subidx)] (aset ^objects arr subidx nil) (aset rngs subidx 0) (aset rngs p/max-branches (unchecked-dec-int (aget rngs p/max-branches))) ret)))))) (doAssoc [this nm am shift root-edit current-node i val] (let [ret (.ensureEditable this nm am root-edit current-node shift)] (if (.regular nm ret) (loop [shift shift node ret] (if (zero? shift) (let [arr (.array nm node)] (.aset am arr (bit-and i p/branch-mask) val)) (let [arr (.array nm node) subidx (bit-and (bit-shift-right i shift) p/branch-mask) next-shift (int (unchecked-subtract-int shift p/shift-increment)) child (.ensureEditable this nm am root-edit (aget ^objects arr subidx) next-shift)] (aset ^objects arr subidx child) (recur next-shift child)))) (let [arr (.array nm ret) rngs (ranges nm ret) subidx (bit-and (bit-shift-right i shift) p/branch-mask) subidx (loop [subidx subidx] (if (< i (aget rngs subidx)) subidx (recur (unchecked-inc-int subidx)))) i (if (zero? subidx) i (unchecked-subtract-int i (aget rngs (unchecked-dec-int subidx))))] (aset ^objects arr subidx (.doAssoc this nm am (unchecked-subtract-int shift p/shift-increment) root-edit (aget ^objects arr subidx) i val)))) ret)) (newPath [this nm am tail edit shift current-node] (if (== (.alength am tail) p/max-branches) (loop [s 0 n current-node] (if (== s shift) n (let [arr (object-array p/max-branches) ret (.node nm edit arr)] (aset ^objects arr 0 n) (recur (unchecked-add s (int p/shift-increment)) ret)))) (loop [s 0 n current-node] (if (== s shift) n (let [arr (object-array p/non-regular-array-len) rngs (int-array p/non-regular-array-len) ret (.node nm edit arr)] (aset ^objects arr 0 n) (aset ^objects arr p/max-branches rngs) (aset rngs p/max-branches 1) (aset rngs 0 (.alength am tail)) (recur (unchecked-add s (int p/shift-increment)) ret)))))))) ================================================ FILE: src/parameterized/clojure/clojure/core/rrb_vector.clj ================================================ (ns clojure.core.rrb-vector "An implementation of the confluently persistent vector data structure introduced in Bagwell, Rompf, \"RRB-Trees: Efficient Immutable Vectors\", EPFL-REPORT-169879, September, 2011. RRB-Trees build upon Clojure's PersistentVectors, adding logarithmic time concatenation and slicing. The main API entry points are clojure.core.rrb-vector/catvec, performing vector concatenation, and clojure.core.rrb-vector/subvec, which produces a new vector containing the appropriate subrange of the input vector (in contrast to clojure.core/subvec, which returns a view on the input vector). core.rrb-vector's vectors can store objects or unboxed primitives. The implementation allows for seamless interoperability with clojure.lang.PersistentVector, clojure.core.Vec (more commonly known as gvec) and clojure.lang.APersistentVector$SubVector instances: clojure.core.rrb-vector/catvec and clojure.core.rrb-vector/subvec convert their inputs to clojure.core.rrb-vector.rrbt.Vector instances whenever necessary (this is a very fast constant time operation for PersistentVector and gvec; for SubVector it is O(log n), where n is the size of the underlying vector). clojure.core.rrb-vector also exports its own versions of vector and vector-of and vec which always produce clojure.core.rrb-vector.rrbt.Vector instances. Note that vector-of accepts :object as one of the possible type arguments, in addition to keywords naming primitive types." {:author "Michał Marczyk"} (:refer-clojure :exclude [vector vector-of vec subvec]) (:require [clojure.core.rrb-vector.parameters :as p] [clojure.core.rrb-vector.protocols :refer [slicev splicev]] [clojure.core.rrb-vector.nodes :refer [ams object-am object-nm primitive-nm empty-pv-node empty-gvec-node]] [clojure.core.rrb-vector.rrbt :refer [as-rrbt]] clojure.core.rrb-vector.interop) (:import (clojure.core.rrb_vector.rrbt Vector) (clojure.core.rrb_vector.nodes NodeManager) (clojure.core ArrayManager))) (set! *warn-on-reflection* true) (set! *unchecked-math* true) ;; :warn-on-boxed (defn catvec "Concatenates the given vectors in logarithmic time." ([] []) ([v1] v1) ([v1 v2] (splicev v1 v2)) ([v1 v2 v3] (splicev (splicev v1 v2) v3)) ([v1 v2 v3 v4] (splicev (splicev v1 v2) (splicev v3 v4))) ([v1 v2 v3 v4 & vn] (splicev (splicev (splicev v1 v2) (splicev v3 v4)) (apply catvec vn)))) (defn subvec "Returns a new vector containing the elements of the given vector v lying between the start (inclusive) and end (exclusive) indices in logarithmic time. end defaults to end of vector. The resulting vector shares structure with the original, but does not hold on to any elements of the original vector lying outside the given index range." ([v start] (slicev v start (count v))) ([v start end] (slicev v start end))) (defmacro ^:private gen-vector-method [& params] (let [arr (with-meta (gensym "arr__") {:tag 'objects})] `(let [~arr (object-array ~(count params))] ~@(map-indexed (fn [i param] `(aset ~arr ~i ~param)) params) (Vector. ^NodeManager object-nm ^ArrayManager object-am ~(count params) p/shift-increment empty-pv-node ~arr nil 0 0)))) (defn vector "Creates a new vector containing the args." ([] (gen-vector-method)) ([x1] (gen-vector-method x1)) ([x1 x2] (gen-vector-method x1 x2)) ([x1 x2 x3] (gen-vector-method x1 x2 x3)) ([x1 x2 x3 x4] (gen-vector-method x1 x2 x3 x4)) ([x1 x2 x3 x4 & xn] (loop [v (transient (vector x1 x2 x3 x4)) xn xn] (if xn (recur (.conj ^clojure.lang.ITransientCollection v (first xn)) (next xn)) (persistent! v))))) (defn vec "Returns a vector containing the contents of coll. If coll is a vector, returns an RRB vector using the internal tree of coll." [coll] (if (vector? coll) (as-rrbt coll) (apply vector coll))) (defmacro ^:private gen-vector-of-method [t & params] (let [am (gensym "am__") nm (gensym "nm__") arr (gensym "arr__")] `(let [~am ^ArrayManager (ams ~t) ~nm ^NodeManager (if (identical? ~t :object) object-nm primitive-nm) ~arr (.array ~am ~(count params))] ~@(map-indexed (fn [i param] `(.aset ~am ~arr ~i ~param)) params) (Vector. ~nm ~am ~(count params) p/shift-increment (if (identical? ~t :object) empty-pv-node empty-gvec-node) ~arr nil 0 0)))) (defn vector-of "Creates a new vector capable of storing homogenous items of type t, which should be one of :object, :int, :long, :float, :double, :byte, :short, :char, :boolean. Primitives are stored unboxed. Optionally takes one or more elements to populate the vector." ([t] (gen-vector-of-method t)) ([t x1] (gen-vector-of-method t x1)) ([t x1 x2] (gen-vector-of-method t x1 x2)) ([t x1 x2 x3] (gen-vector-of-method t x1 x2 x3)) ([t x1 x2 x3 x4] (gen-vector-of-method t x1 x2 x3 x4)) ([t x1 x2 x3 x4 & xn] (loop [v (transient (vector-of t x1 x2 x3 x4)) xn xn] (if xn (recur (.conj ^clojure.lang.ITransientCollection v (first xn)) (next xn)) (persistent! v))))) ================================================ FILE: src/test/cljs/clojure/core/rrb_vector/long_test.cljs ================================================ (ns clojure.core.rrb-vector.long-test (:require [clojure.test :as test :refer [deftest testing is are]] [clojure.core.rrb-vector.test-utils :as u] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.debug :as dv] [clojure.core.rrb-vector.debug-platform-dependent :as dpd])) ;; The intent is to keep this file as close to ;; src/test/clojure/clojure/core/rrb_vector/long_test.clj as possible, ;; so that when we start requiring Clojure 1.7.0 and later for this ;; library, this file and that one can be replaced with a common test ;; file with the suffix .cljc ;; Note that the namespace of this file _intentionally_ does not match ;; the pattern of namespaces that are run for ClojureScript tests by ;; default. That is because of how long the tests in this file take ;; to run. It seems best to include them in the set of tests in such ;; a way that it is only run when a developer explicitly wants to run ;; longer tests. It should not be run by default when running on ;; build.clojure.org. ;; Currently the Clojure/JVM versions of these tests _are_ run by ;; default, and on build.clojure.org, but at least the ones in here ;; now run significantly faster on Clojure/JVM than they do in any of ;; the JavaScript runtimes I have tested with. (dv/set-debug-opts! dv/full-debug-opts) (def generative-test-length :short) (def check-subvec-params (case generative-test-length :short [125 100000 10] :medium [250 200000 20] :long [250 200000 20])) (deftest test-slicing-generative (testing "slicing (generative)" (is (try (apply dv/generative-check-subvec u/extra-checks? check-subvec-params) (catch js/Error e (throw (ex-info (dpd/format "%s: %s %s" (u/ex-message-copy e) (:init-cnt (ex-data e)) (:s&es (ex-data e))) {} (u/ex-cause-copy e)))))))) ;; short: 2 to 3 sec ;; medium: 50 to 60 sec (def check-catvec-params (case generative-test-length :short [ 10 30 10 60000] :medium [250 30 10 60000] :long [250 30 10 60000])) (deftest test-splicing-generative (testing "splicing (generative)" (is (try (apply dv/generative-check-catvec u/extra-checks? check-catvec-params) (catch js/Error e (throw (ex-info (dpd/format "%s: %s" (u/ex-message-copy e) (:cnts (ex-data e))) {} (u/ex-cause-copy e)))))))) ;; This problem reproduction code is from CRRBV-17 ticket: ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-17 (def benchmark-size 100000) ;; This small variation of the program in the ticket simply does ;; progress debug printing occasionally, as well as extra debug ;; checking of the results occasionally. ;; If you enable the printing of the message that begins ;; with "splice-rrbts result had shift" in function ;; fallback-to-slow-splice-if-needed, then run this test, you will see ;; it called hundreds or perhaps thousands of times. The fallback ;; approach is effective at avoiding a crash for this scenario, but at ;; a dramatic extra run-time cost. (defn vector-push-f [v my-catvec extra-checks-catvec] (loop [v v i 0] (let [check? (or (zero? (mod i 10000)) (and (> i 99000) (zero? (mod i 100))) (and (> i 99900) (zero? (mod i 10))))] (when check? (print "i=" i " ") (u/print-optimizer-counts)) (if (< i benchmark-size) (recur (if check? (extra-checks-catvec (fv/vector i) v) (my-catvec (fv/vector i) v)) (inc i)) v)))) ;; Approximate run times for this test on a 2015 MacBook Pro ;; 36 sec - clj 1.10.1, OpenJDK 11.0.4 ;; 465 sec - cljs 1.10.439, OpenJDK 11.0.4, Nashorn JS runtime ;; 138 sec - cljs 1.10.238, OpenJDK 11.0.4, nodejs 8.10.0 ;; 137 sec - cljs 1.10.238, OpenJDK 11.0.4, Spidermonkey JavaScript-C52.9.1 (deftest test-crrbv-17 (u/reset-optimizer-counts!) (is (= (reverse (range benchmark-size)) (vector-push-f (fv/vector) fv/catvec dv/checking-catvec)))) ================================================ FILE: src/test/cljs/clojure/core/rrb_vector/test_cljs.cljs ================================================ (ns clojure.core.rrb-vector.test-cljs (:require [cljs.test :as test] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.test-utils :as u] clojure.core.rrb-vector.test-common clojure.core.rrb-vector.test-cljs-only clojure.core.rrb-vector.long-test)) ;; This file was copied from namespace clojure.data.xml.test-cljs ;; in the data.xml library tests, then modified for use by ;; core.rrb-vector, so that core.rrb-vector's ClojureScript tests ;; could also be run on build.clojure.org via a mvn command. (def ^:dynamic *results*) (defmethod test/report [::test/default :end-run-tests] [m] (assert (nil? *results*)) (set! *results* m)) ;; Function -main-nashorn is called when running tests using a maven ;; command such as this one: ;; mvn -DCLOJURE_VERSION=1.10.1 -Dclojure.version=1.10.1 clean test ;; Function run is called when running tests using a 'lein cljsbuild' ;; command such as this one: ;; lein with-profile +cljs,+1.10 cljsbuild test ;; To run the tests in a namespace from here, it must be :require'd ;; above, and its name must match the regex given to a run-all-tests ;; call. You may also call run-all-tests with no args to run tests in ;; all namespaces, but even then it must be :require'd above. (defn ^:export -main-nashorn [] (set! *print-newline* false) (set! *print-fn* js/print) (set! *print-err-fn* js/print) (binding [*results* nil] (println "Running Basic Tests") (test/run-all-tests #"clojure\.core\.rrb-vector\.test-.*") ;;(test/run-all-tests #"clojure\.core\.rrb-vector\..*test.*") (pr-str *results*))) (defn ^:export run [] (println "Running Basic Tests") (test/run-all-tests #"clojure\.core\.rrb-vector\.test-.*") ;;(test/run-all-tests #"clojure\.core\.rrb-vector\..*test.*") ) ================================================ FILE: src/test/cljs/clojure/core/rrb_vector/test_cljs_only.cljs ================================================ (ns clojure.core.rrb-vector.test-cljs-only (:require [clojure.test :as test :refer [deftest testing is are]] [clojure.core.rrb-vector.test-utils :as u] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.debug :as dv] [clojure.core.rrb-vector.debug-platform-dependent :as dpd])) (dv/set-debug-opts! dv/full-debug-opts) ================================================ FILE: src/test/cljs/clojure/core/rrb_vector/test_common.cljs ================================================ (ns clojure.core.rrb-vector.test-common (:require [clojure.test :as test :refer [deftest testing is are]] [clojure.core.reducers :as r] [clojure.core.rrb-vector.test-utils :as u] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.debug :as dv] [clojure.core.rrb-vector.debug-platform-dependent :as pd])) ;; The intent is to keep this file as close to ;; src/test/clojure/clojure/core/rrb_vector/test_common.clj as ;; possible, so that when we start requiring Clojure 1.7.0 and later ;; for this library, this file and that one can be replaced with a ;; common test file with the suffix .cljc (dv/set-debug-opts! dv/full-debug-opts) (deftest test-slicing (testing "slicing" (is (dv/check-subvec u/extra-checks? 32000 10 29999 1234 18048 10123 10191)))) (deftest test-splicing (testing "splicing" (is (dv/check-catvec u/extra-checks? 1025 1025 3245 1025 32768 1025 1025 10123 1025 1025)) (is (dv/check-catvec u/extra-checks? 10 40 40 40 40 40 40 40 40)) (is (apply dv/check-catvec u/extra-checks? (repeat 30 33))) (is (dv/check-catvec u/extra-checks? 26091 31388 1098 43443 46195 4484 48099 7905 13615 601 13878 250 10611 9271 53170)) ;; Order that catvec will perform splicev calls: (let [my-splice (if u/extra-checks? dv/checking-splicev fv/catvec) counts [26091 31388 1098 43443 46195 4484 48099 7905 13615 601 13878 250 10611 9271 53170] prefix-sums (reductions + counts) ranges (map range (cons 0 prefix-sums) prefix-sums) [v01 v02 v03 v04 v05 v06 v07 v08 v09 v10 v11 v12 v13 v14 v15] (map fv/vec ranges) v01-02 (my-splice v01 v02) ;; top level catvec call v03-04 (my-splice v03 v04) ;; top level catvec call v01-04 (my-splice v01-02 v03-04) ;; top level catvec call v05-06 (my-splice v05 v06) ;; recurse level 1 catvec call v07-08 (my-splice v07 v08) ;; recurse level 1 catvec call v05-08 (my-splice v05-06 v07-08) ;; recurse level 1 catvec call v09-10 (my-splice v09 v10) ;; recurse level 2 catvec call v11-12 (my-splice v11 v12) ;; recurse level 2 catvec call v09-12 (my-splice v09-10 v11-12) ;; recurse level 2 catvec call v13-14 (my-splice v13 v14) ;; recurse level 3 catvec call v13-15 (my-splice v13-14 v15) ;; recurse level 3 catvec call v09-15 (my-splice v09-12 v13-15) ;; recurse level 2 catvec call v05-15 (my-splice v05-08 v09-15) ;; recurse level 1 catvec call v01-15 (my-splice v01-04 v05-15) ;; top level catvec call exp-val (range (last prefix-sums))] (is (= -1 (dv/first-diff v01-15 exp-val))) (is (= -1 (dv/first-diff (into v01-04 v05-15) exp-val)))))) (deftest test-reduce (let [v1 (vec (range 128)) v2 (fv/vec (range 128))] (testing "reduce" (is (= (reduce + v1) (reduce + v2)))) (testing "reduce-kv" (is (= (reduce-kv + 0 v1) (reduce-kv + 0 v2)))))) (deftest test-reduce-2 (let [my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec) v1 (my-subvec (dv/cvec (range 1003)) 500) v2 (dv/cvec (range 500 1003))] (is (= (reduce + 0 v1) (reduce + 0 v2) (reduce + 0 (r/map identity (seq v1))) (reduce + 0 (r/map identity (seq v2))))))) (deftest test-reduce-3 (let [v0 (vec []) rv0 (fv/vec [])] (testing "reduce" (is (= (reduce + v0) (reduce + rv0)))) (testing "reduce-kv" (is (= (reduce-kv + 0 v0) (reduce-kv + 0 rv0)))))) (deftest test-seq (let [v (fv/vec (range 128)) s (seq v)] (testing "seq contents" (is (= v s))) (testing "chunked-seq?" (is (chunked-seq? s))) (testing "internal-reduce" (is (satisfies? IReduce s))))) (deftest test-assoc (let [my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (let [v1 (fv/vec (range 40000)) v2 (reduce (fn [out [k v]] (assoc out k v)) (assoc v1 40000 :foo) (map-indexed vector (rseq v1)))] (is (= (concat (rseq v1) [:foo]) v2))) (are [i] (= :foo (-> (range 40000) (fv/vec) (my-subvec i) (assoc 10 :foo) (nth 10))) 1 32 1024 32768))) (deftest test-assoc! (let [my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (let [v1 (fv/vec (range 40000)) v2 (persistent! (reduce (fn [out [k v]] (assoc! out k v)) (assoc! (transient v1) 40000 :foo) (map-indexed vector (rseq v1))))] (is (= (concat (rseq v1) [:foo]) v2))) (are [i] (= :foo (-> (range 40000) (fv/vec) (my-subvec i) (transient) (assoc! 10 :foo) (persistent!) (nth 10))) 1 32 1024 32768))) (deftest test-relaxed (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec)] (is (= (into (my-catvec (dv/cvec (range 123)) (dv/cvec (range 68))) (range 64)) (concat (range 123) (range 68) (range 64)))) (is (= (dv/slow-into (my-catvec (dv/cvec (range 123)) (dv/cvec (range 68))) (range 64)) (concat (range 123) (range 68) (range 64)))))) (deftest test-hasheq (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec)] (is (= (hash []) (hash (fv/vector)))) ;; CRRBV-25 (let [v1 (dv/cvec (range 1024)) v2 (dv/cvec (range 1024)) v3 (my-catvec (dv/cvec (range 512)) (dv/cvec (range 512 1024))) s1 (seq v1) s2 (seq v2) s3 (seq v3)] (is (= (hash v1) (hash v2) (hash v3) (hash s1) (hash s2) (hash s3))) (is (= (hash (nthnext s1 120)) (hash (nthnext s2 120)) (hash (nthnext s3 120))))))) (deftest test-reduce-subvec-catvec (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (letfn [(insert-by-sub-catvec [v n] (my-catvec (my-subvec v 0 n) (dv/cvec ['x]) (my-subvec v n))) (repeated-subvec-catvec [i] (reduce insert-by-sub-catvec (dv/cvec (range i)) (range i 0 -1)))] (is (= (repeated-subvec-catvec 2371) (interleave (range 2371) (repeat 'x))))))) (def pos-infinity ##Inf) (deftest test-reduce-subvec-catvec2 (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (letfn [(insert-by-sub-catvec [v n] (my-catvec (my-subvec v 0 n) (dv/cvec ['x]) (my-subvec v n))) (repeated-subvec-catvec [i] (reduce insert-by-sub-catvec (dv/cvec (range i)) (take i (interleave (range (quot i 2) pos-infinity) (range (quot i 2) pos-infinity)))))] (let [n 2371 v (repeated-subvec-catvec n)] (is (every? #(or (integer? %) (= 'x %)) v)) (is (= (count v) (* 2 n))))))) (deftest test-splice-high-subtree-branch-count (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec) x (fv/vec (repeat 1145 \a)) y (my-catvec (my-subvec x 0 778) (my-subvec x 778 779) (dv/cvec [1]) (my-subvec x 779)) z (my-catvec (my-subvec y 0 780) (dv/cvec [2]) (my-subvec y 780 781) (my-subvec y 781)) res (my-catvec (my-subvec z 0 780) (dv/cvec []) (dv/cvec [3]) (my-subvec z 781)) expected (concat (repeat 779 \a) [1] [3] (repeat 366 \a))] (is (= res expected)))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; This problem reproduction code is from CRRBV-12 ticket: ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-12 ;; I would prefer to have all of the data that is the value of ;; crrbv-12-data read from a separate file, but it is not terribly ;; long, and having it in the code avoids having to figure out how to ;; find and read the file on N different JavaScript runtime ;; environments, for the ClojureScript version of the test. (def crrbv-12-data [7912 7831 5393 5795 6588 2394 6403 6237 6152 5890 6507 6388 6100 7400 6340 7624 6379 5430 6335 5883 5570 6220 6319 6442 5666 3901 6974 5440 6626 7782 6760 6066 7763 9547 5585 6724 5407 5675 7727 7666 6845 6658 5409 7304 7291 5826 6523 5529 7387 6275 7193 5563 6572 7150 2949 1133 7312 7267 7135 7787 5812 7372 4295 5937 2931 4846 6149 1901 6680 7319 7845 7517 6722 6535 6362 5457 6649 7757 7463 6755 7436 6364 7361 7174 6048 6657 6533 5763 6074 6744 6734 5668 61 3842 5395 6489 1723 6248 7664 6645 5943 5428 6995 6688 7088 6305 6198 6197 5765 3691 7157 7305 7631 6058 6655 7846 7746 686 6024 6473 6150 5951 1761 7900 7084 5637 6607 5561 5772 7232 8512 6249 7377 5437 4830 6939 6355 7100 7884 951 6765 7054 1367 4580 7284 5414 7344 7525 5801 6374 6685 6737 4413 7353 1851 5973 7538 7116 6359 6605 6743 6153 7398 4757 6623 7546 7013 7091 7501 5749 6368 7911 6675 3246 6304 6469 6868 7701 5768 6369 6996 6346 6171 5884 6757 7615 5986 9904 5982 7049 6011 7716 6646 6178 6636 6637 7700 3390 6107 6938 2513 5663 5309 5673 7069 6615 5825 7183 5600 2188 5807 7635 7257 4803 6740 5865 6869 6968 7404 5124 7565 6169 7681 6181 5427 9861 7669 5936 5588 5463 6059 5695 5784 6768 6922 5720 6229 9173 6486 6399 6013 5517 7198 7320 6970 5969 7593 7351 7622 6561 5739 6433 6452 6320 6979 6260 6763 5539 6292 7133 6571 6108 7455 8470 7148 7597 6935 6865 7852 6549 6506 5425 6552 5551 5612 7230 809 2694 6408 6783 7626 6703 2754 1015 6809 7584 5473 6165 7105 6447 5856 6739 5564 7886 7856 7355 5814 919 6900 6257 118 7259 7419 6278 7619 6401 5970 7537 2899 6012 7190 5500 6122 5817 7620 6402 5811 5412 6822 5643 6138 5948 5523 4884 6460 5828 7159 5405 6224 7192 8669 5827 538 7416 6598 5577 6769 7547 7323 6748 6398 1505 6211 6466 6699 6207 6444 6863 7646 5917 6796 5619 6282 354 6418 5687 2536 6238 1166 6376 3852 5955 188 7218 7477 6926 7694 7253 5880 5424 7392 6337 7438 7814 3205 6336 6465 6812 1102 6468 6034 6133 5849 7578 7863 5761 6372 7568 5813 6380 6481 6942 7676 5552 7015 7120 7838 5684 6101 6834 6092 7917 6124 867 7187 5527 7488 5900 6267 6443 724 6073 6608 6407 6040 5540 6061 5554 5469 6255 6542 7336 2272 6921 1078 5593 7045 5013 6870 6712 6537 6785 6333 5892 6633 7522 6697 5915 5567 6606 5820 7653 7554 6932 5824 9330 8780 7203 7204 7519 7633 6529 7564 5718 7605 6579 7621 4462 6009 6950 6430 5911 5946 6877 7830 6570 7421 6449 6684 8425 5983 5846 5505 6097 5773 5781 6463 6867 5774 6601 1577 5642 6959 6251 7741 7391 6036 6892 5097 6874 6580 6348 5904 6709 5976 7411 7223 6252 7414 6813 4378 5888 5546 6385 401 5912 7828 7775 5925 6151 7648 5810 7673 6250 5808 7251 1407 5644 7439 7901 1964 6631 6858 7630 7771 2892 946 6397 5443 5715 5665 7306 6233 5566 5447 7011 6314 2054 5786 2170 6901 6077 6239 7791 6960 7891 7878 7758 5829 7611 7059 5455 6654 6459 6949 7406 7854 5805 6564 7033 6445 5939 6706 6103 7614 7902 6527 7479 6196 6484 3521 7269 6055 7331 6184 6746 6936 5891 6687 5771 7136 6625 7865 5864 6704 7726 5842 6295 6910 5277 7528 5689 5674 7457 7086 5220 317 7720 6720 5913 7098 5450 7275 7521 7826 7007 6378 7277 6844 7177 5482 97 6730 7861 5601 6000 6039 6953 5624 6450 6736 7492 5499 5822 7276 2889 7102 6648 6291 865 7348 7330 1449 6719 5550 7326 6338 6714 7805 7082 6377 2791 7876 5870 7107 7505 5416 7057 6021 7037 6331 5698 6721 5180 7390 5938 9067 7215 4566 8051 6557 6161 5894 1379 7335 2602 6520 7199 6878 6366 6948 7202 4791 7338 7442 5987 7099 7632 5453 4755 4947 7786 6254 7103 7595 6670 6485 6117 6756 6339 7240 7609 6853 6299 7205 4857 7511 576 5835 5396 5997 5508 6413 6219 5403 7686 9189 6634 5503 6801 7508 5611 7667 7572 7587 6015 7153 7340 6279 5646 2004 2708 7119 5737 3258 7427 6204 6476 6511 2300 7055 5389 6984 5438 6002 6272 5756 5734 6913 6425 6847 5657 6357 6862 6030 5522 6943 3518 6139 6671 7764 6493 5691 6082 4635 6640 6898 7262 9391 6828 2277 6690 6464 5759 7441 6622 1262 7114 6294 7070 6539 6788 6167 7824 6382 2512 7322 5992 7696 5445 5538 6140 7151 6409 7085 6166 6263 1194 5544 7141 5906 2939 7389 7290 6491 6322 8324 7341 7246 5610 7536 6946 7540 7760 6293 5589 7009 7822 5456 6805 5841 7722 5559 7265 6903 3517 1243 6078 7180 6147 8063 7395 7551 5460 6421 7567 6546 6941 6301 5486 7347 6479 5990 5932 6881 7737 6051 7375 5762 6897 2967 7297 7263 6965 6752 6158 7556 6794 7641 7628 2374 6289 7286 7581 6008 491 6919 9157 7002 6585 7960 6967 7692 7128 5680 5037 5752 6223 5989 7545 6584 7282 6221 871 6116 5484 6350 6266 6889 6216 1892 924 5875 7658 5461 5410 8352 7072 5724 6931 6050 6125 5519 6711 7518 6613 7576 7989 5603 7214 6664 2933 5839 7454 9353 6512 7242 7768 6037 6567 6673 8438 7364 5406 6080 577 6895 5742 5722 6944 6273 5965 5464 6876 7719 7311 7258 6829 7280 6028 5740 9162 9858 6695 7239 6972 7025 7147 7039 6226 6135 7219 6477 6708 767 5432 7405 7580 3790 372 7523 6597 5922 6105 5434 9587 6173 7739 5984 5854 2153 6912 7476 7598 5985 5874 8723 5628 5496 7352 4829 6483 7211 6933 5545 7544 5444 5790 8223 1089 6676 5667 6749 6777 5429 6347 5399 5662 6446 5524 6909 5415 7742 6343 5921 7160 7175 7026 1838 6894 4355 52 6192 5341 6945 7366 7816 2006 7380 6531 6904 5958 6270 6069 5574 7349 7212 5256 6010 6961 2825 6691 7792 6017 6888 7707 6693 6456 5871 7238 7780 7256 5630 7744 6855 5077 6958 6046 6707 6530 6501 7298 5636 6121 1105 6243 5541 6814 6732 7500 6866 7093 7745 7030 4338 6517 5991 6458 6213 4695 5542 7853 5926 6550 5230 7432 7006 5858 7677 6495 7310 6432 7487 7670 7674 6245 7315 7893 4360 940 6303 5757 7697 7506 5491 1309 7695 2214 5553 6964 7403 7302 6589 7851 7186 6193 2964 6242 6545 7012 7010 5448 5767 6647 7610 7485 6509 6083 6525 5607 9982 6244 7832 7213 6308 1320 7092 5656 6342 7864 7140 2577 104 1343 6786 7654 6156 5584 6818 5604 6681 6038 6056 6594 6603 7040 5468 5957 7229 6735 5510 6700 7725 7431 7154 7682 6558 7158 7470 7749 5400 5397 7247 6582 5832 7041 7325 5777 6759 6577 6195 7895 9626 7042 6026 6741 7811 7942 8926 1499 6772 7561 5565 3587 7273 6172 7428 6787 7181 5754 7579 5535 5543 5818 7264 1854 6998 7425 5394 6661 6562 375 2990]) (defn quicksort [my-catvec v] (if (<= (count v) 1) v (let [[x & xs] v] (my-catvec (quicksort my-catvec (dv/cvec (filter #(<= % x) xs))) (dv/cvec [x]) (quicksort my-catvec (dv/cvec (filter #(> % x) xs))))))) (defn ascending? [coll] (every? (fn [[a b]] (<= a b)) (partition 2 1 coll))) (deftest test-crrbv-12 (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) v (dv/cvec crrbv-12-data)] (testing "Ascending order after quicksort" (is (ascending? (quicksort my-catvec v)))) (testing "Repeated catvec followed by pop" (is (= [] (nth (iterate pop (nth (iterate #(my-catvec (dv/cvec [0]) %) []) 963)) 963)))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (defn npe-for-1025-then-pop! [kind] (let [my-pop! (if u/extra-checks? dv/checking-pop! pop!) bfactor-squared (* 32 32) mk-vector (case kind :object-array fv/vector) boundary 54 v1 (-> (mk-vector) (into (range boundary)) (into (range boundary (inc bfactor-squared)))) v2 (-> (mk-vector) (into (range bfactor-squared)) (transient) (my-pop!) (persistent!)) v3 (-> (mk-vector) (into (range boundary)) (into (range boundary (inc bfactor-squared))) (transient) (my-pop!) (persistent!)) v4 (-> (mk-vector) (into (range (inc bfactor-squared))) (transient) (my-pop!) (persistent!))] (is (= (seq v1) (range (inc bfactor-squared)))) (is (= (seq v2) (range (dec bfactor-squared)))) ;; This used to fail with core.rrb-vector version 0.0.14 with ;; NullPointerException while traversing the seq on clj. It gets ;; a different kind of error with cljs. (is (= (seq v3) (range bfactor-squared))) ;; This one caused a NullPointerException with version 0.0.14 ;; while traversing the seq (is (= (seq v4) (range bfactor-squared))))) (deftest test-npe-for-1025-then-pop! (doseq [kind [:object-array]] (npe-for-1025-then-pop! kind))) ;; This problem reproduction code is slightly modified from a version ;; provided in a comment by Mike Fikes on 2018-Dec-09 for this issue: ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-20 (defn play [my-vector my-catvec my-subvec players rounds] (letfn [(swap [marbles split-ndx] (my-catvec (my-subvec marbles split-ndx) (my-subvec marbles 0 split-ndx))) (rotl [marbles n] (swap marbles (mod n (count marbles)))) (rotr [marbles n] (swap marbles (mod (- (count marbles) n) (count marbles)))) (place-marble [marbles marble] (let [marbles (rotl marbles 2)] [(my-catvec (my-vector marble) marbles) 0])) (remove-marble [marbles marble] (let [marbles (rotr marbles 7) first-marble (nth marbles 0)] [(my-subvec marbles 1) (+ marble first-marble)])) (play-round [marbles round] (if (zero? (mod round 23)) (remove-marble marbles round) (place-marble marbles round))) (add-score [scores player round-score] (if (zero? round-score) scores (assoc scores player (+ (get scores player 0) round-score))))] (loop [marbles (my-vector 0) round 1 player 1 scores {} ret []] (let [[marbles round-score] (play-round marbles round) scores (add-score scores player round-score)] (if (> round rounds) (conj ret {:round round :marbles marbles}) (recur marbles (inc round) (if (= player players) 1 (inc player)) scores (conj ret {:round round :marbles marbles}))))))) (defn play-core [& args] (apply play clojure.core/vector clojure.core/into clojure.core/subvec args)) (defn play-rrbv [& args] (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (apply play fv/vector my-catvec my-subvec args))) (deftest test-crrbv-20 ;; This one passes (is (= (play-core 10 1128) (play-rrbv 10 1128))) ;; This ends up with (play-rrbv 10 1129) throwing an exception, with ;; core.rrb-vector version 0.0.14 (is (= (play-core 10 1129) (play-rrbv 10 1129))) ;; The previous test demonstrates a bug in the transient RRB vector ;; implementation. The one below demonstrated a similar bug in the ;; persistent RRB vector implementation in version 0.0.14. (let [v1128 (:marbles (last (play-rrbv 10 1128))) v1129-pre (-> v1128 (fv/subvec 2) (conj 2001))] (is (every? integer? (conj v1129-pre 2002))))) (deftest test-crrbv-21 ;; The following sequence of operations gave a different exception ;; than the above with core.rrb-vector version 0.0.14, and was a ;; different root cause with a distinct fix required. I do not ;; recall whether it was the same root cause as ;; npe-for-1025-then-pop! but both test cases are included for extra ;; testing goodness. (let [v1128 (:marbles (last (play-rrbv 10 1128))) vpop1 (reduce (fn [v i] (pop v)) v1128 (range 1026))] (is (every? integer? (pop vpop1))) ;; The transient version below gave a similar exception with ;; version 0.0.14, but the call stack went through the transient ;; version of popTail, rather than the persistent version of ;; popTail that the one above does. (is (every? integer? (persistent! (pop! (transient vpop1))))))) (deftest test-crrbv-22 (testing "pop! from a regular transient vector with 32*32+1 elements" (let [v1025 (into (fv/vector) (range 1025))] (is (= (persistent! (pop! (transient v1025))) (range 1024))))) (testing "pop from a persistent regular vector with 32*32+1 elements" (let [v1025 (into (fv/vector) (range 1025))] (is (= (pop v1025) (range 1024)))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; This code was copied from ;; https://github.com/mattiasw2/adventofcode1/blob/master/src/adventofcode1/nineteen_b.clj ;; mentioned in issue ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-14 (defn puzzle-b [n my-vec my-catvec my-subvec] (letfn [(remove-at [arr idx] (my-catvec (my-subvec arr 0 idx) (my-subvec arr (inc idx)))) (create-arr [size] (my-vec (range 1 (inc size)))) (fv-rest [arr] (my-subvec arr 1)) (calculate-opposite [n] (int (/ n 2))) (move [elfs] (let [lc (count elfs)] (if (= 1 lc) {:ok (first elfs)} (let [current (first elfs) opposite-pos (calculate-opposite lc) _ (assert (> opposite-pos 0)) _ (assert (< opposite-pos lc)) opposite-elf (nth elfs opposite-pos) other2 (fv-rest (remove-at elfs opposite-pos))] (my-catvec other2 (dv/cvec [current])))))) (puzzle-b-sample [elfs round] (let [elfs2 (move elfs)] ;;(println "round=" round "# elfs=" (count elfs)) (if (:ok elfs2) (:ok elfs2) (recur elfs2 (inc round)))))] (puzzle-b-sample (create-arr n) 1))) (defn puzzle-b-core [n] (puzzle-b n clojure.core/vec clojure.core/into clojure.core/subvec)) (defn get-shift [v] (.-shift v)) (defn vstats [v] (str "cnt=" (count v) " shift=" (get-shift v) " %=" (pd/format "%5.1f" (* 100.0 (dv/fraction-full v))))) ;;(def custom-catvec-data (atom [])) (defn custom-catvec [& args] (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) ;;n (count @custom-catvec-data) max-arg-shift (apply max (map get-shift args)) ret (apply my-catvec args) ret-shift (get-shift ret)] (when (or (>= ret-shift 30) (> ret-shift max-arg-shift)) (doall (map-indexed (fn [idx v] (println (str "custom-catvec ENTER v" idx " " (vstats v)))) args)) (println (str "custom-catvec LEAVE ret " (vstats ret)))) ;;(swap! custom-catvec-data conj {:args args :ret ret}) ;;(println "custom-catvec RECRD in index" n "of @custom-catvec-data") ret)) (defn puzzle-b-rrbv [n] (let [my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (puzzle-b n fv/vec custom-catvec my-subvec))) (deftest test-crrbv-14 ;; This one passes (u/reset-optimizer-counts!) (is (= (puzzle-b-core 977) (puzzle-b-rrbv 977))) (u/print-optimizer-counts) ;; (puzzle-b-rrbv 978) throws ;; ArrayIndexOutOfBoundsException (u/reset-optimizer-counts!) (is (integer? (puzzle-b-rrbv 978))) (u/print-optimizer-counts)) (deftest test-crrbv-30 (let [v1 [1 2 3] tv1 (transient [1 2 3]) fv1 (fv/vector 1 2 3) tfv1 (transient (fv/vector 1 2 3))] (doseq [[v msg] [[v1 ""] [fv1 ""] [(map #(nth v1 %) [0 1 2]) "#(nth v1 %)"] [(map #(v1 %) [0 1 2]) "#(v1 %)"] [(map #(nth tv1 %) [0 1 2]) "#(nth tv1 %)"] [(map #(tv1 %) [0 1 2]) "#(tv1 %)"] [(map #(nth fv1 %) [0 1 2]) "#(nth fv1 %)"] [(map #(fv1 %) [0 1 2]) "#(fv1 %)"] [(map #(nth tfv1 %) [0 1 2]) "#(nth tfv1 %)"] [(map #(tfv1 %) [0 1 2]) "#(tfv1 %)"]]] (is (= '(1 2 3) v) (str "Failing case: " msg))))) ================================================ FILE: src/test/cljs/clojure/core/rrb_vector/test_utils.cljs ================================================ (ns clojure.core.rrb-vector.test-utils (:require [clojure.test :as test] [clojure.core.rrb-vector.rrbt :as rrbt])) ;; Parts of this file are nearly identical to ;; src/test/clojure/clojure/core/rrb_vector/test_utils.clj, but also ;; significant parts are specific to each of the clj/cljs versions, so ;; while they could later be combined into a .cljc file, it may not ;; give much benefit to do so. (def extra-checks? false) (defn reset-optimizer-counts! [] (println "reset all optimizer counts to 0") (reset! rrbt/peephole-optimization-count 0) (reset! rrbt/fallback-to-slow-splice-count1 0) (reset! rrbt/fallback-to-slow-splice-count2 0)) (defn print-optimizer-counts [] (println "optimizer counts: peephole=" @rrbt/peephole-optimization-count "fallback1=" @rrbt/fallback-to-slow-splice-count1 "fallback2=" @rrbt/fallback-to-slow-splice-count2)) (defn now-msec [] (js/Date.now)) (def num-deftests-started (atom 0)) (def last-deftest-start-time (atom nil)) (defn print-test-env-info [] (println "extra-checks?=" extra-checks?) (println "*clojurescript-version*" *clojurescript-version*)) (defmethod test/report [:cljs.test/default :begin-test-var] [m] (let [n (swap! num-deftests-started inc)] (when (== n 1) (print-test-env-info))) (println) (println "starting cljs test" (:var m)) (reset! last-deftest-start-time (now-msec))) (defmethod test/report [:cljs.test/default :end-test-var] [m] (println "elapsed time (sec)" (/ (- (now-msec) @last-deftest-start-time) 1000.0))) ;; Enable tests to be run on versions of Clojure before 1.10, when ;; ex-message was added. (defn ex-message-copy "Returns the message attached to the given Error / ExceptionInfo object. For non-Errors returns nil." [ex] (when (instance? js/Error ex) (.-message ex))) (defn ex-cause-copy "Returns exception cause (an Error / ExceptionInfo) if ex is an ExceptionInfo. Otherwise returns nil." [ex] (when (instance? ExceptionInfo ex) (.-cause ex))) ================================================ FILE: src/test/clojure/clojure/core/rrb_vector/long_test.clj ================================================ (ns clojure.core.rrb-vector.long-test (:require [clojure.test :as test :refer [deftest testing is are]] [clojure.core.rrb-vector.test-utils :as u] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.debug :as dv] [clojure.core.rrb-vector.debug-platform-dependent :as dpd]) (:import (clojure.lang ExceptionInfo))) ;; The intent is to keep this file as close to ;; src/test/cljs/clojure/core/rrb_vector/long_test.cljs as possible, ;; so that when we start requiring Clojure 1.7.0 and later for this ;; library, this file and that one can be replaced with a common test ;; file with the suffix .cljc ;; Note that the namespace of this file _intentionally_ does not match ;; the pattern of namespaces that are run for ClojureScript tests by ;; default. That is because of how long the tests in this file take ;; to run. It seems best to include them in the set of tests in such ;; a way that it is only run when a developer explicitly wants to run ;; longer tests. It should not be run by default when running on ;; build.clojure.org. ;; Currently the Clojure/JVM versions of these tests _are_ run by ;; default, and on build.clojure.org, but at least the ones in here ;; now run significantly faster on Clojure/JVM than they do in any of ;; the JavaScript runtimes I have tested with. (dv/set-debug-opts! dv/full-debug-opts) (def generative-test-length :short) (def check-subvec-params (case generative-test-length :short [125 100000 10] :medium [250 200000 20] :long [250 200000 20])) (deftest test-slicing-generative (testing "slicing (generative)" (is (try (apply dv/generative-check-subvec u/extra-checks? check-subvec-params) (catch ExceptionInfo e (throw (ex-info (dpd/format "%s: %s %s" (u/ex-message-copy e) (:init-cnt (ex-data e)) (:s&es (ex-data e))) {} (u/ex-cause-copy e)))))))) ;; short: 2 to 3 sec ;; medium: 50 to 60 sec (def check-catvec-params (case generative-test-length :short [ 10 30 10 60000] :medium [250 30 10 60000] :long [250 30 10 60000])) (deftest test-splicing-generative (testing "splicing (generative)" (is (try (apply dv/generative-check-catvec u/extra-checks? check-catvec-params) (catch ExceptionInfo e (throw (ex-info (dpd/format "%s: %s" (u/ex-message-copy e) (:cnts (ex-data e))) {} (u/ex-cause-copy e)))))))) ;; This problem reproduction code is from CRRBV-17 ticket: ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-17 (def benchmark-size 100000) ;; This small variation of the program in the ticket simply does ;; progress debug printing occasionally, as well as extra debug ;; checking of the results occasionally. ;; If you enable the printing of the message that begins ;; with "splice-rrbts result had shift" in function ;; fallback-to-slow-splice-if-needed, then run this test, you will see ;; it called hundreds or perhaps thousands of times. The fallback ;; approach is effective at avoiding a crash for this scenario, but at ;; a dramatic extra run-time cost. (defn vector-push-f [v my-catvec extra-checks-catvec] (loop [v v i 0] (let [check? (or (zero? (mod i 10000)) (and (> i 99000) (zero? (mod i 100))) (and (> i 99900) (zero? (mod i 10))))] (when check? (print "i=" i " ") (u/print-optimizer-counts)) (if (< i benchmark-size) (recur (if check? (extra-checks-catvec (fv/vector i) v) (my-catvec (fv/vector i) v)) (inc i)) v)))) ;; Approximate run times for this test on a 2015 MacBook Pro ;; 36 sec - clj 1.10.1, OpenJDK 11.0.4 ;; 465 sec - cljs 1.10.439, OpenJDK 11.0.4, Nashorn JS runtime ;; 138 sec - cljs 1.10.238, OpenJDK 11.0.4, nodejs 8.10.0 ;; 137 sec - cljs 1.10.238, OpenJDK 11.0.4, Spidermonkey JavaScript-C52.9.1 (deftest test-crrbv-17 (u/reset-optimizer-counts!) (is (= (reverse (range benchmark-size)) (vector-push-f (fv/vector) fv/catvec dv/checking-catvec)))) ================================================ FILE: src/test/clojure/clojure/core/rrb_vector/test_clj_only.clj ================================================ (ns clojure.core.rrb-vector.test-clj-only (:require [clojure.test :as test :refer [deftest testing is are]] [clojure.template :refer [do-template]] [clojure.reflect :as ref] [clojure.core.rrb-vector.test-utils :as u] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.debug :as dv] [clojure.test.check :as tc] [clojure.test.check.properties :as prop] [clojure.test.check.generators :as gen]) (:import (java.util NoSuchElementException))) (dv/set-debug-opts! dv/full-debug-opts) (defn clj-version-at-least [major-minor-vector] (let [clj-version ((juxt :major :minor) *clojure-version*) cmp (compare clj-version major-minor-vector)] (>= cmp 0))) (deftest test-iterators (let [v (fv/catvec (dv/cvec (range 1000)) (dv/cvec (range 1000 2048)))] (is (= (iterator-seq (.iterator ^Iterable v)) (iterator-seq (.iterator ^Iterable (seq v))) (iterator-seq (.listIterator ^java.util.List v)) (iterator-seq (.listIterator ^java.util.List (seq v))) (range 2048))) (is (= (iterator-seq (.listIterator ^java.util.List v 100)) (iterator-seq (.listIterator ^java.util.List (seq v) 100)) (range 100 2048))) (letfn [(iterator [xs] (.iterator ^Iterable xs)) (list-iterator ([xs] (.listIterator ^java.util.List xs)) ([xs start] (.listIterator ^java.util.List xs start)))] (do-template [iexpr cnt] (is (thrown? NoSuchElementException (let [iter iexpr] (dotimes [_ (inc cnt)] (.next ^java.util.Iterator iter))))) (iterator v) 2048 (iterator (seq v)) 2048 (list-iterator v) 2048 (list-iterator (seq v)) 2048 (list-iterator v 100) 1948 (list-iterator (seq v) 100) 1948)))) ;; This test can run in cljs, too, but at least in my testing only if ;; we use test.check version 0.10.0 or later. However, that seems to ;; be incompatible with running cljs tests with Clojure 1.6.0, so for ;; now at least this test is clj-only. ;; ;; Note: according to several deftest forms within the test.check ;; library's own internal set of tests, it ;; uses (is (:result (tc/quick-check ...))) to check whether a result ;; passes or fails. ;; ;; The doc string for the latest version of test.check as of ;; 2019-Sep-25 says :result is a legacy key, and that :pass? should ;; have the same value. I like the descriptiveness of :pass? better, ;; and would prefer to use that here, but core.rrb-vector is not using ;; that latest version of test.check yet. Consider updating to use ;; key :pass? instead of :result if core.rrb-vector updates to a ;; version of test.check that returns that key. ;; ;; When quick-check finds a failing test case, it still returns a map ;; that Clojure considers to be a logical true value, so the test will ;; still pass if you do `(is (tc/quick-check ...))`. (deftest test-reduce-subvec-catvec-generative (letfn [(insert-by-sub-catvec [v n] (fv/catvec (fv/subvec v 0 n) (dv/cvec ['x]) (fv/subvec v n))) (repeated-subvec-catvec [i] (reduce insert-by-sub-catvec (dv/cvec (range i)) (range i 0 -1)))] (is (:result (tc/quick-check 1000 (prop/for-all [cnt (gen/fmap (comp inc #(mod % 60000)) gen/pos-int)] (= (repeated-subvec-catvec cnt) (interleave (range cnt) (repeat 'x))))))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; This code was copied from the issue: ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-13 (defn assoc-in-bytevec [my-vector-of use-transient? n indices] (let [coll (into (my-vector-of :byte) (range n)) coll2 (reduce (fn [coll i] (if use-transient? (assoc! coll i -1) (assoc coll i -1))) (if use-transient? (transient coll) coll) indices)] (if use-transient? (persistent! coll2) coll2))) (defn assoc-in-bytevec-core [& args] (apply assoc-in-bytevec clojure.core/vector-of args)) (defn assoc-in-bytevec-rrbv [& args] (apply assoc-in-bytevec fv/vector-of args)) (deftest test-crrbv-13 (doseq [use-transient? [false true]] (doseq [args [[10 [5]] [32 [0]] [32 [32]] [64 [32]] [64 [64]]]] (is (= (apply assoc-in-bytevec-core false args) (apply assoc-in-bytevec-rrbv use-transient? args)) (str "args=" (cons use-transient? args)))) (doseq [args [[64 [0]] [64 [1]] [64 [31]]]] (is (= (apply assoc-in-bytevec-core false args) (apply assoc-in-bytevec-rrbv use-transient? args)) (str "args=" (cons use-transient? args)))))) ;; Double check that the type of the mutable fields used to store the ;; cached hash values of collections are 32-bit int, not 64-bit long, ;; because 64-bit long do not have the Java Memory Model thread-safety ;; guarantees that 32-bit int values do. (defn member-data-by-name [klass field-name-as-symbol] (let [klass-dat (ref/type-reflect klass) members (:members klass-dat)] (first (filter (fn [x] (= field-name-as-symbol (:name x))) members)))) (deftest test-crrbv-26 ;; For reasons I do not understand, the code below throws an ;; exception with Clojure 1.6.0 and earlier because it cannot find ;; the Vector and VecSeq classes. It seems to work fine on Clojure ;; 1.7.0 and later, and checking on those versions is enough for the ;; purposes of this test. (when (clj-version-at-least [1 7]) (let [vector-class (Class/forName "clojure.core.rrb_vector.rrbt.Vector") vecseq-class (Class/forName "clojure.core.rrb_vector.rrbt.VecSeq")] (is (= 'int (:type (member-data-by-name vector-class '_hash)))) (is (= 'int (:type (member-data-by-name vector-class '_hasheq)))) (is (= 'int (:type (member-data-by-name vecseq-class '_hash)))) (is (= 'int (:type (member-data-by-name vecseq-class '_hasheq))))))) ================================================ FILE: src/test/clojure/clojure/core/rrb_vector/test_cljs.clj ================================================ ; Copyright (c) Rich Hickey. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns ^{:doc "Clojurescript tests for core.rrb-vector"} clojure.core.rrb-vector.test-cljs (:require [clojure.test :refer :all])) ;; This file was copied from namespace clojure.data.xml.test-cljs ;; in the data.xml library tests, then modified for use by ;; core.rrb-vector, so that core.rrb-vector's ClojureScript tests ;; could also be run on build.clojure.org via a mvn command. (deftest ^:cljs-nashorn clojurescript-test-suite (try (require 'clojure.core.rrb-vector.cljs-testsuite) (eval '(clojure.core.rrb-vector.cljs-testsuite/run-testsuite! "target/cljs-test-nashorn")) (catch Exception e (if (or (neg? (compare ((juxt :major :minor) *clojure-version*) [1 8])) (neg? (compare (System/getProperty "java.runtime.version") "1.8"))) (println "WARN: ignoring cljs testsuite error on clojure < 1.8 or jdk < 1.8" *clojure-version* (System/getProperty "java.runtime.name") (System/getProperty "java.vm.version") (System/getProperty "java.runtime.version") \newline (str e)) (do (println "ERROR: cljs nashorn test suite should be able to run on clojure >= 1.8 and jdk >= 1.8" *clojure-version* (System/getProperty "java.runtime.name") (System/getProperty "java.vm.version") (System/getProperty "java.runtime.version")) (throw e)))))) ================================================ FILE: src/test/clojure/clojure/core/rrb_vector/test_common.clj ================================================ (ns clojure.core.rrb-vector.test-common (:require [clojure.test :as test :refer [deftest testing is are]] [clojure.core.reducers :as r] [clojure.core.rrb-vector.test-utils :as u] [clojure.core.rrb-vector :as fv] [clojure.core.rrb-vector.debug :as dv] [clojure.core.rrb-vector.debug-platform-dependent :as pd])) ;; The intent is to keep this file as close to ;; src/test/cljs/clojure/core/rrb_vector/test_common.cljs as possible, ;; so that when we start requiring Clojure 1.7.0 and later for this ;; library, this file and that one can be replaced with a common test ;; file with the suffix .cljc (dv/set-debug-opts! dv/full-debug-opts) (deftest test-slicing (testing "slicing" (is (dv/check-subvec u/extra-checks? 32000 10 29999 1234 18048 10123 10191)))) (deftest test-splicing (testing "splicing" (is (dv/check-catvec u/extra-checks? 1025 1025 3245 1025 32768 1025 1025 10123 1025 1025)) (is (dv/check-catvec u/extra-checks? 10 40 40 40 40 40 40 40 40)) (is (apply dv/check-catvec u/extra-checks? (repeat 30 33))) (is (dv/check-catvec u/extra-checks? 26091 31388 1098 43443 46195 4484 48099 7905 13615 601 13878 250 10611 9271 53170)) ;; Order that catvec will perform splicev calls: (let [my-splice (if u/extra-checks? dv/checking-splicev fv/catvec) counts [26091 31388 1098 43443 46195 4484 48099 7905 13615 601 13878 250 10611 9271 53170] prefix-sums (reductions + counts) ranges (map range (cons 0 prefix-sums) prefix-sums) [v01 v02 v03 v04 v05 v06 v07 v08 v09 v10 v11 v12 v13 v14 v15] (map fv/vec ranges) v01-02 (my-splice v01 v02) ;; top level catvec call v03-04 (my-splice v03 v04) ;; top level catvec call v01-04 (my-splice v01-02 v03-04) ;; top level catvec call v05-06 (my-splice v05 v06) ;; recurse level 1 catvec call v07-08 (my-splice v07 v08) ;; recurse level 1 catvec call v05-08 (my-splice v05-06 v07-08) ;; recurse level 1 catvec call v09-10 (my-splice v09 v10) ;; recurse level 2 catvec call v11-12 (my-splice v11 v12) ;; recurse level 2 catvec call v09-12 (my-splice v09-10 v11-12) ;; recurse level 2 catvec call v13-14 (my-splice v13 v14) ;; recurse level 3 catvec call v13-15 (my-splice v13-14 v15) ;; recurse level 3 catvec call v09-15 (my-splice v09-12 v13-15) ;; recurse level 2 catvec call v05-15 (my-splice v05-08 v09-15) ;; recurse level 1 catvec call v01-15 (my-splice v01-04 v05-15) ;; top level catvec call exp-val (range (last prefix-sums))] (is (= -1 (dv/first-diff v01-15 exp-val))) (is (= -1 (dv/first-diff (into v01-04 v05-15) exp-val)))))) (deftest test-reduce (let [v1 (vec (range 128)) v2 (fv/vec (range 128))] (testing "reduce" (is (= (reduce + v1) (reduce + v2)))) (testing "reduce-kv" (is (= (reduce-kv + 0 v1) (reduce-kv + 0 v2)))))) (deftest test-reduce-2 (let [my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec) v1 (my-subvec (dv/cvec (range 1003)) 500) v2 (dv/cvec (range 500 1003))] (is (= (reduce + 0 v1) (reduce + 0 v2) (reduce + 0 (r/map identity (seq v1))) (reduce + 0 (r/map identity (seq v2))))))) (deftest test-reduce-3 (let [v0 (vec []) rv0 (fv/vec [])] (testing "reduce" (is (= (reduce + v0) (reduce + rv0)))) (testing "reduce-kv" (is (= (reduce-kv + 0 v0) (reduce-kv + 0 rv0)))))) (deftest test-seq (let [v (fv/vec (range 128)) s (seq v)] (testing "seq contents" (is (= v s))) (testing "chunked-seq?" (is (chunked-seq? s))) (testing "internal-reduce" (is (satisfies? clojure.core.protocols/InternalReduce s))))) (deftest test-assoc (let [my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (let [v1 (fv/vec (range 40000)) v2 (reduce (fn [out [k v]] (assoc out k v)) (assoc v1 40000 :foo) (map-indexed vector (rseq v1)))] (is (= (concat (rseq v1) [:foo]) v2))) (are [i] (= :foo (-> (range 40000) (fv/vec) (my-subvec i) (assoc 10 :foo) (nth 10))) 1 32 1024 32768))) (deftest test-assoc! (let [my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (let [v1 (fv/vec (range 40000)) v2 (persistent! (reduce (fn [out [k v]] (assoc! out k v)) (assoc! (transient v1) 40000 :foo) (map-indexed vector (rseq v1))))] (is (= (concat (rseq v1) [:foo]) v2))) (are [i] (= :foo (-> (range 40000) (fv/vec) (my-subvec i) (transient) (assoc! 10 :foo) (persistent!) (nth 10))) 1 32 1024 32768))) (deftest test-relaxed (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec)] (is (= (into (my-catvec (dv/cvec (range 123)) (dv/cvec (range 68))) (range 64)) (concat (range 123) (range 68) (range 64)))) (is (= (dv/slow-into (my-catvec (dv/cvec (range 123)) (dv/cvec (range 68))) (range 64)) (concat (range 123) (range 68) (range 64)))))) (deftest test-hasheq (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec)] (is (= (hash []) (hash (fv/vector)))) ;; CRRBV-25 (let [v1 (dv/cvec (range 1024)) v2 (dv/cvec (range 1024)) v3 (my-catvec (dv/cvec (range 512)) (dv/cvec (range 512 1024))) s1 (seq v1) s2 (seq v2) s3 (seq v3)] (is (= (hash v1) (hash v2) (hash v3) (hash s1) (hash s2) (hash s3))) (is (= (hash (nthnext s1 120)) (hash (nthnext s2 120)) (hash (nthnext s3 120))))))) (deftest test-reduce-subvec-catvec (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (letfn [(insert-by-sub-catvec [v n] (my-catvec (my-subvec v 0 n) (dv/cvec ['x]) (my-subvec v n))) (repeated-subvec-catvec [i] (reduce insert-by-sub-catvec (dv/cvec (range i)) (range i 0 -1)))] (is (= (repeated-subvec-catvec 2371) (interleave (range 2371) (repeat 'x))))))) (def pos-infinity Double/POSITIVE_INFINITY) (deftest test-reduce-subvec-catvec2 (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (letfn [(insert-by-sub-catvec [v n] (my-catvec (my-subvec v 0 n) (dv/cvec ['x]) (my-subvec v n))) (repeated-subvec-catvec [i] (reduce insert-by-sub-catvec (dv/cvec (range i)) (take i (interleave (range (quot i 2) pos-infinity) (range (quot i 2) pos-infinity)))))] (let [n 2371 v (repeated-subvec-catvec n)] (is (every? #(or (integer? %) (= 'x %)) v)) (is (= (count v) (* 2 n))))))) (deftest test-splice-high-subtree-branch-count (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec) x (fv/vec (repeat 1145 \a)) y (my-catvec (my-subvec x 0 778) (my-subvec x 778 779) (dv/cvec [1]) (my-subvec x 779)) z (my-catvec (my-subvec y 0 780) (dv/cvec [2]) (my-subvec y 780 781) (my-subvec y 781)) res (my-catvec (my-subvec z 0 780) (dv/cvec []) (dv/cvec [3]) (my-subvec z 781)) expected (concat (repeat 779 \a) [1] [3] (repeat 366 \a))] (is (= res expected)))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; This problem reproduction code is from CRRBV-12 ticket: ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-12 ;; I would prefer to have all of the data that is the value of ;; crrbv-12-data read from a separate file, but it is not terribly ;; long, and having it in the code avoids having to figure out how to ;; find and read the file on N different JavaScript runtime ;; environments, for the ClojureScript version of the test. (def crrbv-12-data [7912 7831 5393 5795 6588 2394 6403 6237 6152 5890 6507 6388 6100 7400 6340 7624 6379 5430 6335 5883 5570 6220 6319 6442 5666 3901 6974 5440 6626 7782 6760 6066 7763 9547 5585 6724 5407 5675 7727 7666 6845 6658 5409 7304 7291 5826 6523 5529 7387 6275 7193 5563 6572 7150 2949 1133 7312 7267 7135 7787 5812 7372 4295 5937 2931 4846 6149 1901 6680 7319 7845 7517 6722 6535 6362 5457 6649 7757 7463 6755 7436 6364 7361 7174 6048 6657 6533 5763 6074 6744 6734 5668 61 3842 5395 6489 1723 6248 7664 6645 5943 5428 6995 6688 7088 6305 6198 6197 5765 3691 7157 7305 7631 6058 6655 7846 7746 686 6024 6473 6150 5951 1761 7900 7084 5637 6607 5561 5772 7232 8512 6249 7377 5437 4830 6939 6355 7100 7884 951 6765 7054 1367 4580 7284 5414 7344 7525 5801 6374 6685 6737 4413 7353 1851 5973 7538 7116 6359 6605 6743 6153 7398 4757 6623 7546 7013 7091 7501 5749 6368 7911 6675 3246 6304 6469 6868 7701 5768 6369 6996 6346 6171 5884 6757 7615 5986 9904 5982 7049 6011 7716 6646 6178 6636 6637 7700 3390 6107 6938 2513 5663 5309 5673 7069 6615 5825 7183 5600 2188 5807 7635 7257 4803 6740 5865 6869 6968 7404 5124 7565 6169 7681 6181 5427 9861 7669 5936 5588 5463 6059 5695 5784 6768 6922 5720 6229 9173 6486 6399 6013 5517 7198 7320 6970 5969 7593 7351 7622 6561 5739 6433 6452 6320 6979 6260 6763 5539 6292 7133 6571 6108 7455 8470 7148 7597 6935 6865 7852 6549 6506 5425 6552 5551 5612 7230 809 2694 6408 6783 7626 6703 2754 1015 6809 7584 5473 6165 7105 6447 5856 6739 5564 7886 7856 7355 5814 919 6900 6257 118 7259 7419 6278 7619 6401 5970 7537 2899 6012 7190 5500 6122 5817 7620 6402 5811 5412 6822 5643 6138 5948 5523 4884 6460 5828 7159 5405 6224 7192 8669 5827 538 7416 6598 5577 6769 7547 7323 6748 6398 1505 6211 6466 6699 6207 6444 6863 7646 5917 6796 5619 6282 354 6418 5687 2536 6238 1166 6376 3852 5955 188 7218 7477 6926 7694 7253 5880 5424 7392 6337 7438 7814 3205 6336 6465 6812 1102 6468 6034 6133 5849 7578 7863 5761 6372 7568 5813 6380 6481 6942 7676 5552 7015 7120 7838 5684 6101 6834 6092 7917 6124 867 7187 5527 7488 5900 6267 6443 724 6073 6608 6407 6040 5540 6061 5554 5469 6255 6542 7336 2272 6921 1078 5593 7045 5013 6870 6712 6537 6785 6333 5892 6633 7522 6697 5915 5567 6606 5820 7653 7554 6932 5824 9330 8780 7203 7204 7519 7633 6529 7564 5718 7605 6579 7621 4462 6009 6950 6430 5911 5946 6877 7830 6570 7421 6449 6684 8425 5983 5846 5505 6097 5773 5781 6463 6867 5774 6601 1577 5642 6959 6251 7741 7391 6036 6892 5097 6874 6580 6348 5904 6709 5976 7411 7223 6252 7414 6813 4378 5888 5546 6385 401 5912 7828 7775 5925 6151 7648 5810 7673 6250 5808 7251 1407 5644 7439 7901 1964 6631 6858 7630 7771 2892 946 6397 5443 5715 5665 7306 6233 5566 5447 7011 6314 2054 5786 2170 6901 6077 6239 7791 6960 7891 7878 7758 5829 7611 7059 5455 6654 6459 6949 7406 7854 5805 6564 7033 6445 5939 6706 6103 7614 7902 6527 7479 6196 6484 3521 7269 6055 7331 6184 6746 6936 5891 6687 5771 7136 6625 7865 5864 6704 7726 5842 6295 6910 5277 7528 5689 5674 7457 7086 5220 317 7720 6720 5913 7098 5450 7275 7521 7826 7007 6378 7277 6844 7177 5482 97 6730 7861 5601 6000 6039 6953 5624 6450 6736 7492 5499 5822 7276 2889 7102 6648 6291 865 7348 7330 1449 6719 5550 7326 6338 6714 7805 7082 6377 2791 7876 5870 7107 7505 5416 7057 6021 7037 6331 5698 6721 5180 7390 5938 9067 7215 4566 8051 6557 6161 5894 1379 7335 2602 6520 7199 6878 6366 6948 7202 4791 7338 7442 5987 7099 7632 5453 4755 4947 7786 6254 7103 7595 6670 6485 6117 6756 6339 7240 7609 6853 6299 7205 4857 7511 576 5835 5396 5997 5508 6413 6219 5403 7686 9189 6634 5503 6801 7508 5611 7667 7572 7587 6015 7153 7340 6279 5646 2004 2708 7119 5737 3258 7427 6204 6476 6511 2300 7055 5389 6984 5438 6002 6272 5756 5734 6913 6425 6847 5657 6357 6862 6030 5522 6943 3518 6139 6671 7764 6493 5691 6082 4635 6640 6898 7262 9391 6828 2277 6690 6464 5759 7441 6622 1262 7114 6294 7070 6539 6788 6167 7824 6382 2512 7322 5992 7696 5445 5538 6140 7151 6409 7085 6166 6263 1194 5544 7141 5906 2939 7389 7290 6491 6322 8324 7341 7246 5610 7536 6946 7540 7760 6293 5589 7009 7822 5456 6805 5841 7722 5559 7265 6903 3517 1243 6078 7180 6147 8063 7395 7551 5460 6421 7567 6546 6941 6301 5486 7347 6479 5990 5932 6881 7737 6051 7375 5762 6897 2967 7297 7263 6965 6752 6158 7556 6794 7641 7628 2374 6289 7286 7581 6008 491 6919 9157 7002 6585 7960 6967 7692 7128 5680 5037 5752 6223 5989 7545 6584 7282 6221 871 6116 5484 6350 6266 6889 6216 1892 924 5875 7658 5461 5410 8352 7072 5724 6931 6050 6125 5519 6711 7518 6613 7576 7989 5603 7214 6664 2933 5839 7454 9353 6512 7242 7768 6037 6567 6673 8438 7364 5406 6080 577 6895 5742 5722 6944 6273 5965 5464 6876 7719 7311 7258 6829 7280 6028 5740 9162 9858 6695 7239 6972 7025 7147 7039 6226 6135 7219 6477 6708 767 5432 7405 7580 3790 372 7523 6597 5922 6105 5434 9587 6173 7739 5984 5854 2153 6912 7476 7598 5985 5874 8723 5628 5496 7352 4829 6483 7211 6933 5545 7544 5444 5790 8223 1089 6676 5667 6749 6777 5429 6347 5399 5662 6446 5524 6909 5415 7742 6343 5921 7160 7175 7026 1838 6894 4355 52 6192 5341 6945 7366 7816 2006 7380 6531 6904 5958 6270 6069 5574 7349 7212 5256 6010 6961 2825 6691 7792 6017 6888 7707 6693 6456 5871 7238 7780 7256 5630 7744 6855 5077 6958 6046 6707 6530 6501 7298 5636 6121 1105 6243 5541 6814 6732 7500 6866 7093 7745 7030 4338 6517 5991 6458 6213 4695 5542 7853 5926 6550 5230 7432 7006 5858 7677 6495 7310 6432 7487 7670 7674 6245 7315 7893 4360 940 6303 5757 7697 7506 5491 1309 7695 2214 5553 6964 7403 7302 6589 7851 7186 6193 2964 6242 6545 7012 7010 5448 5767 6647 7610 7485 6509 6083 6525 5607 9982 6244 7832 7213 6308 1320 7092 5656 6342 7864 7140 2577 104 1343 6786 7654 6156 5584 6818 5604 6681 6038 6056 6594 6603 7040 5468 5957 7229 6735 5510 6700 7725 7431 7154 7682 6558 7158 7470 7749 5400 5397 7247 6582 5832 7041 7325 5777 6759 6577 6195 7895 9626 7042 6026 6741 7811 7942 8926 1499 6772 7561 5565 3587 7273 6172 7428 6787 7181 5754 7579 5535 5543 5818 7264 1854 6998 7425 5394 6661 6562 375 2990]) (defn quicksort [my-catvec v] (if (<= (count v) 1) v (let [[x & xs] v] (my-catvec (quicksort my-catvec (dv/cvec (filter #(<= % x) xs))) (dv/cvec [x]) (quicksort my-catvec (dv/cvec (filter #(> % x) xs))))))) (defn ascending? [coll] (every? (fn [[a b]] (<= a b)) (partition 2 1 coll))) (deftest test-crrbv-12 (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) v (dv/cvec crrbv-12-data)] (testing "Ascending order after quicksort" (is (ascending? (quicksort my-catvec v)))) (testing "Repeated catvec followed by pop" (is (= [] (nth (iterate pop (nth (iterate #(my-catvec (dv/cvec [0]) %) []) 963)) 963)))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (defn npe-for-1025-then-pop! [kind] (let [my-pop! (if u/extra-checks? dv/checking-pop! pop!) bfactor-squared (* 32 32) mk-vector (case kind :object-array fv/vector :long-array #(fv/vector-of :long)) boundary 54 v1 (-> (mk-vector) (into (range boundary)) (into (range boundary (inc bfactor-squared)))) v2 (-> (mk-vector) (into (range bfactor-squared)) (transient) (my-pop!) (persistent!)) v3 (-> (mk-vector) (into (range boundary)) (into (range boundary (inc bfactor-squared))) (transient) (my-pop!) (persistent!)) v4 (-> (mk-vector) (into (range (inc bfactor-squared))) (transient) (my-pop!) (persistent!))] (is (= (seq v1) (range (inc bfactor-squared)))) (is (= (seq v2) (range (dec bfactor-squared)))) ;; This used to fail with core.rrb-vector version 0.0.14 with ;; NullPointerException while traversing the seq on clj. It gets ;; a different kind of error with cljs. (is (= (seq v3) (range bfactor-squared))) ;; This one caused a NullPointerException with version 0.0.14 ;; while traversing the seq (is (= (seq v4) (range bfactor-squared))))) (deftest test-npe-for-1025-then-pop! (doseq [kind [:object-array :long-array]] (npe-for-1025-then-pop! kind))) ;; This problem reproduction code is slightly modified from a version ;; provided in a comment by Mike Fikes on 2018-Dec-09 for this issue: ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-20 (defn play [my-vector my-catvec my-subvec players rounds] (letfn [(swap [marbles split-ndx] (my-catvec (my-subvec marbles split-ndx) (my-subvec marbles 0 split-ndx))) (rotl [marbles n] (swap marbles (mod n (count marbles)))) (rotr [marbles n] (swap marbles (mod (- (count marbles) n) (count marbles)))) (place-marble [marbles marble] (let [marbles (rotl marbles 2)] [(my-catvec (my-vector marble) marbles) 0])) (remove-marble [marbles marble] (let [marbles (rotr marbles 7) first-marble (nth marbles 0)] [(my-subvec marbles 1) (+ marble first-marble)])) (play-round [marbles round] (if (zero? (mod round 23)) (remove-marble marbles round) (place-marble marbles round))) (add-score [scores player round-score] (if (zero? round-score) scores (assoc scores player (+ (get scores player 0) round-score))))] (loop [marbles (my-vector 0) round 1 player 1 scores {} ret []] (let [[marbles round-score] (play-round marbles round) scores (add-score scores player round-score)] (if (> round rounds) (conj ret {:round round :marbles marbles}) (recur marbles (inc round) (if (= player players) 1 (inc player)) scores (conj ret {:round round :marbles marbles}))))))) (defn play-core [& args] (apply play clojure.core/vector clojure.core/into clojure.core/subvec args)) (defn play-rrbv [& args] (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (apply play fv/vector my-catvec my-subvec args))) (deftest test-crrbv-20 ;; This one passes (is (= (play-core 10 1128) (play-rrbv 10 1128))) ;; This ends up with (play-rrbv 10 1129) throwing an exception, with ;; core.rrb-vector version 0.0.14 (is (= (play-core 10 1129) (play-rrbv 10 1129))) ;; The previous test demonstrates a bug in the transient RRB vector ;; implementation. The one below demonstrated a similar bug in the ;; persistent RRB vector implementation in version 0.0.14. (let [v1128 (:marbles (last (play-rrbv 10 1128))) v1129-pre (-> v1128 (fv/subvec 2) (conj 2001))] (is (every? integer? (conj v1129-pre 2002))))) (deftest test-crrbv-21 ;; The following sequence of operations gave a different exception ;; than the above with core.rrb-vector version 0.0.14, and was a ;; different root cause with a distinct fix required. I do not ;; recall whether it was the same root cause as ;; npe-for-1025-then-pop! but both test cases are included for extra ;; testing goodness. (let [v1128 (:marbles (last (play-rrbv 10 1128))) vpop1 (reduce (fn [v i] (pop v)) v1128 (range 1026))] (is (every? integer? (pop vpop1))) ;; The transient version below gave a similar exception with ;; version 0.0.14, but the call stack went through the transient ;; version of popTail, rather than the persistent version of ;; popTail that the one above does. (is (every? integer? (persistent! (pop! (transient vpop1))))))) (deftest test-crrbv-22 (testing "pop! from a regular transient vector with 32*32+1 elements" (let [v1025 (into (fv/vector) (range 1025))] (is (= (persistent! (pop! (transient v1025))) (range 1024))))) (testing "pop from a persistent regular vector with 32*32+1 elements" (let [v1025 (into (fv/vector) (range 1025))] (is (= (pop v1025) (range 1024)))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; This code was copied from ;; https://github.com/mattiasw2/adventofcode1/blob/master/src/adventofcode1/nineteen_b.clj ;; mentioned in issue ;; https://clojure.atlassian.net/projects/CRRBV/issues/CRRBV-14 (defn puzzle-b [n my-vec my-catvec my-subvec] (letfn [(remove-at [arr idx] (my-catvec (my-subvec arr 0 idx) (my-subvec arr (inc idx)))) (create-arr [size] (my-vec (range 1 (inc size)))) (fv-rest [arr] (my-subvec arr 1)) (calculate-opposite [n] (int (/ n 2))) (move [elfs] (let [lc (count elfs)] (if (= 1 lc) {:ok (first elfs)} (let [current (first elfs) opposite-pos (calculate-opposite lc) _ (assert (> opposite-pos 0)) _ (assert (< opposite-pos lc)) opposite-elf (nth elfs opposite-pos) other2 (fv-rest (remove-at elfs opposite-pos))] (my-catvec other2 (dv/cvec [current])))))) (puzzle-b-sample [elfs round] (let [elfs2 (move elfs)] ;;(println "round=" round "# elfs=" (count elfs)) (if (:ok elfs2) (:ok elfs2) (recur elfs2 (inc round)))))] (puzzle-b-sample (create-arr n) 1))) (defn puzzle-b-core [n] (puzzle-b n clojure.core/vec clojure.core/into clojure.core/subvec)) (defn get-shift [v] (.-shift v)) (defn vstats [v] (str "cnt=" (count v) " shift=" (get-shift v) " %=" (pd/format "%5.1f" (* 100.0 (dv/fraction-full v))))) ;;(def custom-catvec-data (atom [])) (defn custom-catvec [& args] (let [my-catvec (if u/extra-checks? dv/checking-catvec fv/catvec) ;;n (count @custom-catvec-data) max-arg-shift (apply max (map get-shift args)) ret (apply my-catvec args) ret-shift (get-shift ret)] (when (or (>= ret-shift 30) (> ret-shift max-arg-shift)) (doall (map-indexed (fn [idx v] (println (str "custom-catvec ENTER v" idx " " (vstats v)))) args)) (println (str "custom-catvec LEAVE ret " (vstats ret)))) ;;(swap! custom-catvec-data conj {:args args :ret ret}) ;;(println "custom-catvec RECRD in index" n "of @custom-catvec-data") ret)) (defn puzzle-b-rrbv [n] (let [my-subvec (if u/extra-checks? dv/checking-subvec fv/subvec)] (puzzle-b n fv/vec custom-catvec my-subvec))) (deftest test-crrbv-14 ;; This one passes (u/reset-optimizer-counts!) (is (= (puzzle-b-core 977) (puzzle-b-rrbv 977))) (u/print-optimizer-counts) ;; (puzzle-b-rrbv 978) throws ;; ArrayIndexOutOfBoundsException (u/reset-optimizer-counts!) (is (integer? (puzzle-b-rrbv 978))) (u/print-optimizer-counts)) (deftest test-crrbv-30 (let [v1 [1 2 3] tv1 (transient [1 2 3]) fv1 (fv/vector 1 2 3) tfv1 (transient (fv/vector 1 2 3))] (doseq [[v msg] [[v1 "v1"] [fv1 "fv1"] [(map #(nth v1 %) [0 1 2]) "#(nth v1 %)"] [(map #(v1 %) [0 1 2]) "#(v1 %)"] [(map #(nth tv1 %) [0 1 2]) "#(nth tv1 %)"] [(map #(tv1 %) [0 1 2]) "#(tv1 %)"] [(map #(nth fv1 %) [0 1 2]) "#(nth fv1 %)"] [(map #(fv1 %) [0 1 2]) "%(fv1 %)"] [(map #(nth tfv1 %) [0 1 2]) "#(nth tfv1 %)"] [(map #(tfv1 %) [0 1 2]) "#(tfv1 %)"]]] (is (= '(1 2 3) v) (str "Failing case: " msg))))) ================================================ FILE: src/test/clojure/clojure/core/rrb_vector/test_utils.clj ================================================ (ns clojure.core.rrb-vector.test-utils (:require [clojure.test :as test] [clojure.string :as str] [clojure.core.rrb-vector.rrbt :as rrbt])) ;; Parts of this file are nearly identical to ;; src/test/cljs/clojure/core/rrb_vector/test_utils.cljs, but also ;; significant parts are specific to each of the clj/cljs versions, so ;; while they could later be combined into a .cljc file, it may not ;; give much benefit to do so. (def extra-checks? false) (defn reset-optimizer-counts! [] (println "reset all optimizer counts to 0") (reset! rrbt/peephole-optimization-count 0) (reset! rrbt/fallback-to-slow-splice-count1 0) (reset! rrbt/fallback-to-slow-splice-count2 0)) (defn print-optimizer-counts [] (println "optimizer counts: peephole=" @rrbt/peephole-optimization-count "fallback1=" @rrbt/fallback-to-slow-splice-count1 "fallback2=" @rrbt/fallback-to-slow-splice-count2)) (defn now-msec [] (System/currentTimeMillis)) (def num-deftests-started (atom 0)) (def last-deftest-start-time (atom nil)) (defn print-jvm-classpath [] (let [cp-str (System/getProperty "java.class.path") cp-strs (str/split cp-str #":")] (println "java.class.path:") (doseq [cp-str cp-strs] (println " " cp-str)))) (defn print-test-env-info [] (try (let [shift-var (resolve 'clojure.core.rrb-vector.parameters/shift-increment)] (println "shift-increment=" @shift-var " (from parameters namespace)")) (catch Exception e (println "shift-increment=5 (assumed because no parameters namespace)"))) (println "extra-checks?=" extra-checks?) (let [p (System/getProperties)] (println "java.vm.name" (get p "java.vm.name")) (println "java.vm.version" (get p "java.vm.version")) (print-jvm-classpath) (println "(clojure-version)" (clojure-version)))) (defmethod test/report :begin-test-var [m] (let [n (swap! num-deftests-started inc)] (when (== n 1) (print-test-env-info))) (println) (println "starting clj test" (:var m)) (reset! last-deftest-start-time (now-msec))) (defmethod test/report :end-test-var [m] (println "elapsed time (sec)" (/ (- (now-msec) @last-deftest-start-time) 1000.0))) ;; Enable tests to be run on versions of Clojure before 1.10, when ;; ex-message was added. (defn ex-message-copy "Returns the message attached to ex if ex is a Throwable. Otherwise returns nil." {:added "1.10"} [ex] (when (instance? Throwable ex) (.getMessage ^Throwable ex))) (defn ex-cause-copy "Returns the cause of ex if ex is a Throwable. Otherwise returns nil." {:added "1.10"} [ex] (when (instance? Throwable ex) (.getCause ^Throwable ex))) ================================================ FILE: src/test/resources/clojure/core/rrb_vector/cljs_testsuite.clj ================================================ (ns clojure.core.rrb-vector.cljs-testsuite (:require [clojure.test :refer [is]] [cljs.repl.nashorn :as repl-nh] [cljs.build.api :as bapi] [clojure.java.io :as io] [clojure.core.rrb-vector.test-utils :as u]) (:import java.nio.file.Files java.nio.file.attribute.FileAttribute)) ;; This file was copied from namespace clojure.data.xml.cljs-testsuite ;; in the data.xml library tests, then modified for use by ;; core.rrb-vector, so that core.rrb-vector's ClojureScript tests ;; could also be run on build.clojure.org via a mvn command. (defn tempdir [] (str (Files/createTempDirectory "cljs-nashorn-" (into-array FileAttribute [])))) (defn compile-testsuite! [dir] (let [out (io/file dir "tests.js") inputs ["src/main/clojure" "src/test/clojure" "src/test/cljs"]] (println "INFO" "Compiling cljs testsuite from" inputs "into" (str out)) (bapi/build (apply bapi/inputs inputs) {:output-to (str out) :output-dir dir :main 'clojure.core.rrb-vector.test-cljs :optimizations :advanced :pseudo-names true :pretty-print true}))) (defn run-testsuite! [dir] (System/setProperty "nashorn.persistent.code.cache" "target/nashorn_code_cache") (let [t1 (u/now-msec) engine (repl-nh/create-engine)] (compile-testsuite! dir) (println "INFO" "Elapsed time (sec)" (/ (- (u/now-msec) t1) 1000.0) "to compile ClojureScript code") (println "INFO" "Running cljs tests in nashorn with persistent code cache in" (System/getProperty "nashorn.persistent.code.cache")) (.eval engine (io/reader (io/file dir "tests.js"))) (let [{:as res :keys [fail error]} (read-string (.eval engine "clojure.core.rrb_vector.test_cljs._main_nashorn()"))] (is (and (zero? fail) (zero? error)) (pr-str res))))) ================================================ FILE: src/test_local/clojure/clojure/core/rrb_vector_check.clj ================================================ (ns clojure.core.rrb-vector-check (:require [clojure.test :as test :refer [deftest testing is are]] [clojure.core.rrb-vector.test-utils :as u] [clojure.core.rrb-vector :as fv] [clojure.test.check.generators :as gen] [collection-check.core :refer [assert-vector-like]]) (:use clojure.test)) ;; On my 2015 MacBook Pro with JDK 11, a few num-tests values and ;; approximate run time of assert-vector-like test on Clojure ;; implementation: ;; 1,000: 16 sec ;; 10,000: 120 sec (def short-num-tests 1000) (def medium-num-tests 10000) (def long-num-tests 100000) ;;(def num-tests short-num-tests) (def num-tests medium-num-tests) ;;(def num-tests long-num-tests) ;; collection-check.core/assert-vector-like calls test.chuck/checking. ;; The README for the test.chuck library says that test.chuck/checking ;; is intended to be called directly within a `deftest` form, with no ;; need for any `is` or `are` calls, because test.chuck/checking makes ;; calls to those macros inside itself. ;; ;; I have confirmed, by intentionally making the function fv/vector ;; return incorrect values in some cases, that this deftest does fail ;; as it should, given the direct call to function assert-vector-like. (deftest collection-check (println "Before assert-vector-like with num-tests=" num-tests) (assert-vector-like num-tests (fv/vector) gen/int) (println "After assert-vector-like with num-tests=" num-tests) (is (every? nil? (.-array ^clojure.lang.PersistentVector$Node (.-root ^clojure.lang.PersistentVector (vector))))))