Showing preview only (282K chars total). Download the full file or copy to clipboard to get everything.
Repository: scandum/wolfsort
Branch: master
Commit: 56ad38959aee
Files: 18
Total size: 238.6 KB
Directory structure:
gitextract_sm3bx4qr/
├── LICENSE
├── README.md
└── src/
├── bench.c
├── blitsort.c
├── blitsort.h
├── crumsort.c
├── crumsort.h
├── extra_tests.c
├── fluxsort.c
├── fluxsort.h
├── gridsort.c
├── gridsort.h
├── quadsort.c
├── quadsort.h
├── skipsort.c
├── skipsort.h
├── wolfsort.c
└── wolfsort.h
================================================
FILE CONTENTS
================================================
================================================
FILE: LICENSE
================================================
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to <http://unlicense.org>
================================================
FILE: README.md
================================================
Intro
-----
This document describes a stable adaptive hybrid bucket / quick / merge / drop sort named wolfsort.
The bucket sort, forming the core of wolfsort, is not a comparison sort, so wolfsort can be considered
a member of the radix-sort family. Quicksort and mergesort are well known. Dropsort gained popularity
after it was reinvented as Stalin sort. A [benchmark](https://github.com/scandum/wolfsort#benchmark-for-wolfsort-v1154-dripsort) is available at the bottom.
Why a hybrid?
-------------
While an adaptive merge sort is very fast at sorting ordered data, its inability to effectively
partition is its greatest weakness. A radix-like bucket sort, on the other hand, is unable to take advantage of
sorted data. While quicksort is fast at partitioning, a bucket sort is faster on medium-sized
arrays in the 1K - 1M element range. Dropsort in turn hybridizes surprisingly well with bucket
and sample sorts.
History
-------
Wolfsort 1, codename: quantumsort, started out with the concept that memory is in abundance on
modern systems. I theorized that by allocating 8n memory performance could be increased by allowing
a bucket sort to partition in one pass.
Not all the memory would be used or ever accessed however, which is why I envisioned it as a type
of poor-man's quantum computing. The extra memory only serves to simplify computations. The concept
kind of worked, except that large memory allocations in C can be either very fast or very slow. I
didn't investigate why.
I also learned people don't like it when you use the term quantum computing outside of the proper
context, or perhaps they were upset about wolfsort's voracious appetite for memory. Hence it was named.
Wolfsort 2, codename: flowsort, is when I reinvented counting sort. Instead of making 1 pass and
using extra memory to deal with fluctuations in the data, flowsort makes one pass to calculate the
bucket sizes, then makes a second pass to neatly fill the buckets.
Wolfsort 3, codename: dripsort, was inspired by the work of M. Lochbaum on [rhsort](https://github.com/mlochbaum/rhsort)
to use a method similar to dropsort to deal with bucket overflow, and to calculate the minimum and
maximum value to optimize for distributions with a small range of values. Dripsort once again makes
one pass and uses around 4n memory to deal with fluctuations in the data. Compared to v1 this is a
50% reduction in memory allocation, while at the same time significantly increasing robustness.
Analyzer
--------
Wolfsort uses the same analyzer as [fluxsort](https://github.com/scandum/fluxsort) to sort fully
in-order and fully reverse-order distributions in n comparisons. The array is split into 4 segments
for which a measure of presortedness is calculated. Mostly ordered segments are sorted with
[quadsort](https://github.com/scandum/quadsort), while mostly random segments are sorted with wolfsort.
In addition, the minimum and maximum value in the distribution is obtained.
Setting the bucket size
-----------------------
For optimal performance wolfsort needs to have at least 8 buckets, end up with between 1 and 16 elements
per bucket, so the bucket size is set to hold 8 elements on average. However, the buckets should remain
in the L1 cache, so the maximum number of buckets is set at 65536.
This sets the optimal range for wolfsort between 8 * 8 (64) and 8 * 65536 (524,288) elements. Beyond
the optimal range performance will degrade steadily. Once the average bucket size reaches the threshold
of 18 elements (1,179,648 total elements) the sort becomes less optimal than quicksort, though it retains
a computational advantage for a little while longer. However, by recursing once, wolfsort increases the
optimal range to 1 trillion elements.
By computing the minimum and maximum value in the data distribution, the number of buckets are optimized
further to target the sweet spot.
Dropsort
--------
Dropsort was first proposed as an alternative sorting algorithm by David Morgan in 2006, it makes one pass
and is lossy. The algorithm was reinvented in 2018 as Stalin sort. The concept of dropping hash entries in
a non-lossy manner was independently developed by Marshall Lochbaum in 2018 and is utilized in his 2022
release of rhsort (Robin Hood Sort).
Wolfsort allocates 4n memory to allow some deviancy in the data distribution and minimize bucket overflow.
In the case an element is too deviant and overflows the bucket, it is copied in-place to the input
array. In near-optimal cases this results in a minimal drip, in the worst case it will result in a downpour
of elements being copied to the input array.
While a centrally planned partitioning system has its weaknesses, the worst case is mostly alleviated by using
fluxsort on the deviant elements once partitioning finishes. Fluxsort is adaptive and is generally
strong against distributions where wolfsort is weak.
The overall performance gain from incorporating dropsort into wolfsort is approximately 20%, but can reach
an order of magnitude when the fallback is synergetic with fluxsort. Deviant distributions can deceive
wolfsort for a time, but not a very long time.
Small number sorting
--------------------
Since wolfsort uses auxiliary memory, each partition is stable once partitioning completes. The next
step is to sort the content of each bucket using fluxsort. If the number of elements in a bucket is
below 32, fluxsort defaults to quadsort, which is highly optimized for sorting small arrays using a
combination of branchless parity merges and twice-unguarded insertion.
Once each bucket is sorted, all that remains is merging the two distributions of compliant and deviant
elements, and wolfsort is finished.
Memory overhead
---------------
Wolfsort requires 4n memory for the partitioning process and n / 4 memory (up to a maximum of 65536)
for the buckets.
If not enough memory is available wolfsort falls back on fluxsort, which requires exactly 1n swap memory,
and if that's not sufficient fluxsort falls back on quadsort which can sort in-place. It is an
option to fall back on blitsort instead of quadsort, but since this would be an a-typical case,
and increase dependencies, I didn't implement this.
64 bit integers
---------------
With the advent of fluxsort and crumsort the dominance of radix sorts has been pushed out of 64 bit territory. Increased memory-level-parallelism in future hardware, or algorithmic optimizations, might make radix sorts competitive again for 64 bit types. Wolfsort has a commented-out default to fluxsort.
128 bit floats
--------------
Wolfsort defaults to fluxsort for 128 bit floats. Keep in mind that in the real world you'll typically be sorting tables instead of arrays, so the benchmark isn't indicative of real world performance, as the sort will likely be copying 64 bit pointers instead of 128 bit floats.
God Mode
--------
Wolfsort supports a cheat mode where the sort becomes unstable. This trick was taken from rhsort. Since wolfsort aspires to have some utility as a stable sort, this method is disabled by default, including in the benchmark.
In the benchmark rhsort does use this optimization, but it's only relevant for the random % 100 distribution. For 32 bit random integers rhsort easily beats wolfsort without an unfair advantage.
LLVM
----
When compiling with Clang, quadsort and fluxsort will take advantate of branchless ternary oprations, which gives a 15-30% performance gain. While not an algorithmic improvement, it's relevant to keep in mind, particularly when it comes to LLVM compiled Rust sorts with similar optimizations.
Interface
---------
Wolfsort uses the same interface as qsort, which is described in [man qsort](https://man7.org/linux/man-pages/man3/qsort.3p.html).
Wolfsort also comes with the `wolfsort_prim(void *array, size_t nmemb, size_t size)` function to perform primitive comparisons on arrays of 32 and 64 bit integers. Nmemb is the number of elements, while size should be either `sizeof(int)` or `sizeof(long long)` for signed integers, and `sizeof(int) + 1` or `sizeof(long long) + 1` for unsigned integers. Support for the char and short types can be easily added in wolfsort.h.
Wolfsort can only sort arrays of primitive integers by default. Wolfsort should be able to sort tables with some minor changes, but it'll require a different interface than qsort() provides.
Proof of concept
----------------
Wolfsort is primarily a proof of concept for a hybrid bucket / comparison sort. It only supports non-negative integers.
I'll briefly mention other sorting algorithms listed in the benchmark code / graphs. They can all be considered the fastest algorithms currently available in their particular class.
Blitsort
--------
[Blitsort](https://github.com/scandum/blitsort) is a hybrid in-place stable adaptive rotate quick / merge sort.
Crumsort
--------
[Crumsort](https://github.com/scandum/crumsort) is a hybrid in-place unstable adaptive quick / rotate merge sort.
Quadsort
--------
[Quadsort](https://github.com/scandum/quadsort) is an adaptive mergesort. It supports rotations as a fall-back to sort in-place. It has very good performance when it comes to sorting tables and generally outperforms timsort.
Gridsort
--------
[Gridsort](https://github.com/scandum/gridsort) is a stable comparison sort which stores data in a 2 dimensional self-balancing grid. It has some interesting properties and was the fastest comparison sort for random data for a brief period of time.
Fluxsort
--------
[Fluxsort](https://github.com/scandum/fluxsort) is a hybrid stable branchless out-of-place quick / merge sort.
Piposort
--------
[Piposort](https://github.com/scandum/piposort) is a simplified branchless quadsort with a much smaller code size and complexity while still being very fast. Piposort might be of use to people who want to port quadsort. This is a lot easier when you start out small.
rhsort
------
[rhsort](https://github.com/mlochbaum/rhsort) is a hybrid stable out-of-place counting / radix / drop / insertion sort. It has exceptional performance on random and generic data for medium array sizes.
Ska sort
--------
[Ska sort](https://github.com/skarupke/ska_sort) is an advanced radix sort that can sort strings and floats as well. It offers both an in-place and out-of-place version, but since the out-of-place unstable version is not very competitive with wolfsort, I only benchmark the stable and faster ska_sort_copy variant.
Big O
-----
```
┌───────────────────────┐┌────────────────────┐
│comparisons ││swap memory │
┌───────────────┐├───────┬───────┬───────┤├──────┬──────┬──────┤┌──────┐┌─────────┐┌─────────┐┌─────────┐
│name ││min │avg │max ││min │avg │max ││stable││partition││adaptive ││compares │
├───────────────┤├───────┼───────┼───────┤├──────┼──────┼──────┤├──────┤├─────────┤├─────────┤├─────────┤
│blitsort ││n │n log n│n log n││1 │1 │1 ││yes ││yes ││yes ││yes │
├───────────────┤├───────┼───────┼───────┤├──────┼──────┼──────┤├──────┤├─────────┤├─────────┤├─────────┤
│crumsort ││n │n log n│n log n││1 │1 │1 ││no ││yes ││yes ││yes │
├───────────────┤├───────┼───────┼───────┤├──────┼──────┼──────┤├──────┤├─────────┤├─────────┤├─────────┤
│fluxsort ││n │n log n│n log n││n │n │n ││yes ││yes ││yes ││yes │
├───────────────┤├───────┼───────┼───────┤├──────┼──────┼──────┤├──────┤├─────────┤├─────────┤├─────────┤
│gridsort ││n │n log n│n log n││n │n │n ││yes ││yes ││yes ││yes │
├───────────────┤├───────┼───────┼───────┤├──────┼──────┼──────┤├──────┤├─────────┤├─────────┤├─────────┤
│quadsort ││n │n log n│n log n││1 │n │n ││yes ││no ││yes ││yes │
├───────────────┤├───────┼───────┼───────┤├──────┼──────┼──────┤├──────┤├─────────┤├─────────┤├─────────┤
│wolfsort ││n │n log n│n log n││n │n │n ││yes ││yes ││yes ││hybrid │
├───────────────┤├───────┼───────┼───────┤├──────┼──────┼──────┤├──────┤├─────────┤├─────────┤├─────────┤
│rhsort ││n │n log n│n log n││n │n │n ││yes ││yes ││semi ││hybrid │
├───────────────┤├───────┼───────┼───────┤├──────┼──────┼──────┤├──────┤├─────────┤├─────────┤├─────────┤
│skasort_copy ││n k │n k │n k ││n │n │n ││yes ││yes ││no ││no │
└───────────────┘└───────┴───────┴───────┘└──────┴──────┴──────┘└──────┘└─────────┘└─────────┘└─────────┘
```
Benchmark for Wolfsort v1.2.1.3
-------------------------------
rhsort vs wolfsort vs ska_sort_copy on 100K elements
----------------------------------------------------
The following benchmark was on WSL gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1) on 100,000 32 bit integers.
The source code was compiled using g++ -O3 -fpermissive bench.c. All comparisons are inlined through the cmp macro.
A table with the best and average time in seconds can be uncollapsed below the bar graph.

<details><summary><b>data table</b></summary>
| Name | Items | Type | Best | Average | Loops | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| wolfsort | 100000 | 64 | 0.003006 | 0.003063 | 0 | 100 | random order |
| skasort | 100000 | 64 | 0.001818 | 0.001842 | 0 | 100 | random order |
| Name | Items | Type | Best | Average | Loops | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| rhsort | 100000 | 32 | 0.000706 | 0.000729 | 0 | 100 | random order |
| wolfsort | 100000 | 32 | 0.001000 | 0.001026 | 0 | 100 | random order |
| skasort | 100000 | 32 | 0.000626 | 0.000640 | 0 | 100 | random order |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000115 | 0.000118 | 0 | 100 | random % 100 |
| wolfsort | 100000 | 32 | 0.000376 | 0.000382 | 0 | 100 | random % 100 |
| skasort | 100000 | 32 | 0.000780 | 0.000793 | 0 | 100 | random % 100 |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000302 | 0.000317 | 0 | 100 | ascending order |
| wolfsort | 100000 | 32 | 0.000086 | 0.000088 | 0 | 100 | ascending order |
| skasort | 100000 | 32 | 0.000709 | 0.000720 | 0 | 100 | ascending order |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000615 | 0.000633 | 0 | 100 | ascending saw |
| wolfsort | 100000 | 32 | 0.000379 | 0.000407 | 0 | 100 | ascending saw |
| skasort | 100000 | 32 | 0.000624 | 0.000637 | 0 | 100 | ascending saw |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000591 | 0.000615 | 0 | 100 | pipe organ |
| wolfsort | 100000 | 32 | 0.000248 | 0.000258 | 0 | 100 | pipe organ |
| skasort | 100000 | 32 | 0.000624 | 0.000639 | 0 | 100 | pipe organ |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000400 | 0.000420 | 0 | 100 | descending order |
| wolfsort | 100000 | 32 | 0.000097 | 0.000101 | 0 | 100 | descending order |
| skasort | 100000 | 32 | 0.000684 | 0.000693 | 0 | 100 | descending order |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000612 | 0.000629 | 0 | 100 | descending saw |
| wolfsort | 100000 | 32 | 0.000389 | 0.000393 | 0 | 100 | descending saw |
| skasort | 100000 | 32 | 0.000627 | 0.000639 | 0 | 100 | descending saw |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000633 | 0.000664 | 0 | 100 | random tail |
| wolfsort | 100000 | 32 | 0.000467 | 0.000473 | 0 | 100 | random tail |
| skasort | 100000 | 32 | 0.000622 | 0.000636 | 0 | 100 | random tail |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000671 | 0.000685 | 0 | 100 | random half |
| wolfsort | 100000 | 32 | 0.000689 | 0.000706 | 0 | 100 | random half |
| skasort | 100000 | 32 | 0.000628 | 0.000641 | 0 | 100 | random half |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.002019 | 0.002052 | 0 | 100 | ascending tiles |
| wolfsort | 100000 | 32 | 0.000683 | 0.000691 | 0 | 100 | ascending tiles |
| skasort | 100000 | 32 | 0.001096 | 0.001113 | 0 | 100 | ascending tiles |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000837 | 0.000871 | 0 | 100 | bit reversal |
| wolfsort | 100000 | 32 | 0.000887 | 0.000928 | 0 | 100 | bit reversal |
| skasort | 100000 | 32 | 0.000775 | 0.000782 | 0 | 100 | bit reversal |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000118 | 0.000123 | 0 | 100 | random % 4 |
| wolfsort | 100000 | 32 | 0.000368 | 0.000371 | 0 | 100 | random % 4 |
| skasort | 100000 | 32 | 0.000785 | 0.000809 | 0 | 100 | random % 4 |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.001278 | 0.001465 | 0 | 100 | semi random |
| wolfsort | 100000 | 32 | 0.000792 | 0.000811 | 0 | 100 | semi random |
| skasort | 100000 | 32 | 0.000805 | 0.000821 | 0 | 100 | semi random |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.000198 | 0.000202 | 0 | 100 | random signal |
| wolfsort | 100000 | 32 | 0.000815 | 0.000829 | 0 | 100 | random signal |
| skasort | 100000 | 32 | 0.001099 | 0.001118 | 0 | 100 | random signal |
</details>
The following benchmark was on WSL 2 gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04).
The source code was compiled using `g++ -O3 -w -fpermissive bench.c`. It measures the performance on random data with array sizes
ranging from 10 to 10,000,000. It's generated by running the benchmark using 10000000 0 0 as the argument. The benchmark is weighted, meaning the number of repetitions
halves each time the number of items doubles. A table with the best and average time in seconds can be uncollapsed below the bar graph.

<details><summary><b>data table</b></summary>
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| rhsort | 10 | 32 | 0.135095 | 0.137011 | 0.0 | 10 | random 10 |
| wolfsort | 10 | 32 | 0.052087 | 0.052986 | 0.0 | 10 | random 10 |
| skasort | 10 | 32 | 0.099853 | 0.100198 | 0.0 | 10 | random 10 |
| | | | | | | | |
| rhsort | 100 | 32 | 0.069252 | 0.070421 | 0.0 | 10 | random 100 |
| wolfsort | 100 | 32 | 0.132208 | 0.132824 | 0.0 | 10 | random 100 |
| skasort | 100 | 32 | 0.232007 | 0.232507 | 0.0 | 10 | random 100 |
| | | | | | | | |
| rhsort | 1000 | 32 | 0.055916 | 0.056130 | 0.0 | 10 | random 1000 |
| wolfsort | 1000 | 32 | 0.101611 | 0.101913 | 0.0 | 10 | random 1000 |
| skasort | 1000 | 32 | 0.054757 | 0.055050 | 0.0 | 10 | random 1000 |
| | | | | | | | |
| rhsort | 10000 | 32 | 0.057062 | 0.057359 | 0.0 | 10 | random 10000 |
| wolfsort | 10000 | 32 | 0.118598 | 0.119373 | 0.0 | 10 | random 10000 |
| skasort | 10000 | 32 | 0.059786 | 0.060189 | 0.0 | 10 | random 10000 |
| | | | | | | | |
| rhsort | 100000 | 32 | 0.071273 | 0.073310 | 0.0 | 10 | random 100000 |
| wolfsort | 100000 | 32 | 0.102639 | 0.103917 | 0.0 | 10 | random 100000 |
| skasort | 100000 | 32 | 0.064120 | 0.064615 | 0.0 | 10 | random 100000 |
| | | | | | | | |
| rhsort | 1000000 | 32 | 0.181059 | 0.187563 | 0.0 | 10 | random 1000000 |
| wolfsort | 1000000 | 32 | 0.146630 | 0.147598 | 0.0 | 10 | random 1000000 |
| skasort | 1000000 | 32 | 0.070250 | 0.071571 | 0.0 | 10 | random 1000000 |
| | | | | | | | |
| rhsort | 10000000 | 32 | 0.412107 | 0.425066 | 0 | 10 | random 10000000 |
| wolfsort | 10000000 | 32 | 0.193120 | 0.200947 | 0 | 10 | random 10000000 |
| skasort | 10000000 | 32 | 0.115721 | 0.116621 | 0 | 10 | random 10000000 |
</details>
Benchmark for Wolfsort v1.2.1.3
-------------------------------
fluxsort vs gridsort vs quadsort vs wolfsort on 100K elements
-------------------------------------------------------------
The following benchmark was on WSL gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1).
The source code was compiled using g++ -O3 -fpermissive bench.c. All comparisons are inlined through the cmp macro.
A table with the best and average time in seconds can be uncollapsed below the bar graph.

<details><summary><b>data table</b></summary>
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| fluxsort | 100000 | 128 | 0.008328 | 0.008424 | 0 | 100 | random order |
| gridsort | 100000 | 128 | 0.007823 | 0.007932 | 0 | 100 | random order |
| quadsort | 100000 | 128 | 0.008260 | 0.008353 | 0 | 100 | random order |
| wolfsort | 100000 | 128 | 0.008330 | 0.008415 | 0 | 100 | random order |
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| fluxsort | 100000 | 64 | 0.001971 | 0.001991 | 0 | 100 | random order |
| gridsort | 100000 | 64 | 0.002370 | 0.002398 | 0 | 100 | random order |
| quadsort | 100000 | 64 | 0.002230 | 0.002254 | 0 | 100 | random order |
| wolfsort | 100000 | 64 | 0.003023 | 0.003068 | 0 | 100 | random order |
| Name | Items | Type | Best | Average | Loops | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| fluxsort | 100000 | 32 | 0.001868 | 0.001901 | 0 | 100 | random order |
| gridsort | 100000 | 32 | 0.002324 | 0.002357 | 0 | 100 | random order |
| quadsort | 100000 | 32 | 0.002149 | 0.002174 | 0 | 100 | random order |
| wolfsort | 100000 | 32 | 0.000988 | 0.001019 | 0 | 100 | random order |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000733 | 0.000740 | 0 | 100 | random % 100 |
| gridsort | 100000 | 32 | 0.001921 | 0.001941 | 0 | 100 | random % 100 |
| quadsort | 100000 | 32 | 0.001627 | 0.001645 | 0 | 100 | random % 100 |
| wolfsort | 100000 | 32 | 0.000374 | 0.000378 | 0 | 100 | random % 100 |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000043 | 0.000044 | 0 | 100 | ascending order |
| gridsort | 100000 | 32 | 0.000264 | 0.000271 | 0 | 100 | ascending order |
| quadsort | 100000 | 32 | 0.000052 | 0.000053 | 0 | 100 | ascending order |
| wolfsort | 100000 | 32 | 0.000087 | 0.000089 | 0 | 100 | ascending order |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000305 | 0.000314 | 0 | 100 | ascending saw |
| gridsort | 100000 | 32 | 0.000621 | 0.000641 | 0 | 100 | ascending saw |
| quadsort | 100000 | 32 | 0.000411 | 0.000417 | 0 | 100 | ascending saw |
| wolfsort | 100000 | 32 | 0.000379 | 0.000384 | 0 | 100 | ascending saw |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000193 | 0.000203 | 0 | 100 | pipe organ |
| gridsort | 100000 | 32 | 0.000446 | 0.000486 | 0 | 100 | pipe organ |
| quadsort | 100000 | 32 | 0.000252 | 0.000260 | 0 | 100 | pipe organ |
| wolfsort | 100000 | 32 | 0.000248 | 0.000259 | 0 | 100 | pipe organ |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000054 | 0.000055 | 0 | 100 | descending order |
| gridsort | 100000 | 32 | 0.000284 | 0.000295 | 0 | 100 | descending order |
| quadsort | 100000 | 32 | 0.000068 | 0.000070 | 0 | 100 | descending order |
| wolfsort | 100000 | 32 | 0.000097 | 0.000100 | 0 | 100 | descending order |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000315 | 0.000325 | 0 | 100 | descending saw |
| gridsort | 100000 | 32 | 0.000652 | 0.000667 | 0 | 100 | descending saw |
| quadsort | 100000 | 32 | 0.000440 | 0.000446 | 0 | 100 | descending saw |
| wolfsort | 100000 | 32 | 0.000389 | 0.000393 | 0 | 100 | descending saw |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000607 | 0.000619 | 0 | 100 | random tail |
| gridsort | 100000 | 32 | 0.000847 | 0.000860 | 0 | 100 | random tail |
| quadsort | 100000 | 32 | 0.000685 | 0.000694 | 0 | 100 | random tail |
| wolfsort | 100000 | 32 | 0.000464 | 0.000471 | 0 | 100 | random tail |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.001074 | 0.001081 | 0 | 100 | random half |
| gridsort | 100000 | 32 | 0.001332 | 0.001355 | 0 | 100 | random half |
| quadsort | 100000 | 32 | 0.001230 | 0.001243 | 0 | 100 | random half |
| wolfsort | 100000 | 32 | 0.000686 | 0.000696 | 0 | 100 | random half |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000317 | 0.000324 | 0 | 100 | ascending tiles |
| gridsort | 100000 | 32 | 0.000665 | 0.000693 | 0 | 100 | ascending tiles |
| quadsort | 100000 | 32 | 0.000789 | 0.000802 | 0 | 100 | ascending tiles |
| wolfsort | 100000 | 32 | 0.000686 | 0.000693 | 0 | 100 | ascending tiles |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.001714 | 0.001751 | 0 | 100 | bit reversal |
| gridsort | 100000 | 32 | 0.002045 | 0.002060 | 0 | 100 | bit reversal |
| quadsort | 100000 | 32 | 0.002083 | 0.002100 | 0 | 100 | bit reversal |
| wolfsort | 100000 | 32 | 0.000888 | 0.000912 | 0 | 100 | bit reversal |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.000215 | 0.000223 | 0 | 100 | random % 4 |
| gridsort | 100000 | 32 | 0.001283 | 0.001305 | 0 | 100 | random % 4 |
| quadsort | 100000 | 32 | 0.001080 | 0.001090 | 0 | 100 | random % 4 |
| wolfsort | 100000 | 32 | 0.000369 | 0.000371 | 0 | 100 | random % 4 |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.001072 | 0.001098 | 0 | 100 | semi random |
| gridsort | 100000 | 32 | 0.001355 | 0.001377 | 0 | 100 | semi random |
| quadsort | 100000 | 32 | 0.001062 | 0.001074 | 0 | 100 | semi random |
| wolfsort | 100000 | 32 | 0.000789 | 0.000803 | 0 | 100 | semi random |
| | | | | | | | |
| fluxsort | 100000 | 32 | 0.001079 | 0.001099 | 0 | 100 | random signal |
| gridsort | 100000 | 32 | 0.001296 | 0.001306 | 0 | 100 | random signal |
| quadsort | 100000 | 32 | 0.001014 | 0.001027 | 0 | 100 | random signal |
| wolfsort | 100000 | 32 | 0.000816 | 0.000828 | 0 | 100 | random signal |
</details>
fluxsort vs gridsort vs quadsort vs wolfsort on 10M elements
------------------------------------------------------------

<details><summary><b>data table</b></summary>
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| fluxsort | 10000000 | 128 | 1.242395 | 1.264809 | 0 | 10 | random order |
| gridsort | 10000000 | 128 | 1.048748 | 1.110490 | 0 | 10 | random order |
| quadsort | 10000000 | 128 | 1.407639 | 1.418088 | 0 | 10 | random order |
| wolfsort | 10000000 | 128 | 1.239099 | 1.241608 | 0 | 10 | random order |
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| fluxsort | 10000000 | 64 | 0.317327 | 0.318203 | 0 | 10 | random order |
| gridsort | 10000000 | 64 | 0.332430 | 0.334392 | 0 | 10 | random order |
| quadsort | 10000000 | 64 | 0.438257 | 0.439139 | 0 | 10 | random order |
| wolfsort | 10000000 | 64 | 0.481977 | 0.484055 | 0 | 10 | random order |
| Name | Items | Type | Best | Average | Loops | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| fluxsort | 10000000 | 32 | 0.269351 | 0.271460 | 0 | 10 | random order |
| gridsort | 10000000 | 32 | 0.322099 | 0.323899 | 0 | 10 | random order |
| quadsort | 10000000 | 32 | 0.364457 | 0.365617 | 0 | 10 | random order |
| wolfsort | 10000000 | 32 | 0.189780 | 0.190911 | 0 | 10 | random order |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.089973 | 0.090849 | 0 | 10 | random % 100 |
| gridsort | 10000000 | 32 | 0.172222 | 0.173147 | 0 | 10 | random % 100 |
| quadsort | 10000000 | 32 | 0.248361 | 0.250615 | 0 | 10 | random % 100 |
| wolfsort | 10000000 | 32 | 0.086473 | 0.087067 | 0 | 10 | random % 100 |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.006437 | 0.006574 | 0 | 10 | ascending order |
| gridsort | 10000000 | 32 | 0.032321 | 0.032798 | 0 | 10 | ascending order |
| quadsort | 10000000 | 32 | 0.011736 | 0.012125 | 0 | 10 | ascending order |
| wolfsort | 10000000 | 32 | 0.010888 | 0.011015 | 0 | 10 | ascending order |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.074940 | 0.075525 | 0 | 10 | ascending saw |
| gridsort | 10000000 | 32 | 0.067478 | 0.067893 | 0 | 10 | ascending saw |
| quadsort | 10000000 | 32 | 0.097133 | 0.098004 | 0 | 10 | ascending saw |
| wolfsort | 10000000 | 32 | 0.081797 | 0.082794 | 0 | 10 | ascending saw |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.064577 | 0.065593 | 0 | 10 | pipe organ |
| gridsort | 10000000 | 32 | 0.048932 | 0.049336 | 0 | 10 | pipe organ |
| quadsort | 10000000 | 32 | 0.082533 | 0.083781 | 0 | 10 | pipe organ |
| wolfsort | 10000000 | 32 | 0.070334 | 0.071158 | 0 | 10 | pipe organ |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.009807 | 0.010104 | 0 | 10 | descending order |
| gridsort | 10000000 | 32 | 0.034583 | 0.034814 | 0 | 10 | descending order |
| quadsort | 10000000 | 32 | 0.011396 | 0.011639 | 0 | 10 | descending order |
| wolfsort | 10000000 | 32 | 0.014198 | 0.014544 | 0 | 10 | descending order |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.078279 | 0.079071 | 0 | 10 | descending saw |
| gridsort | 10000000 | 32 | 0.069702 | 0.070109 | 0 | 10 | descending saw |
| quadsort | 10000000 | 32 | 0.101826 | 0.102801 | 0 | 10 | descending saw |
| wolfsort | 10000000 | 32 | 0.085101 | 0.085973 | 0 | 10 | descending saw |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.121948 | 0.122561 | 0 | 10 | random tail |
| gridsort | 10000000 | 32 | 0.109341 | 0.110117 | 0 | 10 | random tail |
| quadsort | 10000000 | 32 | 0.153324 | 0.153797 | 0 | 10 | random tail |
| wolfsort | 10000000 | 32 | 0.103558 | 0.104152 | 0 | 10 | random tail |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.181347 | 0.183186 | 0 | 10 | random half |
| gridsort | 10000000 | 32 | 0.185691 | 0.186592 | 0 | 10 | random half |
| quadsort | 10000000 | 32 | 0.225265 | 0.225897 | 0 | 10 | random half |
| wolfsort | 10000000 | 32 | 0.159819 | 0.160569 | 0 | 10 | random half |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.073673 | 0.074755 | 0 | 10 | ascending tiles |
| gridsort | 10000000 | 32 | 0.126309 | 0.126626 | 0 | 10 | ascending tiles |
| quadsort | 10000000 | 32 | 0.165332 | 0.167541 | 0 | 10 | ascending tiles |
| wolfsort | 10000000 | 32 | 0.093424 | 0.094040 | 0 | 10 | ascending tiles |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.271679 | 0.272589 | 0 | 10 | bit reversal |
| gridsort | 10000000 | 32 | 0.296563 | 0.297984 | 0 | 10 | bit reversal |
| quadsort | 10000000 | 32 | 0.369105 | 0.370652 | 0 | 10 | bit reversal |
| wolfsort | 10000000 | 32 | 0.251209 | 0.252891 | 0 | 10 | bit reversal |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.056011 | 0.056552 | 0 | 10 | random % 4 |
| gridsort | 10000000 | 32 | 0.191179 | 0.194017 | 0 | 10 | random % 4 |
| quadsort | 10000000 | 32 | 0.192466 | 0.193967 | 0 | 10 | random % 4 |
| wolfsort | 10000000 | 32 | 0.081668 | 0.082543 | 0 | 10 | random % 4 |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.054231 | 0.054571 | 0 | 10 | semi random |
| gridsort | 10000000 | 32 | 0.146534 | 0.146907 | 0 | 10 | semi random |
| quadsort | 10000000 | 32 | 0.197462 | 0.200010 | 0 | 10 | semi random |
| wolfsort | 10000000 | 32 | 0.192603 | 0.194365 | 0 | 10 | semi random |
| | | | | | | | |
| fluxsort | 10000000 | 32 | 0.173080 | 0.176575 | 0 | 10 | random signal |
| gridsort | 10000000 | 32 | 0.137590 | 0.137932 | 0 | 10 | random signal |
| quadsort | 10000000 | 32 | 0.180939 | 0.181778 | 0 | 10 | random signal |
| wolfsort | 10000000 | 32 | 0.161181 | 0.161714 | 0 | 10 | random signal |
</details>
blitsort vs crumsort vs pdqsort vs wolfsort on 100K elements
-------------------------------------------------------------
The following benchmark was on WSL gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1).
The source code was compiled using g++ -O3 -fpermissive bench.c. All comparisons are inlined through the cmp macro.
A table with the best and average time in seconds can be uncollapsed below the bar graph.
Blitsort uses 512 elements of auxiliary memory, crumsort 512, pdqsort 64, and wolfsort 100000.

<details><summary><b>data table</b></summary>
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| blitsort | 100000 | 128 | 0.010864 | 0.010994 | 0 | 100 | random order |
| crumsort | 100000 | 128 | 0.008143 | 0.008222 | 0 | 100 | random order |
| pdqsort | 100000 | 128 | 0.005954 | 0.006063 | 0 | 100 | random order |
| wolfsort | 100000 | 128 | 0.008308 | 0.008396 | 0 | 100 | random order |
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| blitsort | 100000 | 64 | 0.002326 | 0.002354 | 0 | 100 | random order |
| crumsort | 100000 | 64 | 0.001835 | 0.001848 | 0 | 100 | random order |
| pdqsort | 100000 | 64 | 0.002752 | 0.002806 | 0 | 100 | random order |
| wolfsort | 100000 | 64 | 0.003014 | 0.003069 | 0 | 100 | random order |
| Name | Items | Type | Best | Average | Loops | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| blitsort | 100000 | 32 | 0.002094 | 0.002117 | 0 | 100 | random order |
| crumsort | 100000 | 32 | 0.001764 | 0.001779 | 0 | 100 | random order |
| pdqsort | 100000 | 32 | 0.002747 | 0.002770 | 0 | 100 | random order |
| wolfsort | 100000 | 32 | 0.000983 | 0.001016 | 0 | 100 | random order |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.000880 | 0.000891 | 0 | 100 | random % 100 |
| crumsort | 100000 | 32 | 0.000602 | 0.000641 | 0 | 100 | random % 100 |
| pdqsort | 100000 | 32 | 0.000795 | 0.000805 | 0 | 100 | random % 100 |
| wolfsort | 100000 | 32 | 0.000376 | 0.000381 | 0 | 100 | random % 100 |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.000043 | 0.000045 | 0 | 100 | ascending order |
| crumsort | 100000 | 32 | 0.000043 | 0.000044 | 0 | 100 | ascending order |
| pdqsort | 100000 | 32 | 0.000084 | 0.000088 | 0 | 100 | ascending order |
| wolfsort | 100000 | 32 | 0.000086 | 0.000088 | 0 | 100 | ascending order |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.000440 | 0.000450 | 0 | 100 | ascending saw |
| crumsort | 100000 | 32 | 0.000410 | 0.000419 | 0 | 100 | ascending saw |
| pdqsort | 100000 | 32 | 0.003222 | 0.003246 | 0 | 100 | ascending saw |
| wolfsort | 100000 | 32 | 0.000379 | 0.000382 | 0 | 100 | ascending saw |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.000242 | 0.000251 | 0 | 100 | pipe organ |
| crumsort | 100000 | 32 | 0.000229 | 0.000243 | 0 | 100 | pipe organ |
| pdqsort | 100000 | 32 | 0.002842 | 0.002864 | 0 | 100 | pipe organ |
| wolfsort | 100000 | 32 | 0.000249 | 0.000257 | 0 | 100 | pipe organ |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.000054 | 0.000055 | 0 | 100 | descending order |
| crumsort | 100000 | 32 | 0.000054 | 0.000055 | 0 | 100 | descending order |
| pdqsort | 100000 | 32 | 0.000190 | 0.000198 | 0 | 100 | descending order |
| wolfsort | 100000 | 32 | 0.000097 | 0.000100 | 0 | 100 | descending order |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.000452 | 0.000466 | 0 | 100 | descending saw |
| crumsort | 100000 | 32 | 0.000421 | 0.000431 | 0 | 100 | descending saw |
| pdqsort | 100000 | 32 | 0.004200 | 0.004245 | 0 | 100 | descending saw |
| wolfsort | 100000 | 32 | 0.000383 | 0.000402 | 0 | 100 | descending saw |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.000782 | 0.000829 | 0 | 100 | random tail |
| crumsort | 100000 | 32 | 0.000714 | 0.000755 | 0 | 100 | random tail |
| pdqsort | 100000 | 32 | 0.002638 | 0.002759 | 0 | 100 | random tail |
| wolfsort | 100000 | 32 | 0.000463 | 0.000483 | 0 | 100 | random tail |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.001210 | 0.001275 | 0 | 100 | random half |
| crumsort | 100000 | 32 | 0.001063 | 0.001096 | 0 | 100 | random half |
| pdqsort | 100000 | 32 | 0.002738 | 0.002780 | 0 | 100 | random half |
| wolfsort | 100000 | 32 | 0.000685 | 0.000712 | 0 | 100 | random half |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.001105 | 0.001278 | 0 | 100 | ascending tiles |
| crumsort | 100000 | 32 | 0.001393 | 0.001435 | 0 | 100 | ascending tiles |
| pdqsort | 100000 | 32 | 0.002367 | 0.002398 | 0 | 100 | ascending tiles |
| wolfsort | 100000 | 32 | 0.000682 | 0.000689 | 0 | 100 | ascending tiles |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.001956 | 0.001988 | 0 | 100 | bit reversal |
| crumsort | 100000 | 32 | 0.001762 | 0.001794 | 0 | 100 | bit reversal |
| pdqsort | 100000 | 32 | 0.002731 | 0.002758 | 0 | 100 | bit reversal |
| wolfsort | 100000 | 32 | 0.000890 | 0.000921 | 0 | 100 | bit reversal |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.000328 | 0.000341 | 0 | 100 | random % 4 |
| crumsort | 100000 | 32 | 0.000206 | 0.000216 | 0 | 100 | random % 4 |
| pdqsort | 100000 | 32 | 0.000382 | 0.000391 | 0 | 100 | random % 4 |
| wolfsort | 100000 | 32 | 0.000367 | 0.000378 | 0 | 100 | random % 4 |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.001209 | 0.001244 | 0 | 100 | semi random |
| crumsort | 100000 | 32 | 0.000309 | 0.000319 | 0 | 100 | semi random |
| pdqsort | 100000 | 32 | 0.000479 | 0.000500 | 0 | 100 | semi random |
| wolfsort | 100000 | 32 | 0.000791 | 0.000828 | 0 | 100 | semi random |
| | | | | | | | |
| blitsort | 100000 | 32 | 0.001893 | 0.001926 | 0 | 100 | random signal |
| crumsort | 100000 | 32 | 0.001714 | 0.001742 | 0 | 100 | random signal |
| pdqsort | 100000 | 32 | 0.002950 | 0.002976 | 0 | 100 | random signal |
| wolfsort | 100000 | 32 | 0.000814 | 0.000834 | 0 | 100 | random signal |
</details>
blitsort vs crumsort vs pdqsort vs wolfsort on 10M elements
-----------------------------------------------------------
Blitsort uses 512 elements of auxiliary memory, crumsort 512, pdqsort 64, and wolfsort 100000000.

<details><summary><b>data table</b></summary>
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| blitsort | 10000000 | 128 | 2.172622 | 2.191956 | 0 | 10 | random order |
| crumsort | 10000000 | 128 | 1.134328 | 1.135821 | 0 | 10 | random order |
| pdqsort | 10000000 | 128 | 0.805620 | 0.808041 | 0 | 10 | random order |
| wolfsort | 10000000 | 128 | 1.237174 | 1.238863 | 0 | 10 | random order |
| Name | Items | Type | Best | Average | Compares | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| blitsort | 10000000 | 64 | 0.434356 | 0.443134 | 0 | 10 | random order |
| crumsort | 10000000 | 64 | 0.250065 | 0.251453 | 0 | 10 | random order |
| pdqsort | 10000000 | 64 | 0.359586 | 0.360388 | 0 | 10 | random order |
| wolfsort | 10000000 | 64 | 0.480904 | 0.482835 | 0 | 10 | random order |
| Name | Items | Type | Best | Average | Loops | Samples | Distribution |
| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |
| blitsort | 10000000 | 32 | 0.332071 | 0.339524 | 0 | 10 | random order |
| crumsort | 10000000 | 32 | 0.231584 | 0.232056 | 0 | 10 | random order |
| pdqsort | 10000000 | 32 | 0.347793 | 0.348437 | 0 | 10 | random order |
| wolfsort | 10000000 | 32 | 0.189250 | 0.189762 | 0 | 10 | random order |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.126792 | 0.128439 | 0 | 10 | random % 100 |
| crumsort | 10000000 | 32 | 0.060683 | 0.061353 | 0 | 10 | random % 100 |
| pdqsort | 10000000 | 32 | 0.079284 | 0.079891 | 0 | 10 | random % 100 |
| wolfsort | 10000000 | 32 | 0.086577 | 0.087157 | 0 | 10 | random % 100 |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.006581 | 0.006784 | 0 | 10 | ascending order |
| crumsort | 10000000 | 32 | 0.006690 | 0.006801 | 0 | 10 | ascending order |
| pdqsort | 10000000 | 32 | 0.011712 | 0.011851 | 0 | 10 | ascending order |
| wolfsort | 10000000 | 32 | 0.010958 | 0.011520 | 0 | 10 | ascending order |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.070514 | 0.071260 | 0 | 10 | ascending saw |
| crumsort | 10000000 | 32 | 0.064829 | 0.066035 | 0 | 10 | ascending saw |
| pdqsort | 10000000 | 32 | 0.560995 | 0.561774 | 0 | 10 | ascending saw |
| wolfsort | 10000000 | 32 | 0.081644 | 0.082279 | 0 | 10 | ascending saw |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.041220 | 0.041924 | 0 | 10 | pipe organ |
| crumsort | 10000000 | 32 | 0.039335 | 0.040018 | 0 | 10 | pipe organ |
| pdqsort | 10000000 | 32 | 0.363633 | 0.364187 | 0 | 10 | pipe organ |
| wolfsort | 10000000 | 32 | 0.070536 | 0.071400 | 0 | 10 | pipe organ |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.010271 | 0.010549 | 0 | 10 | descending order |
| crumsort | 10000000 | 32 | 0.010254 | 0.010499 | 0 | 10 | descending order |
| pdqsort | 10000000 | 32 | 0.023129 | 0.023708 | 0 | 10 | descending order |
| wolfsort | 10000000 | 32 | 0.014583 | 0.015316 | 0 | 10 | descending order |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.073410 | 0.074402 | 0 | 10 | descending saw |
| crumsort | 10000000 | 32 | 0.068284 | 0.069154 | 0 | 10 | descending saw |
| pdqsort | 10000000 | 32 | 0.942142 | 0.958606 | 0 | 10 | descending saw |
| wolfsort | 10000000 | 32 | 0.085338 | 0.086014 | 0 | 10 | descending saw |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.124089 | 0.130327 | 0 | 10 | random tail |
| crumsort | 10000000 | 32 | 0.103030 | 0.104337 | 0 | 10 | random tail |
| pdqsort | 10000000 | 32 | 0.337862 | 0.342594 | 0 | 10 | random tail |
| wolfsort | 10000000 | 32 | 0.103381 | 0.108048 | 0 | 10 | random tail |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.191479 | 0.193036 | 0 | 10 | random half |
| crumsort | 10000000 | 32 | 0.146732 | 0.147742 | 0 | 10 | random half |
| pdqsort | 10000000 | 32 | 0.342803 | 0.343424 | 0 | 10 | random half |
| wolfsort | 10000000 | 32 | 0.159515 | 0.160787 | 0 | 10 | random half |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.182256 | 0.190378 | 0 | 10 | ascending tiles |
| crumsort | 10000000 | 32 | 0.188875 | 0.195063 | 0 | 10 | ascending tiles |
| pdqsort | 10000000 | 32 | 0.285777 | 0.286996 | 0 | 10 | ascending tiles |
| wolfsort | 10000000 | 32 | 0.093709 | 0.094315 | 0 | 10 | ascending tiles |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.324983 | 0.326345 | 0 | 10 | bit reversal |
| crumsort | 10000000 | 32 | 0.230872 | 0.231599 | 0 | 10 | bit reversal |
| pdqsort | 10000000 | 32 | 0.343915 | 0.344677 | 0 | 10 | bit reversal |
| wolfsort | 10000000 | 32 | 0.250331 | 0.251319 | 0 | 10 | bit reversal |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.061197 | 0.062058 | 0 | 10 | random % 4 |
| crumsort | 10000000 | 32 | 0.030134 | 0.030564 | 0 | 10 | random % 4 |
| pdqsort | 10000000 | 32 | 0.043492 | 0.043673 | 0 | 10 | random % 4 |
| wolfsort | 10000000 | 32 | 0.081548 | 0.082020 | 0 | 10 | random % 4 |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.066686 | 0.067764 | 0 | 10 | semi random |
| crumsort | 10000000 | 32 | 0.045479 | 0.046088 | 0 | 10 | semi random |
| pdqsort | 10000000 | 32 | 0.060253 | 0.060612 | 0 | 10 | semi random |
| wolfsort | 10000000 | 32 | 0.190505 | 0.191946 | 0 | 10 | semi random |
| | | | | | | | |
| blitsort | 10000000 | 32 | 0.272456 | 0.274928 | 0 | 10 | random signal |
| crumsort | 10000000 | 32 | 0.224115 | 0.225966 | 0 | 10 | random signal |
| pdqsort | 10000000 | 32 | 0.382742 | 0.384505 | 0 | 10 | random signal |
| wolfsort | 10000000 | 32 | 0.160946 | 0.161769 | 0 | 10 | random signal |
</details>
================================================
FILE: src/bench.c
================================================
/*
To compile use either:
gcc -O3 bench.c
or
clang -O3 bench.c
or
g++ -O3 bench.c
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#include <time.h>
#include <errno.h>
#include <math.h>
#define cmp(a,b) (*(a) > *(b)) // uncomment for faster primitive comparisons
const char *sorts[] = { "*", "quadsort", "gridsort", "blitsort", "fluxsort", "skipsort", "crumsort", "wolfsort", "sort::std" };
//#define SKIP_STRINGS
//#define SKIP_DOUBLES
//#define SKIP_LONGS
#if __has_include("blitsort.h")
#include "blitsort.h" // curl "https://raw.githubusercontent.com/scandum/blitsort/master/src/blitsort.{c,h}" -o "blitsort.#1"
#endif
#if __has_include("crumsort.h")
#include "crumsort.h" // curl "https://raw.githubusercontent.com/scandum/crumsort/master/src/crumsort.{c,h}" -o "crumsort.#1"
#endif
#if __has_include("dripsort.h")
#include "dripsort.h"
#endif
#if __has_include("flowsort.h")
#include "flowsort.h"
#endif
#if __has_include("fluxsort.h")
#include "fluxsort.h" // curl "https://raw.githubusercontent.com/scandum/fluxsort/master/src/fluxsort.{c,h}" -o "fluxsort.#1"
#endif
#if __has_include("gridsort.h")
#include "gridsort.h" // curl "https://raw.githubusercontent.com/scandum/gridsort/master/src/gridsort.{c,h}" -o "gridsort.#1"
#endif
#if __has_include("octosort.h")
#include "octosort.h" // curl "https://raw.githubusercontent.com/scandum/octosort/master/src/octosort.{c,h}" -o "octosort.#1"
#endif
#if __has_include("piposort.h")
#include "piposort.h" // curl "https://raw.githubusercontent.com/scandum/piposort/master/src/piposort.{c,h}" -o "piposort.#1"
#endif
#if __has_include("quadsort.h")
#include "quadsort.h" // curl "https://raw.githubusercontent.com/scandum/quadsort/master/src/quadsort.{c,h}" -o "quadsort.#1"
#endif
#if __has_include("skipsort.h")
#include "skipsort.h"
#endif
#if __has_include("wolfsort.h")
#include "wolfsort.h" // curl "https://raw.githubusercontent.com/scandum/wolfsort/master/src/wolfsort.{c,h}" -o "wolfsort.#1"
#endif
#if __has_include("rhsort.c")
#define RHSORT_C
#include "rhsort.c" // curl https://raw.githubusercontent.com/mlochbaum/rhsort/master/rhsort.c > rhsort.c
#endif
#ifdef __GNUG__
#include <algorithm>
#if __has_include("pdqsort.h")
#include "pdqsort.h" // curl https://raw.githubusercontent.com/orlp/pdqsort/master/pdqsort.h > pdqsort.h
#endif
#if __has_include("ska_sort.hpp")
#define SKASORT_HPP
#include "ska_sort.hpp" // curl https://raw.githubusercontent.com/skarupke/ska_sort/master/ska_sort.hpp > ska_sort.hpp
#endif
#if __has_include("timsort.hpp")
#include "timsort.hpp" // curl https://raw.githubusercontent.com/timsort/cpp-TimSort/master/include/gfx/timsort.hpp > timsort.hpp
#endif
#endif
#if __has_include("antiqsort.c")
#include "antiqsort.c"
#endif
//typedef int CMPFUNC (const void *a, const void *b);
typedef void SRTFUNC(void *array, size_t nmemb, size_t size, CMPFUNC *cmpf);
// Comment out Remove __attribute__ ((noinline)) and comparisons++ for full
// throttle. Like so: #define COMPARISON_PP //comparisons++
size_t comparisons;
#define COMPARISON_PP comparisons++
#define NO_INLINE __attribute__ ((noinline))
// primitive type comparison functions
NO_INLINE int cmp_int(const void * a, const void * b)
{
COMPARISON_PP;
return *(int *) a - *(int *) b;
// const int l = *(const int *)a;
// const int r = *(const int *)b;
// return l - r;
// return l > r;
// return (l > r) - (l < r);
}
NO_INLINE int cmp_rev(const void * a, const void * b)
{
int fa = *(int *)a;
int fb = *(int *)b;
COMPARISON_PP;
return fb - fa;
}
NO_INLINE int cmp_stable(const void * a, const void * b)
{
int fa = *(int *)a;
int fb = *(int *)b;
COMPARISON_PP;
return fa / 100000 - fb / 100000;
}
NO_INLINE int cmp_long(const void * a, const void * b)
{
const long long fa = *(const long long *) a;
const long long fb = *(const long long *) b;
COMPARISON_PP;
return (fa > fb) - (fa < fb);
// return (fa > fb);
}
NO_INLINE int cmp_float(const void * a, const void * b)
{
return *(float *) a - *(float *) b;
}
NO_INLINE int cmp_long_double(const void * a, const void * b)
{
const long double fa = *(const long double *) a;
const long double fb = *(const long double *) b;
COMPARISON_PP;
return (fa > fb) - (fa < fb);
/* if (isnan(fa) || isnan(fb))
{
return isnan(fa) - isnan(fb);
}
return (fa > fb);
*/
}
// pointer comparison functions
NO_INLINE int cmp_str(const void * a, const void * b)
{
COMPARISON_PP;
return strcmp(*(const char **) a, *(const char **) b);
}
NO_INLINE int cmp_int_ptr(const void * a, const void * b)
{
const int *fa = *(const int **) a;
const int *fb = *(const int **) b;
COMPARISON_PP;
return (*fa > *fb) - (*fa < *fb);
}
NO_INLINE int cmp_long_ptr(const void * a, const void * b)
{
const long long *fa = *(const long long **) a;
const long long *fb = *(const long long **) b;
COMPARISON_PP;
return (*fa > *fb) - (*fa < *fb);
}
NO_INLINE int cmp_long_double_ptr(const void * a, const void * b)
{
const long double *fa = *(const long double **) a;
const long double *fb = *(const long double **) b;
COMPARISON_PP;
return (*fa > *fb) - (*fa < *fb);
}
// c++ comparison functions
#ifdef __GNUG__
NO_INLINE bool cpp_cmp_int(const int &a, const int &b)
{
COMPARISON_PP;
return a < b;
}
NO_INLINE bool cpp_cmp_str(char const* const a, char const* const b)
{
COMPARISON_PP;
return strcmp(a, b) < 0;
}
#endif
long long utime()
{
struct timeval now_time;
gettimeofday(&now_time, NULL);
return now_time.tv_sec * 1000000LL + now_time.tv_usec;
}
void seed_rand(unsigned long long seed)
{
srand(seed);
}
void test_sort(void *array, void *unsorted, void *valid, int minimum, int maximum, int samples, int repetitions, SRTFUNC *srt, const char *name, const char *desc, size_t size, CMPFUNC *cmpf)
{
long long start, end, total, best, average_time, average_comp;
char temp[100];
static char compare = 0;
long long *ptla = (long long *) array, *ptlv = (long long *) valid;
long double *ptda = (long double *) array, *ptdv = (long double *) valid;
int *pta = (int *) array, *ptv = (int *) valid, rep, sam, max, cnt, name32;
#ifdef SKASORT_HPP
void *swap;
#endif
if (*name == '*')
{
if (!strcmp(desc, "random order") || !strcmp(desc, "random 1-4") || !strcmp(desc, "random 4") || !strcmp(desc, "random string") || !strcmp(desc, "random 10"))
{
if (comparisons)
{
compare = 1;
printf("%s\n", "| Name | Items | Type | Best | Average | Compares | Samples | Distribution |");
printf("%s\n", "| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |");
}
else
{
printf("%s\n", "| Name | Items | Type | Best | Average | Loops | Samples | Distribution |");
printf("%s\n", "| --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- |");
}
}
else
{
printf("%s\n", "| | | | | | | | |");
}
return;
}
name32 = name[0] + (name[1] ? name[1] * 32 : 0) + (name[2] ? name[2] * 1024 : 0);
best = average_time = average_comp = 0;
if (minimum == 7 && maximum == 7)
{
pta = (int *) unsorted;
printf("\e[1;32m%10d %10d %10d %10d %10d %10d %10d\e[0m\n", pta[0], pta[1], pta[2], pta[3], pta[4], pta[5], pta[6]);
pta = (int *) array;
}
for (sam = 0 ; sam < samples ; sam++)
{
total = average_comp = 0;
max = minimum;
start = utime();
for (rep = repetitions - 1 ; rep >= 0 ; rep--)
{
memcpy(array, (char *) unsorted + maximum * rep * size, max * size);
comparisons = 0;
// edit char *sorts to add / remove sorts
switch (name32)
{
#ifdef BLITSORT_H
case 'b' + 'l' * 32 + 'i' * 1024: blitsort(array, max, size, cmpf); break;
#endif
#ifdef CRUMSORT_H
case 'c' + 'r' * 32 + 'u' * 1024: crumsort(array, max, size, cmpf); break;
#endif
#ifdef DRIPSORT_H
case 'd' + 'r' * 32 + 'i' * 1024: dripsort(array, max, size, cmpf); break;
#endif
#ifdef FLOWSORT_H
case 'f' + 'l' * 32 + 'o' * 1024: flowsort(array, max, size, cmpf); break;
#endif
#ifdef FLUXSORT_H
case 'f' + 'l' * 32 + 'u' * 1024: fluxsort(array, max, size, cmpf); break;
case 's' + '_' * 32 + 'f' * 1024: fluxsort_size(array, max, size, cmpf); break;
#endif
#ifdef GRIDSORT_H
case 'g' + 'r' * 32 + 'i' * 1024: gridsort(array, max, size, cmpf); break;
#endif
#ifdef OCTOSORT_H
case 'o' + 'c' * 32 + 't' * 1024: octosort(array, max, size, cmpf); break;
#endif
#ifdef PIPOSORT_H
case 'p' + 'i' * 32 + 'p' * 1024: piposort(array, max, size, cmpf); break;
#endif
#ifdef QUADSORT_H
case 'q' + 'u' * 32 + 'a' * 1024: quadsort(array, max, size, cmpf); break;
case 's' + '_' * 32 + 'q' * 1024: quadsort_size(array, max, size, cmpf); break;
#endif
#ifdef SKIPSORT_H
case 's' + 'k' * 32 + 'i' * 1024: skipsort(array, max, size, cmpf); break;
#endif
#ifdef WOLFSORT_H
case 'w' + 'o' * 32 + 'l' * 1024: wolfsort(array, max, size, cmpf); break;
#endif
case 'q' + 's' * 32 + 'o' * 1024: qsort(array, max, size, cmpf); break;
#ifdef RHSORT_C
case 'r' + 'h' * 32 + 's' * 1024: if (size == sizeof(int)) rhsort32(pta, max); else return; break;
#endif
#ifdef __GNUG__
case 's' + 'o' * 32 + 'r' * 1024: if (size == sizeof(int)) std::sort(pta, pta + max); else if (size == sizeof(long long)) std::sort(ptla, ptla + max); else std::sort(ptda, ptda + max); break;
case 's' + 't' * 32 + 'a' * 1024: if (size == sizeof(int)) std::stable_sort(pta, pta + max); else if (size == sizeof(long long)) std::stable_sort(ptla, ptla + max); else std::stable_sort(ptda, ptda + max); break;
#ifdef PDQSORT_H
case 'p' + 'd' * 32 + 'q' * 1024: if (size == sizeof(int)) pdqsort(pta, pta + max); else if (size == sizeof(long long)) pdqsort(ptla, ptla + max); else pdqsort(ptda, ptda + max); break;
#endif
#ifdef SKASORT_HPP
case 's' + 'k' * 32 + 'a' * 1024: swap = malloc(max * size); if (size == sizeof(int)) ska_sort_copy(pta, pta + max, (int *) swap); else if (size == sizeof(long long)) ska_sort_copy(ptla, ptla + max, (long long *) swap); else repetitions = 0; free(swap); break;
#endif
#ifdef GFX_TIMSORT_HPP
case 't' + 'i' * 32 + 'm' * 1024: if (size == sizeof(int)) gfx::timsort(pta, pta + max, cpp_cmp_int); else if (size == sizeof(long long)) gfx::timsort(ptla, ptla + max); else gfx::timsort(ptda, ptda + max); break;
#endif
#endif
default:
switch (name32)
{
case 's' + 'o' * 32 + 'r' * 1024:
case 's' + 't' * 32 + 'a' * 1024:
case 'p' + 'd' * 32 + 'q' * 1024:
case 'r' + 'h' * 32 + 's' * 1024:
case 's' + 'k' * 32 + 'a' * 1024:
case 't' + 'i' * 32 + 'm' * 1024:
printf("unknown sort: %s (compile with g++ instead of gcc?)\n", name);
return;
default:
printf("unknown sort: %s\n", name);
return;
}
}
average_comp += comparisons;
if (minimum < maximum && ++max > maximum)
{
max = minimum;
}
}
end = utime();
total = end - start;
if (!best || total < best)
{
best = total;
}
average_time += total;
}
if (minimum == 7 && maximum == 7)
{
printf("\e[1;32m%10d %10d %10d %10d %10d %10d %10d\e[0m\n", pta[0], pta[1], pta[2], pta[3], pta[4], pta[5], pta[6]);
}
if (repetitions == 0)
{
return;
}
average_time /= samples;
if (cmpf == cmp_stable)
{
for (cnt = 1 ; cnt < maximum ; cnt++)
{
if (pta[cnt - 1] > pta[cnt])
{
sprintf(temp, "\e[1;31m%16s\e[0m", "unstable");
desc = temp;
break;
}
}
}
if (compare)
{
if (repetitions <= 1)
{
printf("|%10s |%9d | %4d |%9f |%9f |%10d | %7d | %16s |\e[0m\n", name, maximum, (int) size * 8, best / 1000000.0, average_time / 1000000.0, (int) comparisons, samples, desc);
}
else
{
printf("|%10s |%9d | %4d |%9f |%9f |%10.1f | %7d | %16s |\e[0m\n", name, maximum, (int) size * 8, best / 1000000.0, average_time / 1000000.0, (float) average_comp / repetitions, samples, desc);
}
}
else
{
printf("|%10s | %8d | %4d | %f | %f | %9d | %7d | %16s |\e[0m\n", name, maximum, (int) size * 8, best / 1000000.0, average_time / 1000000.0, repetitions, samples, desc);
}
if (minimum != maximum || cmpf == cmp_stable)
{
return;
}
for (cnt = 1 ; cnt < maximum ; cnt++)
{
if (cmpf == cmp_str)
{
char **ptsa = (char **) array;
if (strcmp((char *) ptsa[cnt - 1], (char *) ptsa[cnt]) > 0)
{
printf("%17s: not properly sorted at index %d. (%s vs %s\n", name, cnt, (char *) ptsa[cnt - 1], (char *) ptsa[cnt]);
break;
}
}
else if (size == sizeof(int *) && cmpf == cmp_long_double_ptr)
{
long double **pptda = (long double **) array;
if (cmp_long_double_ptr(&pptda[cnt - 1], &pptda[cnt]) > 0)
{
printf("%17s: not properly sorted at index %d. (%Lf vs %Lf\n", name, cnt, *pptda[cnt - 1], *pptda[cnt]);
break;
}
}
else if (cmpf == cmp_long_ptr)
{
long long **pptla = (long long **) array;
if (cmp_long_ptr(&pptla[cnt - 1], &pptla[cnt]) > 0)
{
printf("%17s: not properly sorted at index %d. (%lld vs %lld\n", name, cnt, *pptla[cnt - 1], *pptla[cnt]);
break;
}
}
else if (cmpf == cmp_int_ptr)
{
int **pptia = (int **) array;
if (cmp_int_ptr(&pptia[cnt - 1], &pptia[cnt]) > 0)
{
printf("%17s: not properly sorted at index %d. (%d vs %d\n", name, cnt, *pptia[cnt - 1], *pptia[cnt]);
break;
}
}
else if (size == sizeof(int))
{
if (pta[cnt - 1] > pta[cnt])
{
printf("%17s: not properly sorted at index %d. (%d vs %d\n", name, cnt, pta[cnt - 1], pta[cnt]);
break;
}
if (pta[cnt - 1] == pta[cnt])
{
// printf("%17s: Found a repeat value at index %d. (%d)\n", name, cnt, pta[cnt]);
}
}
else if (size == sizeof(long long))
{
if (ptla[cnt - 1] > ptla[cnt])
{
printf("%17s: not properly sorted at index %d. (%lld vs %lld\n", name, cnt, ptla[cnt - 1], ptla[cnt]);
break;
}
}
else if (size == sizeof(long double))
{
if (cmp_long_double(&ptda[cnt - 1], &ptda[cnt]) > 0)
{
printf("%17s: not properly sorted at index %d. (%Lf vs %Lf\n", name, cnt, ptda[cnt - 1], ptda[cnt]);
break;
}
}
}
for (cnt = 1 ; cnt < maximum ; cnt++)
{
if (size == sizeof(int))
{
if (pta[cnt] != ptv[cnt])
{
printf(" validate: array[%d] != valid[%d]. (%d vs %d\n", cnt, cnt, pta[cnt], ptv[cnt]);
break;
}
}
else if (size == sizeof(long long))
{
if (ptla[cnt] != ptlv[cnt])
{
if (cmpf == cmp_str)
{
char **ptsa = (char **) array;
char **ptsv = (char **) valid;
printf(" validate: array[%d] != valid[%d]. (%s vs %s) %s\n", cnt, cnt, (char *) ptsa[cnt], (char *) ptsv[cnt], !strcmp((char *) ptsa[cnt], (char *) ptsv[cnt]) ? "\e[1;31munstable\e[0m" : "");
break;
}
if (cmpf == cmp_long_ptr)
{
long long **ptla = (long long **) array;
long long **ptlv = (long long **) valid;
printf(" validate: array[%d] != valid[%d]. (%lld vs %lld) %s\n", cnt, cnt, *ptla[cnt], *ptlv[cnt], (*ptla[cnt] == *ptlv[cnt]) ? "\e[1;31munstable\e[0m" : "");
break;
}
if (cmpf == cmp_int_ptr)
{
int **ptia = (int **) array;
int **ptiv = (int **) valid;
printf(" validate: array[%d] != valid[%d]. (%d vs %d) %s\n", cnt, cnt, *ptia[cnt], *ptiv[cnt], (*ptia[cnt] == *ptiv[cnt]) ? "\e[1;31munstable\e[0m" : "");
break;
}
printf(" validate: array[%d] != valid[%d]. (%lld vs %lld\n", cnt, cnt, ptla[cnt], ptlv[cnt]);
break;
}
}
else if (size == sizeof(long double))
{
if (ptda[cnt] != ptdv[cnt])
{
printf(" validate: array[%d] != valid[%d]. (%Lf vs %Lf\n", cnt, cnt, ptda[cnt], ptdv[cnt]);
break;
}
}
}
}
void validate()
{
int seed = time(NULL);
int cnt, val, max = 1000;
int *a_array, *r_array, *v_array;
seed_rand(seed);
a_array = (int *) malloc(max * sizeof(int));
r_array = (int *) malloc(max * sizeof(int));
v_array = (int *) malloc(max * sizeof(int));
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = rand();
for (cnt = 0 ; cnt < max ; cnt++)
{
memcpy(a_array, r_array, cnt * sizeof(int));
memcpy(v_array, r_array, cnt * sizeof(int));
quadsort_prim(a_array, cnt, sizeof(int));
qsort(v_array, cnt, sizeof(int), cmp_int);
for (val = 0 ; val < cnt ; val++)
{
if (val && v_array[val - 1] > v_array[val]) {printf("\e[1;31mvalidate rand: seed %d: size: %d Not properly sorted at index %d.\n", seed, cnt, val); return;}
if (a_array[val] != v_array[val]) {printf("\e[1;31mvalidate rand: seed %d: size: %d Not verified at index %d.\n", seed, cnt, val); return;}
}
}
// ascending saw
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = cnt % (max / 5);
for (cnt = 0 ; cnt < max ; cnt += 7)
{
memcpy(a_array, r_array, cnt * sizeof(int));
memcpy(v_array, r_array, cnt * sizeof(int));
quadsort(a_array, cnt, sizeof(int), cmp_int);
qsort(v_array, cnt, sizeof(int), cmp_int);
for (val = 0 ; val < cnt ; val++)
{
if (val && v_array[val - 1] > v_array[val]) {printf("\e[1;31mvalidate ascending saw: seed %d: size: %d Not properly sorted at index %d.\n", seed, cnt, val); return;}
if (a_array[val] != v_array[val]) {printf("\e[1;31mvalidate ascending saw: seed %d: size: %d Not verified at index %d.\n", seed, cnt, val); return;}
}
}
// descending saw
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = (max - cnt + 1) % (max / 11);
}
for (cnt = 1 ; cnt < max ; cnt += 7)
{
memcpy(a_array, r_array, cnt * sizeof(int));
memcpy(v_array, r_array, cnt * sizeof(int));
quadsort(a_array, cnt, sizeof(int), cmp_int);
qsort(v_array, cnt, sizeof(int), cmp_int);
for (val = 0 ; val < cnt ; val++)
{
if (val && v_array[val - 1] > v_array[val]) {printf("\e[1;31mvalidate descending saw: seed %d: size: %d Not properly sorted at index %d.\n\n", seed, cnt, val); return;}
if (a_array[val] != v_array[val]) {printf("\e[1;31mvalidate descending saw: seed %d: size: %d Not verified at index %d.\n\n", seed, cnt, val); return;}
}
}
// random half
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = (cnt < max / 2) ? cnt : rand();
for (cnt = 1 ; cnt < max ; cnt += 7)
{
memcpy(a_array, r_array, cnt * sizeof(int));
memcpy(v_array, r_array, cnt * sizeof(int));
quadsort(a_array, cnt, sizeof(int), cmp_int);
qsort(v_array, cnt, sizeof(int), cmp_int);
for (val = 0 ; val < cnt ; val++)
{
if (val && v_array[val - 1] > v_array[val]) {printf("\e[1;31mvalidate rand tail: seed %d: size: %d Not properly sorted at index %d.\n", seed, cnt, val); return;}
if (a_array[val] != v_array[val]) {printf("\e[1;31mvalidate rand tail: seed %d: size: %d Not verified at index %d.\n", seed, cnt, val); return;}
}
}
free(a_array);
free(r_array);
free(v_array);
}
unsigned int bit_reverse(unsigned int x)
{
x = (((x & 0xaaaaaaaa) >> 1) | ((x & 0x55555555) << 1));
x = (((x & 0xcccccccc) >> 2) | ((x & 0x33333333) << 2));
x = (((x & 0xf0f0f0f0) >> 4) | ((x & 0x0f0f0f0f) << 4));
x = (((x & 0xff00ff00) >> 8) | ((x & 0x00ff00ff) << 8));
return((x >> 16) | (x << 15));
}
void run_test(void *a_array, void *r_array, void *v_array, int minimum, int maximum, int samples, int repetitions, int copies, const char *desc, size_t size, CMPFUNC *cmpf)
{
int cnt, rep;
memcpy(v_array, r_array, maximum * size);
for (rep = 0 ; rep < copies ; rep++)
{
memcpy((char *) r_array + rep * maximum * size, v_array, maximum * size);
}
quadsort(v_array, maximum, size, cmpf);
for (cnt = 0 ; (size_t) cnt < sizeof(sorts) / sizeof(char *) ; cnt++)
{
test_sort(a_array, r_array, v_array, minimum, maximum, samples, repetitions, qsort, sorts[cnt], desc, size, cmpf);
}
}
void range_test(int max, int samples, int repetitions, int seed)
{
int cnt, last;
int mem = max * 10 > 32768 * 64 ? max * 10 : 32768 * 64;
char dist[40];
int *a_array = (int *) malloc(max * sizeof(int));
int *r_array = (int *) malloc(mem * sizeof(int));
int *v_array = (int *) malloc(max * sizeof(int));
srand(seed);
for (cnt = 0 ; cnt < mem ; cnt++)
{
r_array[cnt] = rand();
}
if (max <= 4096)
{
for (last = 1, samples = 32768*4, repetitions = 4 ; repetitions <= max ; repetitions *= 2, samples /= 2)
{
if (max >= repetitions)
{
sprintf(dist, "random %d-%d", last, repetitions);
memcpy(v_array, r_array, repetitions * sizeof(int));
quadsort(v_array, repetitions, sizeof(int), cmp_int);
for (cnt = 0 ; (size_t) cnt < sizeof(sorts) / sizeof(char *) ; cnt++)
{
test_sort(a_array, r_array, v_array, last, repetitions, 50, samples, qsort, sorts[cnt], dist, sizeof(int), cmp_int);
}
last = repetitions + 1;
}
}
free(a_array);
free(r_array);
free(v_array);
return;
}
if (max == 10000000)
{
repetitions = 10000000;
for (max = 10 ; max <= 10000000 ; max *= 10)
{
repetitions /= 10;
memcpy(v_array, r_array, max * sizeof(int));
quadsort_prim(v_array, max, sizeof(int));
sprintf(dist, "random %d", max);
for (cnt = 0 ; (size_t) cnt < sizeof(sorts) / sizeof(char *) ; cnt++)
{
test_sort(a_array, r_array, v_array, max, max, 10, repetitions, qsort, sorts[cnt], dist, sizeof(int), cmp_int);
}
}
}
else
{
for (samples = 32768*4, repetitions = 4 ; samples > 0 ; repetitions *= 2, samples /= 2)
{
if (max >= repetitions)
{
memcpy(v_array, r_array, repetitions * sizeof(int));
quadsort(v_array, repetitions, sizeof(int), cmp_int);
sprintf(dist, "random %d", repetitions);
for (cnt = 0 ; (size_t) cnt < sizeof(sorts) / sizeof(char *) ; cnt++)
{
test_sort(a_array, r_array, v_array, repetitions, repetitions, 100, samples, qsort, sorts[cnt], dist, sizeof(int), cmp_int);
}
}
}
}
free(a_array);
free(r_array);
free(v_array);
return;
}
#define VAR int
int main(int argc, char **argv)
{
int max = 100000;
int samples = 10;
int repetitions = 1;
int seed = 0;
int cnt, mem;
VAR *a_array, *r_array, *v_array, sum;
if (argc >= 1 && argv[1] && *argv[1])
{
max = atoi(argv[1]);
}
if (argc >= 2 && argv[2] && *argv[2])
{
samples = atoi(argv[2]);
}
if (argc >= 3 && argv[3] && *argv[3])
{
repetitions = atoi(argv[3]);
}
if (argc >= 4 && argv[4] && *argv[4])
{
seed = atoi(argv[4]);
}
validate();
seed = seed ? seed : time(NULL);
printf("Info: int = %lu, long long = %lu, long double = %lu\n\n", sizeof(int) * 8, sizeof(long long) * 8, sizeof(long double) * 8);
printf("Benchmark: array size: %d, samples: %d, repetitions: %d, seed: %d\n\n", max, samples, repetitions, seed);
if (repetitions == 0)
{
range_test(max, samples, repetitions, seed);
return 0;
}
mem = max * repetitions;
#ifndef SKIP_STRINGS
#ifndef cmp
// C string
{
char **sa_array = (char **) malloc(max * sizeof(char **));
char **sr_array = (char **) malloc(mem * sizeof(char **));
char **sv_array = (char **) malloc(max * sizeof(char **));
char *buffer = (char *) malloc(mem * 16);
seed_rand(seed);
for (cnt = 0 ; cnt < mem ; cnt++)
{
sprintf(buffer + cnt * 16, "%X", rand() % 1000000);
sr_array[cnt] = buffer + cnt * 16;
}
run_test(sa_array, sr_array, sv_array, max, max, samples, repetitions, 0, "random string", sizeof(char **), cmp_str);
free(sa_array);
free(sr_array);
free(sv_array);
free(buffer);
}
// long double table
{
long double **da_array = (long double **) malloc(max * sizeof(long double *));
long double **dr_array = (long double **) malloc(mem * sizeof(long double *));
long double **dv_array = (long double **) malloc(max * sizeof(long double *));
long double *buffer = (long double *) malloc(mem * sizeof(long double));
if (da_array == NULL || dr_array == NULL || dv_array == NULL)
{
printf("main(%d,%d,%d): malloc: %s\n", max, samples, repetitions, strerror(errno));
return 0;
}
seed_rand(seed);
for (cnt = 0 ; cnt < mem ; cnt++)
{
buffer[cnt] = (long double) rand();
buffer[cnt] += (long double) ((unsigned long long) rand() << 32ULL);
dr_array[cnt] = buffer + cnt;
}
run_test(da_array, dr_array, dv_array, max, max, samples, repetitions, 0, "random double", sizeof(long double *), cmp_long_double_ptr);
free(da_array);
free(dr_array);
free(dv_array);
free(buffer);
}
// long long table
{
long long **la_array = (long long **) malloc(max * sizeof(long long *));
long long **lr_array = (long long **) malloc(mem * sizeof(long long *));
long long **lv_array = (long long **) malloc(max * sizeof(long long *));
long long *buffer = (long long *) malloc(mem * sizeof(long long));
if (la_array == NULL || lr_array == NULL || lv_array == NULL)
{
printf("main(%d,%d,%d): malloc: %s\n", max, samples, repetitions, strerror(errno));
return 0;
}
seed_rand(seed);
for (cnt = 0 ; cnt < mem ; cnt++)
{
buffer[cnt] = (long long) rand();
buffer[cnt] += (long long) ((unsigned long long) rand() << 32ULL);
lr_array[cnt] = buffer + cnt;
}
run_test(la_array, lr_array, lv_array, max, max, samples, repetitions, 0, "random long", sizeof(long long *), cmp_long_ptr);
free(la_array);
free(lr_array);
free(lv_array);
free(buffer);
}
// int table
{
int **la_array = (int **) malloc(max * sizeof(int *));
int **lr_array = (int **) malloc(mem * sizeof(int *));
int **lv_array = (int **) malloc(max * sizeof(int *));
int *buffer = (int *) malloc(mem * sizeof(int));
if (la_array == NULL || lr_array == NULL || lv_array == NULL)
{
printf("main(%d,%d,%d): malloc: %s\n", max, samples, repetitions, strerror(errno));
return 0;
}
seed_rand(seed);
for (cnt = 0 ; cnt < mem ; cnt++)
{
buffer[cnt] = rand();
lr_array[cnt] = buffer + cnt;
}
run_test(la_array, lr_array, lv_array, max, max, samples, repetitions, 0, "random int", sizeof(int *), cmp_int_ptr);
free(la_array);
free(lr_array);
free(lv_array);
free(buffer);
printf("\n");
}
#endif
#endif
// 128 bit
#ifndef SKIP_DOUBLES
long double *da_array = (long double *) malloc(max * sizeof(long double));
long double *dr_array = (long double *) malloc(mem * sizeof(long double));
long double *dv_array = (long double *) malloc(max * sizeof(long double));
if (da_array == NULL || dr_array == NULL || dv_array == NULL)
{
printf("main(%d,%d,%d): malloc: %s\n", max, samples, repetitions, strerror(errno));
return 0;
}
seed_rand(seed);
for (cnt = 0 ; cnt < mem ; cnt++)
{
dr_array[cnt] = (long double) rand();
dr_array[cnt] += (long double) ((unsigned long long) rand() << 32ULL);
dr_array[cnt] += 1.0L / 3.0L;
}
memcpy(dv_array, dr_array, max * sizeof(long double));
quadsort(dv_array, max, sizeof(long double), cmp_long_double);
for (cnt = 0 ; (size_t) cnt < sizeof(sorts) / sizeof(char *) ; cnt++)
{
test_sort(da_array, dr_array, dv_array, max, max, samples, repetitions, qsort, sorts[cnt], "random order", sizeof(long double), cmp_long_double);
}
#ifndef cmp
#ifdef QUADSORT_H
test_sort(da_array, dr_array, dv_array, max, max, samples, repetitions, qsort, "s_quadsort", "random order", sizeof(long double), cmp_long_double_ptr);
#endif
#endif
free(da_array);
free(dr_array);
free(dv_array);
printf("\n");
#endif
// 64 bit
#ifndef SKIP_LONGS
long long *la_array = (long long *) malloc(max * sizeof(long long));
long long *lr_array = (long long *) malloc(mem * sizeof(long long));
long long *lv_array = (long long *) malloc(max * sizeof(long long));
if (la_array == NULL || lr_array == NULL || lv_array == NULL)
{
printf("main(%d,%d,%d): malloc: %s\n", max, samples, repetitions, strerror(errno));
return 0;
}
seed_rand(seed);
for (cnt = 0 ; cnt < mem ; cnt++)
{
lr_array[cnt] = rand();
lr_array[cnt] += (unsigned long long) rand() << 32ULL;
}
memcpy(lv_array, lr_array, max * sizeof(long long));
quadsort(lv_array, max, sizeof(long long), cmp_long);
for (cnt = 0 ; (size_t) cnt < sizeof(sorts) / sizeof(char *) ; cnt++)
{
test_sort(la_array, lr_array, lv_array, max, max, samples, repetitions, qsort, sorts[cnt], "random order", sizeof(long long), cmp_long);
}
free(la_array);
free(lr_array);
free(lv_array);
printf("\n");
#endif
// 32 bit
a_array = (VAR *) malloc(max * sizeof(VAR));
r_array = (VAR *) malloc(mem * sizeof(VAR));
v_array = (VAR *) malloc(max * sizeof(VAR));
int quad0 = 0;
int nmemb = max;
int half1 = nmemb / 2;
int half2 = nmemb - half1;
int quad1 = half1 / 2;
int quad2 = half1 - quad1;
int quad3 = half2 / 2;
int quad4 = half2 - quad3;
int span3 = quad1 + quad2 + quad3;
// random
seed_rand(seed);
for (cnt = 0 ; cnt < mem ; cnt++)
{
r_array[cnt] = rand();
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "random order", sizeof(VAR), cmp_int);
// random % 100
for (cnt = 0 ; cnt < mem ; cnt++)
{
r_array[cnt] = rand() % 100;
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "random % 100", sizeof(VAR), cmp_int);
// ascending
for (cnt = sum = 0 ; cnt < mem ; cnt++)
{
r_array[cnt] = sum; sum += rand() % 5;
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "ascending order", sizeof(VAR), cmp_int);
// ascending saw
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = rand();
}
quadsort(r_array + quad0, quad1, sizeof(VAR), cmp_int);
quadsort(r_array + quad1, quad2, sizeof(VAR), cmp_int);
quadsort(r_array + half1, quad3, sizeof(VAR), cmp_int);
quadsort(r_array + span3, quad4, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "ascending saw", sizeof(VAR), cmp_int);
// pipe organ
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = rand();
}
quadsort(r_array + quad0, half1, sizeof(VAR), cmp_int);
qsort(r_array + half1, half2, sizeof(VAR), cmp_rev);
for (cnt = half1 + 1 ; cnt < max ; cnt++)
{
if (r_array[cnt] >= r_array[cnt - 1])
{
r_array[cnt] = r_array[cnt - 1] - 1; // guarantee the run is strictly descending
}
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "pipe organ", sizeof(VAR), cmp_int);
// descending
for (cnt = 0, sum = mem * 10 ; cnt < mem ; cnt++)
{
r_array[cnt] = sum; sum -= 1 + rand() % 5;
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "descending order", sizeof(VAR), cmp_int);
// descending saw
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = rand();
}
qsort(r_array + quad0, quad1, sizeof(VAR), cmp_rev);
qsort(r_array + quad1, quad2, sizeof(VAR), cmp_rev);
qsort(r_array + half1, quad3, sizeof(VAR), cmp_rev);
qsort(r_array + span3, quad4, sizeof(VAR), cmp_rev);
for (cnt = 1 ; cnt < max ; cnt++)
{
if (cnt == quad1 || cnt == half1 || cnt == span3) continue;
if (r_array[cnt] >= r_array[cnt - 1])
{
r_array[cnt] = r_array[cnt - 1] - 1; // guarantee the run is strictly descending
}
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "descending saw", sizeof(VAR), cmp_int);
// random tail 25%
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = rand();
}
quadsort(r_array, span3, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "random tail", sizeof(VAR), cmp_int);
// random 50%
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = rand();
}
quadsort(r_array, half1, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "random half", sizeof(VAR), cmp_int);
// tiles
for (cnt = 0 ; cnt < mem ; cnt++)
{
if (cnt % 2 == 0)
{
r_array[cnt] = 16777216 + cnt;
}
else
{
r_array[cnt] = 33554432 + cnt;
}
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "ascending tiles", sizeof(VAR), cmp_int);
// bit-reversal
for (cnt = 0 ; cnt < mem ; cnt++)
{
r_array[cnt] = bit_reverse(cnt);
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "bit reversal", sizeof(VAR), cmp_int);
#ifndef cmp
#ifdef ANTIQSORT
test_antiqsort;
#endif
#endif
#define QUAD_DEBUG
#if __has_include("extra_tests.c")
#include "extra_tests.c"
#endif
free(a_array);
free(r_array);
free(v_array);
return 0;
}
================================================
FILE: src/blitsort.c
================================================
// blitsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
#define BLIT_AUX 512 // set to 0 for sqrt(n) cache size
#define BLIT_OUT 96 // should be smaller or equal to BLIT_AUX
void FUNC(blit_partition)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, CMPFUNC *cmp);
void FUNC(blit_analyze)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
unsigned char loop, asum, bsum, csum, dsum;
unsigned int astreaks, bstreaks, cstreaks, dstreaks;
size_t quad1, quad2, quad3, quad4, half1, half2;
size_t cnt, abalance, bbalance, cbalance, dbalance;
VAR *pta, *ptb, *ptc, *ptd;
half1 = nmemb / 2;
quad1 = half1 / 2;
quad2 = half1 - quad1;
half2 = nmemb - half1;
quad3 = half2 / 2;
quad4 = half2 - quad3;
pta = array;
ptb = array + quad1;
ptc = array + half1;
ptd = array + half1 + quad3;
astreaks = bstreaks = cstreaks = dstreaks = 0;
abalance = bbalance = cbalance = dbalance = 0;
for (cnt = nmemb ; cnt > 132 ; cnt -= 128)
{
for (asum = bsum = csum = dsum = 0, loop = 32 ; loop ; loop--)
{
asum += cmp(pta, pta + 1) > 0; pta++;
bsum += cmp(ptb, ptb + 1) > 0; ptb++;
csum += cmp(ptc, ptc + 1) > 0; ptc++;
dsum += cmp(ptd, ptd + 1) > 0; ptd++;
}
abalance += asum; astreaks += asum = (asum == 0) | (asum == 32);
bbalance += bsum; bstreaks += bsum = (bsum == 0) | (bsum == 32);
cbalance += csum; cstreaks += csum = (csum == 0) | (csum == 32);
dbalance += dsum; dstreaks += dsum = (dsum == 0) | (dsum == 32);
if (cnt > 516 && asum + bsum + csum + dsum == 0)
{
abalance += 48; pta += 96;
bbalance += 48; ptb += 96;
cbalance += 48; ptc += 96;
dbalance += 48; ptd += 96;
cnt -= 384;
}
}
for ( ; cnt > 7 ; cnt -= 4)
{
abalance += cmp(pta, pta + 1) > 0; pta++;
bbalance += cmp(ptb, ptb + 1) > 0; ptb++;
cbalance += cmp(ptc, ptc + 1) > 0; ptc++;
dbalance += cmp(ptd, ptd + 1) > 0; ptd++;
}
if (quad1 < quad2) {bbalance += cmp(ptb, ptb + 1) > 0; ptb++;}
if (quad1 < quad3) {cbalance += cmp(ptc, ptc + 1) > 0; ptc++;}
if (quad1 < quad4) {dbalance += cmp(ptd, ptd + 1) > 0; ptd++;}
cnt = abalance + bbalance + cbalance + dbalance;
if (cnt == 0)
{
if (cmp(pta, pta + 1) <= 0 && cmp(ptb, ptb + 1) <= 0 && cmp(ptc, ptc + 1) <= 0)
{
return;
}
}
asum = quad1 - abalance == 1;
bsum = quad2 - bbalance == 1;
csum = quad3 - cbalance == 1;
dsum = quad4 - dbalance == 1;
if (asum | bsum | csum | dsum)
{
unsigned char span1 = (asum && bsum) * (cmp(pta, pta + 1) > 0);
unsigned char span2 = (bsum && csum) * (cmp(ptb, ptb + 1) > 0);
unsigned char span3 = (csum && dsum) * (cmp(ptc, ptc + 1) > 0);
switch (span1 | span2 * 2 | span3 * 4)
{
case 0: break;
case 1: FUNC(quad_reversal)(array, ptb); abalance = bbalance = 0; break;
case 2: FUNC(quad_reversal)(pta + 1, ptc); bbalance = cbalance = 0; break;
case 3: FUNC(quad_reversal)(array, ptc); abalance = bbalance = cbalance = 0; break;
case 4: FUNC(quad_reversal)(ptb + 1, ptd); cbalance = dbalance = 0; break;
case 5: FUNC(quad_reversal)(array, ptb);
FUNC(quad_reversal)(ptb + 1, ptd); abalance = bbalance = cbalance = dbalance = 0; break;
case 6: FUNC(quad_reversal)(pta + 1, ptd); bbalance = cbalance = dbalance = 0; break;
case 7: FUNC(quad_reversal)(array, ptd); return;
}
if (asum && abalance) {FUNC(quad_reversal)(array, pta); abalance = 0;}
if (bsum && bbalance) {FUNC(quad_reversal)(pta + 1, ptb); bbalance = 0;}
if (csum && cbalance) {FUNC(quad_reversal)(ptb + 1, ptc); cbalance = 0;}
if (dsum && dbalance) {FUNC(quad_reversal)(ptc + 1, ptd); dbalance = 0;}
}
#ifdef cmp
cnt = nmemb / 256; // more than 50% ordered
#else
cnt = nmemb / 512; // more than 25% ordered
#endif
asum = astreaks > cnt;
bsum = bstreaks > cnt;
csum = cstreaks > cnt;
dsum = dstreaks > cnt;
#ifndef cmp
if (quad1 > QUAD_CACHE)
{
asum = bsum = csum = dsum = 1;
}
#endif
switch (asum + bsum * 2 + csum * 4 + dsum * 8)
{
case 0:
FUNC(blit_partition)(array, swap, swap_size, nmemb, cmp);
return;
case 1:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
FUNC(blit_partition)(pta + 1, swap, swap_size, quad2 + half2, cmp);
break;
case 2:
FUNC(blit_partition)(array, swap, swap_size, quad1, cmp);
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
FUNC(blit_partition)(ptb + 1, swap, swap_size, half2, cmp);
break;
case 3:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
FUNC(blit_partition)(ptb + 1, swap, swap_size, half2, cmp);
break;
case 4:
FUNC(blit_partition)(array, swap, swap_size, half1, cmp);
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
FUNC(blit_partition)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 8:
FUNC(blit_partition)(array, swap, swap_size, half1 + quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 9:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
FUNC(blit_partition)(pta + 1, swap, swap_size, quad2 + quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 12:
FUNC(blit_partition)(array, swap, swap_size, half1, cmp);
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 5:
case 6:
case 7:
case 10:
case 11:
case 13:
case 14:
case 15:
if (asum)
{
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
}
else FUNC(blit_partition)(array, swap, swap_size, quad1, cmp);
if (bsum)
{
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
}
else FUNC(blit_partition)(pta + 1, swap, swap_size, quad2, cmp);
if (csum)
{
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
}
else FUNC(blit_partition)(ptb + 1, swap, swap_size, quad3, cmp);
if (dsum)
{
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
}
else FUNC(blit_partition)(ptc + 1, swap, swap_size, quad4, cmp);
break;
}
if (cmp(pta, pta + 1) <= 0)
{
if (cmp(ptc, ptc + 1) <= 0)
{
if (cmp(ptb, ptb + 1) <= 0)
{
return;
}
}
else
{
FUNC(rotate_merge_block)(array + half1, swap, swap_size, quad3, quad4, cmp);
}
}
else
{
FUNC(rotate_merge_block)(array, swap, swap_size, quad1, quad2, cmp);
if (cmp(ptc, ptc + 1) > 0)
{
FUNC(rotate_merge_block)(array + half1, swap, swap_size, quad3, quad4, cmp);
}
}
FUNC(rotate_merge_block)(array, swap, swap_size, half1, half2, cmp);
}
// The next 4 functions are used for pivot selection
VAR FUNC(blit_binary_median)(VAR *pta, VAR *ptb, size_t len, CMPFUNC *cmp)
{
while (len /= 2)
{
if (cmp(pta + len, ptb + len) <= 0) pta += len; else ptb += len;
}
return cmp(pta, ptb) > 0 ? *pta : *ptb;
}
void FUNC(blit_trim_four)(VAR *pta, CMPFUNC *cmp)
{
VAR swap;
size_t x;
x = cmp(pta, pta + 1) > 0; swap = pta[!x]; pta[0] = pta[x]; pta[1] = swap; pta += 2;
x = cmp(pta, pta + 1) > 0; swap = pta[!x]; pta[0] = pta[x]; pta[1] = swap; pta -= 2;
x = (cmp(pta, pta + 2) <= 0) * 2; pta[2] = pta[x]; pta++;
x = (cmp(pta, pta + 2) > 0) * 2; pta[0] = pta[x];
}
VAR FUNC(blit_median_of_nine)(VAR *array, VAR *swap, size_t nmemb, CMPFUNC *cmp)
{
VAR *pta;
size_t x, y, z;
z = nmemb / 9;
pta = array;
for (x = 0 ; x < 9 ; x++)
{
swap[x] = *pta;
pta += z;
}
FUNC(blit_trim_four)(swap, cmp);
FUNC(blit_trim_four)(swap + 4, cmp);
swap[0] = swap[5];
swap[3] = swap[8];
FUNC(blit_trim_four)(swap, cmp);
swap[0] = swap[6];
x = cmp(swap + 0, swap + 1) > 0;
y = cmp(swap + 0, swap + 2) > 0;
z = cmp(swap + 1, swap + 2) > 0;
return swap[(x == y) + (y ^ z)];
}
VAR FUNC(blit_median_of_cbrt)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, int *generic, CMPFUNC *cmp)
{
VAR *pta, *pts;
size_t cnt, div, cbrt;
for (cbrt = 32 ; nmemb > cbrt * cbrt * cbrt && cbrt < swap_size ; cbrt *= 2) {}
div = nmemb / cbrt;
pta = array; // + (size_t) &div / 16 % div; // for a non-deterministic offset
pts = swap;
for (cnt = 0 ; cnt < cbrt ; cnt++)
{
pts[cnt] = *pta;
pta += div;
}
cbrt /= 2;
FUNC(quadsort_swap)(pts, pts + cbrt * 2, cbrt, cbrt, cmp);
FUNC(quadsort_swap)(pts + cbrt, pts + cbrt * 2, cbrt, cbrt, cmp);
*generic = (cmp(pts + cbrt * 2 - 1, pts) <= 0) & (cmp(pts + cbrt - 1, pts) <= 0);
return FUNC(blit_binary_median)(pts, pts + cbrt, cbrt, cmp);
}
// As per suggestion by Marshall Lochbaum to improve generic data handling
size_t FUNC(blit_reverse_partition)(VAR *array, VAR *swap, VAR *piv, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb > swap_size)
{
size_t l, r, h = nmemb / 2;
l = FUNC(blit_reverse_partition)(array + 0, swap, piv, swap_size, h, cmp);
r = FUNC(blit_reverse_partition)(array + h, swap, piv, swap_size, nmemb - h, cmp);
FUNC(trinity_rotation)(array + l, swap, swap_size, h - l + r, h - l);
return l + r;
}
#if !defined __clang__
size_t cnt, val, m = 0;
VAR *pta = array;
for (cnt = nmemb / 4 ; cnt ; cnt--)
{
val = cmp(piv, pta) > 0; swap[-m] = array[m] = *pta++; m += val; swap++;
val = cmp(piv, pta) > 0; swap[-m] = array[m] = *pta++; m += val; swap++;
val = cmp(piv, pta) > 0; swap[-m] = array[m] = *pta++; m += val; swap++;
val = cmp(piv, pta) > 0; swap[-m] = array[m] = *pta++; m += val; swap++;
}
for (cnt = nmemb % 4 ; cnt ; cnt--)
{
val = cmp(piv, pta) > 0; swap[-m] = array[m] = *pta++; m += val; swap++;
}
swap -= nmemb;
#else
size_t cnt, m;
VAR *tmp, *ptx = array, *pta = array, *pts = swap;
for (cnt = nmemb / 4 ; cnt ; cnt--)
{
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
}
for (cnt = nmemb % 4 ; cnt ; cnt--)
{
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
}
m = pta - array;
#endif
memcpy(array + m, swap, (nmemb - m) * sizeof(VAR));
return m;
}
size_t FUNC(blit_default_partition)(VAR *array, VAR *swap, VAR *piv, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb > swap_size)
{
size_t l, r, h = nmemb / 2;
l = FUNC(blit_default_partition)(array + 0, swap, piv, swap_size, h, cmp);
r = FUNC(blit_default_partition)(array + h, swap, piv, swap_size, nmemb - h, cmp);
FUNC(trinity_rotation)(array + l, swap, swap_size, h - l + r, h - l);
return l + r;
}
#if !defined __clang__
size_t cnt, val, m = 0;
VAR *pta = array;
for (cnt = nmemb / 4 ; cnt ; cnt--)
{
val = cmp(pta, piv) <= 0; swap[-m] = array[m] = *pta++; m += val; swap++;
val = cmp(pta, piv) <= 0; swap[-m] = array[m] = *pta++; m += val; swap++;
val = cmp(pta, piv) <= 0; swap[-m] = array[m] = *pta++; m += val; swap++;
val = cmp(pta, piv) <= 0; swap[-m] = array[m] = *pta++; m += val; swap++;
}
for (cnt = nmemb % 4 ; cnt ; cnt--)
{
val = cmp(pta, piv) <= 0; swap[-m] = array[m] = *pta++; m += val; swap++;
}
swap -= nmemb;
#else
size_t cnt, m;
VAR *tmp, *ptx = array, *pta = array, *pts = swap;
for (cnt = nmemb / 4 ; cnt ; cnt--)
{
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
}
for (cnt = nmemb % 4 ; cnt ; cnt--)
{
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
}
m = pta - array;
#endif
memcpy(array + m, swap, sizeof(VAR) * (nmemb - m));
return m;
}
void FUNC(blit_partition)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
size_t a_size = 0, s_size;
VAR piv, max = 0;
int generic = 0;
while (1)
{
if (nmemb <= 2048)
{
piv = FUNC(blit_median_of_nine)(array, swap, nmemb, cmp);
}
else
{
piv = FUNC(blit_median_of_cbrt)(array, swap, swap_size, nmemb, &generic, cmp);
if (generic) break;
}
if (a_size && cmp(&max, &piv) <= 0)
{
a_size = FUNC(blit_reverse_partition)(array, swap, &piv, swap_size, nmemb, cmp);
s_size = nmemb - a_size;
nmemb = a_size;
if (s_size <= a_size / 16 || a_size <= BLIT_OUT) break;
a_size = 0;
continue;
}
a_size = FUNC(blit_default_partition)(array, swap, &piv, swap_size, nmemb, cmp);
s_size = nmemb - a_size;
if (a_size <= s_size / 16 || s_size <= BLIT_OUT)
{
if (s_size == 0)
{
a_size = FUNC(blit_reverse_partition)(array, swap, &piv, swap_size, a_size, cmp);
s_size = nmemb - a_size;
nmemb = a_size;
if (s_size <= a_size / 16 || a_size <= BLIT_OUT) break;
a_size = 0;
continue;
}
FUNC(quadsort_swap)(array + a_size, swap, swap_size, s_size, cmp);
}
else
{
FUNC(blit_partition)(array + a_size, swap, swap_size, s_size, cmp);
}
nmemb = a_size;
if (s_size <= a_size / 16 || a_size <= BLIT_OUT) break;
max = piv;
}
FUNC(quadsort_swap)(array, swap, swap_size, nmemb, cmp);
}
void FUNC(blitsort)(void *array, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb <= 132)
{
FUNC(quadsort)(array, nmemb, cmp);
}
else
{
VAR *pta = (VAR *) array;
#if BLIT_AUX
size_t swap_size = BLIT_AUX;
#else
size_t swap_size = 1 << 19;
while (nmemb / swap_size < swap_size / 128)
{
swap_size /= 4;
}
#endif
VAR swap[swap_size];
FUNC(blit_analyze)(pta, swap, swap_size, nmemb, cmp);
}
}
void FUNC(blitsort_swap)(void *array, void *swap, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb <= 132)
{
FUNC(quadsort_swap)(array, swap, swap_size, nmemb, cmp);
}
else
{
VAR *pta = (VAR *) array;
VAR *pts = (VAR *) swap;
FUNC(blit_analyze)(pta, pts, swap_size, nmemb, cmp);
}
}
#undef BLIT_AUX
#undef BLIT_OUT
================================================
FILE: src/blitsort.h
================================================
// blitsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
#ifndef BLITSORT_H
#define BLITSORT_H
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <errno.h>
#include <stdalign.h>
#include <float.h>
#include <string.h>
typedef int CMPFUNC (const void *a, const void *b);
//#define cmp(a,b) (*(a) > *(b))
#ifndef QUADSORT_H
#include "quadsort.h"
#endif
// When sorting an array of pointers, like a string array, the QUAD_CACHE needs
// to be set for proper performance when sorting large arrays.
// quadsort_prim() can be used to sort arrays of 32 and 64 bit integers
// without a comparison function or cache restrictions.
// With a 6 MB L3 cache a value of 262144 works well.
#ifdef cmp
#define QUAD_CACHE 4294967295
#else
//#define QUAD_CACHE 131072
#define QUAD_CACHE 262144
//#define QUAD_CACHE 524288
//#define QUAD_CACHE 4294967295
#endif
//////////////////////////////////////////////////////////
// ┌───────────────────────────────────────────────────┐//
// │ ██████┐ ██████┐ ██████┐ ██████┐████████┐ │//
// │ └────██┐└────██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
// │ █████┌┘ █████┌┘ ██████┌┘ ██│ ██│ │//
// │ └───██┐██┌───┘ ██┌──██┐ ██│ ██│ │//
// │ ██████┌┘███████┐ ██████┌┘██████┐ ██│ │//
// │ └─────┘ └──────┘ └─────┘ └─────┘ └─┘ │//
// └───────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR int
#define FUNC(NAME) NAME##32
#include "blitsort.c"
#undef VAR
#undef FUNC
// blitsort_prim
#define VAR int
#define FUNC(NAME) NAME##_int32
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "blitsort.c"
#undef cmp
#else
#include "blitsort.c"
#endif
#undef VAR
#undef FUNC
#define VAR unsigned int
#define FUNC(NAME) NAME##_uint32
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "blitsort.c"
#undef cmp
#else
#include "blitsort.c"
#endif
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
// ┌───────────────────────────────────────────────────┐//
// │ █████┐ ██┐ ██┐ ██████┐ ██████┐████████┐ │//
// │ ██┌───┘ ██│ ██│ ██┌──██┐└─██┌─┘└──██┌──┘ │//
// │ ██████┐ ███████│ ██████┌┘ ██│ ██│ │//
// │ ██┌──██┐└────██│ ██┌──██┐ ██│ ██│ │//
// │ └█████┌┘ ██│ ██████┌┘██████┐ ██│ │//
// │ └────┘ └─┘ └─────┘ └─────┘ └─┘ │//
// └───────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR long long
#define FUNC(NAME) NAME##64
#include "blitsort.c"
#undef VAR
#undef FUNC
// blitsort_prim
#define VAR long long
#define FUNC(NAME) NAME##_int64
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "blitsort.c"
#undef cmp
#else
#include "blitsort.c"
#endif
#undef VAR
#undef FUNC
#define VAR unsigned long long
#define FUNC(NAME) NAME##_uint64
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "blitsort.c"
#undef cmp
#else
#include "blitsort.c"
#endif
#undef VAR
#undef FUNC
// This section is outside of 32/64 bit pointer territory, so no cache checks
// necessary, unless sorting 32+ byte structures.
#undef QUAD_CACHE
#define QUAD_CACHE 4294967295
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ █████┐ ██████┐ ██████┐████████┐ │//
//│ ██┌──██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
//│ └█████┌┘ ██████┌┘ ██│ ██│ │//
//│ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ └█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR char
#define FUNC(NAME) NAME##8
#include "blitsort.c"
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ ▄██┐ █████┐ ██████┐ ██████┐████████┐│//
//│ ████│ ██┌───┘ ██┌──██┐└─██┌─┘└──██┌──┘│//
//│ └─██│ ██████┐ ██████┌┘ ██│ ██│ │//
//│ ██│ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ ██████┐└█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └─────┘ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR short
#define FUNC(NAME) NAME##16
#include "blitsort.c"
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ ▄██┐ ██████┐ █████┐ ██████┐ ██████┐████████┐ │//
//│ ████│ └────██┐██┌──██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
//│ └─██│ █████┌┘└█████┌┘ ██████┌┘ ██│ ██│ │//
//│ ██│ ██┌───┘ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ ██████┐███████┐└█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └─────┘└──────┘ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
// 128 reflects the name, though the actual size is 80, 96, or 128 bits,
// depending on platform.
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
#define VAR long double
#define FUNC(NAME) NAME##128
#include "blitsort.c"
#undef VAR
#undef FUNC
#endif
///////////////////////////////////////////////////////////
//┌─────────────────────────────────────────────────────┐//
//│ ██████┐██┐ ██┐███████┐████████┐ ██████┐ ███┐ ███┐│//
//│██┌────┘██│ ██│██┌────┘└──██┌──┘██┌───██┐████┐████││//
//│██│ ██│ ██│███████┐ ██│ ██│ ██│██┌███┌██││//
//│██│ ██│ ██│└────██│ ██│ ██│ ██│██│└█┌┘██││//
//│└██████┐└██████┌┘███████│ ██│ └██████┌┘██│ └┘ ██││//
//│ └─────┘ └─────┘ └──────┘ └─┘ └─────┘ └─┘ └─┘│//
//└─────────────────────────────────────────────────────┘//
///////////////////////////////////////////////////////////
/*
typedef struct {char bytes[32];} struct256;
#define VAR struct256
#define FUNC(NAME) NAME##256
#include "blitsort.c"
#undef VAR
#undef FUNC
*/
/////////////////////////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────────────────────────┐//
//│ ██████┐ ██┐ ██████┐████████┐███████┐ ██████┐ ██████┐ ████████┐ │//
//│ ██┌──██┐██│ └─██┌─┘└──██┌──┘██┌────┘██┌───██┐██┌──██┐└──██┌──┘ │//
//│ ██████┌┘██│ ██│ ██│ ███████┐██│ ██│██████┌┘ ██│ │//
//│ ██┌──██┐██│ ██│ ██│ └────██│██│ ██│██┌──██┐ ██│ │//
//│ ██████┌┘███████┐██████┐ ██│ ███████│└██████┌┘██│ ██│ ██│ │//
//│ └─────┘ └──────┘└─────┘ └─┘ └──────┘ └─────┘ └─┘ └─┘ └─┘ │//
//└────────────────────────────────────────────────────────────────────────┘//
/////////////////////////////////////////////////////////////////////////////
void blitsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp)
{
if (nmemb < 2)
{
return;
}
switch (size)
{
case sizeof(char):
blitsort8(array, nmemb, cmp);
return;
case sizeof(short):
blitsort16(array, nmemb, cmp);
return;
case sizeof(int):
blitsort32(array, nmemb, cmp);
return;
case sizeof(long long):
blitsort64(array, nmemb, cmp);
return;
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
case sizeof(long double):
blitsort128(array, nmemb, cmp);
return;
#endif
// case sizeof(struct256):
// blitsort256(array, nmemb, cmp);
return;
default:
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
assert(size == sizeof(char) || size == sizeof(short) || size == sizeof(int) || size == sizeof(long long) || size == sizeof(long double));
#else
assert(size == sizeof(char) || size == sizeof(short) || size == sizeof(int) || size == sizeof(long long));
#endif
// qsort(array, nmemb, size, cmp);
}
}
// suggested size values for primitives:
// case 0: unsigned char
// case 1: signed char
// case 2: signed short
// case 3: unsigned short
// case 4: signed int
// case 5: unsigned int
// case 6: float
// case 7: double
// case 8: signed long long
// case 9: unsigned long long
// case ?: long double, use sizeof(long double):
void blitsort_prim(void *array, size_t nmemb, size_t size)
{
if (nmemb < 2)
{
return;
}
switch (size)
{
case 4:
blitsort_int32(array, nmemb, NULL);
return;
case 5:
blitsort_uint32(array, nmemb, NULL);
return;
case 8:
blitsort_int64(array, nmemb, NULL);
return;
case 9:
blitsort_uint64(array, nmemb, NULL);
return;
default:
assert(size == sizeof(int) || size == sizeof(int) + 1 || size == sizeof(long long) || size == sizeof(long long) + 1);
return;
}
}
#undef QUAD_CACHE
#endif
================================================
FILE: src/crumsort.c
================================================
// crumsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
#define CRUM_AUX 512
#define CRUM_OUT 96
void FUNC(fulcrum_partition)(VAR *array, VAR *swap, VAR *max, size_t swap_size, size_t nmemb, CMPFUNC *cmp);
void FUNC(crum_analyze)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
unsigned char loop, asum, bsum, csum, dsum;
unsigned int astreaks, bstreaks, cstreaks, dstreaks;
size_t quad1, quad2, quad3, quad4, half1, half2;
size_t cnt, abalance, bbalance, cbalance, dbalance;
VAR *pta, *ptb, *ptc, *ptd;
half1 = nmemb / 2;
quad1 = half1 / 2;
quad2 = half1 - quad1;
half2 = nmemb - half1;
quad3 = half2 / 2;
quad4 = half2 - quad3;
pta = array;
ptb = array + quad1;
ptc = array + half1;
ptd = array + half1 + quad3;
astreaks = bstreaks = cstreaks = dstreaks = 0;
abalance = bbalance = cbalance = dbalance = 0;
for (cnt = nmemb ; cnt > 132 ; cnt -= 128)
{
for (asum = bsum = csum = dsum = 0, loop = 32 ; loop ; loop--)
{
asum += cmp(pta, pta + 1) > 0; pta++;
bsum += cmp(ptb, ptb + 1) > 0; ptb++;
csum += cmp(ptc, ptc + 1) > 0; ptc++;
dsum += cmp(ptd, ptd + 1) > 0; ptd++;
}
abalance += asum; astreaks += asum = (asum == 0) | (asum == 32);
bbalance += bsum; bstreaks += bsum = (bsum == 0) | (bsum == 32);
cbalance += csum; cstreaks += csum = (csum == 0) | (csum == 32);
dbalance += dsum; dstreaks += dsum = (dsum == 0) | (dsum == 32);
if (cnt > 516 && asum + bsum + csum + dsum == 0)
{
abalance += 48; pta += 96;
bbalance += 48; ptb += 96;
cbalance += 48; ptc += 96;
dbalance += 48; ptd += 96;
cnt -= 384;
}
}
for ( ; cnt > 7 ; cnt -= 4)
{
abalance += cmp(pta, pta + 1) > 0; pta++;
bbalance += cmp(ptb, ptb + 1) > 0; ptb++;
cbalance += cmp(ptc, ptc + 1) > 0; ptc++;
dbalance += cmp(ptd, ptd + 1) > 0; ptd++;
}
if (quad1 < quad2) {bbalance += cmp(ptb, ptb + 1) > 0; ptb++;}
if (quad1 < quad3) {cbalance += cmp(ptc, ptc + 1) > 0; ptc++;}
if (quad1 < quad4) {dbalance += cmp(ptd, ptd + 1) > 0; ptd++;}
cnt = abalance + bbalance + cbalance + dbalance;
if (cnt == 0)
{
if (cmp(pta, pta + 1) <= 0 && cmp(ptb, ptb + 1) <= 0 && cmp(ptc, ptc + 1) <= 0)
{
return;
}
}
asum = quad1 - abalance == 1;
bsum = quad2 - bbalance == 1;
csum = quad3 - cbalance == 1;
dsum = quad4 - dbalance == 1;
if (asum | bsum | csum | dsum)
{
unsigned char span1 = (asum && bsum) * (cmp(pta, pta + 1) > 0);
unsigned char span2 = (bsum && csum) * (cmp(ptb, ptb + 1) > 0);
unsigned char span3 = (csum && dsum) * (cmp(ptc, ptc + 1) > 0);
switch (span1 | span2 * 2 | span3 * 4)
{
case 0: break;
case 1: FUNC(quad_reversal)(array, ptb); abalance = bbalance = 0; break;
case 2: FUNC(quad_reversal)(pta + 1, ptc); bbalance = cbalance = 0; break;
case 3: FUNC(quad_reversal)(array, ptc); abalance = bbalance = cbalance = 0; break;
case 4: FUNC(quad_reversal)(ptb + 1, ptd); cbalance = dbalance = 0; break;
case 5: FUNC(quad_reversal)(array, ptb);
FUNC(quad_reversal)(ptb + 1, ptd); abalance = bbalance = cbalance = dbalance = 0; break;
case 6: FUNC(quad_reversal)(pta + 1, ptd); bbalance = cbalance = dbalance = 0; break;
case 7: FUNC(quad_reversal)(array, ptd); return;
}
if (asum && abalance) {FUNC(quad_reversal)(array, pta); abalance = 0;}
if (bsum && bbalance) {FUNC(quad_reversal)(pta + 1, ptb); bbalance = 0;}
if (csum && cbalance) {FUNC(quad_reversal)(ptb + 1, ptc); cbalance = 0;}
if (dsum && dbalance) {FUNC(quad_reversal)(ptc + 1, ptd); dbalance = 0;}
}
#ifdef cmp
cnt = nmemb / 256; // switch to quadsort if at least 50% ordered
#else
cnt = nmemb / 512; // switch to quadsort if at least 25% ordered
#endif
asum = astreaks > cnt;
bsum = bstreaks > cnt;
csum = cstreaks > cnt;
dsum = dstreaks > cnt;
#ifndef cmp
if (quad1 > QUAD_CACHE)
{
asum = bsum = csum = dsum = 1;
}
#endif
switch (asum + bsum * 2 + csum * 4 + dsum * 8)
{
case 0:
FUNC(fulcrum_partition)(array, swap, NULL, swap_size, nmemb, cmp);
return;
case 1:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
FUNC(fulcrum_partition)(pta + 1, swap, NULL, swap_size, quad2 + half2, cmp);
break;
case 2:
FUNC(fulcrum_partition)(array, swap, NULL, swap_size, quad1, cmp);
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
FUNC(fulcrum_partition)(ptb + 1, swap, NULL, swap_size, half2, cmp);
break;
case 3:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
FUNC(fulcrum_partition)(ptb + 1, swap, NULL, swap_size, half2, cmp);
break;
case 4:
FUNC(fulcrum_partition)(array, swap, NULL, swap_size, half1, cmp);
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
FUNC(fulcrum_partition)(ptc + 1, swap, NULL, swap_size, quad4, cmp);
break;
case 8:
FUNC(fulcrum_partition)(array, swap, NULL, swap_size, half1 + quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 9:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
FUNC(fulcrum_partition)(pta + 1, swap, NULL, swap_size, quad2 + quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 12:
FUNC(fulcrum_partition)(array, swap, NULL, swap_size, half1, cmp);
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 5:
case 6:
case 7:
case 10:
case 11:
case 13:
case 14:
case 15:
if (asum)
{
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
}
else FUNC(fulcrum_partition)(array, swap, NULL, swap_size, quad1, cmp);
if (bsum)
{
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
}
else FUNC(fulcrum_partition)(pta + 1, swap, NULL, swap_size, quad2, cmp);
if (csum)
{
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
}
else FUNC(fulcrum_partition)(ptb + 1, swap, NULL, swap_size, quad3, cmp);
if (dsum)
{
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
}
else FUNC(fulcrum_partition)(ptc + 1, swap, NULL, swap_size, quad4, cmp);
break;
}
if (cmp(pta, pta + 1) <= 0)
{
if (cmp(ptc, ptc + 1) <= 0)
{
if (cmp(ptb, ptb + 1) <= 0)
{
return;
}
}
else
{
FUNC(rotate_merge_block)(array + half1, swap, swap_size, quad3, quad4, cmp);
}
}
else
{
FUNC(rotate_merge_block)(array, swap, swap_size, quad1, quad2, cmp);
if (cmp(ptc, ptc + 1) > 0)
{
FUNC(rotate_merge_block)(array + half1, swap, swap_size, quad3, quad4, cmp);
}
}
FUNC(rotate_merge_block)(array, swap, swap_size, half1, half2, cmp);
}
// The next 4 functions are used for pivot selection
VAR *FUNC(crum_binary_median)(VAR *pta, VAR *ptb, size_t len, CMPFUNC *cmp)
{
while (len /= 2)
{
if (cmp(pta + len, ptb + len) <= 0) pta += len; else ptb += len;
}
return cmp(pta, ptb) > 0 ? pta : ptb;
}
VAR *FUNC(crum_median_of_cbrt)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, int *generic, CMPFUNC *cmp)
{
VAR *pta, *piv;
size_t cnt, cbrt, div;
for (cbrt = 32 ; nmemb > cbrt * cbrt * cbrt && cbrt < swap_size ; cbrt *= 2) {}
div = nmemb / cbrt;
pta = array + nmemb - 1 - (size_t) &div / 64 % div;
piv = array + cbrt;
for (cnt = cbrt ; cnt ; cnt--)
{
swap[0] = *--piv; *piv = *pta; *pta = swap[0];
pta -= div;
}
cbrt /= 2;
FUNC(quadsort_swap)(piv, swap, swap_size, cbrt, cmp);
FUNC(quadsort_swap)(piv + cbrt, swap, swap_size, cbrt, cmp);
*generic = (cmp(piv + cbrt * 2 - 1, piv) <= 0) & (cmp(piv + cbrt - 1, piv) <= 0);
return FUNC(crum_binary_median)(piv, piv + cbrt, cbrt, cmp);
}
size_t FUNC(crum_median_of_three)(VAR *array, size_t v0, size_t v1, size_t v2, CMPFUNC *cmp)
{
size_t v[3] = {v0, v1, v2};
char x, y, z;
x = cmp(array + v0, array + v1) > 0;
y = cmp(array + v0, array + v2) > 0;
z = cmp(array + v1, array + v2) > 0;
return v[(x == y) + (y ^ z)];
}
VAR *FUNC(crum_median_of_nine)(VAR *array, size_t nmemb, CMPFUNC *cmp)
{
size_t x, y, z, div = nmemb / 16;
x = FUNC(crum_median_of_three)(array, div * 2, div * 1, div * 4, cmp);
y = FUNC(crum_median_of_three)(array, div * 8, div * 6, div * 10, cmp);
z = FUNC(crum_median_of_three)(array, div * 14, div * 12, div * 15, cmp);
return array + FUNC(crum_median_of_three)(array, x, y, z, cmp);
}
size_t FUNC(fulcrum_default_partition)(VAR *array, VAR *swap, VAR *ptx, VAR *piv, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
size_t i, cnt, val, m = 0;
VAR *ptl, *ptr, *pta, *tpa;
memcpy(swap, array, 32 * sizeof(VAR));
memcpy(swap + 32, array + nmemb - 32, 32 * sizeof(VAR));
ptl = array;
ptr = array + nmemb - 1;
pta = array + 32;
tpa = array + nmemb - 33;
cnt = nmemb / 16 - 4;
while (1)
{
if (pta - ptl - m <= 48)
{
if (cnt-- == 0) break;
for (i = 16 ; i ; i--)
{
val = cmp(pta, piv) <= 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
}
}
if (pta - ptl - m >= 16)
{
if (cnt-- == 0) break;
for (i = 16 ; i ; i--)
{
val = cmp(tpa, piv) <= 0; ptl[m] = ptr[m] = *tpa--; m += val; ptr--;
}
}
}
if (pta - ptl - m <= 48)
{
for (cnt = nmemb % 16 ; cnt ; cnt--)
{
val = cmp(pta, piv) <= 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
}
}
else
{
for (cnt = nmemb % 16 ; cnt ; cnt--)
{
val = cmp(tpa, piv) <= 0; ptl[m] = ptr[m] = *tpa--; m += val; ptr--;
}
}
pta = swap;
for (cnt = 16 ; cnt ; cnt--)
{
val = cmp(pta, piv) <= 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
val = cmp(pta, piv) <= 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
val = cmp(pta, piv) <= 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
val = cmp(pta, piv) <= 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
}
return m;
}
// As per suggestion by Marshall Lochbaum to improve generic data handling by mimicking dual-pivot quicksort
size_t FUNC(fulcrum_reverse_partition)(VAR *array, VAR *swap, VAR *ptx, VAR *piv, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
size_t i, cnt, val, m = 0;
VAR *ptl, *ptr, *pta, *tpa;
memcpy(swap, array, 32 * sizeof(VAR));
memcpy(swap + 32, array + nmemb - 32, 32 * sizeof(VAR));
ptl = array;
ptr = array + nmemb - 1;
pta = array + 32;
tpa = array + nmemb - 33;
cnt = nmemb / 16 - 4;
while (1)
{
if (pta - ptl - m <= 48)
{
if (cnt-- == 0) break;
for (i = 16 ; i ; i--)
{
val = cmp(piv, pta) > 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
}
}
if (pta - ptl - m >= 16)
{
if (cnt-- == 0) break;
for (i = 16 ; i ; i--)
{
val = cmp(piv, tpa) > 0; ptl[m] = ptr[m] = *tpa--; m += val; ptr--;
}
}
}
if (pta - ptl - m <= 48)
{
for (cnt = nmemb % 16 ; cnt ; cnt--)
{
val = cmp(piv, pta) > 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
}
}
else
{
for (cnt = nmemb % 16 ; cnt ; cnt--)
{
val = cmp(piv, tpa) > 0; ptl[m] = ptr[m] = *tpa--; m += val; ptr--;
}
}
pta = swap;
for (cnt = 16 ; cnt ; cnt--)
{
val = cmp(piv, pta) > 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
val = cmp(piv, pta) > 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
val = cmp(piv, pta) > 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
val = cmp(piv, pta) > 0; ptl[m] = ptr[m] = *pta++; m += val; ptr--;
}
return m;
}
void FUNC(fulcrum_partition)(VAR *array, VAR *swap, VAR *max, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
size_t a_size, s_size;
VAR *ptp, piv;
int generic = 0;
while (1)
{
if (nmemb <= 2048)
{
ptp = FUNC(crum_median_of_nine)(array, nmemb, cmp);
}
else
{
ptp = FUNC(crum_median_of_cbrt)(array, swap, swap_size, nmemb, &generic, cmp);
if (generic) break;
}
piv = *ptp;
if (max && cmp(max, &piv) <= 0)
{
a_size = FUNC(fulcrum_reverse_partition)(array, swap, array, &piv, swap_size, nmemb, cmp);
s_size = nmemb - a_size;
nmemb = a_size;
if (s_size <= a_size / 32 || a_size <= CRUM_OUT) break;
max = NULL;
continue;
}
*ptp = array[--nmemb];
a_size = FUNC(fulcrum_default_partition)(array, swap, array, &piv, swap_size, nmemb, cmp);
s_size = nmemb - a_size;
ptp = array + a_size; array[nmemb] = *ptp; *ptp = piv;
if (a_size <= s_size / 32 || s_size <= CRUM_OUT)
{
FUNC(quadsort_swap)(ptp + 1, swap, swap_size, s_size, cmp);
}
else
{
FUNC(fulcrum_partition)(ptp + 1, swap, max, swap_size, s_size, cmp);
}
nmemb = a_size;
if (s_size <= a_size / 32 || a_size <= CRUM_OUT)
{
if (a_size <= CRUM_OUT) break;
a_size = FUNC(fulcrum_reverse_partition)(array, swap, array, &piv, swap_size, nmemb, cmp);
s_size = nmemb - a_size;
nmemb = a_size;
if (s_size <= a_size / 32 || a_size <= CRUM_OUT) break;
max = NULL;
continue;
}
max = ptp;
}
FUNC(quadsort_swap)(array, swap, swap_size, nmemb, cmp);
}
void FUNC(crumsort)(void *array, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb <= 256)
{
VAR swap[nmemb];
FUNC(quadsort_swap)(array, swap, nmemb, nmemb, cmp);
return;
}
VAR *pta = (VAR *) array;
#if CRUM_AUX
size_t swap_size = CRUM_AUX;
#else
size_t swap_size = 128;
while (swap_size * swap_size <= nmemb)
{
swap_size *= 4;
}
#endif
VAR swap[swap_size];
FUNC(crum_analyze)(pta, swap, swap_size, nmemb, cmp);
}
void FUNC(crumsort_swap)(void *array, void *swap, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb <= 256)
{
FUNC(quadsort_swap)(array, swap, swap_size, nmemb, cmp);
}
else
{
VAR *pta = (VAR *) array;
VAR *pts = (VAR *) swap;
FUNC(crum_analyze)(pta, pts, swap_size, nmemb, cmp);
}
}
================================================
FILE: src/crumsort.h
================================================
// crumsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
#ifndef CRUMSORT_H
#define CRUMSORT_H
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <errno.h>
#include <stdalign.h>
#include <float.h>
#include <string.h>
typedef int CMPFUNC (const void *a, const void *b);
//#define cmp(a,b) (*(a) > *(b))
#ifndef QUADSORT_H
#include "quadsort.h"
#endif
// When sorting an array of pointers, like a string array, the QUAD_CACHE needs
// to be set for proper performance when sorting large arrays.
// crumsort_prim() can be used to sort arrays of 32 and 64 bit integers
// without a comparison function or cache restrictions.
// With a 6 MB L3 cache a value of 262144 works well.
#ifdef cmp
#define QUAD_CACHE 4294967295
#else
//#define QUAD_CACHE 131072
#define QUAD_CACHE 262144
//#define QUAD_CACHE 524288
//#define QUAD_CACHE 4294967295
#endif
//////////////////////////////////////////////////////////
// ┌───────────────────────────────────────────────────┐//
// │ ██████┐ ██████┐ ██████┐ ██████┐████████┐ │//
// │ └────██┐└────██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
// │ █████┌┘ █████┌┘ ██████┌┘ ██│ ██│ │//
// │ └───██┐██┌───┘ ██┌──██┐ ██│ ██│ │//
// │ ██████┌┘███████┐ ██████┌┘██████┐ ██│ │//
// │ └─────┘ └──────┘ └─────┘ └─────┘ └─┘ │//
// └───────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR int
#define FUNC(NAME) NAME##32
#include "crumsort.c"
#undef VAR
#undef FUNC
// crumsort_prim
#define VAR int
#define FUNC(NAME) NAME##_int32
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "crumsort.c"
#undef cmp
#else
#include "crumsort.c"
#endif
#undef VAR
#undef FUNC
#define VAR unsigned int
#define FUNC(NAME) NAME##_uint32
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "crumsort.c"
#undef cmp
#else
#include "crumsort.c"
#endif
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
// ┌───────────────────────────────────────────────────┐//
// │ █████┐ ██┐ ██┐ ██████┐ ██████┐████████┐ │//
// │ ██┌───┘ ██│ ██│ ██┌──██┐└─██┌─┘└──██┌──┘ │//
// │ ██████┐ ███████│ ██████┌┘ ██│ ██│ │//
// │ ██┌──██┐└────██│ ██┌──██┐ ██│ ██│ │//
// │ └█████┌┘ ██│ ██████┌┘██████┐ ██│ │//
// │ └────┘ └─┘ └─────┘ └─────┘ └─┘ │//
// └───────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR long long
#define FUNC(NAME) NAME##64
#include "crumsort.c"
#undef VAR
#undef FUNC
// crumsort_prim
#define VAR long long
#define FUNC(NAME) NAME##_int64
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "crumsort.c"
#undef cmp
#else
#include "crumsort.c"
#endif
#undef VAR
#undef FUNC
#define VAR unsigned long long
#define FUNC(NAME) NAME##_uint64
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "crumsort.c"
#undef cmp
#else
#include "crumsort.c"
#endif
#undef VAR
#undef FUNC
// This section is outside of 32/64 bit pointer territory, so no cache checks
// necessary, unless sorting 32+ byte structures.
#undef QUAD_CACHE
#define QUAD_CACHE 4294967295
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ █████┐ ██████┐ ██████┐████████┐ │//
//│ ██┌──██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
//│ └█████┌┘ ██████┌┘ ██│ ██│ │//
//│ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ └█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR char
#define FUNC(NAME) NAME##8
#include "crumsort.c"
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ ▄██┐ █████┐ ██████┐ ██████┐████████┐│//
//│ ████│ ██┌───┘ ██┌──██┐└─██┌─┘└──██┌──┘│//
//│ └─██│ ██████┐ ██████┌┘ ██│ ██│ │//
//│ ██│ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ ██████┐└█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └─────┘ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR short
#define FUNC(NAME) NAME##16
#include "crumsort.c"
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ ▄██┐ ██████┐ █████┐ ██████┐ ██████┐████████┐ │//
//│ ████│ └────██┐██┌──██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
//│ └─██│ █████┌┘└█████┌┘ ██████┌┘ ██│ ██│ │//
//│ ██│ ██┌───┘ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ ██████┐███████┐└█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └─────┘└──────┘ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
// 128 reflects the name, though the actual size of a long double is 64, 80,
// 96, or 128 bits, depending on platform.
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
#define VAR long double
#define FUNC(NAME) NAME##128
#include "crumsort.c"
#undef VAR
#undef FUNC
#endif
///////////////////////////////////////////////////////////
//┌─────────────────────────────────────────────────────┐//
//│ ██████┐██┐ ██┐███████┐████████┐ ██████┐ ███┐ ███┐│//
//│██┌────┘██│ ██│██┌────┘└──██┌──┘██┌───██┐████┐████││//
//│██│ ██│ ██│███████┐ ██│ ██│ ██│██┌███┌██││//
//│██│ ██│ ██│└────██│ ██│ ██│ ██│██│└█┌┘██││//
//│└██████┐└██████┌┘███████│ ██│ └██████┌┘██│ └┘ ██││//
//│ └─────┘ └─────┘ └──────┘ └─┘ └─────┘ └─┘ └─┘│//
//└─────────────────────────────────────────────────────┘//
///////////////////////////////////////////////////////////
/*
typedef struct {char bytes[32];} struct256;
#define VAR struct256
#define FUNC(NAME) NAME##256
#include "crumsort.c"
#undef VAR
#undef FUNC
*/
//////////////////////////////////////////////////////////////////////////
//┌─────────────────────────────────────────────────────────────────────┐//
//│ ██████┐██████┐ ██┐ ██┐███┐ ███┐███████┐ ██████┐ ██████┐ ████████┐│//
//│██┌────┘██┌──██┐██│ ██│████┐████│██┌────┘██┌───██┐██┌──██┐└──██┌──┘│//
//│██│ ██████┌┘██│ ██│██┌███┌██│███████┐██│ ██│██████┌┘ ██│ │//
//│██│ ██┌──██┐██│ ██│██│└█┌┘██│└────██│██│ ██│██┌──██┐ ██│ │//
//│└██████┐██│ ██│└██████┌┘██│ └┘ ██│███████│└██████┌┘██│ ██│ ██│ │//
//│ └─────┘└─┘ └─┘ └─────┘ └─┘ └─┘└──────┘ └─────┘ └─┘ └─┘ └─┘ │//
//└─────────────────────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////////////////////
void crumsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp)
{
if (nmemb < 2)
{
return;
}
switch (size)
{
case sizeof(char):
crumsort8(array, nmemb, cmp);
return;
case sizeof(short):
crumsort16(array, nmemb, cmp);
return;
case sizeof(int):
crumsort32(array, nmemb, cmp);
return;
case sizeof(long long):
crumsort64(array, nmemb, cmp);
return;
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
case sizeof(long double):
crumsort128(array, nmemb, cmp);
return;
#endif
// case sizeof(struct256):
// crumsort256(array, nmemb, cmp);
return;
default:
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
assert(size == sizeof(char) || size == sizeof(short) || size == sizeof(int) || size == sizeof(long long) || size == sizeof(long double));
#else
assert(size == sizeof(char) || size == sizeof(short) || size == sizeof(int) || size == sizeof(long long));
#endif
// qsort(array, nmemb, size, cmp);
}
}
// suggested size values for primitives:
// case 0: unsigned char
// case 1: signed char
// case 2: signed short
// case 3: unsigned short
// case 4: signed int
// case 5: unsigned int
// case 6: float
// case 7: double
// case 8: signed long long
// case 9: unsigned long long
// case ?: long double, use sizeof(long double):
void crumsort_prim(void *array, size_t nmemb, size_t size)
{
if (nmemb < 2)
{
return;
}
switch (size)
{
case 4:
crumsort_int32(array, nmemb, NULL);
return;
case 5:
crumsort_uint32(array, nmemb, NULL);
return;
case 8:
crumsort_int64(array, nmemb, NULL);
return;
case 9:
crumsort_uint64(array, nmemb, NULL);
return;
default:
assert(size == sizeof(int) || size == sizeof(int) + 1 || size == sizeof(long long) || size == sizeof(long long) + 1);
return;
}
}
#undef QUAD_CACHE
#endif
================================================
FILE: src/extra_tests.c
================================================
#ifdef QUAD_DEBUG
// random % 4
for (cnt = 0 ; cnt < mem ; cnt++)
{
r_array[cnt] = rand() % 4;
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "random % 4", sizeof(VAR), cmp_int);
// semi random
for (cnt = 0 ; cnt < mem ; cnt++)
{
r_array[cnt] = rand() % 8 / 7 * rand();
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "semi random", sizeof(VAR), cmp_int);
// random signal
for (cnt = 0 ; cnt < mem ; cnt++)
{
if (cnt < mem / 2)
{
r_array[cnt] = cnt + rand() % 16;
}
else
{
r_array[cnt] = mem - cnt + rand() % 16;
}
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "random signal", sizeof(VAR), cmp_int);
// exponential
for (cnt = 0 ; cnt < mem ; cnt++)
{
r_array[cnt] = (size_t) (cnt * cnt) % 10000; //(1 << 30);
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, 0, "exponential", sizeof(VAR), cmp_int);
// random fragments -- Make array 92% sorted
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = rand();
}
quadsort(r_array + quad0, quad1 / 100 * 98, sizeof(VAR), cmp_int);
quadsort(r_array + quad1, quad1 / 100 * 98, sizeof(VAR), cmp_int);
quadsort(r_array + half1, quad1 / 100 * 98, sizeof(VAR), cmp_int);
quadsort(r_array + span3, quad1 / 100 * 98, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "chaos fragments", sizeof(VAR), cmp_int);
// Make array 12% sorted, this tends to make timsort/powersort slower than fully random
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = rand();
}
quadsort(r_array + quad0 / 1, quad1 * 2 / 100, sizeof(VAR), cmp_int);
quadsort(r_array + quad1 / 2, quad1 * 2 / 100, sizeof(VAR), cmp_int);
quadsort(r_array + quad1 / 1, quad1 * 2 / 100, sizeof(VAR), cmp_int);
quadsort(r_array + half1 / 1, quad1 * 2 / 100, sizeof(VAR), cmp_int);
quadsort(r_array + span3 / 2, quad1 * 2 / 100, sizeof(VAR), cmp_int);
quadsort(r_array + span3 / 1, quad1 * 2 / 100, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "order fragments", sizeof(VAR), cmp_int);
// Make array 95% generic
for (cnt = 0 ; cnt < max ; cnt++)
{
if (rand() % 20 == 0)
{
r_array[cnt] = rand();
}
else
{
r_array[cnt] = 1000000000;
}
}
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "95% generic", sizeof(VAR), cmp_int);
// Three saws
for (cnt = 0 ; cnt < max ; cnt++)
{
r_array[cnt] = rand();
}
quadsort(r_array, max / 3, sizeof(VAR), cmp_int);
quadsort(r_array + max / 3, max / 3, sizeof(VAR), cmp_int);
quadsort(r_array + max / 3 * 2, max / 3, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "three saws", sizeof(VAR), cmp_int);
// various combinations of reverse and ascending order data
/*
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = rand();
quadsort(r_array + quad0, half1, sizeof(VAR), cmp_int);
quadsort(r_array + half1, half2, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "aaaaa aaaaa", sizeof(VAR), cmp_int);
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = rand();
quadsort(r_array + quad1 / 2, nmemb - quad1 / 2, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "raaaaaaaaaa", sizeof(VAR), cmp_int);
size_t span2 = quad2 + quad3 + quad4;
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = rand();
quadsort(r_array + quad1, span2, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "rr aaaaaaaa", sizeof(VAR), cmp_int);
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = rand();
quadsort(r_array + quad0, quad1, sizeof(VAR), cmp_int);
quadsort(r_array + half1, half2, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "aa rr aaaaa", sizeof(VAR), cmp_int);
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = rand();
quadsort(r_array + quad0, half1, sizeof(VAR), cmp_int);
quadsort(r_array + span3, quad4, sizeof(VAR), cmp_int);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "aaaaa rr aa", sizeof(VAR), cmp_int);
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = rand();
quadsort(r_array + quad0, nmemb, sizeof(VAR), cmp_int);
qsort(r_array + quad0, half1, sizeof(VAR), cmp_rev);
qsort(r_array + half1, half2, sizeof(VAR), cmp_rev);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "rrrrr rrrrr", sizeof(VAR), cmp_int);
for (cnt = 0 ; cnt < max ; cnt++) r_array[cnt] = rand();
quadsort(r_array + quad0, nmemb, sizeof(VAR), cmp_int);
qsort(r_array + quad0, quad1, sizeof(VAR), cmp_rev);
qsort(r_array + quad1, quad2, sizeof(VAR), cmp_rev);
qsort(r_array + half1, quad3, sizeof(VAR), cmp_rev);
qsort(r_array + span3, quad4, sizeof(VAR), cmp_rev);
run_test(a_array, r_array, v_array, max, max, samples, repetitions, repetitions, "rr rr rr rr", sizeof(VAR), cmp_int);
*/
#endif
================================================
FILE: src/fluxsort.c
================================================
// fluxsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
#define FLUX_OUT 96
void FUNC(flux_partition)(VAR *array, VAR *swap, VAR *ptx, VAR *ptp, size_t nmemb, CMPFUNC *cmp);
// Determine whether to use mergesort or quicksort
void FUNC(flux_analyze)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
unsigned char loop, asum, bsum, csum, dsum;
unsigned int astreaks, bstreaks, cstreaks, dstreaks;
size_t quad1, quad2, quad3, quad4, half1, half2;
size_t cnt, abalance, bbalance, cbalance, dbalance;
VAR *pta, *ptb, *ptc, *ptd;
half1 = nmemb / 2;
quad1 = half1 / 2;
quad2 = half1 - quad1;
half2 = nmemb - half1;
quad3 = half2 / 2;
quad4 = half2 - quad3;
pta = array;
ptb = array + quad1;
ptc = array + half1;
ptd = array + half1 + quad3;
astreaks = bstreaks = cstreaks = dstreaks = 0;
abalance = bbalance = cbalance = dbalance = 0;
if (quad1 < quad2) {bbalance += cmp(ptb, ptb + 1) > 0; ptb++;}
if (quad1 < quad3) {cbalance += cmp(ptc, ptc + 1) > 0; ptc++;}
if (quad1 < quad4) {dbalance += cmp(ptd, ptd + 1) > 0; ptd++;}
for (cnt = nmemb ; cnt > 132 ; cnt -= 128)
{
for (asum = bsum = csum = dsum = 0, loop = 32 ; loop ; loop--)
{
asum += cmp(pta, pta + 1) > 0; pta++;
bsum += cmp(ptb, ptb + 1) > 0; ptb++;
csum += cmp(ptc, ptc + 1) > 0; ptc++;
dsum += cmp(ptd, ptd + 1) > 0; ptd++;
}
abalance += asum; astreaks += asum = (asum == 0) | (asum == 32);
bbalance += bsum; bstreaks += bsum = (bsum == 0) | (bsum == 32);
cbalance += csum; cstreaks += csum = (csum == 0) | (csum == 32);
dbalance += dsum; dstreaks += dsum = (dsum == 0) | (dsum == 32);
if (cnt > 516 && asum + bsum + csum + dsum == 0)
{
abalance += 48; pta += 96;
bbalance += 48; ptb += 96;
cbalance += 48; ptc += 96;
dbalance += 48; ptd += 96;
cnt -= 384;
}
}
for ( ; cnt > 7 ; cnt -= 4)
{
abalance += cmp(pta, pta + 1) > 0; pta++;
bbalance += cmp(ptb, ptb + 1) > 0; ptb++;
cbalance += cmp(ptc, ptc + 1) > 0; ptc++;
dbalance += cmp(ptd, ptd + 1) > 0; ptd++;
}
cnt = abalance + bbalance + cbalance + dbalance;
if (cnt == 0)
{
if (cmp(pta, pta + 1) <= 0 && cmp(ptb, ptb + 1) <= 0 && cmp(ptc, ptc + 1) <= 0)
{
return;
}
}
asum = quad1 - abalance == 1;
bsum = quad2 - bbalance == 1;
csum = quad3 - cbalance == 1;
dsum = quad4 - dbalance == 1;
if (asum | bsum | csum | dsum)
{
unsigned char span1 = (asum && bsum) * (cmp(pta, pta + 1) > 0);
unsigned char span2 = (bsum && csum) * (cmp(ptb, ptb + 1) > 0);
unsigned char span3 = (csum && dsum) * (cmp(ptc, ptc + 1) > 0);
switch (span1 | span2 * 2 | span3 * 4)
{
case 0: break;
case 1: FUNC(quad_reversal)(array, ptb); abalance = bbalance = 0; break;
case 2: FUNC(quad_reversal)(pta + 1, ptc); bbalance = cbalance = 0; break;
case 3: FUNC(quad_reversal)(array, ptc); abalance = bbalance = cbalance = 0; break;
case 4: FUNC(quad_reversal)(ptb + 1, ptd); cbalance = dbalance = 0; break;
case 5: FUNC(quad_reversal)(array, ptb);
FUNC(quad_reversal)(ptb + 1, ptd); abalance = bbalance = cbalance = dbalance = 0; break;
case 6: FUNC(quad_reversal)(pta + 1, ptd); bbalance = cbalance = dbalance = 0; break;
case 7: FUNC(quad_reversal)(array, ptd); return;
}
if (asum && abalance) {FUNC(quad_reversal)(array, pta); abalance = 0;}
if (bsum && bbalance) {FUNC(quad_reversal)(pta + 1, ptb); bbalance = 0;}
if (csum && cbalance) {FUNC(quad_reversal)(ptb + 1, ptc); cbalance = 0;}
if (dsum && dbalance) {FUNC(quad_reversal)(ptc + 1, ptd); dbalance = 0;}
}
#ifdef cmp
cnt = nmemb / 256; // switch to quadsort if at least 50% ordered
#else
cnt = nmemb / 512; // switch to quadsort if at least 25% ordered
#endif
asum = astreaks > cnt;
bsum = bstreaks > cnt;
csum = cstreaks > cnt;
dsum = dstreaks > cnt;
#ifndef cmp
if (quad1 > QUAD_CACHE)
{
asum = bsum = csum = dsum = 1;
}
#endif
switch (asum + bsum * 2 + csum * 4 + dsum * 8)
{
case 0:
FUNC(flux_partition)(array, swap, array, swap + nmemb, nmemb, cmp);
return;
case 1:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
FUNC(flux_partition)(pta + 1, swap, pta + 1, swap + quad2 + half2, quad2 + half2, cmp);
break;
case 2:
FUNC(flux_partition)(array, swap, array, swap + quad1, quad1, cmp);
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
FUNC(flux_partition)(ptb + 1, swap, ptb + 1, swap + half2, half2, cmp);
break;
case 3:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
FUNC(flux_partition)(ptb + 1, swap, ptb + 1, swap + half2, half2, cmp);
break;
case 4:
FUNC(flux_partition)(array, swap, array, swap + half1, half1, cmp);
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
FUNC(flux_partition)(ptc + 1, swap, ptc + 1, swap + quad4, quad4, cmp);
break;
case 8:
FUNC(flux_partition)(array, swap, array, swap + half1 + quad3, half1 + quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 9:
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
FUNC(flux_partition)(pta + 1, swap, pta + 1, swap + quad2 + quad3, quad2 + quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 12:
FUNC(flux_partition)(array, swap, array, swap + half1, half1, cmp);
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
break;
case 5:
case 6:
case 7:
case 10:
case 11:
case 13:
case 14:
case 15:
if (asum)
{
if (abalance) FUNC(quadsort_swap)(array, swap, swap_size, quad1, cmp);
}
else FUNC(flux_partition)(array, swap, array, swap + quad1, quad1, cmp);
if (bsum)
{
if (bbalance) FUNC(quadsort_swap)(pta + 1, swap, swap_size, quad2, cmp);
}
else FUNC(flux_partition)(pta + 1, swap, pta + 1, swap + quad2, quad2, cmp);
if (csum)
{
if (cbalance) FUNC(quadsort_swap)(ptb + 1, swap, swap_size, quad3, cmp);
}
else FUNC(flux_partition)(ptb + 1, swap, ptb + 1, swap + quad3, quad3, cmp);
if (dsum)
{
if (dbalance) FUNC(quadsort_swap)(ptc + 1, swap, swap_size, quad4, cmp);
}
else FUNC(flux_partition)(ptc + 1, swap, ptc + 1, swap + quad4, quad4, cmp);
break;
}
if (cmp(pta, pta + 1) <= 0)
{
if (cmp(ptc, ptc + 1) <= 0)
{
if (cmp(ptb, ptb + 1) <= 0)
{
return;
}
memcpy(swap, array, nmemb * sizeof(VAR));
}
else
{
FUNC(cross_merge)(swap + half1, array + half1, quad3, quad4, cmp);
memcpy(swap, array, half1 * sizeof(VAR));
}
}
else
{
if (cmp(ptc, ptc + 1) <= 0)
{
memcpy(swap + half1, array + half1, half2 * sizeof(VAR));
FUNC(cross_merge)(swap, array, quad1, quad2, cmp);
}
else
{
FUNC(cross_merge)(swap + half1, ptb + 1, quad3, quad4, cmp);
FUNC(cross_merge)(swap, array, quad1, quad2, cmp);
}
}
FUNC(cross_merge)(array, swap, half1, half2, cmp);
}
// The next 4 functions are used for pivot selection
VAR FUNC(binary_median)(VAR *pta, VAR *ptb, size_t len, CMPFUNC *cmp)
{
while (len /= 2)
{
if (cmp(pta + len, ptb + len) <= 0) pta += len; else ptb += len;
}
return cmp(pta, ptb) > 0 ? *pta : *ptb;
}
void FUNC(trim_four)(VAR *pta, CMPFUNC *cmp)
{
VAR swap;
size_t x;
x = cmp(pta, pta + 1) > 0; swap = pta[!x]; pta[0] = pta[x]; pta[1] = swap; pta += 2;
x = cmp(pta, pta + 1) > 0; swap = pta[!x]; pta[0] = pta[x]; pta[1] = swap; pta -= 2;
x = (cmp(pta, pta + 2) <= 0) * 2; pta[2] = pta[x]; pta++;
x = (cmp(pta, pta + 2) > 0) * 2; pta[0] = pta[x];
}
VAR FUNC(median_of_nine)(VAR *array, size_t nmemb, CMPFUNC *cmp)
{
VAR *pta, swap[9];
size_t x, y, z;
z = nmemb / 9;
pta = array;
for (x = 0 ; x < 9 ; x++)
{
swap[x] = *pta;
pta += z;
}
FUNC(trim_four)(swap, cmp);
FUNC(trim_four)(swap + 4, cmp);
swap[0] = swap[5];
swap[3] = swap[8];
FUNC(trim_four)(swap, cmp);
swap[0] = swap[6];
x = cmp(swap + 0, swap + 1) > 0;
y = cmp(swap + 0, swap + 2) > 0;
z = cmp(swap + 1, swap + 2) > 0;
return swap[(x == y) + (y ^ z)];
}
VAR FUNC(median_of_cbrt)(VAR *array, VAR *swap, VAR *ptx, size_t nmemb, int *generic, CMPFUNC *cmp)
{
VAR *pta, *pts;
size_t cnt, div, cbrt;
for (cbrt = 32 ; nmemb > cbrt * cbrt * cbrt ; cbrt *= 2) {}
div = nmemb / cbrt;
pta = ptx + (size_t) &div / 16 % div;
pts = ptx == array ? swap : array;
for (cnt = 0 ; cnt < cbrt ; cnt++)
{
pts[cnt] = *pta;
pta += div;
}
cbrt /= 2;
FUNC(quadsort_swap)(pts, pts + cbrt * 2, cbrt, cbrt, cmp);
FUNC(quadsort_swap)(pts + cbrt, pts + cbrt * 2, cbrt, cbrt, cmp);
*generic = (cmp(pts + cbrt * 2 - 1, pts) <= 0) & (cmp(pts + cbrt - 1, pts) <= 0);
return FUNC(binary_median)(pts, pts + cbrt, cbrt, cmp);
}
// As per suggestion by Marshall Lochbaum to improve generic data handling by mimicking dual-pivot quicksort
void FUNC(flux_reverse_partition)(VAR *array, VAR *swap, VAR *ptx, VAR *piv, size_t nmemb, CMPFUNC *cmp)
{
size_t a_size, s_size;
#if !defined __clang__
{
size_t cnt, m, val;
VAR *pts = swap;
for (m = 0, cnt = nmemb / 8 ; cnt ; cnt--)
{
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
}
for (cnt = nmemb % 8 ; cnt ; cnt--)
{
val = cmp(piv, ptx) > 0; pts[-m] = array[m] = *ptx++; m += val; pts++;
}
a_size = m;
s_size = nmemb - a_size;
}
#else
{
size_t cnt;
VAR *tmp, *pta = array, *pts = swap;
for (cnt = nmemb / 8 ; cnt ; cnt--)
{
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
}
for (cnt = nmemb % 8 ; cnt ; cnt--)
{
tmp = cmp(piv, ptx) > 0 ? pta++ : pts++; *tmp = *ptx++;
}
a_size = pta - array;
s_size = pts - swap;
}
#endif
memcpy(array + a_size, swap, s_size * sizeof(VAR));
if (s_size <= a_size / 16 || a_size <= FLUX_OUT)
{
FUNC(quadsort_swap)(array, swap, a_size, a_size, cmp);
return;
}
FUNC(flux_partition)(array, swap, array, piv, a_size, cmp);
}
size_t FUNC(flux_default_partition)(VAR *array, VAR *swap, VAR *ptx, VAR *piv, size_t nmemb, CMPFUNC *cmp)
{
size_t run = 0, a = 0, m = 0;
#if !defined __clang__
size_t val;
for (a = 8 ; a <= nmemb ; a += 8)
{
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
if (m == a) run = a;
}
for (a = nmemb % 8 ; a ; a--)
{
val = cmp(ptx, piv) <= 0; swap[-m] = array[m] = *ptx++; m += val; swap++;
}
swap -= nmemb;
#else
VAR *tmp, *pta = array, *pts = swap;
for (a = 8 ; a <= nmemb ; a += 8)
{
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
if (pta == array || pts == swap) run = a;
}
for (a = nmemb % 8 ; a ; a--)
{
tmp = cmp(ptx, piv) <= 0 ? pta++ : pts++; *tmp = *ptx++;
}
m = pta - array;
#endif
if (run <= nmemb / 4)
{
return m;
}
if (m == nmemb)
{
return m;
}
a = nmemb - m;
memcpy(array + m, swap, a * sizeof(VAR));
FUNC(quadsort_swap)(array + m, swap, a, a, cmp);
FUNC(quadsort_swap)(array, swap, m, m, cmp);
return 0;
}
void FUNC(flux_partition)(VAR *array, VAR *swap, VAR *ptx, VAR *piv, size_t nmemb, CMPFUNC *cmp)
{
size_t a_size = 0, s_size;
int generic = 0;
while (1)
{
--piv;
if (nmemb <= 2048)
{
*piv = FUNC(median_of_nine)(ptx, nmemb, cmp);
}
else
{
*piv = FUNC(median_of_cbrt)(array, swap, ptx, nmemb, &generic, cmp);
if (generic)
{
if (ptx == swap)
{
memcpy(array, swap, nmemb * sizeof(VAR));
}
FUNC(quadsort_swap)(array, swap, nmemb, nmemb, cmp);
return;
}
}
if (a_size && cmp(piv + 1, piv) <= 0)
{
FUNC(flux_reverse_partition)(array, swap, array, piv, nmemb, cmp);
return;
}
a_size = FUNC(flux_default_partition)(array, swap, ptx, piv, nmemb, cmp);
s_size = nmemb - a_size;
if (a_size <= s_size / 32 || s_size <= FLUX_OUT)
{
if (a_size == 0)
{
return;
}
if (s_size == 0)
{
FUNC(flux_reverse_partition)(array, swap, array, piv, a_size, cmp);
return;
}
memcpy(array + a_size, swap, s_size * sizeof(VAR));
FUNC(quadsort_swap)(array + a_size, swap, s_size, s_size, cmp);
}
else
{
FUNC(flux_partition)(array + a_size, swap, swap, piv, s_size, cmp);
}
if (s_size <= a_size / 32 || a_size <= FLUX_OUT)
{
if (a_size <= FLUX_OUT)
{
FUNC(quadsort_swap)(array, swap, a_size, a_size, cmp);
}
else
{
FUNC(flux_reverse_partition)(array, swap, array, piv, a_size, cmp);
}
return;
}
nmemb = a_size;
ptx = array;
}
}
void FUNC(fluxsort)(void *array, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb <= 132)
{
FUNC(quadsort)(array, nmemb, cmp);
}
else
{
VAR *pta = (VAR *) array;
VAR *swap = (VAR *) malloc(nmemb * sizeof(VAR));
if (swap == NULL)
{
FUNC(quadsort)(array, nmemb, cmp);
return;
}
FUNC(flux_analyze)(pta, swap, nmemb, nmemb, cmp);
free(swap);
}
}
void FUNC(fluxsort_swap)(void *array, void *swap, size_t swap_size, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb <= 132)
{
FUNC(quadsort_swap)(array, swap, swap_size, nmemb, cmp);
}
else
{
VAR *pta = (VAR *) array;
VAR *pts = (VAR *) swap;
FUNC(flux_analyze)(pta, pts, swap_size, nmemb, cmp);
}
}
================================================
FILE: src/fluxsort.h
================================================
// fluxsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
#ifndef FLUXSORT_H
#define FLUXSORT_H
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <errno.h>
#include <float.h>
#include <string.h>
typedef int CMPFUNC (const void *a, const void *b);
//#define cmp(a,b) (*(a) > *(b))
#ifndef QUADSORT_H
#include "quadsort.h"
#endif
// When sorting an array of 32/64 bit pointers, like a string array, QUAD_CACHE
// needs to be adjusted in quadsort.h and here for proper performance when
// sorting large arrays.
#ifdef cmp
#define QUAD_CACHE 4294967295
#else
//#define QUAD_CACHE 131072
#define QUAD_CACHE 262144
//#define QUAD_CACHE 524288
//#define QUAD_CACHE 4294967295
#endif
//////////////////////////////////////////////////////////
// ┌───────────────────────────────────────────────────┐//
// │ ██████┐ ██████┐ ██████┐ ██████┐████████┐ │//
// │ └────██┐└────██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
// │ █████┌┘ █████┌┘ ██████┌┘ ██│ ██│ │//
// │ └───██┐██┌───┘ ██┌──██┐ ██│ ██│ │//
// │ ██████┌┘███████┐ ██████┌┘██████┐ ██│ │//
// │ └─────┘ └──────┘ └─────┘ └─────┘ └─┘ │//
// └───────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR int
#define FUNC(NAME) NAME##32
#include "fluxsort.c"
#undef VAR
#undef FUNC
// fluxsort_prim
#define VAR int
#define FUNC(NAME) NAME##_int32
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "fluxsort.c"
#undef cmp
#else
#include "fluxsort.c"
#endif
#undef VAR
#undef FUNC
#define VAR unsigned int
#define FUNC(NAME) NAME##_uint32
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "fluxsort.c"
#undef cmp
#else
#include "fluxsort.c"
#endif
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
// ┌───────────────────────────────────────────────────┐//
// │ █████┐ ██┐ ██┐ ██████┐ ██████┐████████┐ │//
// │ ██┌───┘ ██│ ██│ ██┌──██┐└─██┌─┘└──██┌──┘ │//
// │ ██████┐ ███████│ ██████┌┘ ██│ ██│ │//
// │ ██┌──██┐└────██│ ██┌──██┐ ██│ ██│ │//
// │ └█████┌┘ ██│ ██████┌┘██████┐ ██│ │//
// │ └────┘ └─┘ └─────┘ └─────┘ └─┘ │//
// └───────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR long long
#define FUNC(NAME) NAME##64
#include "fluxsort.c"
#undef VAR
#undef FUNC
// fluxsort_prim
#define VAR long long
#define FUNC(NAME) NAME##_int64
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "fluxsort.c"
#undef cmp
#else
#include "fluxsort.c"
#endif
#undef VAR
#undef FUNC
#define VAR unsigned long long
#define FUNC(NAME) NAME##_uint64
#ifndef cmp
#define cmp(a,b) (*(a) > *(b))
#include "fluxsort.c"
#undef cmp
#else
#include "fluxsort.c"
#endif
#undef VAR
#undef FUNC
// This section is outside of 32/64 bit pointer territory, so no cache checks
// necessary, unless sorting 32+ byte structures.
#undef QUAD_CACHE
#define QUAD_CACHE 4294967295
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ █████┐ ██████┐ ██████┐████████┐ │//
//│ ██┌──██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
//│ └█████┌┘ ██████┌┘ ██│ ██│ │//
//│ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ └█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR char
#define FUNC(NAME) NAME##8
#include "fluxsort.c"
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ ▄██┐ █████┐ ██████┐ ██████┐████████┐│//
//│ ████│ ██┌───┘ ██┌──██┐└─██┌─┘└──██┌──┘│//
//│ └─██│ ██████┐ ██████┌┘ ██│ ██│ │//
//│ ██│ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ ██████┐└█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └─────┘ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#define VAR short
#define FUNC(NAME) NAME##16
#include "fluxsort.c"
#undef VAR
#undef FUNC
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ ▄██┐ ██████┐ █████┐ ██████┐ ██████┐████████┐ │//
//│ ████│ └────██┐██┌──██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
//│ └─██│ █████┌┘└█████┌┘ ██████┌┘ ██│ ██│ │//
//│ ██│ ██┌───┘ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ ██████┐███████┐└█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └─────┘└──────┘ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
#define VAR long double
#define FUNC(NAME) NAME##128
#include "fluxsort.c"
#undef VAR
#undef FUNC
#endif
//////////////////////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────────────────────┐//
//│███████┐██┐ ██┐ ██┐██┐ ██┐███████┐ ██████┐ ██████┐ ████████┐ │//
//│██┌────┘██│ ██│ ██│└██┐██┌┘██┌────┘██┌───██┐██┌──██┐└──██┌──┘ │//
//│█████┐ ██│ ██│ ██│ └███┌┘ ███████┐██│ ██│██████┌┘ ██│ │//
//│██┌──┘ ██│ ██│ ██│ ██┌██┐ └────██│██│ ██│██┌──██┐ ██│ │//
//│██│ ███████┐└██████┌┘██┌┘ ██┐███████│└██████┌┘██│ ██│ ██│ │//
//│└─┘ └──────┘ └─────┘ └─┘ └─┘└──────┘ └─────┘ └─┘ └─┘ └─┘ │//
//└────────────────────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////////////////////
void fluxsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp)
{
if (nmemb < 2)
{
return;
}
switch (size)
{
case sizeof(char):
fluxsort8(array, nmemb, cmp);
return;
case sizeof(short):
fluxsort16(array, nmemb, cmp);
return;
case sizeof(int):
fluxsort32(array, nmemb, cmp);
return;
case sizeof(long long):
fluxsort64(array, nmemb, cmp);
return;
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
case sizeof(long double):
fluxsort128(array, nmemb, cmp);
return;
#endif
default:
#if (DBL_MANT_DIG < LDBL_MANT_DIG)
assert(size == sizeof(char) || size == sizeof(short) || size == sizeof(int) || size == sizeof(long long) || size == sizeof(long double));
#else
assert(size == sizeof(char) || size == sizeof(short) || size == sizeof(int) || size == sizeof(long long));
#endif
}
}
// This must match quadsort_prim()
void fluxsort_prim(void *array, size_t nmemb, size_t size)
{
if (nmemb < 2)
{
return;
}
switch (size)
{
case 4:
fluxsort_int32(array, nmemb, NULL);
return;
case 5:
fluxsort_uint32(array, nmemb, NULL);
return;
case 8:
fluxsort_int64(array, nmemb, NULL);
return;
case 9:
fluxsort_uint64(array, nmemb, NULL);
return;
default:
assert(size == sizeof(int) || size == sizeof(int) + 1 || size == sizeof(long long) || size == sizeof(long long) + 1);
return;
}
}
// Sort arrays of structures, the comparison function must be by reference.
void fluxsort_size(void *array, size_t nmemb, size_t size, CMPFUNC *cmp)
{
char **pti, *pta, *pts;
size_t index, offset;
pta = (char *) array;
pti = (char **) malloc(nmemb * sizeof(char *));
assert(pti != NULL);
for (index = offset = 0 ; index < nmemb ; index++)
{
pti[index] = pta + offset;
offset += size;
}
switch (sizeof(size_t))
{
case 4: fluxsort32(pti, nmemb, cmp); break;
case 8: fluxsort64(pti, nmemb, cmp); break;
}
pts = (char *) malloc(nmemb * size);
assert(pts != NULL);
for (index = 0 ; index < nmemb ; index++)
{
memcpy(pts, pti[index], size);
pts += size;
}
pts -= nmemb * size;
memcpy(array, pts, nmemb * size);
free(pti);
free(pts);
}
#undef QUAD_CACHE
#endif
================================================
FILE: src/gridsort.c
================================================
// gridsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
STRUCT(x_node)
{
VAR *swap;
size_t y_size;
size_t y;
VAR *y_base;
STRUCT(y_node) **y_axis;
};
STRUCT(y_node)
{
size_t z_size;
VAR *z_axis1;
VAR *z_axis2;
};
STRUCT(x_node) *FUNC(create_grid)(VAR *array, size_t nmemb, CMPFUNC *cmp)
{
STRUCT(x_node) *x_node = (STRUCT(x_node) *) malloc(sizeof(STRUCT(x_node)));
STRUCT(y_node) *y_node;
for (BSC_Z = BSC_X ; BSC_Z * BSC_Z / 4 < nmemb ; BSC_Z *= 4);
x_node->swap = (VAR *) malloc(BSC_Z * 2 * sizeof(VAR));
x_node->y_base = (VAR *) malloc(BSC_Z * sizeof(VAR));
x_node->y_axis = (STRUCT(y_node) **) malloc(BSC_Z * sizeof(STRUCT(y_node) *));
FUNC(quadsort_swap)(array, x_node->swap, BSC_Z * 2, BSC_Z * 2, cmp);
for (int cnt = 0 ; cnt < 2 ; cnt++)
{
y_node = (STRUCT(y_node) *) malloc(sizeof(STRUCT(y_node)));
y_node->z_axis1 = (VAR *) malloc(BSC_Z * sizeof(VAR));
memcpy(y_node->z_axis1, array + cnt * BSC_Z, BSC_Z * sizeof(VAR));
y_node->z_axis2 = (VAR *) malloc(BSC_Z * sizeof(VAR));
y_node->z_size = 0;
x_node->y_axis[cnt] = y_node;
x_node->y_base[cnt] = y_node->z_axis1[0];
}
x_node->y_size = 2;
x_node->y = 0;
return x_node;
}
// used by destroy_grid
// y_node->z_axis1 should be sorted and of BSC_Z size.
// y_node->z_axis2 should be unsorted and of y_node->z_size size.
void FUNC(twin_merge_cpy)(STRUCT(x_node) *x_node, VAR *dest, STRUCT(y_node) *y_node, CMPFUNC *cmp)
{
VAR *ptl = y_node->z_axis1;
VAR *ptr = y_node->z_axis2;
size_t nmemb1 = BSC_Z;
size_t nmemb2 = y_node->z_size;
VAR *tpl = y_node->z_axis1 + nmemb1 - 1;
VAR *tpr = y_node->z_axis2 + nmemb2 - 1;
VAR *ptd = dest;
VAR *tpd = dest + nmemb1 + nmemb2 - 1;
size_t loop, x, y;
FUNC(quadsort_swap)(ptr, x_node->swap, nmemb2, nmemb2, cmp);
while (1)
{
if (tpl - ptl > 8)
{
ptl8_ptr: if (cmp(ptl + 7, ptr) <= 0)
{
memcpy(ptd, ptl, 8 * sizeof(VAR)); ptd += 8; ptl += 8;
if (tpl - ptl > 8) {goto ptl8_ptr;} continue;
}
tpl8_tpr: if (cmp(tpl - 7, tpr) > 0)
{
tpd -= 7; tpl -= 7; memcpy(tpd--, tpl--, 8 * sizeof(VAR));
if (tpl - ptl > 8) {goto tpl8_tpr;} continue;
}
}
if (tpr - ptr > 8)
{
ptl_ptr8: if (cmp(ptl, ptr + 7) > 0)
{
memcpy(ptd, ptr, 8 * sizeof(VAR)); ptd += 8; ptr += 8;
if (tpr - ptr > 8) {goto ptl_ptr8;} continue;
}
tpl_tpr8: if (cmp(tpl, tpr - 7) <= 0)
{
tpd -= 7; tpr -= 7; memcpy(tpd--, tpr--, 8 * sizeof(VAR));
if (tpr - ptr > 8) {goto tpl_tpr8;} continue;
}
}
if (tpd - ptd < 16)
{
break;
}
loop = 8; do
{
head_branchless_merge(ptd, x, ptl, ptr, cmp);
tail_branchless_merge(tpd, y, tpl, tpr, cmp);
}
while (--loop);
}
while (tpl - ptl > 1 && tpr - ptr > 1)
{
if (cmp(ptl + 1, ptr) <= 0)
{
*ptd++ = *ptl++; *ptd++ = *ptl++;
}
else if (cmp(ptl, ptr + 1) > 0)
{
*ptd++ = *ptr++; *ptd++ = *ptr++;
}
else
{
x = cmp(ptl, ptr) <= 0; y = !x; ptd[x] = *ptr; ptr += 1; ptd[y] = *ptl; ptl += 1; ptd += 2;
x = cmp(ptl, ptr) <= 0; y = !x; ptd[x] = *ptr; ptr += y; ptd[y] = *ptl; ptl += x; ptd++;
}
}
while (ptl <= tpl && ptr <= tpr)
{
*ptd++ = cmp(ptl, ptr) <= 0 ? *ptl++ : *ptr++;
}
while (ptl <= tpl)
{
*ptd++ = *ptl++;
}
while (ptr <= tpr)
{
*ptd++ = *ptr++;
}
}
void FUNC(parity_twin_merge)(VAR *ptl, VAR *ptr, VAR *ptd, VAR *tpd, size_t block, CMPFUNC *cmp)
{
VAR *tpl, *tpr;
#if !defined __clang__
unsigned char x, y;
#endif
tpl = ptl + block - 1;
tpr = ptr + block - 1;
for (block-- ; block ; block--)
{
head_branchless_merge(ptd, x, ptl, ptr, cmp);
tail_branchless_merge(tpd, y, tpl, tpr, cmp);
}
*ptd = cmp(ptl, ptr) <= 0 ? *ptl : *ptr;
*tpd = cmp(tpl, tpr) > 0 ? *tpl : *tpr;
}
// merge two sorted arrays across two buckets
// [AB][AB] --> [AA][ ] + [BB][ ]
void FUNC(twin_merge)(STRUCT(x_node) *x_node, STRUCT(y_node) *y_node1, STRUCT(y_node) *y_node2, CMPFUNC *cmp)
{
VAR *pta, *ptb, *tpa, *tpb, *pts;
FUNC(quadsort_swap)(y_node1->z_axis2, x_node->swap, BSC_Z, BSC_Z, cmp);
pta = y_node1->z_axis1;
ptb = y_node1->z_axis2;
tpa = pta + BSC_Z - 1;
tpb = ptb + BSC_Z - 1;
if (cmp(tpa, ptb) <= 0)
{
pts = y_node1->z_axis2;
y_node1->z_axis2 = y_node2->z_axis1;
y_node2->z_axis1 = pts;
return;
}
if (cmp(pta, tpb) > 0)
{
pts = y_node1->z_axis1;
y_node1->z_axis1 = y_node1->z_axis2;
y_node1->z_axis2 = y_node2->z_axis1;
y_node2->z_axis1 = pts;
return;
}
FUNC(parity_twin_merge)(pta, ptb, y_node2->z_axis2, y_node2->z_axis1 + BSC_Z - 1, BSC_Z, cmp);
pta = y_node1->z_axis1; y_node1->z_axis1 = y_node2->z_axis2; y_node2->z_axis2 = pta;
}
void FUNC(destroy_grid)(STRUCT(x_node) *x_node, VAR *array, CMPFUNC *cmp)
{
STRUCT(y_node) *y_node;
size_t y, z;
for (y = z = 0 ; y < x_node->y_size ; y++)
{
y_node = x_node->y_axis[y];
if (y_node->z_size)
{
FUNC(twin_merge_cpy)(x_node, &array[z], y_node, cmp);
}
else
{
memcpy(&array[z], y_node->z_axis1, BSC_Z * sizeof(VAR));
}
z += BSC_Z + y_node->z_size;
free(y_node->z_axis1);
free(y_node->z_axis2);
free(y_node);
}
free(x_node->y_axis);
free(x_node->y_base);
free(x_node->swap);
free(x_node);
}
size_t FUNC(adaptive_binary_search)(STRUCT(x_node) *x_node, VAR *array, VAR key, CMPFUNC *cmp)
{
static unsigned int run;
size_t top, mid;
VAR *base = array;
if (!run)
{
top = x_node->y_size;
goto monobound;
}
if (x_node->y == x_node->y_size - 1)
{
if (cmp(base + x_node->y, &key) <= 0)
{
return x_node->y;
}
top = x_node->y;
goto monobound;
}
if (x_node->y == 0)
{
base++;
if (cmp(base, &key) > 0)
{
return 0;
}
top = x_node->y_size - 1;
goto monobound;
}
base += x_node->y;
if (cmp(base, &key) <= 0)
{
if (cmp(base + 1, &key) > 0)
{
goto end;
}
base++;
top = x_node->y_size - x_node->y - 1;
}
else
{
base--;
if (cmp(base, &key) <= 0)
{
goto end;
}
top = x_node->y - 1;
base = array;
}
monobound:
while (top > 1)
{
mid = top / 2;
if (cmp(base + mid, &key) <= 0)
{
base += mid;
}
top -= mid;
}
end:
top = base - array;
run = x_node->y == top;
return x_node->y = top;
}
void FUNC(insert_y_node)(STRUCT(x_node) *x_node, size_t y)
{
size_t end = ++x_node->y_size;
if (x_node->y_size % BSC_Z == 0)
{
x_node->y_base = (VAR *) realloc(x_node->y_base, (x_node->y_size + BSC_Z) * sizeof(VAR));
x_node->y_axis = (STRUCT(y_node) **) realloc(x_node->y_axis, (x_node->y_size + BSC_Z) * sizeof(STRUCT(y_node) *));
}
while (y < --end)
{
x_node->y_axis[end] = x_node->y_axis[end - 1];
x_node->y_base[end] = x_node->y_base[end - 1];
}
x_node->y_axis[y] = (STRUCT(y_node) *) malloc(sizeof(STRUCT(y_node)));
x_node->y_axis[y]->z_axis1 = (VAR *) malloc(BSC_Z * sizeof(VAR));
x_node->y_axis[y]->z_axis2 = (VAR *) malloc(BSC_Z * sizeof(VAR));
}
void FUNC(split_y_node)(STRUCT(x_node) *x_node, size_t y1, size_t y2, CMPFUNC *cmp)
{
STRUCT(y_node) *y_node1, *y_node2;
FUNC(insert_y_node)(x_node, y2);
y_node1 = x_node->y_axis[y1];
y_node2 = x_node->y_axis[y2];
FUNC(twin_merge)(x_node, y_node1, y_node2, cmp);
y_node1->z_size = y_node2->z_size = 0;
x_node->y_base[y1] = y_node1->z_axis1[0];
x_node->y_base[y2] = y_node2->z_axis1[0];
}
void FUNC(insert_z_node)(STRUCT(x_node) *x_node, VAR key, CMPFUNC *cmp)
{
STRUCT(y_node) *y_node;
size_t y;
y = FUNC(adaptive_binary_search)(x_node, x_node->y_base, key, cmp);
y_node = x_node->y_axis[y];
y_node->z_axis2[y_node->z_size++] = key;
if (y_node->z_size == BSC_Z)
{
FUNC(split_y_node)(x_node, y, y + 1, cmp);
}
}
/////////////////////////////////////////////////////////////////////////////
//┌───────────────────────────────────────────────────────────────────────┐//
//│ ██████┐ ██████┐ ██████┐██████┐ ███████┐ ██████┐ ██████┐ ████████┐ │//
//│ ██┌────┘ ██┌──██┐└─██┌─┘██┌──██┐██┌────┘██┌───██┐██┌──██┐└──██┌──┘ │//
//│ ██│ ███┐██████┌┘ ██│ ██│ ██│███████┐██│ ██│██████┌┘ ██│ │//
//│ ██│ ██│██┌──██┐ ██│ ██│ ██│└────██│██│ ██│██┌──██┐ ██│ │//
//│ └██████┌┘██│ ██│██████┐██████┌┘███████│└██████┌┘██│ ██│ ██│ │//
//│ └─────┘ └─┘ └─┘└─────┘└─────┘ └──────┘ └─────┘ └─┘ └─┘ └─┘ │//
//└───────────────────────────────────────────────────────────────────────┘//
/////////////////////////////////////////////////////////////////////////////
void FUNC(gridsort)(void *array, size_t nmemb, size_t size, CMPFUNC *cmp)
{
size_t cnt = nmemb;
VAR *pta = (VAR *) array;
STRUCT(x_node) *grid = FUNC(create_grid)(pta, cnt, cmp);
pta += BSC_Z * 2;
cnt -= BSC_Z * 2;
while (cnt--)
{
FUNC(insert_z_node)(grid, *pta++, cmp);
}
FUNC(destroy_grid)(grid, (VAR *) array, cmp);
}
================================================
FILE: src/gridsort.h
================================================
// gridsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
#ifndef GRIDSORT_H
#define GRIDSORT_H
//#define cmp(a,b) (*(a) > *(b))
#ifndef QUADSORT_H
#include "quadsort.h"
#endif
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <errno.h>
typedef int CMPFUNC (const void *a, const void *b);
#define BSC_X 32
#define BSC_Y 2
size_t BSC_Z;
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ █████┐ ██████┐ ██████┐████████┐ │//
//│ ██┌──██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
//│ └█████┌┘ ██████┌┘ ██│ ██│ │//
//│ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ └█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#undef VAR
#undef FUNC
#undef STRUCT
#define VAR char
#define FUNC(NAME) NAME##8
#define STRUCT(NAME) struct NAME##8
#include "gridsort.c"
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ ▄██┐ █████┐ ██████┐ ██████┐████████┐│//
//│ ████│ ██┌───┘ ██┌──██┐└─██┌─┘└──██┌──┘│//
//│ └─██│ ██████┐ ██████┌┘ ██│ ██│ │//
//│ ██│ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ ██████┐└█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └─────┘ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#undef VAR
#undef FUNC
#undef STRUCT
#define VAR short
#define FUNC(NAME) NAME##16
#define STRUCT(NAME) struct NAME##16
#include "gridsort.c"
//////////////////////////////////////////////////////////
// ┌───────────────────────────────────────────────────┐//
// │ ██████┐ ██████┐ ██████┐ ██████┐████████┐ │//
// │ └────██┐└────██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
// │ █████┌┘ █████┌┘ ██████┌┘ ██│ ██│ │//
// │ └───██┐██┌───┘ ██┌──██┐ ██│ ██│ │//
// │ ██████┌┘███████┐ ██████┌┘██████┐ ██│ │//
// │ └─────┘ └──────┘ └─────┘ └─────┘ └─┘ │//
// └───────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#undef VAR
#undef FUNC
#undef STRUCT
#define VAR int
#define FUNC(NAME) NAME##32
#define STRUCT(NAME) struct NAME##32
#include "gridsort.c"
//////////////////////////////////////////////////////////
// ┌───────────────────────────────────────────────────┐//
// │ █████┐ ██┐ ██┐ ██████┐ ██████┐████████┐ │//
// │ ██┌───┘ ██│ ██│ ██┌──██┐└─██┌─┘└──██┌──┘ │//
// │ ██████┐ ███████│ ██████┌┘ ██│ ██│ │//
// │ ██┌──██┐└────██│ ██┌──██┐ ██│ ██│ │//
// │ └█████┌┘ ██│ ██████┌┘██████┐ ██│ │//
// │ └────┘ └─┘ └─────┘ └─────┘ └─┘ │//
// └───────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#undef VAR
#undef FUNC
#undef STRUCT
#define VAR long long
#define FUNC(NAME) NAME##64
#define STRUCT(NAME) struct NAME##64
#include "gridsort.c"
//////////////////////////////////////////////////////////
//┌────────────────────────────────────────────────────┐//
//│ ▄██┐ ██████┐ █████┐ ██████┐ ██████┐████████┐ │//
//│ ████│ └────██┐██┌──██┐ ██┌──██┐└─██┌─┘└──██┌──┘ │//
//│ └─██│ █████┌┘└█████┌┘ ██████┌┘ ██│ ██│ │//
//│ ██│ ██┌───┘ ██┌──██┐ ██┌──██┐ ██│ ██│ │//
//│ ██████┐███████┐└█████┌┘ ██████┌┘██████┐ ██│ │//
//│ └─────┘└──────┘ └────┘ └─────┘ └─────┘ └─┘ │//
//└────────────────────────────────────────────────────┘//
//////////////////////////////////////////////////////////
#undef VAR
#undef FUNC
#undef STRUCT
#define VAR long double
#define FUNC(NAME) NAME##128
#define STRUCT(NAME) struct NAME##128
#include "gridsort.c"
/////////////////////////////////////////////////////////////////////////////
//┌───────────────────────────────────────────────────────────────────────┐//
//│ ██████┐ ██████┐ ██████┐██████┐ ███████┐ ██████┐ ██████┐ ████████┐ │//
//│ ██┌────┘ ██┌──██┐└─██┌─┘██┌──██┐██┌────┘██┌───██┐██┌──██┐└──██┌──┘ │//
//│ ██│ ███┐██████┌┘ ██│ ██│ ██│███████┐██│ ██│██████┌┘ ██│ │//
//│ ██│ ██│██┌──██┐ ██│ ██│ ██│└────██│██│ ██│██┌──██┐ ██│ │//
//│ └██████┌┘██│ ██│██████┐██████┌┘███████│└██████┌┘██│ ██│ ██│ │//
//│ └─────┘ └─┘ └─┘└─────┘└─────┘ └──────┘ └─────┘ └─┘ └─┘ └─┘ │//
//└───────────────────────────────────────────────────────────────────────┘//
/////////////////////////////////////////////////////////////////////////////
void gridsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp)
{
if (nmemb < BSC_X * BSC_X)
{
return quadsort(array, nmemb, size, cmp);
}
switch (size)
{
case sizeof(char):
return gridsort8(array, nmemb, size, cmp);
case sizeof(short):
return gridsort16(array, nmemb, size, cmp);
case sizeof(int):
return gridsort32(array, nmemb, size, cmp);
case sizeof(long long):
return gridsort64(array, nmemb, size, cmp);
case sizeof(long double):
return gridsort128(array, nmemb, size, cmp);
default:
assert(size == sizeof(char) || size == sizeof(short) || size == sizeof(int) || size == sizeof(long long) || size == sizeof(long double));
}
}
#undef VAR
#undef FUNC
#undef STRUCT
#endif
================================================
FILE: src/quadsort.c
================================================
// quadsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com
// the next seven functions are used for sorting 0 to 31 elements
void FUNC(parity_swap_four)(VAR *array, CMPFUNC *cmp)
{
VAR tmp, *pta = array;
size_t x;
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta--;
if (cmp(pta, pta + 1) > 0)
{
tmp = pta[0]; pta[0] = pta[1]; pta[1] = tmp; pta--;
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta--;
branchless_swap(pta, tmp, x, cmp);
}
}
void FUNC(parity_swap_five)(VAR *array, CMPFUNC *cmp)
{
VAR tmp, *pta = array;
size_t x, y;
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta -= 1;
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, y, cmp); pta = array;
if (x + y)
{
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta -= 1;
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta = array;
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta -= 1;
}
}
void FUNC(parity_swap_six)(VAR *array, VAR *swap, CMPFUNC *cmp)
{
VAR tmp, *pta = array, *ptl, *ptr;
size_t x, y;
branchless_swap(pta, tmp, x, cmp); pta++;
branchless_swap(pta, tmp, x, cmp); pta += 3;
branchless_swap(pta, tmp, x, cmp); pta--;
branchless_swap(pta, tmp, x, cmp); pta = array;
if (cmp(pta + 2, pta + 3) <= 0)
{
branchless_swap(pta, tmp, x, cmp); pta += 4;
branchless_swap(pta, tmp, x, cmp);
return;
}
x = cmp(pta, pta + 1) > 0; y = !x; swap[0] = pta[x]; swap[1] = pta[y]; swap[2] = pta[2]; pta += 4;
x = cmp(pta, pta + 1) > 0; y = !x; swap[4] = pta[x]; swap[5] = pta[y]; swap[3] = pta[-1];
pta = array; ptl = swap; ptr = swap + 3;
head_branchless_merge(pta, x, ptl, ptr, cmp);
head_branchless_merge(pta, x, ptl, ptr, cmp);
head_branchless_merge(pta, x, ptl, ptr, cmp);
pta = array + 5; ptl = swap + 2; ptr = swap + 5;
tail_branchless_merge(pta, y, ptl, ptr, cmp);
tail_branchless_merge(pta, y, ptl, ptr, cmp);
*pta = cmp(ptl, ptr) > 0 ? *ptl : *ptr;
}
void FUNC(parity_swap_seven)(VAR *array, VAR *swap, CMPFUNC *cmp)
{
VAR tmp, *pta = array, *ptl, *ptr;
size_t x, y;
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta -= 3;
branchless_swap(pta, tmp, y, cmp); pta += 2;
branchless_swap(pta, tmp, x, cmp); pta += 2; y += x;
branchless_swap(pta, tmp, x, cmp); pta -= 1; y += x;
if (y == 0) return;
branchless_swap(pta, tmp, x, cmp); pta = array;
x = cmp(pta, pta + 1) > 0; swap[0] = pta[x]; swap[1] = pta[!x]; swap[2] = pta[2]; pta += 3;
x = cmp(pta, pta + 1) > 0; swap[3] = pta[x]; swap[4] = pta[!x]; pta += 2;
x = cmp(pta, pta + 1) > 0; swap[5] = pta[x]; swap[6] = pta[!x];
pta = array; ptl = swap; ptr = swap + 3;
head_branchless_merge(pta, x, ptl, ptr, cmp);
head_branchless_merge(pta, x, ptl, ptr, cmp);
head_branchless_merge(pta, x, ptl, ptr, cmp);
pta = array + 6; ptl = swap + 2; ptr = swap + 6;
tail_branchless_merge(pta, y, ptl, ptr, cmp);
tail_branchless_merge(pta, y, ptl, ptr, cmp);
tail_branchless_merge(pta, y, ptl, ptr, cmp);
*pta = cmp(ptl, ptr) > 0 ? *ptl : *ptr;
}
void FUNC(tiny_sort)(VAR *array, VAR *swap, size_t nmemb, CMPFUNC *cmp)
{
VAR tmp;
size_t x;
switch (nmemb)
{
case 0:
case 1:
return;
case 2:
branchless_swap(array, tmp, x, cmp);
return;
case 3:
branchless_swap(array, tmp, x, cmp); array++;
branchless_swap(array, tmp, x, cmp); array--;
branchless_swap(array, tmp, x, cmp);
return;
case 4:
FUNC(parity_swap_four)(array, cmp);
return;
case 5:
FUNC(parity_swap_five)(array, cmp);
return;
case 6:
FUNC(parity_swap_six)(array, swap, cmp);
return;
case 7:
FUNC(parity_swap_seven)(array, swap, cmp);
return;
}
}
// left must be equal or one smaller than right
void FUNC(parity_merge)(VAR *dest, VAR *from, size_t left, size_t right, CMPFUNC *cmp)
{
VAR *ptl, *ptr, *tpl, *tpr, *tpd, *ptd;
#if !defined __clang__
size_t x, y;
#endif
ptl = from;
ptr = from + left;
ptd = dest;
tpl = ptr - 1;
tpr = tpl + right;
tpd = dest + left + right - 1;
if (left < right)
{
*ptd++ = cmp(ptl, ptr) <= 0 ? *ptl++ : *ptr++;
}
*ptd++ = cmp(ptl, ptr) <= 0 ? *ptl++ : *ptr++;
#if !defined cmp && !defined __clang__ // cache limit workaround for gcc
if (left > QUAD_CACHE)
{
while (--left)
{
*ptd++ = cmp(ptl, ptr) <= 0 ? *ptl++ : *ptr++;
*tpd-- = cmp(tpl, tpr) > 0 ? *tpl-- : *tpr--;
}
}
else
#endif
{
while (--left)
{
head_branchless_merge(ptd, x, ptl, ptr, cmp);
tail_branchless_merge(tpd, y, tpl, tpr, cmp);
}
}
*tpd = cmp(tpl, tpr) > 0 ? *tpl : *tpr;
}
void FUNC(tail_swap)(VAR *array, VAR *swap, size_t nmemb, CMPFUNC *cmp)
{
if (nmemb < 8)
{
FUNC(tiny_sort)(array, swap, nmemb, cmp);
return;
}
size_t quad1, quad2, quad3, quad4, half1, half2;
half1 = nmemb / 2;
quad1 = half1 / 2;
quad2 = half1 - quad1;
half2 = nmemb - half1;
quad3 = half2 / 2;
quad4 = half2 - quad3;
VAR *pta = array;
FUNC(tail_swap)(pta, swap, quad1, cmp); pta += quad1;
FUNC(tail_swap)(pta, swap, quad2, cmp); pta += quad2;
FUNC(tail_swap)(pta, swap, quad3, cmp); pta += quad3;
FUNC(tail_swap)(pta, swap, quad4, cmp);
if (cmp(array + quad1 - 1, array + quad1) <= 0 && cmp(array + half1 - 1, array + half1) <= 0 && cmp(pta - 1, pta) <= 0)
{
return;
}
FUNC(parity_merge)(swap, array, quad1, quad2, cmp);
FUNC(parity_merge)(swap + half1, array + half1, quad3, quad4, cmp);
FUNC(parity_merge)(array, swap, half1, half2, cmp);
}
// the next three functions create sorted blocks of 32 elements
void FUNC(quad_reversal)(VAR *pta, VAR *ptz)
{
VAR *ptb, *pty, tmp1, tmp2;
size_t loop = (ptz - pta) / 2;
ptb = pta + loop;
pty = ptz - loop;
if (loop % 2 == 0)
{
tmp2 = *ptb; *ptb-- = *pty; *pty++ = tmp2; loop--;
}
loop /= 2;
do
{
tmp1 = *pta; *pta++ = *ptz; *ptz-- = tmp1;
tmp2 = *ptb; *ptb-- = *pty; *pty++ = tmp2;
}
while (loop--);
}
void FUNC(quad_swap_merge)(VAR *array, VAR *swap, CMPFUNC *cmp)
{
VAR *pts, *ptl, *ptr;
#if !defined __clang__
size_t x;
#endif
parity_merge_two(array + 0, swap + 0, x, ptl, ptr, pts, cmp);
parity_merge_two(array + 4, swap + 4, x, ptl, ptr, pts, cmp);
parity_merge_four(swap, array, x, ptl, ptr, pts, cmp);
}
void FUNC(tail_merge)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, size_t block, CMPFUNC *cmp);
size_t FUNC(quad_swap)(VAR *array, size_t nmemb, CMPFUNC *cmp)
{
VAR tmp, swap[32];
size_t count;
VAR *pta, *pts;
unsigned char v1, v2, v3, v4, x;
pta = array;
count = nmemb / 8;
while (count--)
{
v1 = cmp(pta + 0, pta + 1) > 0;
v2 = cmp(pta + 2, pta + 3) > 0;
v3 = cmp(pta + 4, pta + 5) > 0;
v4 = cmp(pta + 6, pta + 7) > 0;
switch (v1 + v2 * 2 + v3 * 4 + v4 * 8)
{
case 0:
if (cmp(pta + 1, pta + 2) <= 0 && cmp(pta + 3, pta + 4) <= 0 && cmp(pta + 5, pta + 6) <= 0)
{
goto ordered;
}
FUNC(quad_swap_merge)(pta, swap, cmp);
break;
case 15:
if (cmp(pta + 1, pta + 2) > 0 && cmp(pta + 3, pta + 4) > 0 && cmp(pta + 5, pta + 6) > 0)
{
pts = pta;
goto reversed;
}
default:
not_ordered:
x = !v1; tmp = pta[x]; pta[0] = pta[v1]; pta[1] = tmp; pta += 2;
x = !v2; tmp = pta[x]; pta[0] = pta[v2]; pta[1] = tmp; pta += 2;
x = !v3; tmp = pta[x]; pta[0] = pta[v3]; pta[1] = tmp; pta += 2;
x = !v4; tmp = pta[x]; pta[0] = pta[v4]; pta[1] = tmp; pta -= 6;
FUNC(quad_swap_merge)(pta, swap, cmp);
}
pta += 8;
continue;
ordered:
pta += 8;
if (count--)
{
if ((v1 = cmp(pta + 0, pta + 1) > 0) | (v2 = cmp(pta + 2, pta + 3) > 0) | (v3 = cmp(pta + 4, pta + 5) > 0) | (v4 = cmp(pta + 6, pta + 7) > 0))
{
if (v1 + v2 + v3 + v4 == 4 && cmp(pta + 1, pta + 2) > 0 && cmp(pta + 3, pta + 4) > 0 && cmp(pta + 5, pta + 6) > 0)
{
pts = pta;
goto reversed;
}
goto not_ordered;
}
if (cmp(pta + 1, pta + 2) <= 0 && cmp(pta + 3, pta + 4) <= 0 && cmp(pta + 5, pta + 6) <= 0)
{
goto ordered;
}
FUNC(quad_swap_merge)(pta, swap, cmp);
pta += 8;
continue;
}
break;
reversed:
pta += 8;
if (count--)
{
if ((v1 = cmp(pta + 0, pta + 1) <= 0) | (v2 = cmp(pta + 2, pta + 3) <= 0) | (v3 = cmp(pta + 4, pta + 5) <= 0) | (v4 = cmp(pta + 6, pta + 7) <= 0))
{
// not reversed
}
else
{
if (cmp(pta - 1, pta) > 0 && cmp(pta + 1, pta + 2) > 0 && cmp(pta + 3, pta + 4) > 0 && cmp(pta + 5, pta + 6) > 0)
{
goto reversed;
}
}
FUNC(quad_reversal)(pts, pta - 1);
if (v1 + v2 + v3 + v4 == 4 && cmp(pta + 1, pta + 2) <= 0 && cmp(pta + 3, pta + 4) <= 0 && cmp(pta + 5, pta + 6) <= 0)
{
goto ordered;
}
if (v1 + v2 + v3 + v4 == 0 && cmp(pta + 1, pta + 2) > 0 && cmp(pta + 3, pta + 4) > 0 && cmp(pta + 5, pta + 6) > 0)
{
pts = pta;
goto reversed;
}
x = !v1; tmp = pta[v1]; pta[0] = pta[x]; pta[1] = tmp; pta += 2;
x = !v2; tmp = pta[v2]; pta[0] = pta[x]; pta[1] = tmp; pta += 2;
x = !v3; tmp = pta[v3]; pta[0] = pta[x]; pta[1] = tmp; pta += 2;
x = !v4; tmp = pta[v4]; pta[0] = pta[x]; pta[1] = tmp; pta -= 6;
if (cmp(pta + 1, pta + 2) > 0 || cmp(pta + 3, pta + 4) > 0 || cmp(pta + 5, pta + 6) > 0)
{
FUNC(quad_swap_merge)(pta, swap, cmp);
}
pta += 8;
continue;
}
switch (nmemb % 8)
{
case 7: if (cmp(pta + 5, pta + 6) <= 0) break;
case 6: if (cmp(pta + 4, pta + 5) <= 0) break;
case 5: if (cmp(pta + 3, pta + 4) <= 0) break;
case 4: if (cmp(pta + 2, pta + 3) <= 0) break;
case 3: if (cmp(pta + 1, pta + 2) <= 0) break;
case 2: if (cmp(pta + 0, pta + 1) <= 0) break;
case 1: if (cmp(pta - 1, pta + 0) <= 0) break;
case 0:
FUNC(quad_reversal)(pts, pta + nmemb % 8 - 1);
if (pts == array)
{
return 1;
}
goto reverse_end;
}
FUNC(quad_reversal)(pts, pta - 1);
break;
}
FUNC(tail_swap)(pta, swap, nmemb % 8, cmp);
reverse_end:
pta = array;
for (count = nmemb / 32 ; count-- ; pta += 32)
{
if (cmp(pta + 7, pta + 8) <= 0 && cmp(pta + 15, pta + 16) <= 0 && cmp(pta + 23, pta + 24) <= 0)
{
continue;
}
FUNC(parity_merge)(swap, pta, 8, 8, cmp);
FUNC(parity_merge)(swap + 16, pta + 16, 8, 8, cmp);
FUNC(parity_merge)(pta, swap, 16, 16, cmp);
}
if (nmemb % 32 > 8)
{
FUNC(tail_merge)(pta, swap, 32, nmemb % 32, 8, cmp);
}
return 0;
}
// The next six functions are quad merge support routines
void FUNC(cross_merge)(VAR *dest, VAR *from, size_t left, size_t right, CMPFUNC *cmp)
{
VAR *ptl, *tpl, *ptr, *tpr, *ptd, *tpd;
size_t loop;
#if !defined __clang__
size_t x, y;
#endif
ptl = from;
ptr = from + left;
tpl = ptr - 1;
tpr = tpl + right;
if (left + 1 >= right && right >= left && left >= 32)
{
if (cmp(ptl + 15, ptr) > 0 && cmp(ptl, ptr + 15) <= 0 && cmp(tpl, tpr - 15) > 0 && cmp(tpl - 15, tpr) <= 0)
{
FUNC(parity_merge)(dest, from, left, right, cmp);
return;
}
}
ptd = dest;
tpd = dest + left + right - 1;
while (1)
{
if (tpl - ptl > 8)
{
ptl8_ptr: if (cmp(ptl + 7, ptr) <= 0)
{
memcpy(ptd, ptl, 8 * sizeof(VAR)); ptd += 8; ptl += 8;
if (tpl - ptl > 8) {goto ptl8_ptr;} continue;
}
tpl8_tpr: if (cmp(tpl - 7, tpr) > 0)
{
tpd -= 7; tpl -= 7; memcpy(tpd--, tpl--, 8 * sizeof(VAR));
if (tpl - ptl > 8) {goto tpl8_tpr;} continue;
}
}
if (tpr - ptr > 8)
{
ptl_ptr8: if (cmp(ptl, ptr + 7) > 0)
{
memcpy(ptd, ptr, 8 * sizeof(VAR)); ptd += 8; ptr += 8;
if (tpr - ptr > 8) {goto ptl_ptr8;} continue;
}
tpl_tpr8: if (cmp(tpl, tpr - 7) <= 0)
{
tpd -= 7; tpr -= 7; memcpy(tpd--, tpr--, 8 * sizeof(VAR));
if (tpr - ptr > 8) {goto tpl_tpr8;} continue;
}
}
if (tpd - ptd < 16)
{
break;
}
#if !defined cmp && !defined __clang__
if (left > QUAD_CACHE)
{
loop = 8; do
{
*ptd++ = cmp(ptl, ptr) <= 0 ? *ptl++ : *ptr++;
*tpd-- = cmp(tpl, tpr) > 0 ? *tpl-- : *tpr--;
}
while (--loop);
}
else
#endif
{
loop = 8; do
{
head_branchless_merge(ptd, x, ptl, ptr, cmp);
tail_branchless_merge(tpd, y, tpl, tpr, cmp);
}
while (--loop);
}
}
while (ptl <= tpl && ptr <= tpr)
{
*ptd++ = cmp(ptl, ptr) <= 0 ? *ptl++ : *ptr++;
}
while (ptl <= tpl)
{
*ptd++ = *ptl++;
}
while (ptr <= tpr)
{
*ptd++ = *ptr++;
}
}
void FUNC(quad_merge_block)(VAR *array, VAR *swap, size_t block, CMPFUNC *cmp)
{
VAR *pt1, *pt2, *pt3;
size_t block_x_2 = block * 2;
pt1 = array + block;
pt2 = pt1 + block;
pt3 = pt2 + block;
switch ((cmp(pt1 - 1, pt1) <= 0) | (cmp(pt3 - 1, pt3) <= 0) * 2)
{
case 0:
FUNC(cross_merge)(swap, array, block, block, cmp);
FUNC(cross_merge)(swap + block_x_2, pt2, block, block, cmp);
break;
case 1:
memcpy(swap, array, block_x_2 * sizeof(VAR));
FUNC(cross_merge)(swap + block_x_2, pt2, block, block, cmp);
break;
case 2:
FUNC(cross_merge)(swap, array, block, block, cmp);
memcpy(swap + block_x_2, pt2, block_x_2 * sizeof(VAR));
break;
case 3:
if (cmp(pt2 - 1, pt2) <= 0)
return;
memcpy(swap, array, block_x_2 * 2 * sizeof(VAR));
}
FUNC(cross_merge)(array, swap, block_x_2, block_x_2, cmp);
}
size_t FUNC(quad_merge)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, size_t block, CMPFUNC *cmp)
{
VAR *pta, *pte;
pte = array + nmemb;
block *= 4;
while (block <= nmemb && block <= swap_size)
{
pta = array;
do
{
FUNC(quad_merge_block)(pta, swap, block / 4, cmp);
pta += block;
}
while (pta + block <= pte);
FUNC(tail_merge)(pta, swap, swap_size, pte - pta, block / 4, cmp);
block *= 4;
}
FUNC(tail_merge)(array, swap, swap_size, nmemb, block / 4, cmp);
return block / 2;
}
void FUNC(partial_forward_merge)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, size_t block, CMPFUNC *cmp)
{
VAR *ptl, *ptr, *tpl, *tpr;
size_t x;
if (nmemb == block)
{
return;
}
ptr = array + block;
tpr = array + nmemb - 1;
if (cmp(ptr - 1, ptr) <= 0)
{
return;
}
memcpy(swap, array, block * sizeof(VAR));
ptl = swap;
tpl = swap + block - 1;
while (ptl < tpl - 1 && ptr < tpr - 1)
{
ptr2: if (cmp(ptl, ptr + 1) > 0)
{
*array++ = *ptr++; *array++ = *ptr++;
if (ptr < tpr - 1) {goto ptr2;} break;
}
if (cmp(ptl + 1, ptr) <= 0)
{
*array++ = *ptl++; *array++ = *ptl++;
if (ptl < tpl - 1) {goto ptl2;} break;
}
goto cross_swap;
ptl2: if (cmp(ptl + 1, ptr) <= 0)
{
*array++ = *ptl++; *array++ = *ptl++;
if (ptl < tpl - 1) {goto ptl2;} break;
}
if (cmp(ptl, ptr + 1) > 0)
{
*array++ = *ptr++; *array++ = *ptr++;
if (ptr < tpr - 1) {goto ptr2;} break;
}
cross_swap:
x = cmp(ptl, ptr) <= 0; array[x] = *ptr; ptr += 1; array[!x] = *ptl; ptl += 1; array += 2;
head_branchless_merge(array, x, ptl, ptr, cmp);
}
while (ptl <= tpl && ptr <= tpr)
{
*array++ = cmp(ptl, ptr) <= 0 ? *ptl++ : *ptr++;
}
while (ptl <= tpl)
{
*array++ = *ptl++;
}
}
void FUNC(partial_backward_merge)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, size_t block, CMPFUNC *cmp)
{
VAR *tpl, *tpa, *tpr;
size_t right, loop, x;
if (nmemb == block)
{
return;
}
tpl = array + block - 1;
tpa = array + nmemb - 1;
if (cmp(tpl, tpl + 1) <= 0)
{
return;
}
right = nmemb - block;
if (nmemb <= swap_size && right >= 64)
{
FUNC(cross_merge)(swap, array, block, right, cmp);
memcpy(array, swap, nmemb * sizeof(VAR));
return;
}
memcpy(swap, array + block, right * sizeof(VAR));
tpr = swap + right - 1;
while (tpl > array + 16 && tpr > swap + 16)
{
tpl_tpr16: if (cmp(tpl, tpr - 15) <= 0)
{
loop = 16; do *tpa-- = *tpr--; while (--loop);
if (tpr > swap + 16) {goto tpl_tpr16;} break;
}
tpl16_tpr: if (cmp(tpl - 15, tpr) > 0)
{
loop = 16; do *tpa-- = *tpl--; while (--loop);
if (tpl > array + 16) {goto tpl16_tpr;} break;
}
loop = 8; do
{
if (cmp(tpl, tpr - 1) <= 0)
{
*tpa-- = *tpr--; *tpa-- = *tpr--;
}
else if (cmp(tpl - 1, tpr) > 0)
{
*tpa-- = *tpl--; *tpa-- = *tpl--;
}
else
{
x = cmp(tpl, tpr) <= 0; tpa--; tpa[x] = *tpr; tpr -= 1; tpa[!x] = *tpl; tpl -= 1; tpa--;
tail_branchless_merge(tpa, x, tpl, tpr, cmp);
}
}
while (--loop);
}
while (tpr > swap + 1 && tpl > array + 1)
{
tpr2: if (cmp(tpl, tpr - 1) <= 0)
{
*tpa-- = *tpr--; *tpa-- = *tpr--;
if (tpr > swap + 1) {goto tpr2;} break;
}
if (cmp(tpl - 1, tpr) > 0)
{
*tpa-- = *tpl--; *tpa-- = *tpl--;
if (tpl > array + 1) {goto tpl2;} break;
}
goto cross_swap;
tpl2: if (cmp(tpl - 1, tpr) > 0)
{
*tpa-- = *tpl--; *tpa-- = *tpl--;
if (tpl > array + 1) {goto tpl2;} break;
}
if (cmp(tpl, tpr - 1) <= 0)
{
*tpa-- = *tpr--; *tpa-- = *tpr--;
if (tpr > swap + 1) {goto tpr2;} break;
}
cross_swap:
x = cmp(tpl, tpr) <= 0; tpa--; tpa[x] = *tpr; tpr -= 1; tpa[!x] = *tpl; tpl -= 1; tpa--;
tail_branchless_merge(tpa, x, tpl, tpr, cmp);
}
while (tpr >= swap && tpl >= array)
{
*tpa-- = cmp(tpl, tpr) > 0 ? *tpl-- : *tpr--;
}
while (tpr >= swap)
{
*tpa-- = *tpr--;
}
}
void FUNC(tail_merge)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, size_t block, CMPFUNC *cmp)
{
VAR *pta, *pte;
pte = array + nmemb;
while (block < nmemb && block <= swap_size)
{
for (pta = array ; pta + block < pte ; pta += block * 2)
{
if (pta + block * 2 < pte)
{
FUNC(partial_backward_merge)(pta, swap, swap_size, block * 2, block, cmp);
continue;
}
FUNC(partial_backward_merge)(pta, swap, swap_size, pte - pta, block, cmp);
break;
}
block *= 2;
}
}
// the next four functions provide in-place rotate merge support
void FUNC(trinity_rotation)(VAR *array, VAR *swap, size_t swap_size, size_t nmemb, size_t left)
{
VAR temp;
size_t bridge, right = nmemb - left;
if (swap_size > 65536)
{
swap_size = 65536;
}
if (left < right)
{
if (left <= swap_size)
{
memcpy(swap, array, left * sizeof(VAR));
memmove(array, array + left, right * sizeof(VAR));
memcpy(array + right, swap, left * sizeof(VAR));
}
else
{
VAR *pta, *ptb, *ptc, *ptd;
pta = array;
ptb = pta + left;
bridge = right - left;
if (bridge <= swap_size && bridge > 3)
{
ptc = pta + right;
ptd = ptc + left;
memcpy(swap, ptb, bridge * sizeof(VAR));
while (left--)
{
*--ptc = *--ptd; *ptd = *--ptb;
}
memcpy(pta, swap, bridge * sizeof(VAR));
}
else
{
ptc = ptb;
ptd = ptc + right;
bridge = left / 2;
while (bridge--)
{
temp = *--ptb; *ptb = *pta; *pta++ = *ptc; *ptc++ = *--ptd; *ptd = temp;
}
bridge = (ptd - ptc) / 2;
while (bridge--)
{
temp = *ptc; *ptc++ = *--ptd; *ptd = *pta; *pta++ = temp;
}
bridge = (ptd - pta) / 2;
while (bridge--)
{
temp = *pta; *pta++ = *--ptd; *ptd = temp;
}
}
}
}
else if (right < left)
{
if (right <= swap_size)
{
memcpy(swap, array + left, right * sizeof(VAR));
memmove(array + right, array, left * sizeof(VAR));
memcpy(array, swap, right * sizeof(VAR));
}
else
{
VAR *pta, *ptb, *ptc, *ptd;
pta = array;
ptb = pta + left;
bridge = left - right;
if (bridge <= swap_size && bridge > 3)
{
ptc = pta + right;
ptd = ptc + left;
memcpy(swap, ptc, bridge * sizeof(VAR));
while (right--)
{
*ptc++ = *pta; *pta++ = *ptb++;
}
memcpy(ptd - bridge, swap, bridge * sizeof(VAR));
}
else
{
ptc = ptb;
ptd = ptc + right;
bridge = right / 2;
while (bridge--)
{
temp = *--ptb; *ptb = *pta; *pta++ = *ptc; *ptc++ = *--ptd; *ptd = temp;
}
bridge = (ptb - pta) / 2;
while (bridge--)
{
temp = *--ptb; *ptb = *pta; *pta++ = *--ptd; *ptd = temp;
}
bridge = (ptd - pta) / 2;
while (bridge--)
{
temp = *pta; *pta++ = *--ptd; *ptd = temp;
}
}
}
}
else
{
VAR *pta, *ptb;
pta = array;
ptb = pta + left;
while (left--)
{
temp = *pta; *pta++ = *ptb; *ptb++ = temp;
}
}
}
size_t FUNC(monobound_binary_first)(VAR *array, VAR *value, size_t top, CMPFUNC *cmp)
{
VAR *end;
size_t mid;
end = array + top;
while (top > 1)
{
mid = top / 2;
if (cmp(value, end - mid) <= 0)
{
end -= mid;
}
top -= mid;
}
if (cmp(value, end - 1) <= 0)
{
end--;
}
return (end - array);
}
void FUNC(rotate_merge_block)(VAR *array, VAR *swap, size_t swap_size, size_t lblock, size_t right, CMPFUNC *cmp)
{
size_t left, rblock, unbalanced;
if (cmp(array + lblock - 1, array + lblock) <= 0)
{
return;
}
rblock = lblock / 2;
lblock -= rblock;
left = FUNC(monobound_binary_first)(array + lblock + rblock, array + lblock, right, cmp);
right -= left;
// [ lblock ] [ rblock ] [ left ] [ right ]
if (left)
{
if (lblock + left <= swap_size)
{
memcpy(swap, array, lblock * sizeof(VAR));
memcpy(swap + lblock, array + lblock + rblock, left * sizeof(VAR));
memmove(array + lblock + left, array + lblock, rblock * sizeof(VAR));
FUNC(cross_merge)(array, swap, lblock, left, cmp);
}
else
{
FUNC(trinity_rotation)(array + lblock, swap, swap_size, rblock + left, rblock);
unbalanced = (left * 2 < lblock) | (lblock * 2 < left);
if (unbalanced && left <= swap_size)
{
FUNC(partial_backward_merge)(array, swap, swap_size, lblock + left, lblock, cmp);
}
else if (unbalanced && lblock <= swap_size)
{
FUNC(partial_forward_merge)(array, swap, swap_size, lblock + left, lblock, cmp);
}
else
{
FUNC(rotate_merge_block)(array, swa
gitextract_sm3bx4qr/
├── LICENSE
├── README.md
└── src/
├── bench.c
├── blitsort.c
├── blitsort.h
├── crumsort.c
├── crumsort.h
├── extra_tests.c
├── fluxsort.c
├── fluxsort.h
├── gridsort.c
├── gridsort.h
├── quadsort.c
├── quadsort.h
├── skipsort.c
├── skipsort.h
├── wolfsort.c
└── wolfsort.h
SYMBOL INDEX (45 symbols across 12 files) FILE: src/bench.c function NO_INLINE (line 104) | NO_INLINE int cmp_int(const void * a, const void * b) function NO_INLINE (line 118) | NO_INLINE int cmp_rev(const void * a, const void * b) function NO_INLINE (line 128) | NO_INLINE int cmp_stable(const void * a, const void * b) function NO_INLINE (line 138) | NO_INLINE int cmp_long(const void * a, const void * b) function NO_INLINE (line 149) | NO_INLINE int cmp_float(const void * a, const void * b) function NO_INLINE (line 154) | NO_INLINE int cmp_long_double(const void * a, const void * b) function NO_INLINE (line 174) | NO_INLINE int cmp_str(const void * a, const void * b) function NO_INLINE (line 181) | NO_INLINE int cmp_int_ptr(const void * a, const void * b) function NO_INLINE (line 191) | NO_INLINE int cmp_long_ptr(const void * a, const void * b) function NO_INLINE (line 201) | NO_INLINE int cmp_long_double_ptr(const void * a, const void * b) function NO_INLINE (line 215) | NO_INLINE bool cpp_cmp_int(const int &a, const int &b) function NO_INLINE (line 222) | NO_INLINE bool cpp_cmp_str(char const* const a, char const* const b) function utime (line 231) | long long utime() function seed_rand (line 240) | void seed_rand(unsigned long long seed) function test_sort (line 245) | void test_sort(void *array, void *unsorted, void *valid, int minimum, in... function validate (line 570) | void validate() function bit_reverse (line 664) | unsigned int bit_reverse(unsigned int x) function run_test (line 674) | void run_test(void *a_array, void *r_array, void *v_array, int minimum, ... function range_test (line 692) | void range_test(int max, int samples, int repetitions, int seed) function main (line 778) | int main(int argc, char **argv) FILE: src/blitsort.c function VAR (line 220) | VAR FUNC(blit_binary_median)(VAR *pta, VAR *ptb, size_t len, CMPFUNC *cmp) function VAR (line 241) | VAR FUNC(blit_median_of_nine)(VAR *array, VAR *swap, size_t nmemb, CMPFU... function VAR (line 274) | VAR FUNC(blit_median_of_cbrt)(VAR *array, VAR *swap, size_t swap_size, s... FILE: src/blitsort.h function blitsort (line 226) | void blitsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) function blitsort_prim (line 283) | void blitsort_prim(void *array, size_t nmemb, size_t size) FILE: src/crumsort.c function VAR (line 220) | VAR *FUNC(crum_binary_median)(VAR *pta, VAR *ptb, size_t len, CMPFUNC *cmp) function VAR (line 229) | VAR *FUNC(crum_median_of_cbrt)(VAR *array, VAR *swap, size_t swap_size, ... function VAR (line 270) | VAR *FUNC(crum_median_of_nine)(VAR *array, size_t nmemb, CMPFUNC *cmp) FILE: src/crumsort.h function crumsort (line 227) | void crumsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) function crumsort_prim (line 284) | void crumsort_prim(void *array, size_t nmemb, size_t size) FILE: src/fluxsort.c function VAR (line 227) | VAR FUNC(binary_median)(VAR *pta, VAR *ptb, size_t len, CMPFUNC *cmp) function VAR (line 248) | VAR FUNC(median_of_nine)(VAR *array, size_t nmemb, CMPFUNC *cmp) function VAR (line 281) | VAR FUNC(median_of_cbrt)(VAR *array, VAR *swap, VAR *ptx, size_t nmemb, ... FILE: src/fluxsort.h function fluxsort (line 200) | void fluxsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) function fluxsort_prim (line 241) | void fluxsort_prim(void *array, size_t nmemb, size_t size) function fluxsort_size (line 270) | void fluxsort_size(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) FILE: src/gridsort.c function STRUCT (line 3) | STRUCT(x_node) function STRUCT (line 12) | STRUCT(y_node) FILE: src/gridsort.h function gridsort (line 140) | void gridsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) FILE: src/quadsort.h function quadsort (line 300) | void quadsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) function quadsort_prim (line 357) | void quadsort_prim(void *array, size_t nmemb, size_t size) function quadsort_size (line 386) | void quadsort_size(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) FILE: src/skipsort.h function skipsort (line 160) | void skipsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) FILE: src/wolfsort.h function wolfsort (line 202) | void wolfsort(void *array, size_t nmemb, size_t size, CMPFUNC *cmp) function wolfsort_prim (line 256) | void wolfsort_prim(void *array, size_t nmemb, size_t size)
Condensed preview — 18 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (297K chars).
[
{
"path": "LICENSE",
"chars": 1210,
"preview": "This is free and unencumbered software released into the public domain.\n\nAnyone is free to copy, modify, publish, use, c"
},
{
"path": "README.md",
"chars": 55589,
"preview": "Intro\n-----\n\nThis document describes a stable adaptive hybrid bucket / quick / merge / drop sort named wolfsort.\nThe buc"
},
{
"path": "src/bench.c",
"chars": 32290,
"preview": "/*\n\tTo compile use either:\n\n\tgcc -O3 bench.c\n\n\tor\n\n\tclang -O3 bench.c\n\n\tor\n\n\tg++ -O3 bench.c\n*/\n\n#include <stdlib.h>\n#in"
},
{
"path": "src/blitsort.c",
"chars": 13894,
"preview": "// blitsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#define BLIT_AUX 512 // set to 0 for sqrt(n) cache size\n#de"
},
{
"path": "src/blitsort.h",
"chars": 8829,
"preview": "// blitsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#ifndef BLITSORT_H\n#define BLITSORT_H\n\n#include <stdlib.h>\n"
},
{
"path": "src/crumsort.c",
"chars": 13655,
"preview": "// crumsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#define CRUM_AUX 512\n#define CRUM_OUT 96\n\nvoid FUNC(fulc"
},
{
"path": "src/crumsort.h",
"chars": 8821,
"preview": "// crumsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#ifndef CRUMSORT_H\n#define CRUMSORT_H\n\n#include <stdlib.h>\n"
},
{
"path": "src/extra_tests.c",
"chars": 5120,
"preview": "#ifdef QUAD_DEBUG\n\n\t// random % 4\n\n\tfor (cnt = 0 ; cnt < mem ; cnt++)\n\t{\n\t\tr_array[cnt] = rand() % 4;\n\t}\n\trun_test(a_arr"
},
{
"path": "src/fluxsort.c",
"chars": 15069,
"preview": "// fluxsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#define FLUX_OUT 96\n\nvoid FUNC(flux_partition)(VAR *array, "
},
{
"path": "src/fluxsort.h",
"chars": 8122,
"preview": "// fluxsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#ifndef FLUXSORT_H\n#define FLUXSORT_H\n\n#include <stdlib.h>\n"
},
{
"path": "src/gridsort.c",
"chars": 8672,
"preview": "// gridsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\nSTRUCT(x_node)\n{\n\tVAR *swap;\n\tsize_t y_size;\n\tsize_t y;\n\tVA"
},
{
"path": "src/gridsort.h",
"chars": 5562,
"preview": "// gridsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#ifndef GRIDSORT_H\n#define GRIDSORT_H\n\n//#define cmp(a,b) ("
},
{
"path": "src/quadsort.c",
"chars": 24782,
"preview": "// quadsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n// the next seven functions are used for sorting 0 to 31 el"
},
{
"path": "src/quadsort.h",
"chars": 11723,
"preview": "// quadsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#ifndef QUADSORT_H\n#define QUADSORT_H\n\n#include <stdlib.h>\n"
},
{
"path": "src/skipsort.c",
"chars": 5822,
"preview": "// skipsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\nvoid FUNC(skip_partition)(VAR *array, VAR *swap, VAR *ptx, "
},
{
"path": "src/skipsort.h",
"chars": 5907,
"preview": "// skipsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#ifndef SKIPSORT_H\n#define SKIPSORT_H\n\n#include <stdlib.h>\n"
},
{
"path": "src/wolfsort.c",
"chars": 11552,
"preview": "// wolfsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n//#define GODMODE \n\n#ifdef GODMODE // inspired by rhsort, t"
},
{
"path": "src/wolfsort.h",
"chars": 7739,
"preview": "// wolfsort 1.2.1.3 - Igor van den Hoven ivdhoven@gmail.com\n\n#ifndef WOLFSORT_H\n#define WOLFSORT_H\n\n#include <stdlib.h>\n"
}
]
About this extraction
This page contains the full source code of the scandum/wolfsort GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 18 files (238.6 KB), approximately 89.2k tokens, and a symbol index with 45 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.