Full Code of deric/clustering-benchmark for AI

master fc7aba73c873 cached

197 files

16.6 MB

4.4M tokens

172 symbols

1 requests

Copy disabled (too large) Download .txt

Showing preview only (17,497K chars total). Download the full file to get everything.

Repository: deric/clustering-benchmark
Branch: master
Commit: fc7aba73c873
Files: 197
Total size: 16.6 MB

Directory structure:
gitextract_0ddfptui/

├── .gitignore
├── README-old.asc
├── README.md
├── consensus
├── evolve-sc
├── nb-configuration.xml
├── pom.xml
├── run
├── src/
│   ├── main/
│   │   ├── java/
│   │   │   └── org/
│   │   │       └── clueminer/
│   │   │           ├── clustering/
│   │   │           │   └── benchmark/
│   │   │           │       ├── AbsParams.java
│   │   │           │       ├── Bench.java
│   │   │           │       ├── BenchParams.java
│   │   │           │       ├── ClusteringBenchmark.java
│   │   │           │       ├── Container.java
│   │   │           │       ├── Experiment.java
│   │   │           │       ├── GnuplotReporter.java
│   │   │           │       ├── Main.java
│   │   │           │       ├── ParamExperiment.java
│   │   │           │       ├── chameleon2/
│   │   │           │       │   └── Cham2Bench.java
│   │   │           │       ├── consensus/
│   │   │           │       │   ├── ConsensusExp.java
│   │   │           │       │   ├── ConsensusParams.java
│   │   │           │       │   └── ConsensusRun.java
│   │   │           │       ├── cutoff/
│   │   │           │       │   ├── CutoffComparison.java
│   │   │           │       │   ├── CutoffExp.java
│   │   │           │       │   ├── CutoffParams.java
│   │   │           │       │   └── FirstJumpOptimization.java
│   │   │           │       ├── evolve/
│   │   │           │       │   ├── EvolveExp.java
│   │   │           │       │   └── EvolveParams.java
│   │   │           │       ├── exp/
│   │   │           │       │   ├── Data.java
│   │   │           │       │   ├── EvolveScores.java
│   │   │           │       │   ├── HclusPar.java
│   │   │           │       │   ├── HclusPar2.java
│   │   │           │       │   └── Hclust.java
│   │   │           │       ├── gen/
│   │   │           │       │   ├── NsgaGen.java
│   │   │           │       │   ├── NsgaGenExp.java
│   │   │           │       │   └── NsgaGenParams.java
│   │   │           │       ├── nsga/
│   │   │           │       │   ├── NsgaExp.java
│   │   │           │       │   ├── NsgaParams.java
│   │   │           │       │   └── NsgaScore.java
│   │   │           │       └── partition/
│   │   │           │           ├── PartitionBench.java
│   │   │           │           ├── PartitionExp.java
│   │   │           │           └── PartitionParams.java
│   │   │           └── data/
│   │   │               ├── DataLoader.java
│   │   │               └── ResourceList.java
│   │   ├── nbm/
│   │   │   └── manifest.mf
│   │   └── resources/
│   │       ├── datasets/
│   │       │   ├── artificial/
│   │       │   │   ├── 2d-10c.arff
│   │       │   │   ├── 2d-20c-no0.arff
│   │       │   │   ├── 2d-3c-no123.arff
│   │       │   │   ├── 2d-4c-no4.arff
│   │       │   │   ├── 2d-4c-no9.arff
│   │       │   │   ├── 2d-4c.arff
│   │       │   │   ├── 2dnormals.arff
│   │       │   │   ├── 2sp2glob.arff
│   │       │   │   ├── 3-spiral.arff
│   │       │   │   ├── 3MC.arff
│   │       │   │   ├── D31.arff
│   │       │   │   ├── DS-577.arff
│   │       │   │   ├── DS-850.arff
│   │       │   │   ├── R15.arff
│   │       │   │   ├── aggregation.arff
│   │       │   │   ├── aml28.arff
│   │       │   │   ├── atom.arff
│   │       │   │   ├── banana.arff
│   │       │   │   ├── birch-rg1.arff
│   │       │   │   ├── birch-rg2.arff
│   │       │   │   ├── birch-rg3.arff
│   │       │   │   ├── blobs.arff
│   │       │   │   ├── cassini.arff
│   │       │   │   ├── chainlink.arff
│   │       │   │   ├── circle.arff
│   │       │   │   ├── cluto-t4-8k.arff
│   │       │   │   ├── cluto-t5-8k.arff
│   │       │   │   ├── cluto-t7-10k.arff
│   │       │   │   ├── cluto-t8-8k.arff
│   │       │   │   ├── complex8.arff
│   │       │   │   ├── complex9.arff
│   │       │   │   ├── compound.arff
│   │       │   │   ├── cuboids.arff
│   │       │   │   ├── cure-t0-2000n-2D.arff
│   │       │   │   ├── cure-t1-2000n-2D.arff
│   │       │   │   ├── cure-t2-4k.arff
│   │       │   │   ├── curves1.arff
│   │       │   │   ├── curves2.arff
│   │       │   │   ├── dartboard1.arff
│   │       │   │   ├── dartboard2.arff
│   │       │   │   ├── dense-disk-3000.arff
│   │       │   │   ├── dense-disk-5000.arff
│   │       │   │   ├── diamond9.arff
│   │       │   │   ├── disk-1000n.arff
│   │       │   │   ├── disk-3000n.arff
│   │       │   │   ├── disk-4000n.arff
│   │       │   │   ├── disk-4500n.arff
│   │       │   │   ├── disk-4600n.arff
│   │       │   │   ├── disk-5000n.arff
│   │       │   │   ├── disk-6000n.arff
│   │       │   │   ├── donut1.arff
│   │       │   │   ├── donut2.arff
│   │       │   │   ├── donut3.arff
│   │       │   │   ├── donutcurves.arff
│   │       │   │   ├── dpb.arff
│   │       │   │   ├── dpc.arff
│   │       │   │   ├── ds2c2sc13.arff
│   │       │   │   ├── ds3c3sc6.arff
│   │       │   │   ├── ds4c2sc8.arff
│   │       │   │   ├── elliptical_10_2.arff
│   │       │   │   ├── elly-2d10c13s.arff
│   │       │   │   ├── engytime.arff
│   │       │   │   ├── flame.arff
│   │       │   │   ├── fourty.arff
│   │       │   │   ├── gaussians1.arff
│   │       │   │   ├── golfball.arff
│   │       │   │   ├── hepta.arff
│   │       │   │   ├── hypercube.arff
│   │       │   │   ├── impossible.arff
│   │       │   │   ├── insect.arff
│   │       │   │   ├── jain.arff
│   │       │   │   ├── long1.arff
│   │       │   │   ├── long2.arff
│   │       │   │   ├── long3.arff
│   │       │   │   ├── longsquare.arff
│   │       │   │   ├── lsun.arff
│   │       │   │   ├── mopsi-finland.arff
│   │       │   │   ├── mopsi-joensuu.arff
│   │       │   │   ├── pathbased.arff
│   │       │   │   ├── pmf.arff
│   │       │   │   ├── rings.arff
│   │       │   │   ├── s-set1.arff
│   │       │   │   ├── s-set2.arff
│   │       │   │   ├── s-set3.arff
│   │       │   │   ├── s-set4.arff
│   │       │   │   ├── shapes.arff
│   │       │   │   ├── simplex.arff
│   │       │   │   ├── sizes1.arff
│   │       │   │   ├── sizes2.arff
│   │       │   │   ├── sizes3.arff
│   │       │   │   ├── sizes4.arff
│   │       │   │   ├── sizes5.arff
│   │       │   │   ├── smile1.arff
│   │       │   │   ├── smile2.arff
│   │       │   │   ├── smile3.arff
│   │       │   │   ├── spherical_4_3.arff
│   │       │   │   ├── spherical_5_2.arff
│   │       │   │   ├── spherical_6_2.arff
│   │       │   │   ├── spiral.arff
│   │       │   │   ├── spiralsquare.arff
│   │       │   │   ├── square1.arff
│   │       │   │   ├── square2.arff
│   │       │   │   ├── square3.arff
│   │       │   │   ├── square4.arff
│   │       │   │   ├── square5.arff
│   │       │   │   ├── st900.arff
│   │       │   │   ├── target.arff
│   │       │   │   ├── tetra.arff
│   │       │   │   ├── threenorm.arff
│   │       │   │   ├── triangle1.arff
│   │       │   │   ├── triangle2.arff
│   │       │   │   ├── twenty.arff
│   │       │   │   ├── twodiamonds.arff
│   │       │   │   ├── wingnut.arff
│   │       │   │   ├── xclara.arff
│   │       │   │   ├── xor.arff
│   │       │   │   ├── zelnik1.arff
│   │       │   │   ├── zelnik2.arff
│   │       │   │   ├── zelnik3.arff
│   │       │   │   ├── zelnik4.arff
│   │       │   │   ├── zelnik5.arff
│   │       │   │   └── zelnik6.arff
│   │       │   └── real-world/
│   │       │       ├── arrhythmia.arff
│   │       │       ├── balance-scale.arff
│   │       │       ├── cpu.arff
│   │       │       ├── dermatology.arff
│   │       │       ├── ecoli.arff
│   │       │       ├── german.arff
│   │       │       ├── glass.arff
│   │       │       ├── haberman.arff
│   │       │       ├── heart-statlog.arff
│   │       │       ├── iono.arff
│   │       │       ├── iris.arff
│   │       │       ├── letter.arff
│   │       │       ├── segment.arff
│   │       │       ├── sonar.arff
│   │       │       ├── tae.arff
│   │       │       ├── thy.arff
│   │       │       ├── vehicle.arff
│   │       │       ├── vowel.arff
│   │       │       ├── water-treatment.arff
│   │       │       ├── wdbc.arff
│   │       │       ├── wine.arff
│   │       │       ├── wisc.arff
│   │       │       ├── yeast.arff
│   │       │       └── zoo.arff
│   │       ├── log4j2.properties
│   │       └── org/
│   │           └── clueminer/
│   │               └── clustering/
│   │                   └── benchmark/
│   │                       └── Bundle.properties
│   └── test/
│       ├── java/
│       │   └── org/
│       │       └── clueminer/
│       │           └── clustering/
│       │               └── benchmark/
│       │                   ├── ExperimentTest.java
│       │                   ├── HclustBenchmarkTest.java
│       │                   └── chameleon2/
│       │                       └── Cham2BenchTest.java
│       └── resources/
│           └── log4j2.properties
└── updreadme.rb

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
*.class

# Package Files #
*.jar
*.war
*.ear
*~
.classpath
.project
.settings/
target/
logs/
/nbproject/private/

/nbproject/

================================================
FILE: README-old.asc
================================================
# Clustering datasets

## Datasets

This project contains collection of labeled clustering problems that can be found in the literature. Most of datasets were artificially created.

All datasets can be found link:https://github.com/deric/clustering-benchmark/tree/master/src/main/resources/datasets/artificial[data folder].

### 2d-10c

[align="right",options="header"]
|===
| data points | clusters | dimension
| 2990  |  10 |  2
|===

image::https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/2d-10c.png["2d-10c",400,float="left"]

[.float .right]
* link:https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/2d-10c.arff[ARFF]
* link:https://github.com/deric/handl-data-generators[generator]

> J. Handl and J. Knowles, “Multiobjective clustering with automatic
> determination of the number of clusters,” UMIST, Tech. Rep., 2004.

### atom

[align="right",options="header"]
|===
| data points | clusters | dimension
| 800         |        2 |  3
|===

image::https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/atom.png["atom",400,float="left"]

[.float .right]
* source: link:https://www.uni-marburg.de/fb12/datenbionik/data?language_sync=1[FCPS]
* link:https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/atom.arff[ARFF]

### aggregation

[align="right",options="header"]
|===
| data points | clusters | dimension
| 788  |  7 |  2
|===

image::https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/aggregation.png[aggregation,400,float="left"]

[.float .right]
* link:https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/aggregation.arff[ARFF]
* link:http://cs.joensuu.fi/sipu/datasets/[original source]

> Gionis, A., H. Mannila, and P. Tsaparas, Clustering aggregation.
> ACM Transactions on Knowledge Discovery from Data (TKDD), 2007. 1(1): p. 1-30.

### chainlink

[align="right",options="header"]
|===
| data points | clusters | dimension
| 1000        |        2 |  3
|===

image::https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/chainlink.png["chainlink",400,float="left"]

[.float .right]
* source: link:https://www.uni-marburg.de/fb12/datenbionik/data?language_sync=1[FCPS]
* link:https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/chainlink.arff[ARFF]

> Alfred Ultsch, Clustering with SOM: U*C,
> in Proc. Workshop on Self Organizing Feature Maps ,pp 31-37 Paris 2005.

### D31

[align="right",style="asciidoc",options="noborders,wide"]
|===
| data points |  3100
| clusters    | 31
| dimensions  | 2
| image::https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/D31.png["D31",400,float="left"] | * link:https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/D31.arff[ARFF]
|===

> Veenman, C.J., M.J.T. Reinders, and E. Backer,
> A maximum variance cluster algorithm. IEEE Trans. Pattern Analysis and Machine Intelligence 2002. 24(9): p. 1273-1280.

### 3MC

[align="right",options="header",style="literal"]
|===
| data points | clusters | dimension
| 400         |        3 |  2
|===

[.float .right]
image::https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/3MC.png["3MC",400,float="left"]


### DS577

[align="right",options="header"]
|===
| data points | clusters | dimension
| 577        |        3 |  2
|===

image::https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/DS577.png["D31",400,float="left"]

[.float .right]
* link:https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/DS577.arff[ARFF]

> M. C. Su, C. H. Chou, and C. C. Hsieh, “Fuzzy C-Means Algorithm with a Point Symmetry Distance,”
> International Journal of Fuzzy Systems, vol. 7, no. 4, pp. 175-181, 2005.


### cluto-t4_8k

[align="right",options="header"]
|===
| data points | clusters | dimension
| 8000        |        7 |  2
|===

image::https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/cluto-t4_8k.png["cluto-t4_8k",400,float="left"]

[.float .right]
* link:https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/cluto-t4.8k.arff[ARFF]

> G. Karypis, “CLUTO A Clustering Toolkit,”
> Dept. of Computer Science, University of Minnesota, Tech. Rep. 02-017, 2002, available at
http://www.cs.umn.edu/ ̃cluto.


## Experiments

This project contains set of clustering methods benchmarks on various dataset. The project is dependent on [Clueminer project](https://github.com/deric/clueminer).

in order to run benchmark compile dependencies into a single JAR file:

    mvn assembly:assembly

# Consensus experiment

allows running repeated runs of the same algorithm:

```
./run consensus --dataset "triangle1" --repeat 10
```
by default k-means algorithm is used.

For available datasets see [resources folder](https://github.com/deric/clustering-benchmark/tree/master/src/main/resources/datasets/artificial).


================================================
FILE: README.md
================================================
# Clustering benchmarks

## Datasets

This project contains collection of labeled clustering problems that can be found in the literature. Most of datasets were artificially created.

The benchmark includes:

  * [artificial datasets](https://github.com/deric/clustering-benchmark/tree/master/src/main/resources/datasets/artificial)
  * [real world datasets](https://github.com/deric/clustering-benchmark/tree/master/src/main/resources/datasets/real-world)

### Artificial data

<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/2d-10c.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/2d-10c.png" alt="2d-10c" title="2d-10c" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/2d-20c-no0.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/2d-20c-no0.png" alt="2d-20c-no0" title="2d-20c-no0" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/2d-3c-no123.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/2d-3c-no123.png" alt="2d-3c-no123" title="2d-3c-no123" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/2d-4c-no4.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/2d-4c-no4.png" alt="2d-4c-no4" title="2d-4c-no4" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/2d-4c-no9.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/2d-4c-no9.png" alt="2d-4c-no9" title="2d-4c-no9" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/2d-4c.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/2d-4c.png" alt="2d-4c" title="2d-4c" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/2sp2glob.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/2sp2glob.png" alt="2sp2glob" title="2sp2glob" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/3-spiral.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/3-spiral.png" alt="3-spiral" title="3-spiral" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/3MC.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/3mc.png" alt="3MC" title="3MC" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/D31.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/d31.png" alt="D31" title="D31" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/DS577.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/ds577.png" alt="DS577" title="DS577" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/DS850.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/ds850.png" alt="DS850" title="DS850" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/R15.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/r15.png" alt="R15" title="R15" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/aggregation.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/aggregation.png" alt="aggregation" title="aggregation" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/atom.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/atom.png" alt="atom" title="atom" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/banana.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/banana.png" alt="banana" title="banana" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/birch-rg1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/birch-rg1.png" alt="birch-rg1" title="birch-rg1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/birch-rg2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/birch-rg2.png" alt="birch-rg2" title="birch-rg2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/birch-rg3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/birch-rg3.png" alt="birch-rg3" title="birch-rg3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/chainlink.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/chainlink.png" alt="chainlink" title="chainlink" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/cluto-t4.8k.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/cluto-t4.8k.png" alt="cluto-t4.8k" title="cluto-t4.8k" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/cluto-t5.8k.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/cluto-t5.8k.png" alt="cluto-t5.8k" title="cluto-t5.8k" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/cluto-t7.10k.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/cluto-t7.10k.png" alt="cluto-t7.10k" title="cluto-t7.10k" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/cluto-t8.8k.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/cluto-t8.8k.png" alt="cluto-t8.8k" title="cluto-t8.8k" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/complex8.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/complex8.png" alt="complex8" title="complex8" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/complex9.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/complex9.png" alt="complex9" title="complex9" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/compound.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/compound.png" alt="compound" title="compound" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/cure-t0-2000n-2D.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/cure-t0-2000n-2d.png" alt="cure-t0-2000n-2D" title="cure-t0-2000n-2D" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/cure-t1-2000n-2D.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/cure-t1-2000n-2d.png" alt="cure-t1-2000n-2D" title="cure-t1-2000n-2D" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/cure-t2-4k.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/cure-t2-4k.png" alt="cure-t2-4k" title="cure-t2-4k" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/curves1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/curves1.png" alt="curves1" title="curves1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/curves2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/curves2.png" alt="curves2" title="curves2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/dartboard1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/dartboard1.png" alt="dartboard1" title="dartboard1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/dartboard2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/dartboard2.png" alt="dartboard2" title="dartboard2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/dense-disk-3000.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/dense-disk-3000.png" alt="dense-disk-3000" title="dense-disk-3000" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/dense-disk-5000.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/dense-disk-5000.png" alt="dense-disk-5000" title="dense-disk-5000" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/diamond9.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/diamond9.png" alt="diamond9" title="diamond9" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/disk-1000n.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/disk-1000n.png" alt="disk-1000n" title="disk-1000n" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/disk-3000n.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/disk-3000n.png" alt="disk-3000n" title="disk-3000n" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/disk-4000n.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/disk-4000n.png" alt="disk-4000n" title="disk-4000n" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/disk-4500n.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/disk-4500n.png" alt="disk-4500n" title="disk-4500n" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/disk-4600n.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/disk-4600n.png" alt="disk-4600n" title="disk-4600n" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/disk-5000n.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/disk-5000n.png" alt="disk-5000n" title="disk-5000n" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/disk-6000n.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/disk-6000n.png" alt="disk-6000n" title="disk-6000n" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/donut1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/donut1.png" alt="donut1" title="donut1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/donut2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/donut2.png" alt="donut2" title="donut2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/donut3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/donut3.png" alt="donut3" title="donut3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/donutcurves.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/donutcurves.png" alt="donutcurves" title="donutcurves" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/ds2c2sc13.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/ds2c2sc13.png" alt="ds2c2sc13" title="ds2c2sc13" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/ds3c3sc6.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/ds3c3sc6.png" alt="ds3c3sc6" title="ds3c3sc6" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/ds4c2sc8.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/ds4c2sc8.png" alt="ds4c2sc8" title="ds4c2sc8" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/elliptical_10_2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/elliptical_10_2.png" alt="elliptical_10_2" title="elliptical_10_2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/elly-2d10c13s.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/elly-2d10c13s.png" alt="elly-2d10c13s" title="elly-2d10c13s" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/engytime.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/engytime.png" alt="engytime" title="engytime" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/flame.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/flame.png" alt="flame" title="flame" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/fourty.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/fourty.png" alt="fourty" title="fourty" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/golfball.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/golfball.png" alt="golfball" title="golfball" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/hepta.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/hepta.png" alt="hepta" title="hepta" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/insect.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/insect.png" alt="insect" title="insect" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/jain.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/jain.png" alt="jain" title="jain" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/long1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/long1.png" alt="long1" title="long1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/long2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/long2.png" alt="long2" title="long2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/long3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/long3.png" alt="long3" title="long3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/longsquare.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/longsquare.png" alt="longsquare" title="longsquare" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/lsun.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/lsun.png" alt="lsun" title="lsun" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/mopsi-finland.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/mopsi-finland.png" alt="mopsi-finland" title="mopsi-finland" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/mopsi-joensuu.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/mopsi-joensuu.png" alt="mopsi-joensuu" title="mopsi-joensuu" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/pathbased.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/pathbased.png" alt="pathbased" title="pathbased" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/rings.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/rings.png" alt="rings" title="rings" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/s-set1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/s-set1.png" alt="s-set1" title="s-set1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/s-set2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/s-set2.png" alt="s-set2" title="s-set2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/s-set3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/s-set3.png" alt="s-set3" title="s-set3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/s-set4.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/s-set4.png" alt="s-set4" title="s-set4" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/sizes1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/sizes1.png" alt="sizes1" title="sizes1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/sizes2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/sizes2.png" alt="sizes2" title="sizes2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/sizes3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/sizes3.png" alt="sizes3" title="sizes3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/sizes4.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/sizes4.png" alt="sizes4" title="sizes4" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/sizes5.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/sizes5.png" alt="sizes5" title="sizes5" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/smile1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/smile1.png" alt="smile1" title="smile1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/smile2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/smile2.png" alt="smile2" title="smile2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/smile3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/smile3.png" alt="smile3" title="smile3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/spherical_4_3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/spherical_4_3.png" alt="spherical_4_3" title="spherical_4_3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/spherical_5_2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/spherical_5_2.png" alt="spherical_5_2" title="spherical_5_2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/spherical_6_2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/spherical_6_2.png" alt="spherical_6_2" title="spherical_6_2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/spiral.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/spiral.png" alt="spiral" title="spiral" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/spiralsquare.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/spiralsquare.png" alt="spiralsquare" title="spiralsquare" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/square1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/square1.png" alt="square1" title="square1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/square2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/square2.png" alt="square2" title="square2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/square3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/square3.png" alt="square3" title="square3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/square4.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/square4.png" alt="square4" title="square4" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/square5.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/square5.png" alt="square5" title="square5" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/st900.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/st900.png" alt="st900" title="st900" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/target.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/target.png" alt="target" title="target" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/tetra.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/tetra.png" alt="tetra" title="tetra" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/triangle1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/triangle1.png" alt="triangle1" title="triangle1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/triangle2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/triangle2.png" alt="triangle2" title="triangle2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/twenty.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/twenty.png" alt="twenty" title="twenty" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/twodiamonds.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/twodiamonds.png" alt="twodiamonds" title="twodiamonds" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/wingnut.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/wingnut.png" alt="wingnut" title="wingnut" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/xclara.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/xclara.png" alt="xclara" title="xclara" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/zelnik1.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/zelnik1.png" alt="zelnik1" title="zelnik1" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/zelnik2.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/zelnik2.png" alt="zelnik2" title="zelnik2" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/zelnik3.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/zelnik3.png" alt="zelnik3" title="zelnik3" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/zelnik4.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/zelnik4.png" alt="zelnik4" title="zelnik4" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/zelnik5.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/zelnik5.png" alt="zelnik5" title="zelnik5" width="239px" style="max-width: 100%;float:left;"/></a>
<a href="https://github.com/deric/clustering-benchmark/blob/master/src/main/resources/datasets/artificial/zelnik6.arff"><img src="https://github.com/deric/clustering-benchmark/blob/images/fig/artificial/zelnik6.png" alt="zelnik6" title="zelnik6" width="239px" style="max-width: 100%;float:left;"/></a>


## Experiments

This project contains set of clustering methods benchmarks on various dataset. The project is dependent on [Clueminer project](https://github.com/deric/clueminer).

in order to run benchmark compile dependencies into a single JAR file:

    mvn assembly:assembly

# Consensus experiment

allows running repeated runs of the same algorithm:

```
./run consensus --dataset "triangle1" --repeat 10
```
by default k-means algorithm is used.

For available datasets see [resources folder](https://github.com/deric/clustering-benchmark/tree/master/src/main/resources/datasets/artificial).


================================================
FILE: consensus
================================================
#!/bin/bash
meth=("KmB-COMUSA-RAND" "KmB-COMUSA-MO" "KmB-COMUSA-RAND-fixed")
for m in "${meth[@]}"; do
 ./run consensus --repeat 10 --method "$m" --dataset "$@"
done



================================================
FILE: evolve-sc
================================================
#!/bin/bash
ARGS="evolve-sc --test --generations 20 --population 50 $@"
`./run $ARGS`


================================================
FILE: nb-configuration.xml
================================================
<?xml version="1.0" encoding="UTF-8"?>
<project-shared-configuration>
    <!--
This file contains additional configuration written by modules in the NetBeans IDE.
The configuration is intended to be shared among all the users of project and
therefore it is assumed to be part of version control checkout.
Without this configuration present, some functionality in the IDE may be limited or fail altogether.
-->
    <spellchecker-wordlist xmlns="http://www.netbeans.org/ns/spellchecker-wordlist/1">
        <word>unsupervised</word>
    </spellchecker-wordlist>
    <properties xmlns="http://www.netbeans.org/ns/maven-properties-data/1">
        <!--
Properties that influence various parts of the IDE, especially code formatting and the like. 
You can copy and paste the single properties, into the pom.xml file and the IDE will pick them up.
That way multiple projects can share the same settings (useful for formatting rules for example).
Any value defined here will override the pom.xml file value but is only applicable to the current project.
-->
        <netbeans.hint.licensePath>${project.basedir}/../clueminer/license.txt</netbeans.hint.licensePath>
    </properties>
</project-shared-configuration>


================================================
FILE: pom.xml
================================================
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <artifactId>clueminer-parent</artifactId>
        <groupId>org.clueminer</groupId>
        <version>0.1.0-SNAPSHOT</version>
        <relativePath>../..</relativePath>
    </parent>

    <groupId>org.clueminer</groupId>
    <artifactId>clustering-benchmark</artifactId>
    <version>0.1.0-SNAPSHOT</version>
    <packaging>nbm</packaging>

    <name>clustering-benchmark</name>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <repositories>
        <!--
        Repository hosting NetBeans modules, especially APIs.
        Versions are based on IDE releases, e.g.: RELEASE691
        To create your own repository, use: nbm:populate-repository
        -->
        <repository>
            <id>netbeans</id>
            <name>NetBeans</name>
            <url>http://bits.netbeans.org/maven2/</url>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
        </repository>
    </repositories>

    <dependencies>
        <dependency>
            <groupId>org.netbeans.api</groupId>
            <artifactId>org-netbeans-api-annotations-common</artifactId>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>dataset-api</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>clustering-impl</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>clustering-api</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>dataset-impl</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>dataset-io</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>fixtures</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>dataset-benchmark</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.netbeans.api</groupId>
            <artifactId>org-openide-util-lookup</artifactId>
        </dependency>
        <dependency>
            <groupId>org.netbeans.api</groupId>
            <artifactId>org-openide-util</artifactId>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>utils</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>clustering-evolution</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>clustering-eval</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>clustering-dist</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>fixtures-clustering</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>${clueminer.junit.version}</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>com.beust</groupId>
            <artifactId>jcommander</artifactId>
            <version>1.48</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>core-lib-wrapper</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>evolution-api</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.clueminer</groupId>
            <artifactId>chameleon</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>gnuplot</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>partitioning-api</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>graph-api</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>graph-impl</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>partitioning-impl</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.netbeans.api</groupId>
            <artifactId>org-openide-util-ui</artifactId>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>nbm-maven-plugin</artifactId>
                <extensions>true</extensions>
                <configuration>
                    <publicPackages>
                        <publicPackage>org.clueminer.clustering.benchmark</publicPackage>
                        <publicPackage>org.clueminer.data</publicPackage>
                    </publicPackages>
                </configuration>
            </plugin>

            <plugin>
                <!-- NetBeans 6.9+ requires JDK 6 -->
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>${maven.compiler}</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                    <archive>
                        <manifest>
                            <mainClass>org.clueminer.clustering.benchmark.Main</mainClass>
                            <packageName>org.clueminer.clustering.benchmark</packageName>
                            <addClasspath>true</addClasspath>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <version>2.4</version>
                <configuration>
                    <!-- to have the jar plugin pickup the nbm generated manifest -->
                    <useDefaultManifestFile>true</useDefaultManifestFile>
                    <archive>
                        <manifest>
                            <mainClass>org.clueminer.clustering.benchmark.Main</mainClass>
                            <packageName>org.clueminer.clustering.benchmark</packageName>
                            <addClasspath>true</addClasspath>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <mainClass>org.clueminer.clustering.benchmark.Main</mainClass>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-dependency-plugin</artifactId>
                <executions>
                    <execution>
                        <phase>install</phase>
                        <goals>
                            <goal>copy-dependencies</goal>
                        </goals>
                        <configuration>
                            <outputDirectory>${project.build.directory}/lib</outputDirectory>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

        </plugins>
    </build>
</project>


================================================
FILE: run
================================================
#!/bin/bash
ARGS="$@"
MAIN="org.clueminer.clustering.benchmark.Main"
jarfile="$(ls -t target/*jar-with-dependencies.jar | head -1)"
if [[ -f "$jarfile" ]]; then
  java -jar $jarfile $ARGS
else
  mvn "-Dexec.args=-classpath %classpath $MAIN $ARGS" -Dexec.executable=java -Dexec.classpathScope=runtime org.codehaus.mojo:exec-maven-plugin:1.2.1:exec
fi


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/AbsParams.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import com.beust.jcommander.Parameter;
import java.io.File;
import org.clueminer.utils.FileUtils;
import org.openide.util.NbBundle;

/**
 *
 * @author Tomas Barton
 */
public class AbsParams {

    @Parameter(names = "--dir", description = "directory for results", required = false)
    public String home = System.getProperty("user.home") + File.separatorChar
            + NbBundle.getMessage(FileUtils.class, "FOLDER_Home");

    @Parameter(names = "--repeat", description = "number of repetitions of each experiment")
    public int repeat = 5;

    @Parameter(names = "--log", description = "java log level")
    public String log = "INFO";

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/Bench.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import com.beust.jcommander.JCommander;
import com.beust.jcommander.ParameterException;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import org.clueminer.data.DataLoader;
import org.clueminer.dataset.api.DataProvider;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.benchmark.DatasetFixture;

/**
 *
 * @author Tomas Barton
 */
public abstract class Bench {

    protected static String benchmarkFolder;
    protected HashMap<String, Map.Entry<Dataset<? extends Instance>, Integer>> availableDatasets = new HashMap<>();
    protected DataProvider provider;

    public Bench() {
        //constructor without arguments
    }

    public static void ensureFolder(String folder) {
        File file = new File(folder);
        if (!file.exists()) {
            if (file.mkdirs()) {
                System.out.println("Directory " + folder + " created!");
            } else {
                System.out.println("Failed to create " + folder + "directory!");
            }
        }
    }

    public abstract void main(String[] args);

    public static void printUsage(String[] args, JCommander cmd, AbsParams params) {

        try {
            cmd.parse(args);

        } catch (ParameterException ex) {
            System.out.println(ex.getMessage());
            cmd.usage();
            System.exit(0);
        }
    }

    protected void loadDatasets() {
        Map<Dataset<? extends Instance>, Integer> datasets = DatasetFixture.allDatasets();
        for (Map.Entry<Dataset<? extends Instance>, Integer> entry : datasets.entrySet()) {
            Dataset<? extends Instance> d = entry.getKey();
            availableDatasets.put(d.getName(), entry);
        }
    }

    protected void loadBenchArtificial() {
        provider = DataLoader.createLoader("datasets", "artificial");
    }

    protected void loadBenchRealWorld() {
        provider = DataLoader.createLoader("datasets", "real-world");
    }

    /**
     * Load specific dataset by name
     *
     * @param name
     */
    protected void load(String name) {
        Map<Dataset<? extends Instance>, Integer> datasets = DatasetFixture.allDatasets();
        for (Map.Entry<Dataset<? extends Instance>, Integer> entry : datasets.entrySet()) {
            Dataset<? extends Instance> d = entry.getKey();
            if (d.getName().equalsIgnoreCase(name)) {
                availableDatasets.put(d.getName(), entry);
            }
        }
    }

    public static String safeName(String name) {
        return name.toLowerCase().replace(" ", "_");
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/BenchParams.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import com.beust.jcommander.Parameter;

/**
 *
 * @author Tomas Barton
 */
public class BenchParams extends AbsParams {

    @Parameter(names = "--n", description = "size of biggest dataset", required = false)
    public int n = 20;

    @Parameter(names = "--n-small", description = "size of smallest", required = false)
    public int nSmall = 5;

    @Parameter(names = "--steps", description = "number of datasets which will be generated")
    public int steps = 4;

    @Parameter(names = "--dimension", description = "number of attributes of each dataset")
    public int dimension = 5;

    @Parameter(names = "--linkage", description = "linkage method")
    public String linkage = "Single Linkage";

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/ClusteringBenchmark.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import org.clueminer.clustering.aggl.linkage.CompleteLinkage;
import org.clueminer.clustering.aggl.linkage.SingleLinkage;
import org.clueminer.clustering.api.AgglomerativeClustering;
import org.clueminer.clustering.api.AlgParams;
import org.clueminer.clustering.api.ClusteringAlgorithm;
import org.clueminer.clustering.api.ClusteringFactory;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.utils.Props;

/**
 * Execute single clustering
 *
 * @author Tomas Barton
 * @param <E>
 */
public class ClusteringBenchmark<E extends Instance> {

    public Container<E> cluster(Dataset<E> dataset, Props props) {
        ClusteringFactory cf = ClusteringFactory.getInstance();
        ClusteringAlgorithm algorithm = cf.getProvider(props.get(AlgParams.ALG));

        return new Container(algorithm, dataset, props);
    }

    public Container<E> cluster(ClusteringAlgorithm algorithm, Dataset<E> dataset, Props props) {
        return new Container(algorithm, dataset, props);
    }

    public Container<E> hclust(final AgglomerativeClustering algorithm, final Dataset<E> dataset, final String linkage) {
        Props props = new Props();
        props.put(AlgParams.LINKAGE, linkage);
        final Container<E> runnable = new Container(algorithm, dataset, props);
        return runnable;
    }

    public Container<E> singleLinkage(final AgglomerativeClustering algorithm, final Dataset<E> dataset) {
        return hclust(algorithm, dataset, SingleLinkage.name);
    }

    public Container<E> completeLinkage(final AgglomerativeClustering algorithm, final Dataset<E> dataset) {
        return hclust(algorithm, dataset, CompleteLinkage.name);
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/Container.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import org.clueminer.clustering.TreeDiff;
import org.clueminer.clustering.api.AgglomerativeClustering;
import org.clueminer.clustering.api.AlgParams;
import org.clueminer.clustering.api.Clustering;
import org.clueminer.clustering.api.ClusteringAlgorithm;
import org.clueminer.clustering.api.ClusteringType;
import org.clueminer.clustering.api.HierarchicalResult;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.exec.ClusteringExecutorCached;
import org.clueminer.utils.Props;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 *
 * @author Tomas Barton
 * @param <E>
 */
public class Container<E extends Instance> implements Runnable {

    private HierarchicalResult result;
    private Clustering clustering;
    private final Dataset<E> dataset;
    private Props params;
    private static final Logger LOG = LoggerFactory.getLogger(Container.class);
    private ClusteringExecutorCached executor;

    public Container(ClusteringAlgorithm algorithm, Dataset<E> dataset) {
        this.executor = new ClusteringExecutorCached();
        executor.setAlgorithm(algorithm);
        this.dataset = dataset;
        this.params = new Props();
    }

    public Container(ClusteringAlgorithm algorithm, Dataset<E> dataset, Props params) {
        this.executor = new ClusteringExecutorCached();
        executor.setAlgorithm(algorithm);
        this.dataset = dataset;
        this.params = params;
    }

    public HierarchicalResult hierarchical(AgglomerativeClustering algorithm, Dataset<E> dataset, Props params) {
        params.put(AlgParams.CLUSTERING_TYPE, ClusteringType.ROWS_CLUSTERING);
        return algorithm.hierarchy(dataset, params);
    }

    @Override
    public void run() {
        if (executor.getAlgorithm() instanceof AgglomerativeClustering) {
            this.result = executor.hclustRows(dataset, params);
        } else {
            this.clustering = executor.clusterRows(dataset, params);
        }
    }

    public Clustering cluster(ClusteringAlgorithm algorithm, Dataset<E> dataset, Props params) {
        return executor.clusterRows(dataset, params);
    }

    public boolean equals(Container other) {
        if (this.result == null || other.result == null) {
            throw new RuntimeException("got null result. this = " + result + " other = " + other);
        }
        return TreeDiff.compare(this.result, other.result);
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/Experiment.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import java.util.Random;
import org.clueminer.clustering.api.AgglomerativeClustering;
import org.clueminer.clustering.api.ClusteringAlgorithm;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.impl.ArrayDataset;
import org.clueminer.report.NanoBench;
import org.clueminer.utils.Props;
import org.openide.util.Exceptions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 *
 * @author Tomas Barton
 * @param <E>
 */
public class Experiment<E extends Instance> implements Runnable {

    protected final Random rand;
    protected final BenchParams params;
    private ClusteringAlgorithm[] algorithms;
    protected final String results;
    private static final Logger LOG = LoggerFactory.getLogger(Experiment.class);

    public Experiment(BenchParams params, String results) {
        rand = new Random();
        this.params = params;
        this.results = results;
    }

    public Experiment(BenchParams params, String results, ClusteringAlgorithm[] algorithms) {
        rand = new Random();
        this.params = params;
        this.results = results;
        this.algorithms = algorithms;
    }

    @Override
    public void run() {
        int inc = (params.n - params.nSmall) / params.steps;

        String[] names = new String[algorithms.length];
        int j = 0;
        for (ClusteringAlgorithm alg : algorithms) {
            names[j++] = alg.getName();
        }

        GnuplotReporter reporter = new GnuplotReporter(results,
                new String[]{"algorithm", "linkage", "n"}, names, params.nSmall + "-" + params.n,
                10);
        LOG.info("increment = {}", inc);
        ClusteringBenchmark bench = new ClusteringBenchmark();
        Container container;
        AgglomerativeClustering aggl;
        for (int i = params.nSmall; i <= params.n; i += inc) {
            Dataset<E> dataset = generateData(i, params.dimension);
            for (ClusteringAlgorithm alg : algorithms) {
                String[] opts = new String[]{alg.getName(), params.linkage, String.valueOf(dataset.size())};

                if (alg instanceof AgglomerativeClustering) {
                    aggl = (AgglomerativeClustering) alg;
                    container = bench.hclust(aggl, dataset, params.linkage);
                } else {
                    container = bench.cluster(alg, dataset, new Props());
                }

                NanoBench.create().measurements(params.repeat).collect(reporter, opts).measure(
                        alg.getName() + " - " + dataset.size(),
                        container
                );
                // Get the Java runtime
                Runtime runtime = Runtime.getRuntime();
                // Run the garbage collector
                runtime.gc();
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException ex) {
                    Exceptions.printStackTrace(ex);
                }
            }
        }
        reporter.finish();
    }

    /**
     * Generate random dataset of doubles with given dimensions
     *
     * @param size
     * @param dim
     * @return
     */
    public Dataset<E> generateData(int size, int dim) {
        LOG.info("generating data: {}x{}", size, dim);
        Dataset<E> dataset = new ArrayDataset<>(size, dim);
        for (int i = 0; i < dim; i++) {
            dataset.attributeBuilder().create("attr-" + i, "NUMERIC");
        }
        for (int i = 0; i < size; i++) {
            dataset.instance(i).setName(String.valueOf(i));
            for (int j = 0; j < dim; j++) {
                dataset.set(i, j, rand.nextDouble());
            }
        }

        return dataset;
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/GnuplotReporter.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import com.google.common.collect.ObjectArrays;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import java.util.LinkedList;
import org.clueminer.gnuplot.GnuplotHelper;
import org.clueminer.gnuplot.PointTypeIterator;
import org.clueminer.report.BigORes;
import org.clueminer.report.Reporter;
import org.openide.util.Exceptions;

/**
 *
 * @author Tomas Barton
 */
public class GnuplotReporter extends GnuplotHelper implements Reporter {

    private final String dataDir;
    private final File dataFile;
    private final LinkedList<String> plots;

    public GnuplotReporter(String folder, String[] opts, String[] algorithms, String suffix, int xCol) {
        this.dataDir = folder + File.separatorChar + "data";
        mkdir(dataDir);
        this.dataFile = new File(dataDir + File.separatorChar + "results-" + suffix + ".csv");
        this.plots = new LinkedList<>();
        writeHeader(opts);

        String memPath = dataDir + File.separatorChar + "mem" + suffix + ".gpt";
        String cpuPath = dataDir + File.separatorChar + "cpu" + suffix + ".gpt";
        String cpu2Path = dataDir + File.separatorChar + "cpu2" + suffix + ".gpt";
        String tpsPath = dataDir + File.separatorChar + "tps" + suffix + ".gpt";
        String timePath = dataDir + File.separatorChar + "time" + suffix + ".gpt";

        writePlotScript(new File(memPath),
                plotComplexity(8, "memory (kB)", xCol, 7, dataFile.getName(), algorithms, "Memory usage of hierarchical clustering algorithms - " + opts[1], false));
        writePlotScript(new File(cpuPath),
                plotCpu(8, "CPU", xCol, 2, dataFile.getName(), algorithms, "CPU usage of hierarchical clustering algorithms - " + opts[1], false));
        writePlotScript(new File(cpu2Path),
                plotComplexity(8, "CPU time", xCol, 2, dataFile.getName(), algorithms, "CPU usage of hierarchical clustering algorithms - " + opts[1], false));
        writePlotScript(new File(tpsPath),
                plotComplexity(8, "tps", xCol, 5, dataFile.getName(), algorithms, "Transactuion per second - " + opts[1], true));

        writePlotScript(new File(timePath),
                plotComplexity(8, "time", xCol, 4, dataFile.getName(), algorithms, "Execution time - " + opts[1], true));

        writeBashScript(folder);
    }

    private void writeHeader(String[] opts) {
        String[] head = new String[]{"label", "avg time (ms)", "memory (MB)", "total time (s)", "tps", "repeats", "memory (kB)"};
        String[] line = ObjectArrays.concat(head, opts, String.class);
        writeCsvLine(dataFile, line, false);
    }

    /**
     *
     * @param result
     */
    @Override
    public void finalResult(BigORes result) {
        String[] res = new String[]{result.getLabel(), result.avgTimeMs(),
            result.totalMemoryInMb(), result.totalTimeInS(), result.tps(),
            result.measurements(), result.totalMemoryInKb()
        };
        String[] line = ObjectArrays.concat(res, result.getOpts(), String.class);
        writeCsvLine(dataFile, line, true);
    }

    /**
     *
     * @param file     to write Gnuplot script
     * @param dataFile
     * @param labelPos column of label which is used for data rows in chart
     * @param type
     * @param x
     * @param y
     */
    private void writePlotScript(File file, String script) {
        PrintWriter template;
        try {
            template = new PrintWriter(file, "UTF-8");
            template.write(script);
            template.close();
        } catch (FileNotFoundException | UnsupportedEncodingException ex) {
            Exceptions.printStackTrace(ex);
        }
        plots.add(withoutExtension(file));
    }

    private String plotCpu(int labelPos, String yLabel, int x, int y, String dataFile, String[] algorithms, String title, boolean logscale) {
        String res = "set datafile separator \",\"\n"
                + "set key outside bottom horizontal box\n"
                + "set title \"" + title + "\"\n"
                + "set xlabel \"data size\" font \"Times,12\"\n"
                + "set ylabel \"" + yLabel + "\" font \"Times,12\"\n"
                //   + "set xtics 0,0.5 nomirror\n"
                //   + "set ytics 0,0.5 nomirror\n"
                + "set mytics 2\n"
                + "set mx2tics 2\n"
                + "set grid\n"
                + "set pointsize 0.5\n"
                + "f(x) = 0.5 * x**2\n";
        if (logscale) {
            res += "set logscale y 2\n";
        }
        int i = 0;
        PointTypeIterator pti = new PointTypeIterator();
        for (String alg : algorithms) {
            if (i == 0) {
                res += "plot ";
            }
            res += "\"< awk -F\\\",\\\" '{if($" + labelPos + " == \\\"" + alg
                    + "\\\") print}' " + dataFile + "\" u " + x + ":" + y
                    + " t \"" + alg + "\" w linespoints pt " + pti.next();
            res += ", \\\n";
            i++;
        }
        res += "f(x) title 'x^2' with lines linestyle 18\n";
        return res;
    }

    private String plotComplexity(int labelPos, String yLabel, int x, int y, String dataFile, String[] algorithms, String title, boolean logscale) {
        String res = "set datafile separator \",\"\n"
                + "set key outside bottom horizontal box\n"
                + "set title \"" + title + "\"\n"
                + "set xlabel \"data size\" font \"Times,12\"\n"
                + "set ylabel \"" + yLabel + "\" font \"Times,12\"\n"
                //   + "set xtics 0,0.5 nomirror\n"
                //   + "set ytics 0,0.5 nomirror\n"
                + "set mytics 2\n"
                + "set mx2tics 2\n"
                + "set grid\n"
                + "set pointsize 0.5\n";
        if (logscale) {
            res += "set logscale y 2\n";
        }
        int i = 0;
        int last = algorithms.length - 1;
        PointTypeIterator pti = new PointTypeIterator();
        for (String alg : algorithms) {
            if (i == 0) {
                res += "plot ";
            }
            res += "\"< awk -F\\\",\\\" '{if($" + labelPos + " == \\\"" + alg
                    + "\\\") print}' " + dataFile + "\" u " + x + ":" + y
                    + " t \"" + alg + "\" w linespoints pt " + pti.next();
            if (i != last) {
                res += ", \\\n";
            } else {
                res += "\n";
            }

            i++;
        }
        return res;
    }

    /**
     * Should be called when all plot files are written
     */
    public void finish() {
        //TODO maybe some cleanup?
    }

    private void writeBashScript(String dataDir) {
        try {
            bashPlotScript(plots.toArray(new String[plots.size()]), dataDir, "data", "set term pdf font 'Times-New-Roman,8'", "pdf");
            bashPlotScript(plots.toArray(new String[plots.size()]), dataDir, "data", "set terminal pngcairo size 1024,768 enhanced font 'Verdana,10'", "png");

        } catch (FileNotFoundException ex) {
            Exceptions.printStackTrace(ex);
        } catch (UnsupportedEncodingException ex) {
            Exceptions.printStackTrace(ex);
        } catch (IOException ex) {
            Exceptions.printStackTrace(ex);
        }
    }

    /**
     *
     * @param plots      plot names without extension
     * @param dir        base dir
     * @param gnuplotDir directory with gnuplot file
     * @param term
     * @param ext        extentions of output format
     * @throws FileNotFoundException
     * @throws UnsupportedEncodingException
     * @throws IOException
     */
    public static void bashPlotScript(String[] plots, String dir, String gnuplotDir, String term, String ext)
            throws FileNotFoundException, UnsupportedEncodingException, IOException {
        //bash script to generate results
        String shFile = dir + File.separatorChar + "_plot-" + ext;
        try (PrintWriter template = new PrintWriter(shFile, "UTF-8")) {
            template.write(bashTemplate(gnuplotDir));
            template.write("TERM=\"" + term + "\"\n");
            int pos;
            for (String plot : plots) {
                pos = plot.indexOf(".");
                if (pos > 0) {
                    //remove extension part
                    plot = plot.substring(0, pos);
                }
                template.write("gnuplot -e \"${TERM}\" " + plot + gnuplotExtension
                        + " > $PWD" + File.separatorChar + ".." + File.separatorChar + plot + "." + ext + "\n");
            }
        }
        Runtime.getRuntime().exec("chmod u+x " + shFile);
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/Main.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
import org.clueminer.clustering.benchmark.chameleon2.Cham2Bench;
import org.clueminer.clustering.benchmark.consensus.ConsensusExp;
import org.clueminer.clustering.benchmark.cutoff.CutoffExp;
import org.clueminer.clustering.benchmark.exp.Data;
import org.clueminer.clustering.benchmark.exp.EvolveScores;
import org.clueminer.clustering.benchmark.exp.HclusPar;
import org.clueminer.clustering.benchmark.exp.HclusPar2;
import org.clueminer.clustering.benchmark.exp.Hclust;
import org.clueminer.clustering.benchmark.gen.NsgaGen;
import org.clueminer.clustering.benchmark.nsga.NsgaScore;
import org.clueminer.clustering.benchmark.partition.PartitionExp;

/**
 *
 * @author deric
 */
public class Main {

    private static final Map<String, Bench> map = new HashMap<>();
    private static Main instance;

    public Main() {
        map.put("hclust", new Hclust());
        map.put("data", new Data());
        map.put("hclust-par", new HclusPar());
        map.put("hclust-par2", new HclusPar2());
        map.put("evolve-sc", new EvolveScores());
        map.put("nsga", new NsgaScore());
        map.put("nsga-gen", new NsgaGen());
        map.put("consensus", new ConsensusExp());
        map.put("cutoff", new CutoffExp());
        map.put("partition", new PartitionExp());
        map.put("chameleon2", new Cham2Bench());
    }

    /**
     * Entrypoint to all experiments
     *
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        if (instance == null) {
            instance = new Main();
        }
        if (args.length == 0) {
            usage();
        }
        String exp = args[0];
        if (!Main.map.containsKey(exp)) {
            usage();
        }

        String[] other = Arrays.copyOfRange(args, 1, args.length);
        Bench bench = Main.map.get(exp);
        //run it
        bench.main(other);
    }

    private static void usage() {
        System.out.println("Usage: java -jar {jar name} [experiment name] [[optional arguments]]");
        System.out.println("Valid experriments values are:");
        for (String key : map.keySet()) {
            for (int i = 0; i < 5; i++) {
                System.out.print(" ");
            }
            System.out.print("- " + key + "\n");
        }
        System.out.println("use '[experiment] --help' to find out more about optional arguments");
        System.out.println("--------------------------");
        System.exit(1);
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/ParamExperiment.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark;

import org.clueminer.clustering.api.AlgParams;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.report.NanoBench;
import org.clueminer.utils.Props;
import org.openide.util.Exceptions;

/**
 *
 * @author deric
 * @param <E>
 */
public class ParamExperiment<E extends Instance> extends Experiment<E> {

    private Props[] configs;

    public ParamExperiment(BenchParams params, String results) {
        super(params, results);
    }

    public ParamExperiment(BenchParams params, String results, Props[] configs) {
        super(params, results);

        this.configs = configs;
    }

    @Override
    public void run() {
        int inc = (params.n - params.nSmall) / params.steps;

        String[] names = new String[configs.length];
        int j = 0;
        for (Props alg : configs) {
            names[j++] = alg.get(AlgParams.ALG);
        }

        //json props must be last column (in order to avoid issues with gnuplot parsing commas in json)
        GnuplotReporter reporter = new GnuplotReporter(results,
                new String[]{"algorithm", "n", "config"},
                names, params.nSmall + "-" + params.n, 9);
        System.out.println("increment = " + inc);
        ClusteringBenchmark bench = new ClusteringBenchmark();
        Container container;
        for (int i = params.nSmall; i <= params.n; i += inc) {
            Dataset<E> dataset = generateData(i, params.dimension);
            for (Props props : configs) {
                String[] opts = new String[]{props.get(AlgParams.ALG), String.valueOf(dataset.size()), props.toJson()};
                try {
                    container = bench.cluster(dataset, props);
                    NanoBench.create().measurements(params.repeat).collect(reporter, opts).measure(
                            props.get(AlgParams.ALG) + " - " + dataset.size(),
                            container
                    );
                } catch (Exception ex) {
                    Exceptions.printStackTrace(ex);
                }
                // Get the Java runtime
                Runtime runtime = Runtime.getRuntime();
                // Run the garbage collector
                runtime.gc();
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException ex) {
                    Exceptions.printStackTrace(ex);
                }
            }
        }
        reporter.finish();
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/chameleon2/Cham2Bench.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.chameleon2;

import java.io.File;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.api.AlgParams;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import org.clueminer.clustering.benchmark.BenchParams;
import org.clueminer.clustering.benchmark.ParamExperiment;
import org.clueminer.clustering.benchmark.exp.Hclust;
import org.clueminer.log.ClmLog;
import org.clueminer.utils.Props;

/**
 *
 * @author deric
 */
public class Cham2Bench extends Hclust {

    /**
     * @param args the command line arguments
     */
    @Override
    public void main(String[] args) {
        BenchParams params = parseArguments(args);
        ClmLog.setup(params.log);

        benchmarkFolder = params.home + File.separatorChar + "chameleon2";
        ensureFolder(benchmarkFolder);

        System.out.println("# n = " + params.n);
        System.out.println("=== starting experiment:");

        Props ch2 = new Props();
        ch2.put(AlgParams.ALG, "Chameleon");

        Props hc = new Props();
        hc.put(AlgParams.ALG, "HC-LW(ms)");
        hc.put(AlgParams.LINKAGE, "Single");

        Props dbscan = new Props();
        dbscan.put(AlgParams.ALG, "DBSCAN");

        Props km = new Props();
        km.put(AlgParams.ALG, "k-means");

        Props[] algorithms = new Props[]{
            ch2,
            hc,
            dbscan, km
        };
        ParamExperiment exp = new ParamExperiment(params, benchmarkFolder, algorithms);
        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusExp.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.consensus;

import com.beust.jcommander.JCommander;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.benchmark.Bench;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.printUsage;
import org.clueminer.log.ClmLog;

/**
 *
 * @author deric
 */
public class ConsensusExp extends Bench {

    public static final String name = "consensus";

    protected static ConsensusParams parseArguments(String[] args) {
        ConsensusParams params = new ConsensusParams();
        JCommander cmd = new JCommander(params);
        printUsage(args, cmd, params);
        return params;
    }

    @Override
    public void main(String[] args) {
        ConsensusParams params = parseArguments(args);
        ClmLog.setup(params.log);

        loadBenchArtificial();
        System.out.println("dataset: " + params.dataset);

        benchmarkFolder = params.home + '/' + "benchmark" + '/' + name;
        ensureFolder(benchmarkFolder);
        System.out.println("writing results to: " + benchmarkFolder);

        System.out.println("=== starting " + name);
        ConsensusRun exp = new ConsensusRun(params, benchmarkFolder, provider.getDataset(params.dataset));
        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }
}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusParams.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.consensus;

import com.beust.jcommander.Parameter;
import org.clueminer.clustering.benchmark.AbsParams;

/**
 *
 * @author deric
 */
public class ConsensusParams extends AbsParams {

    @Parameter(names = "--dataset", description = "use specific dataset")
    public String dataset = null;

    @Parameter(names = "--algorithm", description = "clustering algorithm name")
    public String algorithm = "K-Means bagging";

    @Parameter(names = "--k", description = "expected number of clusters (some methods might not respect this)")
    public int k = -1;

    @Parameter(names = "--method", description = "Initialization and consensus approach")
    public String method = "";

    @Parameter(names = "--fixed", description = "whether to use 'correct' k as parameter")
    public boolean fixedK = false;

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusRun.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.consensus;

import com.google.common.base.Supplier;
import com.google.common.collect.Maps;
import com.google.common.collect.Table;
import com.google.common.collect.Tables;
import java.io.File;
import java.util.LinkedList;
import java.util.Map;
import org.clueminer.bagging.COMUSA;
import org.clueminer.bagging.CoAssociationReduce;
import org.clueminer.bagging.KMeansBagging;
import org.clueminer.clustering.aggl.linkage.AverageLinkage;
import org.clueminer.clustering.api.AlgParams;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.Clustering;
import org.clueminer.clustering.api.ClusteringAlgorithm;
import org.clueminer.clustering.api.ClusteringFactory;
import org.clueminer.clustering.api.Executor;
import org.clueminer.clustering.api.factory.EvaluationFactory;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.safeName;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.benchmark.ResultsCollector;
import org.clueminer.exec.ClusteringExecutorCached;
import org.clueminer.utils.Props;
import org.openide.util.Exceptions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 *
 * @author deric
 */
public class ConsensusRun implements Runnable {

    private static ResultsCollector rc;
    private ConsensusParams params;
    private String benchmarkFolder;
    //table for keeping results from experiments
    private Table<String, String, Double> table;
    private static final Logger LOG = LoggerFactory.getLogger(ConsensusRun.class);
    private Dataset<? extends Instance> dataset;

    public ConsensusRun(ConsensusParams params, String benchmarkFolder, Dataset<? extends Instance> dataset) {
        this.params = params;
        this.benchmarkFolder = benchmarkFolder;
        this.dataset = dataset;

        createTable();
        rc = new ResultsCollector(table);
    }

    private void createTable() {
        table = Tables.newCustomTable(
                Maps.<String, Map<String, Double>>newHashMap(),
                new Supplier<Map<String, Double>>() {
            @Override
            public Map<String, Double> get() {
                return Maps.newHashMap();
            }
        });
    }

    @Override
    public void run() {
        try {
            String name;
            String algorithm;
            String folder;
            EvaluationFactory ef = EvaluationFactory.getInstance();
            LinkedList<ClusterEvaluation> evals = new LinkedList<>();
            evals.add(ef.getProvider("NMI-sqrt"));
            evals.add(ef.getProvider("NMI-sum"));
            evals.add(ef.getProvider("Adjusted Rand"));
            evals.add(ef.getProvider("Deviation"));

            ClusteringAlgorithm alg = ClusteringFactory.getInstance().getProvider(params.algorithm);
            algorithm = safeName(alg.getName());
            Executor exec = new ClusteringExecutorCached(alg);

            createTable();
            name = safeName(dataset.getName());
            folder = benchmarkFolder + File.separatorChar + name;
            ensureFolder(folder);

            String csvRes = folder + File.separatorChar + algorithm + "_" + params.method + "_" + name + ".csv";
            LOG.info("dataset: {} size: {} num attr: {}", name, dataset.size(), dataset.attributeCount());
            //ensureFolder(benchmarkFolder + File.separatorChar + name);
            Clustering c;
            Props props = algorithmSetup(params.method);
            if (params.fixedK) {
                props.putBoolean(KMeansBagging.FIXED_K, true);
            }
            if (params.k > 0) {
                props.putInt("k", params.k);
            } else if (!props.containsKey("k") && props.getBoolean(KMeansBagging.FIXED_K, false)) {
                //use "correct" number of clusters if k not specified
                props.putInt("k", dataset.getClasses().size());
            }
            double score;
            System.out.println(props.toString());
            for (int i = 0; i < params.repeat; i++) {
                c = exec.clusterRows(dataset, props);
                for (ClusterEvaluation eval : evals) {
                    if (c.getEvaluationTable() != null) {
                        score = c.getEvaluationTable().getScore(eval);
                    } else {
                        score = eval.score(c);
                    }
                    System.out.println(eval.getName() + ": " + score + ", clusters: " + c.size());
                    table.put("run " + i, eval.getName(), score);
                }
            }
            rc.writeAvgColsCsv(table, csvRes);

        } catch (Exception e) {
            Exceptions.printStackTrace(e);
        }
    }

    private Props algorithmSetup(String alg) {
        Props p = new Props();
        p.putInt(KMeansBagging.BAGGING, 10);
        switch (alg) {
            case "KmB-COMUSA-RAND":
                p.put(KMeansBagging.CONSENSUS, COMUSA.name);
                p.put(KMeansBagging.INIT_METHOD, "RANDOM");
                p.putDouble(COMUSA.RELAX, 1.0);
                p.putInt(KMeansBagging.MAX_K, 25);
                break;
            case "KmB-COMUSA-MO":
                p.put(KMeansBagging.CONSENSUS, COMUSA.name);
                p.put(KMeansBagging.INIT_METHOD, "MO");
                p.putDouble(COMUSA.RELAX, 1.0);
                p.put("mo_1", "AIC");
                p.put("mo_2", "SD index");
                p.putInt(KMeansBagging.MAX_K, 25);
                break;
            case "KmB-COMUSA-RAND-fixed":
                p.put(KMeansBagging.CONSENSUS, COMUSA.name);
                p.put(KMeansBagging.INIT_METHOD, "RANDOM");
                p.putDouble(COMUSA.RELAX, 1.0);
                p.putBoolean(KMeansBagging.FIXED_K, true);
                break;
            case "KmB-CoAssocHAC-MO-avg":
                p.put(KMeansBagging.CONSENSUS, CoAssociationReduce.name);
                p.put(KMeansBagging.INIT_METHOD, "MO");
                p.put("mo_1", "AIC");
                p.put("mo_2", "SD index");
                p.put(AlgParams.LINKAGE, AverageLinkage.name);
                break;
            case "KmB-CoAssocHAC-MO-AIC_SD":
                p.put(KMeansBagging.CONSENSUS, CoAssociationReduce.name);
                p.put(KMeansBagging.INIT_METHOD, "MO");
                p.put("mo_1", "AIC");
                p.put("mo_2", "SD index");
                break;
            default:
                break;

        }
        return p;
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffComparison.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.cutoff;

import java.io.File;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.LinkedList;
import java.util.Map;
import org.clueminer.clustering.api.AgglomerativeClustering;
import org.clueminer.clustering.api.Cluster;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.Clustering;
import org.clueminer.clustering.api.ClusteringFactory;
import org.clueminer.clustering.api.CutoffStrategy;
import org.clueminer.clustering.api.HierarchicalResult;
import org.clueminer.clustering.api.InternalEvaluator;
import org.clueminer.clustering.api.ScoreException;
import org.clueminer.clustering.api.factory.CutoffStrategyFactory;
import org.clueminer.clustering.api.factory.EvaluationFactory;
import org.clueminer.clustering.api.factory.InternalEvaluatorFactory;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.safeName;
import org.clueminer.io.csv.CSVWriter;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.eval.hclust.HillClimbCutoff;
import org.clueminer.utils.Props;
import org.openide.util.Exceptions;
import org.slf4j.LoggerFactory;

/**
 *
 * @author Tomas Bruna
 */
public class CutoffComparison implements Runnable {

    private final CutoffParams params;
    private final String benchmarkFolder;
    private final ArrayList<Dataset<? extends Instance>> datasets;
    private Map<String, Average> averages;
    private LinkedList<ClusterEvaluation> externalEvals;
    private CSVWriter csv;
    private static final org.slf4j.Logger LOG = LoggerFactory.getLogger(CutoffComparison.class);

    public CutoffComparison(CutoffParams params, String benchmarkFolder, ArrayList<Dataset<? extends Instance>> datasets) {
        this.params = params;
        this.benchmarkFolder = benchmarkFolder;
        this.datasets = datasets;
        loadExternalEvals();
    }

    @Override
    public void run() {
        try {
            String folder;
            AgglomerativeClustering alg = (AgglomerativeClustering) ClusteringFactory.getInstance().getProvider(params.algorithm);
            folder = benchmarkFolder + File.separatorChar + "Cutoff comparison";
            ensureFolder(folder);
            String csvRes = folder + File.separatorChar + "Cutoff comparison with " + safeName(alg.getName()) + " on " + datasets.size() + " datasets.csv";

            PrintWriter writer = new PrintWriter(csvRes, "UTF-8");
            csv = new CSVWriter(writer, ',');
            csv.writeLine(alg.getName());

            initAverages();
            for (Dataset<? extends Instance> dataset : datasets) {
                System.out.println("Running comparisons on " + dataset.getName());
                //create dendrogram
                HierarchicalResult rowsResult = alg.hierarchy(dataset, new Props());
                writeHeader(dataset);

                String strategies[] = params.strategies.split(",");
                String internalEvals[] = params.internalEvals.split(",");
                //try different cutoff methods
                for (String strategy : strategies) {
                    for (String internalEval : internalEvals) {
                        CutoffStrategy cutoff = getCutoffStrategy(strategy.trim(), internalEval.trim());
                        rowsResult.findCutoff(cutoff);
                        Clustering c = rowsResult.getClustering();

                        averages.get(strategy.trim() + internalEval.trim()).addValues(c);
                        writeValues(cutoff, internalEval.trim(), c);

                        if (!(cutoff instanceof HillClimbCutoff) || !(cutoff instanceof HillClimbCutoff)) {
                            break;
                        }
                    }
                }
                csv.writeLine("");
                csv.writeLine("");
            }
            writeAverages();
            csv.close();
        } catch (Exception e) {
            Exceptions.printStackTrace(e);
        }
    }

    private void loadExternalEvals() {
        String evals[] = params.externalEvals.split(",");
        externalEvals = new LinkedList<>();
        EvaluationFactory ef = EvaluationFactory.getInstance();
        for (String eval : evals) {
            externalEvals.add(ef.getProvider(eval.trim()));
        }
    }

    private void writeHeader(Dataset<? extends Instance> dataset) {
        csv.writeLine("Dataset_" + safeName(dataset.getName()));
        String row[] = new String[externalEvals.size() + 2];
        row[0] = "Cutoff strategy";
        row[1] = "Internal eval";
        int i = 2;
        for (ClusterEvaluation eval : externalEvals) {
            row[i] = eval.getName();
            i++;
        }
        csv.writeNext(row);
    }

    private void writeValues(CutoffStrategy cutoff, String internaEval, Clustering c) {
        String row[] = new String[externalEvals.size() + 2];
        row[0] = cutoff.getName();
        if (cutoff instanceof HillClimbCutoff || cutoff instanceof HillClimbCutoff) {
            row[1] = internaEval;
        } else {
            row[1] = "";
        }
        int i = 2;
        for (ClusterEvaluation eval : externalEvals) {
            double score;
            if (c.getEvaluationTable() != null) {
                score = c.getEvaluationTable().getScore(eval);
            } else {
                try {
                    score = eval.score(c);
                } catch (ScoreException ex) {
                    score = Double.NaN;
                    LOG.info("failed to compute score {}: {}", eval.getName(), ex.getMessage());
                }
            }
            row[i] = String.valueOf(score);
            i++;
        }
        csv.writeNext(row);
    }

    private void writeAverages() {
        csv.writeLine("AVERAGES");
        String row[] = new String[externalEvals.size() + 2];
        for (Average average : averages.values()) {
            row[0] = average.name;
            row[1] = "";
            int i = 2;
            for (ClusterEvaluation eval : externalEvals) {
                row[i] = String.valueOf(average.getAverage(eval.getName()));
                i++;
            }
            csv.writeNext(row);
        }
    }

    private CutoffStrategy getCutoffStrategy(String strategy, String eval) {
        CutoffStrategy cutoffStrategy = CutoffStrategyFactory.getInstance().getProvider(strategy);
        InternalEvaluatorFactory<Instance, Cluster<Instance>> ief = InternalEvaluatorFactory.getInstance();
        InternalEvaluator evaluator = ief.getProvider(eval);
        cutoffStrategy.setEvaluator(evaluator);
        return cutoffStrategy;
    }

    private void initAverages() {
        averages = new HashMap<>();
        String strategies[] = params.strategies.split(",");
        String internalEvals[] = params.internalEvals.split(",");
        for (String strategy : strategies) {
            for (String internalEval : internalEvals) {
                CutoffStrategy cutoff = getCutoffStrategy(strategy.trim(), internalEval.trim());
                if (!(cutoff instanceof HillClimbCutoff) || !(cutoff instanceof HillClimbCutoff)) {
                    averages.put(strategy.trim() + internalEval.trim(), new Average(strategy.trim()));
                    break;
                } else {
                    averages.put(strategy.trim() + internalEval.trim(), new Average(strategy.trim() + " with " + internalEval.trim()));
                }
            }
        }

    }

    private class Average {

        String name;
        LinkedList<ClusterEvaluation> evals;
        Map<String, Double> sum;
        Map<String, Integer> cnt;

        Average(String name) {
            this.name = name;
            sum = new HashMap<>();
            cnt = new HashMap<>();
            for (ClusterEvaluation eval : externalEvals) {
                sum.put(eval.getName(), 0.0);
                cnt.put(eval.getName(), 0);
            }
        }

        public double getAverage(String eval) {
            return sum.get(eval) / cnt.get(eval);
        }

        public void addValues(Clustering c) {
            double score;
            for (ClusterEvaluation eval : externalEvals) {
                if (c.getEvaluationTable() != null) {
                    score = c.getEvaluationTable().getScore(eval);
                } else {
                    try {
                        score = eval.score(c);
                    } catch (ScoreException ex) {
                        score = Double.NaN;
                        LOG.info("failed to compute score {}: {}", eval.getName(), ex.getMessage());
                    }
                }
                sum.put(eval.getName(), sum.get(eval.getName()) + score);
                cnt.put(eval.getName(), cnt.get(eval.getName()) + 1);
            }
        }

    }
}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffExp.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.cutoff;

import com.beust.jcommander.JCommander;
import java.util.ArrayList;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.benchmark.Bench;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.printUsage;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;

/**
 *
 * @author Tomas Bruna
 */
public class CutoffExp extends Bench {

    public static final String name = "cutoff";

    protected static CutoffParams parseArguments(String[] args) {
        CutoffParams params = new CutoffParams();
        JCommander cmd = new JCommander(params);
        printUsage(args, cmd, params);
        return params;
    }

    @Override
    public void main(String[] args) {
        CutoffParams params = parseArguments(args);

        loadBenchArtificial();
        System.out.println("datasets: " + params.datasets);

        benchmarkFolder = params.home + '/' + "benchmark" + '/' + name;
        ensureFolder(benchmarkFolder);
        System.out.println("writing results to: " + benchmarkFolder);

        System.out.println("=== starting " + name);
        Runnable exp = null;
        switch (params.mode) {
            case "comparison": {
                exp = new CutoffComparison(params, benchmarkFolder, createDatasetsArray(params.datasets));
                break;
            }
            case "firstJump": {
                exp = new FirstJumpOptimization(params, benchmarkFolder, createDatasetsArray(params.datasets));
                break;
            }
            default: {
                throw new IllegalArgumentException("Mode " + params.mode + " is not supported");
            }
        }

        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }

    private ArrayList<Dataset<? extends Instance>> createDatasetsArray(String datasets) {
        String stringSets[] = datasets.split(",");
        ArrayList<Dataset<? extends Instance>> sets = new ArrayList<>(stringSets.length);
        for (String dataset : stringSets) {
            sets.add(provider.getDataset(dataset.trim()));
        }
        return sets;
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffParams.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.cutoff;

import com.beust.jcommander.Parameter;
import org.clueminer.clustering.benchmark.AbsParams;

/**
 *
 * @author Tomas Bruna
 */
public class CutoffParams extends AbsParams {

    @Parameter(names = "--datasets", description = "Datasets to test separated by ,")
    public String datasets = "triangle1, triangle2, flame, jain, long1, long2, long3, sizes1, sizes2, sizes3,"
            + " sizes4, sizes5, compound, atom, aggregation, lsun, pathbased, smile1, smile2, smile3, twodiamonds, "
            + "wingnut, target, st900, square1, square2, square3, square4, square5, spiral, spiral2, spherical_6_2, "
            + "spherical_5_2, longsquare, engytime, donutcurves, diamond9, complex8, complex9, chainlink, R15, D31, "
            + "2d-4c, 2d-20c-no0, 2d-10c";

    @Parameter(names = "--algorithm", description = "Clustering algorithm name")
    public String algorithm = "Chameleon";

    @Parameter(names = "--strategy", description = "Cutoff strategies to compare separated by ,")
    public String strategies = "hill-climb cutoff, hill-climb inc, First jump cutoff";

    @Parameter(names = "--internalEvals", description = "Iternal evaluations to compare separated by ,")
    public String internalEvals = "Silhouette, SD index";

    @Parameter(names = "--externalEvals", description = "External evaluations to determine quality of results"
               + " separated by ,")
    public String externalEvals = "NMI-sqrt, NMI-sum, Deviation, Adjusted Rand";

    @Parameter(names = "--startRange", description = "Range of the start parameter the in the First jump cutoff")
    public String startRange = "30-400";

    @Parameter(names = "--factorRange", description = "Range of the factor parameter in the First jump cutoff")
    public String factorRange = "1.01-6";

    @Parameter(names = "--mode", description = "Whether to compare different cutoff methods (option comparison)"
               + " or to try different parameters for the First jump cutoff (option firstJump)")
    public String mode = "comparison";

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/cutoff/FirstJumpOptimization.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.cutoff;

import java.io.File;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.LinkedList;
import org.clueminer.clustering.api.AgglomerativeClustering;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.Clustering;
import org.clueminer.clustering.api.ClusteringFactory;
import org.clueminer.clustering.api.CutoffStrategy;
import org.clueminer.clustering.api.HierarchicalResult;
import org.clueminer.clustering.api.ScoreException;
import org.clueminer.clustering.api.factory.CutoffStrategyFactory;
import org.clueminer.clustering.api.factory.EvaluationFactory;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.safeName;
import org.clueminer.io.csv.CSVWriter;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.eval.hclust.FirstJump;
import org.clueminer.utils.Props;
import org.openide.util.Exceptions;

/**
 *
 * @author Tomas Bruna
 */
public class FirstJumpOptimization implements Runnable {

    private final CutoffParams params;
    private final String benchmarkFolder;
    private final ArrayList<Dataset<? extends Instance>> datasets;
    private LinkedList<HierarchicalResult> dendrograms;
    FirstJump cutoff;
    ClusterEvaluation eval;

    public FirstJumpOptimization(CutoffParams params, String benchmarkFolder, ArrayList<Dataset<? extends Instance>> datasets) {
        this.params = params;
        this.benchmarkFolder = benchmarkFolder;
        this.datasets = datasets;
        EvaluationFactory ef = EvaluationFactory.getInstance();
        eval = ef.getProvider("NMI-sqrt");
        cutoff = (FirstJump) getCutoffStrategy("First jump cutoff");
    }

    @Override
    public void run() {
        try {
            String folder;
            AgglomerativeClustering alg = (AgglomerativeClustering) ClusteringFactory.getInstance().getProvider(params.algorithm);
            folder = benchmarkFolder + File.separatorChar + "FirstJumpParams";
            ensureFolder(folder);
            String csvRes = folder + File.separatorChar + safeName(alg.getName()) + "_" + "FirstJumpParams" + ".csv";

            PrintWriter writer = new PrintWriter(csvRes, "UTF-8");
            CSVWriter csv = new CSVWriter(writer, ',');
            csv.writeLine("Clustering_with_" + alg.getName());

            computeDendrograms(alg);

            String startRange[] = params.startRange.split("-");
            String factorRange[] = params.factorRange.split("-");

            for (int i = Integer.valueOf(startRange[0]); i <= Integer.valueOf(startRange[1]); i += 10) {
                double j = Double.valueOf(factorRange[0]);
                while (j <= Double.valueOf(factorRange[1])) {
                    csv.writeLine("AVERAGE_WITH_" + i + "_AND_" + j + ": " + testParameters(i, j));
                    j += 0.1;
                }
            }
            csv.close();
        } catch (NumberFormatException | IOException e) {
            Exceptions.printStackTrace(e);
        }
    }

    /**
     * Try cutoff with specified parameters on all results
     *
     * @param i
     * @param j
     * @return
     */
    private double testParameters(int i, double j) {
        System.out.println("Testing " + i + " and " + j);
        double score;
        Clustering c;
        cutoff.setStart(i);
        cutoff.setFactor(j);
        double sum = 0;
        int cnt = 0;
        //compute cutoff on all results
        for (HierarchicalResult rowsResult : dendrograms) {
            rowsResult.findCutoff(cutoff);
            c = rowsResult.getClustering();
            if (c.getEvaluationTable() != null) {
                score = c.getEvaluationTable().getScore(eval);
            } else {
                try {
                    score = eval.score(c);
                } catch (ScoreException ex) {
                    score = Double.NaN;
                    Exceptions.printStackTrace(ex);
                }
            }
            sum += score;
            cnt++;
        }
        return sum / cnt;
    }

    /**
     * Cluster all datasets and save results
     *
     * @param alg
     */
    private void computeDendrograms(AgglomerativeClustering alg) {
        dendrograms = new LinkedList<>();
        for (Dataset<? extends Instance> dataset : datasets) {
            HierarchicalResult rowsResult = alg.hierarchy(dataset, new Props());
            dendrograms.add(rowsResult);
        }
    }

    private CutoffStrategy getCutoffStrategy(String strategy) {
        CutoffStrategy cutoffStrategy = CutoffStrategyFactory.getInstance().getProvider(strategy);
        return cutoffStrategy;
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/evolve/EvolveExp.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.evolve;

import com.google.common.base.Supplier;
import com.google.common.collect.Maps;
import com.google.common.collect.Table;
import com.google.common.collect.Tables;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import java.util.Map.Entry;
import org.clueminer.clustering.api.AlgParams;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.factory.ExternalEvaluatorFactory;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.safeName;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.benchmark.GnuplotWriter;
import org.clueminer.dataset.benchmark.ResultsCollector;
import org.clueminer.evolution.multim.MultiMuteEvolution;
import org.clueminer.evolution.utils.ConsoleDump;
import org.openide.util.Exceptions;

/**
 * Evolution of hierarchical clusterings with different (unsupervised)
 * optimization criterion (single criterion)
 *
 * @author Tomas Barton
 */
public class EvolveExp implements Runnable {

    private static ResultsCollector rc;
    private EvolveParams params;
    private String benchmarkFolder;
    private ClusterEvaluation[] scores;
    private HashMap<String, Entry<Dataset<? extends Instance>, Integer>> datasets;
    //table for keeping results from experiments
    private final Table<String, String, Double> table;

    public EvolveExp(EvolveParams params, String benchmarkFolder, ClusterEvaluation[] scores, HashMap<String, Entry<Dataset<? extends Instance>, Integer>> availableDatasets) {
        this.params = params;
        this.benchmarkFolder = benchmarkFolder;
        this.scores = scores;
        this.datasets = availableDatasets;

        table = Tables.newCustomTable(
                Maps.<String, Map<String, Double>>newHashMap(),
                new Supplier<Map<String, Double>>() {
            @Override
            public Map<String, Double> get() {
                return Maps.newHashMap();
            }
        });
        rc = new ResultsCollector(table);
    }

    @Override
    public void run() {
        try {
            MultiMuteEvolution evolution;
            String name;

            ClusterEvaluation ext = fetchExternal(params.external);
            //evolution.setAlgorithm(new HACLW());
            System.out.println("datasets size: " + datasets.size());
            for (Map.Entry<String, Map.Entry<Dataset<? extends Instance>, Integer>> e : datasets.entrySet()) {
                Dataset<? extends Instance> d = e.getValue().getKey();
                name = safeName(d.getName());
                String csvRes = benchmarkFolder + File.separatorChar + name + File.separatorChar + name + ".csv";
                System.out.println("=== dataset " + name);
                System.out.println("size: " + d.size());
                ensureFolder(benchmarkFolder + File.separatorChar + name);
                for (ClusterEvaluation eval : scores) {
                    evolution = new MultiMuteEvolution();
                    evolution.setDataset(d);
                    evolution.setEvaluator(eval);
                    evolution.setExternal(ext);
                    evolution.setGenerations(params.generations);
                    evolution.setPopulationSize(params.population);
                    GnuplotWriter gw = new GnuplotWriter(evolution, benchmarkFolder, name + File.separatorChar + safeName(eval.getName()));
                    gw.setPlotDumpMod(50);
                    gw.setCustomTitle("cutoff=" + evolution.getDefaultParam(AlgParams.CUTOFF_STRATEGY) + "(" + evolution.getDefaultParam(AlgParams.CUTOFF_SCORE) + ")");
                    //collect data from evolution
                    evolution.addEvolutionListener(new ConsoleDump());
                    evolution.addEvolutionListener(gw);
                    evolution.addEvolutionListener(rc);
                    evolution.run();
                    System.out.println("## updating results in: " + csvRes);
                    rc.writeToCsv(csvRes);
                }
            }
        } catch (Exception e) {
            Exceptions.printStackTrace(e);
        }
    }

    private ClusterEvaluation fetchExternal(String external) {
        return ExternalEvaluatorFactory.getInstance().getProvider(external);
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/evolve/EvolveParams.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.evolve;

import com.beust.jcommander.Parameter;
import org.clueminer.clustering.benchmark.AbsParams;

/**
 *
 * @author Tomas Barton
 */
public class EvolveParams extends AbsParams {

    @Parameter(names = "--external", description = "reference criterion for comparing with internal criterion (Precision, Accuracy, NMI)")
    public String external = "AUC";

    @Parameter(names = "--test", description = "test only on one dataset")
    public boolean test = false;

    @Parameter(names = "--generations", description = "number of generations in evolution")
    public int generations = 10;

    @Parameter(names = "--population", description = "size of population in each generation")
    public int population = 10;

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/exp/Data.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.exp;

import com.google.common.base.Supplier;
import com.google.common.collect.Maps;
import com.google.common.collect.Table;
import com.google.common.collect.Tables;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import org.clueminer.clustering.algorithm.KMeans;
import org.clueminer.clustering.api.Cluster;
import org.clueminer.clustering.api.ExternalEvaluator;
import org.clueminer.clustering.api.InternalEvaluator;
import org.clueminer.clustering.api.factory.InternalEvaluatorFactory;
import org.clueminer.clustering.benchmark.Bench;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.benchmark.GnuplotWriter;
import org.clueminer.dataset.benchmark.ResultsCollector;
import org.clueminer.eval.external.JaccardIndex;
import org.clueminer.evolution.attr.AttrEvolution;
import org.clueminer.evolution.utils.ConsoleDump;
import org.clueminer.utils.FileUtils;
import org.openide.util.NbBundle;

/**
 *
 * @author tombart
 */
public class Data extends Bench {

    private AttrEvolution test;
    //table for keeping results from experiments
    private Table<String, String, Double> table;
    private static ResultsCollector rc;
    private static String csvOutput;
    private static Data instance;

    /**
     * @param args the command line arguments
     */
    @Override
    public void main(String[] args) {
        int i = 0, j;
        String arg;
        char flag;
        boolean vflag = false;
        String datasetName = "";

        while (i < args.length && args[i].startsWith("-")) {
            arg = args[i++];

            // use this type of check for "wordy" arguments
            switch (arg) {
                // use this type of check for arguments that require arguments
                case "-verbose":
                    System.out.println("verbose mode on");
                    vflag = true;
                    break;
                // use this type of check for a series of flag arguments
                case "-dataset":
                    if (i < args.length) {
                        datasetName = args[i++];
                    } else {
                        System.err.println("-dataset requires a name");
                    }
                    if (vflag) {
                        System.out.println("dataset = " + datasetName);
                    }
                    break;
                default:
                    for (j = 1; j < arg.length(); j++) {
                        flag = arg.charAt(j);
                        switch (flag) {
                            case 'x':
                                if (vflag) {
                                    System.out.println("Option x");
                                }
                                break;
                            case 'n':
                                if (vflag) {
                                    System.out.println("Option n");
                                }
                                break;
                            default:
                                System.err.println("Run: illegal option " + flag);
                                break;
                        }
                    }
                    break;
            }
        }
        if (i == args.length) {
            System.err.println("Usage: Benchmark [-verbose] [-xn] [-dataset name]");
        }

        init();
        execute(datasetName);
    }

    private void init() {
        table = Tables.newCustomTable(
                Maps.<String, Map<String, Double>>newHashMap(),
                new Supplier<Map<String, Double>>() {
            @Override
            public Map<String, Double> get() {
                return Maps.newHashMap();
            }
        });

        String home = System.getProperty("user.home") + File.separatorChar
                + NbBundle.getMessage(
                        FileUtils.class,
                        "FOLDER_Home");
        ensureFolder(home);
        benchmarkFolder = home + File.separatorChar + "benchmark";
        ensureFolder(benchmarkFolder);
        rc = new ResultsCollector(table);
        csvOutput = benchmarkFolder + File.separatorChar + "results.csv";

        //preload dataset names
        loadDatasets();
    }

    public void execute(String datasetName) {
        Map<Dataset<? extends Instance>, Integer> datasets = new HashMap<>();
        if (availableDatasets.containsKey(datasetName)) {
            Map.Entry<Dataset<? extends Instance>, Integer> entry = availableDatasets.get(datasetName);
            datasets.put(entry.getKey(), entry.getValue());
        } else {
            System.out.println("dataset " + datasetName + " not found");
            System.out.println("known datasets: ");
            for (String d : availableDatasets.keySet()) {
                System.out.print(d + " ");
            }
            System.out.println("---");
        }
        // DatasetFixture.allDatasets();

        InternalEvaluatorFactory<Instance, Cluster<Instance>> factory = InternalEvaluatorFactory.getInstance();
        ExternalEvaluator ext = new JaccardIndex();

        String name;
        System.out.println("working folder: " + benchmarkFolder);
        for (Map.Entry<Dataset<? extends Instance>, Integer> entry : datasets.entrySet()) {
            Dataset<? extends Instance> dataset = entry.getKey();
            name = dataset.getName();
            String csvRes = benchmarkFolder + File.separatorChar + name + File.separatorChar + name + ".csv";
            System.out.println("=== dataset " + name);
            System.out.println("size: " + dataset.size());
            System.out.println(dataset.toString());
            String dataDir = benchmarkFolder + File.separatorChar + name;
            (new File(dataDir)).mkdir();
            for (InternalEvaluator eval : factory.getAll()) {
                System.out.println("evaluator: " + eval.getName());
                test = new AttrEvolution(dataset, 20);
                test.setAlgorithm(new KMeans());
                test.setK(entry.getValue());
                test.setEvaluator(eval);
                test.setExternal(ext);
                GnuplotWriter gw = new GnuplotWriter(test, benchmarkFolder, name + "/" + name + "-" + safeName(eval.getName()));
                gw.setPlotDumpMod(50);
                //collect data from evolution
                test.addEvolutionListener(new ConsoleDump());
                test.addEvolutionListener(gw);
                test.addEvolutionListener(rc);
                test.run();
                rc.writeToCsv(csvRes);
            }
        }
    }
}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/exp/EvolveScores.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.exp;

import com.beust.jcommander.JCommander;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.InternalEvaluator;
import org.clueminer.clustering.api.factory.InternalEvaluatorFactory;
import org.clueminer.clustering.benchmark.Bench;
import org.clueminer.clustering.benchmark.evolve.EvolveExp;
import org.clueminer.clustering.benchmark.evolve.EvolveParams;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.log.ClmLog;

/**
 *
 * @author Tomas Barton
 */
public class EvolveScores extends Bench {

    public static final String name = "evolve-sc";

    protected static EvolveParams parseArguments(String[] args) {
        EvolveParams params = new EvolveParams();
        JCommander cmd = new JCommander(params);
        printUsage(args, cmd, params);
        return params;
    }

    @Override
    public void main(String[] args) {
        EvolveParams params = parseArguments(args);
        if (params.test) {
            load("iris");
        } else {
            loadDatasets();
        }
        ClmLog.setup(params.log);
        System.out.println("loaded dataset");
        int i = 0;
        for (Map.Entry<String, Map.Entry<Dataset<? extends Instance>, Integer>> e : availableDatasets.entrySet()) {
            System.out.println((i++) + ":" + e.getKey());
        }

        benchmarkFolder = params.home + '/' + "benchmark" + '/' + name;
        ensureFolder(benchmarkFolder);
        System.out.println("writing results to: " + benchmarkFolder);

        System.out.println("=== starting " + name);
        List<InternalEvaluator> eval = InternalEvaluatorFactory.getInstance().getAll();
        ClusterEvaluation[] scores = eval.toArray(new ClusterEvaluation[eval.size()]);
        System.out.println("scores size: " + scores.length);
        EvolveExp exp = new EvolveExp(params, benchmarkFolder, scores, availableDatasets);
        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/exp/HclusPar.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.exp;

import java.io.File;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.aggl.HCLW;
import org.clueminer.clustering.aggl.HCLWMS;
import org.clueminer.clustering.aggl.HacLwMsPar;
import org.clueminer.clustering.api.AgglomerativeClustering;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import org.clueminer.clustering.benchmark.BenchParams;
import org.clueminer.clustering.benchmark.Experiment;
import org.clueminer.log.ClmLog;

/**
 *
 * @author deric
 */
public class HclusPar extends Hclust {

    /**
     * @param args the command line arguments
     */
    @Override
    public void main(String[] args) {
        BenchParams params = parseArguments(args);
        ClmLog.setup(params.log);

        benchmarkFolder = params.home + File.separatorChar + "benchmark" + File.separatorChar + "hclust-par";
        ensureFolder(benchmarkFolder);

        System.out.println("# n = " + params.n);
        System.out.println("=== starting experiment:");
        AgglomerativeClustering[] algorithms = new AgglomerativeClustering[]{
            new HCLW(), new HCLWMS(), new HacLwMsPar(4), new HacLwMsPar(8), new HacLwMsPar(16), new HacLwMsPar(32)
        };
        Experiment exp = new Experiment(params, benchmarkFolder, algorithms);
        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/exp/HclusPar2.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.exp;

import java.io.File;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.aggl.HCLW;
import org.clueminer.clustering.aggl.HCLWMS;
import org.clueminer.clustering.aggl.HacLwMsPar;
import org.clueminer.clustering.aggl.HacLwMsPar2;
import org.clueminer.clustering.api.AgglomerativeClustering;
import org.clueminer.clustering.benchmark.BenchParams;
import org.clueminer.clustering.benchmark.Experiment;
import static org.clueminer.clustering.benchmark.exp.Hclust.parseArguments;
import org.clueminer.log.ClmLog;

/**
 *
 * @author Tomas Barton
 */
public class HclusPar2 extends Hclust {

    /**
     * @param args the command line arguments
     */
    @Override
    public void main(String[] args) {
        BenchParams params = parseArguments(args);
        ClmLog.setup(params.log);

        benchmarkFolder = params.home + File.separatorChar + "hclust-par2";
        ensureFolder(benchmarkFolder);

        System.out.println("# n = " + params.n);
        System.out.println("=== starting experiment:");
        AgglomerativeClustering[] algorithms = new AgglomerativeClustering[]{
            new HCLW(), new HCLWMS(), new HacLwMsPar(2), new HacLwMsPar(4), new HacLwMsPar2(2), new HacLwMsPar2(4)
        };
        Experiment exp = new Experiment(params, benchmarkFolder, algorithms);
        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/exp/Hclust.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.exp;

import com.beust.jcommander.JCommander;
import com.beust.jcommander.ParameterException;
import java.io.File;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.aggl.HC;
import org.clueminer.clustering.aggl.HCLW;
import org.clueminer.clustering.aggl.HCLWMS;
import org.clueminer.clustering.api.AgglomerativeClustering;
import org.clueminer.clustering.benchmark.Bench;
import org.clueminer.clustering.benchmark.BenchParams;
import org.clueminer.clustering.benchmark.Experiment;
import org.clueminer.log.ClmLog;

/**
 *
 * @author deric
 */
public class Hclust extends Bench {

    protected static Hclust instance;

    /**
     * @param args the command line arguments
     */
    @Override
    public void main(String[] args) {
        BenchParams params = parseArguments(args);
        ClmLog.setup(params.log);

        benchmarkFolder = params.home + File.separatorChar + "benchmark" + File.separatorChar + "hclust";
        ensureFolder(benchmarkFolder);

        System.out.println("# n = " + params.n);
        System.out.println("=== starting experiment:");
        AgglomerativeClustering[] algorithms = new AgglomerativeClustering[]{new HC(), new HCLW(), new HCLWMS()};
        Experiment exp = new Experiment(params, benchmarkFolder, algorithms);
        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }

    protected static BenchParams parseArguments(String[] args) {
        BenchParams params = new BenchParams();
        JCommander cmd = new JCommander(params);
        printUsage(args, cmd, params);
        return params;
    }

    public static void printUsage(String[] args, JCommander cmd, BenchParams params) {
        /* if (args.length == 0) { StringBuilder sb = new StringBuilder();
         * cmd.usage(sb);
         * sb.append("\n").append("attributes marked with * are mandatory");
         * System.out.println(sb);
         * System.err.println("missing mandatory arguments");
         * System.exit(0);
         * } */
        try {
            cmd.parse(args);
            /**
             * TODO validate values of parameters
             */
            if (params.n <= 0 || params.dimension <= 0) {
                throw new ParameterException("invalid data dimensions " + params.n + " x " + params.dimension);
            }

            if (params.steps <= 0) {
                throw new ParameterException("invalid steps size " + params.steps);
            }

            if (params.nSmall == params.n) {
                throw new ParameterException("n can't be same as n-small! " + params.nSmall);
            }

        } catch (ParameterException ex) {
            System.out.println(ex.getMessage());
            cmd.usage();
            System.exit(0);
        }
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGen.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.gen;

import com.beust.jcommander.JCommander;
import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.api.Cluster;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.factory.InternalEvaluatorFactory;
import org.clueminer.clustering.benchmark.Bench;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.printUsage;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.log.ClmLog;

/**
 *
 * @author deric
 */
public class NsgaGen extends Bench {

    public static final String name = "nsga-gen";

    protected static NsgaGenParams parseArguments(String[] args) {
        NsgaGenParams params = new NsgaGenParams();
        JCommander cmd = new JCommander(params);
        printUsage(args, cmd, params);
        return params;
    }

    @Override
    public void main(String[] args) {
        NsgaGenParams params = parseArguments(args);
        if (params.dataset != null) {
            load(params.dataset);
        } else {
            loadDatasets();
        }
        ClmLog.setup(params.log);

        int i = 0;
        for (Map.Entry<String, Map.Entry<Dataset<? extends Instance>, Integer>> e : availableDatasets.entrySet()) {
            System.out.println((i++) + ":" + e.getKey());
        }

        benchmarkFolder = params.home + '/' + "benchmark" + '/' + name;
        ensureFolder(benchmarkFolder);
        System.out.println("writing results to: " + benchmarkFolder);

        System.out.println("=== starting " + name);
        InternalEvaluatorFactory<Instance, Cluster<Instance>> factory = InternalEvaluatorFactory.getInstance();
        ClusterEvaluation c1 = factory.getProvider(params.c1);
        ClusterEvaluation c2 = factory.getProvider(params.c2);
        NsgaGenExp exp = new NsgaGenExp(params, benchmarkFolder, c1, c2, availableDatasets);
        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }
}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGenExp.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.gen;

import com.google.common.base.Supplier;
import com.google.common.collect.Maps;
import com.google.common.collect.Table;
import com.google.common.collect.Tables;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.factory.EvaluationFactory;
import static org.clueminer.clustering.benchmark.Bench.safeName;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.benchmark.GnuplotMO;
import org.clueminer.dataset.benchmark.ResultsCollector;
import org.clueminer.evolution.mo.MoEvolution;
import org.clueminer.evolution.utils.ConsoleDump;
import org.openide.util.Exceptions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 *
 * @author deric
 */
public class NsgaGenExp implements Runnable {

    private static ResultsCollector rc;
    private NsgaGenParams params;
    private String benchmarkFolder;
    private HashMap<String, Map.Entry<Dataset<? extends Instance>, Integer>> datasets;
    //table for keeping results from experiments
    private Table<String, String, Double> table;
    private ClusterEvaluation c1;
    private ClusterEvaluation c2;
    private static final Logger LOG = LoggerFactory.getLogger(NsgaGenExp.class);

    public NsgaGenExp(NsgaGenParams params, String benchmarkFolder, ClusterEvaluation c1, ClusterEvaluation c2, HashMap<String, Map.Entry<Dataset<? extends Instance>, Integer>> availableDatasets) {
        this.params = params;
        this.benchmarkFolder = benchmarkFolder;
        this.c1 = c1;
        this.c2 = c2;
        this.datasets = availableDatasets;

        createTable();
        rc = new ResultsCollector(table);
    }

    private void createTable() {
        table = Tables.newCustomTable(
                Maps.<String, Map<String, Double>>newHashMap(),
                new Supplier<Map<String, Double>>() {
            @Override
            public Map<String, Double> get() {
                return Maps.newHashMap();
            }
        });
    }

    @Override
    public void run() {
        try {
            MoEvolution evolution = new MoEvolution();

            evolution.setPopulationSize(params.population);
            evolution.setNumSolutions(params.solutions);
            evolution.setExternal(EvaluationFactory.getInstance().getProvider("Jaccard"));
            evolution.setMutationProbability(params.mutation);
            evolution.setCrossoverProbability(params.crossover);

            GnuplotMO gw = new GnuplotMO();
            //gw.setCustomTitle("cutoff=" + evolution.getDefaultParam(AgglParams.CUTOFF_STRATEGY) + "(" + evolution.getDefaultParam(AgglParams.CUTOFF_SCORE) + ")");
            //collect data from evolution
            evolution.addEvolutionListener(new ConsoleDump());
            evolution.addMOEvolutionListener(gw);
            evolution.addMOEvolutionListener(rc);
            evolution.addObjective(c1);
            evolution.addObjective(c2);

            int[] generations = new int[]{1, 10, 50, 100, 1000};

            String name;
            String folder;
            LOG.info("datasets size: {}", datasets.size());
            for (Map.Entry<String, Map.Entry<Dataset<? extends Instance>, Integer>> e : datasets.entrySet()) {
                Dataset<? extends Instance> d = e.getValue().getKey();
                name = safeName(d.getName());
                folder = benchmarkFolder + File.separatorChar + name;
                gw.mkdir(folder);
                String csvRes = folder + File.separatorChar + "_" + name + ".csv";
                LOG.info("dataset: {} size: {} num attr: {}", name, d.size(), d.attributeCount());
                //ensureFolder(benchmarkFolder + File.separatorChar + name);

                evolution.setDataset(d);

                for (int i = 0; i < generations.length; i++) {
                    int g = generations[i];
                    evolution.setGenerations(g);
                    gw.setCurrentDir(benchmarkFolder, name + "-" + g);
                    //for (int k = 0; k < params.repeat; k++) {
                    //   logger.log(Level.INFO, "run {0}: {1} & {2}", new Object[]{k, c1.getName(), c2.getName()});
                    evolution.run();
                    rc.writeToCsv(csvRes);
                    //}
                    evolution.fireFinishedBatch();
                }
                createTable();
            }
        } catch (Exception e) {
            Exceptions.printStackTrace(e);
        }
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGenParams.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.gen;

import com.beust.jcommander.Parameter;
import org.clueminer.clustering.benchmark.AbsParams;

/**
 *
 * @author deric
 */
public class NsgaGenParams extends AbsParams {

    @Parameter(names = "--population", description = "size of population in each generation")
    public int population = 20;

    @Parameter(names = "--solutions", description = "number of final solutions which will be returned as result")
    public int solutions = 10;

    @Parameter(names = "--mutation", description = "probability of mutation")
    public double mutation = 0.5;

    @Parameter(names = "--crossover", description = "probability of crossover")
    public double crossover = 0.5;

    @Parameter(names = "--dataset", description = "use specific dataset")
    public String dataset = null;

    @Parameter(names = "--c1", description = "criterion 1")
    public String c1 = "Davies-Bouldin";

    @Parameter(names = "--c2", description = "criterion 2")
    public String c2 = "AIC";

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaExp.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.nsga;

import com.google.common.base.Supplier;
import com.google.common.collect.Maps;
import com.google.common.collect.Table;
import com.google.common.collect.Tables;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.factory.EvaluationFactory;
import static org.clueminer.clustering.benchmark.Bench.safeName;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.benchmark.GnuplotMO;
import org.clueminer.dataset.benchmark.ResultsCollector;
import org.clueminer.evolution.mo.MoEvolution;
import org.clueminer.evolution.utils.ConsoleDump;
import org.openide.util.Exceptions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 *
 * @author Tomas Barton
 */
public class NsgaExp implements Runnable {

    private static ResultsCollector rc;
    private NsgaParams params;
    private String benchmarkFolder;
    private ClusterEvaluation[] scores;
    private HashMap<String, Map.Entry<Dataset<? extends Instance>, Integer>> datasets;
    //table for keeping results from experiments
    private Table<String, String, Double> table;
    private static final Logger LOG = LoggerFactory.getLogger(NsgaExp.class);

    public NsgaExp(NsgaParams params, String benchmarkFolder, ClusterEvaluation[] scores, HashMap<String, Map.Entry<Dataset<? extends Instance>, Integer>> availableDatasets) {
        this.params = params;
        this.benchmarkFolder = benchmarkFolder;
        this.scores = scores;
        this.datasets = availableDatasets;

        createTable();
        rc = new ResultsCollector(table);
    }

    @Override
    public void run() {
        try {
            MoEvolution evolution = new MoEvolution();
            evolution.setGenerations(params.generations);
            evolution.setPopulationSize(params.population);
            evolution.setNumSolutions(params.solutions);
            evolution.setExternal(EvaluationFactory.getInstance().getProvider(params.supervised));
            evolution.setMutationProbability(params.mutation);
            evolution.setCrossoverProbability(params.crossover);
            evolution.setkLimit(params.limitK);
            ClusterEvaluation c1, c2;

            GnuplotMO gw = new GnuplotMO();
            //gw.setCustomTitle("cutoff=" + evolution.getDefaultParam(AgglParams.CUTOFF_STRATEGY) + "(" + evolution.getDefaultParam(AgglParams.CUTOFF_SCORE) + ")");
            //collect data from evolution
            evolution.addEvolutionListener(new ConsoleDump());
            evolution.addMOEvolutionListener(gw);
            evolution.addMOEvolutionListener(rc);

            String name;
            LOG.info("datasets size: {}", datasets.size());
            for (Map.Entry<String, Map.Entry<Dataset<? extends Instance>, Integer>> e : datasets.entrySet()) {
                Dataset<? extends Instance> d = e.getValue().getKey();
                name = safeName(d.getName());
                String csvRes = benchmarkFolder + File.separatorChar + name + File.separatorChar + "_" + name + ".csv";
                LOG.info("dataset: {} size: {} num attr: {}", name, d.size(), d.attributeCount());
                //ensureFolder(benchmarkFolder + File.separatorChar + name);

                gw.setCurrentDir(benchmarkFolder, name);

                evolution.setDataset(d);

                for (int i = 0; i < scores.length; i++) {
                    c1 = scores[i];
                    //lower triangular matrix without diagonal
                    //(doesn't matter which criterion is first, we want to try
                    //all combinations)
                    for (int j = 0; j < i; j++) {
                        c2 = scores[j];
                        evolution.clearObjectives();
                        evolution.addObjective(c1);
                        evolution.addObjective(c2);
                        //run!
                        for (int k = 0; k < params.repeat; k++) {
                            LOG.info("run {}: {} & {}", k, c1.getName(), c2.getName());
                            evolution.run();
                            rc.writeToCsv(csvRes);
                        }
                        evolution.fireFinishedBatch();
                        LOG.info("finished {} & {}", c1.getName(), c2.getName());
                    }
                }
                createTable();
            }
        } catch (Exception e) {
            Exceptions.printStackTrace(e);
        }
    }

    private void createTable() {
        table = Tables.newCustomTable(
                Maps.<String, Map<String, Double>>newHashMap(),
                new Supplier<Map<String, Double>>() {
            @Override
            public Map<String, Double> get() {
                return Maps.newHashMap();
            }
        });
    }
}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaParams.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.nsga;

import com.beust.jcommander.Parameter;
import org.clueminer.clustering.benchmark.AbsParams;

/**
 *
 * @author Tomas Barton
 */
public class NsgaParams extends AbsParams {

    @Parameter(names = "--test", description = "test only on one dataset")
    public boolean test = false;

    @Parameter(names = "--generations", description = "number of generations in evolution")
    public int generations = 10;

    @Parameter(names = "--population", description = "size of population in each generation")
    public int population = 20;

    @Parameter(names = "--solutions", description = "number of final solutions which will be returned as result")
    public int solutions = 10;

    @Parameter(names = "--supervised", description = "supervised criterion for external evaluation")
    public String supervised = "Adjusted Rand";

    @Parameter(names = "--mutation", description = "probability of mutation")
    public double mutation = 0.5;

    @Parameter(names = "--crossover", description = "probability of crossover")
    public double crossover = 0.5;

    @Parameter(names = "--dataset", description = "use specific dataset")
    public String dataset = null;

    @Parameter(names = "--limit-k", description = "limit max. clusterings size to sqrt(n)")
    public boolean limitK = false;

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaScore.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.nsga;

import com.beust.jcommander.JCommander;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.api.ClusterEvaluation;
import org.clueminer.clustering.api.InternalEvaluator;
import org.clueminer.clustering.api.factory.InternalEvaluatorFactory;
import org.clueminer.clustering.benchmark.Bench;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.printUsage;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.log.ClmLog;

/**
 *
 * @author Tomas Barton
 */
public class NsgaScore extends Bench {

    public static final String name = "nsga-scores";

    protected static NsgaParams parseArguments(String[] args) {
        NsgaParams params = new NsgaParams();
        JCommander cmd = new JCommander(params);
        printUsage(args, cmd, params);
        return params;
    }

    @Override
    public void main(String[] args) {
        NsgaParams params = parseArguments(args);
        if (params.test || params.dataset != null) {
            if (params.test) {
                load("iris");
            } else {
                load(params.dataset);
            }
        } else {
            loadDatasets();
        }
        System.out.println("loaded dataset");
        ClmLog.setup(params.log);

        int i = 0;
        for (Map.Entry<String, Map.Entry<Dataset<? extends Instance>, Integer>> e : availableDatasets.entrySet()) {
            System.out.println((i++) + ":" + e.getKey());
        }

        benchmarkFolder = params.home + '/' + "benchmark" + '/' + name;
        ensureFolder(benchmarkFolder);
        System.out.println("writing results to: " + benchmarkFolder);

        System.out.println("=== starting " + name);
        List<InternalEvaluator> eval = InternalEvaluatorFactory.getInstance().getAll();
        ClusterEvaluation[] scores = eval.toArray(new ClusterEvaluation[eval.size()]);
        System.out.println("scores size: " + scores.length);
        NsgaExp exp = new NsgaExp(params, benchmarkFolder, scores, availableDatasets);
        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }
}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/partition/PartitionBench.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.partition;

import java.util.Random;
import static org.clueminer.chameleon.Chameleon.K;
import static org.clueminer.chameleon.Chameleon.MAX_PARTITION;
import org.clueminer.clustering.benchmark.GnuplotReporter;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.impl.ArrayDataset;
import org.clueminer.graph.adjacencyMatrix.AdjMatrixGraph;
import org.clueminer.graph.api.Graph;
import org.clueminer.graph.knn.KNNGraphBuilder;
import org.clueminer.partitioning.api.Partitioning;
import org.clueminer.report.NanoBench;
import org.clueminer.utils.Props;
import org.openide.util.Exceptions;

/**
 *
 * @author deric
 */
public class PartitionBench implements Runnable {

    protected final Random rand;
    protected final PartitionParams params;
    protected final Partitioning[] algorithms;
    protected final String results;

    public PartitionBench(PartitionParams params, String results, Partitioning[] algorithms) {
        rand = new Random();
        this.params = params;
        this.results = results;
        this.algorithms = algorithms;
    }

    @Override
    public void run() {
        int inc = (params.n - params.nSmall) / params.steps;

        String[] names = new String[algorithms.length];
        int j = 0;
        for (Partitioning alg : algorithms) {
            names[j++] = alg.getName();
        }

        GnuplotReporter reporter = new GnuplotReporter(results,
                new String[]{"algorithm", "edges", "n"}, names, params.nSmall + "-" + params.n, 10);
        System.out.println("increment = " + inc);
        KNNGraphBuilder knn = new KNNGraphBuilder();
        Props pref = new Props();
        for (int i = params.nSmall; i <= params.n; i += inc) {
            Graph g = new AdjMatrixGraph(i);
            Dataset<? extends Instance> dataset = generateData(i, params.dimension);
            int datasetK = determineK(dataset);
            int maxPartitionSize = determineMaxPartitionSize(dataset);
            pref.putInt(MAX_PARTITION, maxPartitionSize);
            pref.putInt(K, datasetK);
            g = knn.getNeighborGraph(dataset, g, datasetK);

            for (Partitioning alg : algorithms) {
                String[] opts = new String[]{alg.getName(), String.valueOf(g.getEdgeCount()), String.valueOf(dataset.size())};
                NanoBench.create().measurements(params.repeat).collect(reporter, opts).measure(
                        alg.getName() + " - " + dataset.size(),
                        bench(alg, g, maxPartitionSize, pref)
                );
                // Get the Java runtime
                Runtime runtime = Runtime.getRuntime();
                // Run the garbage collector
                runtime.gc();
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException ex) {
                    Exceptions.printStackTrace(ex);
                }
            }
        }
        reporter.finish();
    }

    public Runnable bench(final Partitioning algorithm, final Graph g, final int maxPartitionSize, final Props props) {

        final Runnable runnable = new Runnable() {

            @Override
            public void run() {
                algorithm.partition(maxPartitionSize, g, props);
            }

        };
        return runnable;
    }

    /**
     * Generate random dataset of doubles with given dimensions
     *
     * @param size
     * @param dim
     * @return
     */
    protected Dataset<? extends Instance> generateData(int size, int dim) {
        System.out.println("generating data: " + size + " x " + dim);
        Dataset<? extends Instance> dataset = new ArrayDataset<>(size, dim);
        for (int i = 0; i < dim; i++) {
            dataset.attributeBuilder().create("attr-" + i, "NUMERIC");
        }
        for (int i = 0; i < size; i++) {
            dataset.instance(i).setName(String.valueOf(i));
            for (int j = 0; j < dim; j++) {
                dataset.set(i, j, rand.nextDouble());
            }
        }

        return dataset;
    }

    private int determineK(Dataset<? extends Instance> dataset) {

        if (dataset.size() < 500) {
            return (int) (Math.log(dataset.size()) / Math.log(2));
        } else {
            return (int) (Math.log(dataset.size()) / Math.log(2)) * 2;
        }
    }

    private int determineMaxPartitionSize(Dataset<? extends Instance> dataset) {
        if (dataset.size() < 500) {
            return 5;
        } else if ((dataset.size() < 2000)) {
            return dataset.size() / 100;
        } else {
            return dataset.size() / 200;
        }
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/partition/PartitionExp.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.partition;

import com.beust.jcommander.JCommander;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.clueminer.clustering.benchmark.Bench;
import static org.clueminer.clustering.benchmark.Bench.ensureFolder;
import static org.clueminer.clustering.benchmark.Bench.printUsage;
import org.clueminer.partitioning.api.Partitioning;
import org.clueminer.partitioning.impl.RecursiveBisection;

/**
 * An experiment to test time complexity of graph partitioning methods
 *
 * @author deric
 */
public class PartitionExp extends Bench {

    public static final String name = "partition";

    protected static PartitionParams parseArguments(String[] args) {
        PartitionParams params = new PartitionParams();
        JCommander cmd = new JCommander(params);
        printUsage(args, cmd, params);
        return params;
    }

    @Override
    public void main(String[] args) {
        PartitionParams params = parseArguments(args);

        benchmarkFolder = params.home + '/' + "benchmark" + '/' + name;
        ensureFolder(benchmarkFolder);
        System.out.println("writing results to: " + benchmarkFolder);

        System.out.println("=== starting " + name);

        Partitioning[] algorithms = new Partitioning[]{
            new RecursiveBisection()
        };

        Runnable exp = new PartitionBench(params, name, algorithms);

        ExecutorService execService = Executors.newFixedThreadPool(1);
        execService.submit(exp);
        execService.shutdown();
    }

}


================================================
FILE: src/main/java/org/clueminer/clustering/benchmark/partition/PartitionParams.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.clustering.benchmark.partition;

import com.beust.jcommander.Parameter;
import org.clueminer.clustering.benchmark.AbsParams;

/**
 *
 * @author deric
 */
public class PartitionParams extends AbsParams {

    @Parameter(names = "--n", description = "size of biggest dataset", required = false)
    public int n = 20;

    @Parameter(names = "--n-small", description = "size of smallest", required = false)
    public int nSmall = 5;

    @Parameter(names = "--steps", description = "number of datasets which will be generated")
    public int steps = 4;

    @Parameter(names = "--dimension", description = "number of attributes of each dataset")
    public int dimension = 5;

}


================================================
FILE: src/main/java/org/clueminer/data/DataLoader.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.data;

import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URL;
import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.TreeMap;
import org.clueminer.dataset.api.DataProvider;
import org.clueminer.dataset.api.Dataset;
import org.clueminer.dataset.api.Instance;
import org.clueminer.dataset.impl.ArrayDataset;
import org.clueminer.exception.ParserError;
import org.clueminer.io.arff.ARFFHandler;
import org.openide.util.Exceptions;

/**
 *
 * @author deric
 */
public class DataLoader implements DataProvider {

    private final Map<String, String> datasets;
    private final Map<String, String> fullPaths;
    private final Map<String, Dataset<? extends Instance>> cache;
    private String prefix = "datasets" + File.separatorChar + "artificial";

    public DataLoader(Map<String, String> datasets, String prefix, Map<String, String> fullPaths) {
        this.datasets = datasets;
        this.cache = new HashMap<>(datasets.size());
        this.prefix = prefix;
        this.fullPaths = fullPaths;
    }

    @Override
    public String[] getDatasetNames() {
        return datasets.keySet().toArray(new String[0]);
    }

    @Override
    public Dataset<? extends Instance> getDataset(String name) {
        if (cache.containsKey(name)) {
            return cache.get(name);
        }
        if (!datasets.containsKey(name)) {
            throw new RuntimeException("unknown dataset " + name);
        }

        Dataset<? extends Instance> dataset = loadDataset(name, datasets.get(name), fullPaths.get(name));
        cache.put(name, dataset);
        return dataset;
    }

    @Override
    public Dataset<? extends Instance> first() {
        Iterator<String> it = datasets.keySet().iterator();
        if (!it.hasNext()) {
            throw new RuntimeException("no datasets were loaded");
        }
        return getDataset(it.next());
    }

    @Override
    public int count() {
        return datasets.size();
    }

    /**
     * Tries to load dataset by its name. There must be a method for loading the
     * dataset in this class.
     *
     * @param name
     * @return
     */
    private Dataset<? extends Instance> loadDataset(String name, String type, String fullPath) {
        Dataset<? extends Instance> dataset = null;
        switch (type) {
            case "arff":
                //TODO: multi dimensions support
                dataset = new ArrayDataset(10, 2);
                dataset.setName(name);
                ARFFHandler arff = new ARFFHandler();
                try {
                    arff.load(resource(name + "." + type, fullPath), dataset);
                } catch (FileNotFoundException | ParserError ex) {
                    Exceptions.printStackTrace(ex);
                }
                break;
            default:
                throw new RuntimeException("unsupported format " + type);
        }

        return dataset;
    }

    /**
     * Resource packed in jar is not possible to open directly, this method uses
     * a .tmp file which should be on exit deleted
     *
     * @param path
     * @param fullPath
     * @return
     */
    public File resource(String path, String fullPath) {
        String resource = prefix + File.separatorChar + path;
        File file;
        URL url = DataLoader.class.getResource(resource);
        if (url == null) {
            //probably on Windows
            file = new File(fullPath);
            if (file.exists()) {
                return file;
            }
            //non existing URL
            //no classpath, compiled as JAR
            //if path is in form: "jar:path.jar!resource/data.arff"
            int pos = fullPath.lastIndexOf("!");
            if (pos > 0) {
                resource = fullPath.substring(pos + 1);
                if (!resource.startsWith("/")) {
                    //necessary for loading as a stream
                    resource = "/" + resource;
                }
            }
            return loadResource(resource);
        }

        if (url.toString().startsWith("jar:")) {
            return loadResource(resource);
        } else {
            file = new File(url.getFile());
        }
        return file;
    }

    private File loadResource(String resource) {
        File file = null;
        try {
            InputStream input = getClass().getResourceAsStream(resource);
            file = File.createTempFile("nodesfile", ".tmp");
            OutputStream out = new FileOutputStream(file);
            int read;
            byte[] bytes = new byte[1024];

            while ((read = input.read(bytes)) != -1) {
                out.write(bytes, 0, read);
            }
            file.deleteOnExit();
        } catch (IOException ex) {
            System.err.println(ex.toString());
        }
        return file;
    }

    public static DataProvider createLoader(String p1, String p2) {
        String path = p1 + File.pathSeparatorChar + p2;
        Map<String, String> datasets = new TreeMap<>();
        Map<String, String> paths = new HashMap<>();

        final Collection<String> list = ResourceList.getResources(p1, p2);
        int idx, dot;
        String dataset;
        String ext;
        for (final String name : list) {
            idx = name.lastIndexOf(File.separatorChar);
            dot = name.lastIndexOf(".");
            if (dot > 0) {
                dataset = name.substring(idx + 1, dot);
            } else {
                dataset = name;
            }
            ext = name.substring(dot + 1);
            datasets.put(dataset, ext);
            paths.put(dataset, name);
        }

        return new DataLoader(datasets, path, paths);
    }

    @Override
    public Iterator<Dataset<? extends Instance>> iterator() {
        return new DataLoaderIterator();
    }

    @Override
    public boolean hasDataset(String name) {
        for (String dataset : datasets.keySet()) {
            if (name.equals(dataset)) {
                return true;
            }
        }
        return false;
    }

    private class DataLoaderIterator implements Iterator<Dataset<? extends Instance>> {

        private final Iterator<String> it;

        public DataLoaderIterator() {
            it = datasets.keySet().iterator();
        }

        @Override
        public boolean hasNext() {
            return it.hasNext();
        }

        @Override
        public Dataset<? extends Instance> next() {
            return getDataset(it.next());
        }

        @Override
        public void remove() {
            throw new UnsupportedOperationException("not supported yet.");
        }

    }

}


================================================
FILE: src/main/java/org/clueminer/data/ResourceList.java
================================================
/*
 * Copyright (C) 2011-2016 clueminer.org
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package org.clueminer.data;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.net.URISyntaxException;
import java.net.URL;
import java.net.URLDecoder;
import java.util.Collection;
import java.util.Enumeration;
import java.util.LinkedList;
import java.util.List;
import java.util.jar.JarEntry;
import java.util.jar.JarFile;
import java.util.regex.Pattern;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import org.openide.util.Exceptions;
import org.openide.util.Utilities;

/**
 * code from
 * @link http://stackoverflow.com/questions/3923129/get-a-list-of-resources-from-classpath-directory
 * @author deric
 */
public class ResourceList {

    private static final String OS = System.getProperty("os.name").toLowerCase();

    /**
     * for all elements of java.class.path get a Collection of resources Pattern
     * pattern = Pattern.compile(".*"); gets all resources
     *
     * @param p1 first part of path
     * @param p2 second part of path
     * @return the resources in the order they are found
     */
    public static Collection<String> getResources(String p1, String p2) {
        final List<String> retval = new LinkedList<>();
        final String classPath = System.getProperty("java.class.path", ".");
        String pathSeparator;
        //platform independent regexp
        Pattern pattern = Pattern.compile("(.*)" + p1 + "(.)" + p2 + "(.*)");
        if (isWindows()) {
            try {
                Enumeration<URL> en = ResourceList.class.getClassLoader().getResources("datasets");
                if (en.hasMoreElements()) {
                    URL metaInf = en.nextElement();
                    File fileMetaInf = Utilities.toFile(metaInf.toURI());
                    browseFiles(retval, fileMetaInf, pattern);
                }
            } catch (IOException | URISyntaxException ex) {
                Exceptions.printStackTrace(ex);
            }
            if (retval.size() > 0) {
                return retval;
            }
            pathSeparator = ";";
        } else {
            pathSeparator = ":";
        }
        //when running from IDE we can use classpath a directly read files from disk
        final String[] classPathElements = classPath.split(pathSeparator);
        for (final String element : classPathElements) {
            retval.addAll(getResources(element, pattern));
        }
        if (retval.isEmpty()) {
            //last resort, when compiled into JAR
            loadFromJar(retval, pattern);
        }
        return retval;
    }

    private static void browseFiles(final List<String> retval, File fileMetaInf, final Pattern pattern) {
        File[] files = fileMetaInf.listFiles();
        for (File f : files) {
            if (f.isDirectory()) {
                browseFiles(retval, f, pattern);
            } else {
                String fileName = f.getAbsolutePath();
                final boolean accept = pattern.matcher(fileName).matches();
                if (accept) {
                    retval.add(fileName);
                }
            }
        }
    }

    private static Collection<String> getResources(final String element, final Pattern pattern) {
        final List<String> retval = new LinkedList<>();
        final File file = new File(element);
        if (file.isDirectory()) {
            retval.addAll(getResourcesFromDirectory(file, pattern));
        } else if (file.exists()) {
            retval.addAll(getResourcesFromJarFile(file, pattern));
        } else {
            System.err.println("can't open file: " + file);
        }
        return retval;
    }

    private static Collection<String> getResourcesFromJarFile(final File file, final Pattern pattern) {
        final List<String> retval = new LinkedList<>();
        ZipInputStream zip = null;
        try {
            zip = new ZipInputStream(new FileInputStream(file));
        } catch (FileNotFoundException ex) {
            Exceptions.printStackTrace(ex);
        }

        if (zip != null) {
            try {
                while (zip.available() == 1) {
                    final ZipEntry ze = zip.getNextEntry();
                    final String fileName = ze.getName();
                    final boolean accept = pattern.matcher(fileName).matches();
                    if (accept) {
                        retval.add(fileName);
                    }
                    zip.closeEntry();
                }
                zip.close();
            } catch (IOException ex) {
                Exceptions.printStackTrace(ex);
            }
        }

        return retval;
    }

    private static Collection<String> getResourcesFromDirectory(
            final File directory,
            final Pattern pattern) {
        final List<String> retval = new LinkedList<>();
        final File[] fileList = directory.listFiles();
        for (final File file : fileList) {
            if (file.isDirectory()) {
                retval.addAll(getResourcesFromDirectory(file, pattern));
            } else {
                try {
                    final String fileName = file.getCanonicalPath();
                    final boolean accept = pattern.matcher(fileName).matches();
                    if (accept) {
                        retval.add(fileName);
                    }
                } catch (final IOException e) {
                    throw new Error(e);
                }
            }
        }
        return retval;
    }

    public static boolean isWindows() {
        return (OS.contains("win"));
    }

    public static boolean isMac() {
        return (OS.contains("mac"));
    }

    public static boolean isUnix() {
        return (OS.contains("nix") || OS.contains("nux") || OS.contains("aix"));
    }

    public static boolean isSolaris() {
        return (OS.contains("sunos"));
    }

    /**
     * List resources from compiled JAR and choose those matching some pattern
     *
     * @param retval
     * @param pattern
     */
    private static void loadFromJar(List<String> retval, Pattern pattern) {
        try {
            Enumeration<URL> en = ResourceList.class.getClassLoader().getResources("datasets");
            if (en.hasMoreElements()) {
                URL metaInf = en.nextElement();
                File file;

                String path = metaInf.getPath();
                String jarFilePath = path.substring(path.indexOf(":") + 1, path.indexOf("!"));
                jarFilePath = URLDecoder.decode(jarFilePath, "UTF-8");
                file = new File(jarFilePath);

                try (JarFile jar = new JarFile(file)) {
                    JarEntry entry;
                    String fileName;
                    Enumeration<JarEntry> enumer = jar.entries();
                    while (enumer.hasMoreElements()) {
                        entry = enumer.nextElement();
                        fileName = entry.getName();
                        if (pattern.matcher(fileName).matches()) {
                            //don't add directories
                            if (!fileName.endsWith("/")) {
                                retval.add("jar:" + jarFilePath + "!" + fileName);
                            }
                        }
                    }

                }
            }
        } catch (IOException ex) {
            Exceptions.printStackTrace(ex);
        }
    }

}


================================================
FILE: src/main/nbm/manifest.mf
================================================
Manifest-Version: 1.0
OpenIDE-Module-Localizing-Bundle: org/clueminer/clustering/benchmark/Bundle.properties


================================================
FILE: src/main/resources/datasets/artificial/2d-10c.arff
================================================
@RELATION 2d-10c
%
@ATTRIBUTE a0 REAL
@ATTRIBUTE a1 REAL
@ATTRIBUTE CLASS {0,1,2,3,4,5,6,7,8,9}
%
@DATA
1.00007,40.9378,0
0.99736,41.1714,0
0.134799,41.8113,0
2.47585,41.6346,0
-3.0587,41.3887,0
0.73266,41.4888,0
0.35745,41.8429,0
0.157674,41.7513,0
2.69103,41.8298,0
0.536346,41.5344,0
2.46873,41.4813,0
-1.91946,40.8824,0
3.44394,41.3349,0
2.88157,41.1534,0
-0.53933,41.4057,0
-0.0276636,41.1127,0
0.0933292,41.0051,0
0.209173,41.7993,0
-0.0668266,41.7829,0
1.79969,42.3177,0
0.510155,41.6298,0
0.138588,41.1531,0
-0.0175984,41.694,0
0.618498,41.4093,0
-2.63433,41.9534,0
2.01141,41.222,0
5.57577,41.5869,0
1.40539,41.53,0
3.55797,41.8688,0
2.68052,41.3868,0
2.95654,41.486,0
-0.391807,41.2361,0
4.00381,41.4221,0
1.44796,41.0426,0
0.201187,41.4562,0
0.795719,41.6564,0
0.502018,41.5292,0
0.290078,41.6987,0
1.19913,40.799,0
-1.28096,41.1171,0
0.030134,41.4093,0
1.99597,41.7064,0
3.11367,41.4872,0
0.546399,41.5257,0
-1.00433,41.3616,0
1.22587,42.0243,0
-0.202468,41.863,0
-1.40362,40.9709,0
0.977844,40.6544,0
2.66858,41.613,0
3.87692,41.5591,0
2.62824,41.4168,0
2.06543,41.3289,0
-1.63482,41.3223,0
2.00123,41.8628,0
3.64504,42.0131,0
-2.36311,41.4501,0
2.6411,41.6433,0
-2.92099,40.9228,0
1.66685,41.3844,0
1.35976,41.4936,0
2.40343,42.0717,0
-0.156466,41.3901,0
-0.149501,41.3361,0
2.31285,41.3771,0
2.73428,41.6682,0
0.703931,40.8069,0
2.65785,41.2811,0
4.18733,41.6706,0
0.21038,41.3822,0
-2.37222,41.5596,0
2.14842,42.1958,0
3.90696,41.487,0
0.805582,40.9885,0
2.88554,41.7446,0
0.4224,41.6881,0
4.28893,41.7292,0
0.709065,41.6168,0
-1.114,41.4017,0
0.00767159,41.3293,0
2.98833,41.0246,0
-0.895292,41.5051,0
3.71451,41.4277,0
-0.34788,40.9151,0
3.6333,41.5624,0
3.86924,41.0564,0
-0.228855,40.791,0
3.10995,41.9408,0
2.41484,41.1329,0
3.09267,41.8441,0
-1.40569,40.9844,0
2.83167,41.5767,0
-2.08925,41.8549,0
-0.0413047,41.4111,0
0.99321,41.2289,0
5.49799,42.229,0
4.56593,41.3131,0
2.62767,41.2001,0
1.86702,41.0572,0
0.99447,40.9706,0
1.35799,41.9985,0
0.965753,41.2944,0
5.09559,41.0741,0
1.36247,41.1762,0
1.69922,41.2959,0
3.16355,41.7048,0
5.25389,41.8303,0
0.531934,41.5568,0
3.28948,41.5843,0
0.10212,41.6421,0
-2.09836,41.0296,0
1.1946,41.225,0
0.0879961,41.7027,0
-0.344577,41.9871,0
-0.195498,41.2088,0
2.2287,41.8828,0
3.68595,41.7489,0
2.78496,41.4375,0
-1.03866,41.3832,0
0.83488,40.8743,0
1.10206,41.6165,0
2.73045,42.2562,0
1.85942,42.4662,0
-2.90953,41.7232,0
3.17061,41.78,0
-1.28675,41.7713,0
3.19275,41.3024,0
2.44799,41.3377,0
1.01221,41.3916,0
2.86381,41.3201,0
2.53895,41.8226,0
0.652358,41.6833,0
2.95625,41.2105,0
2.05624,41.7799,0
2.73907,41.5838,0
0.19292,41.4039,0
-1.90824,41.2266,0
3.31292,41.8747,0
-1.17188,41.499,0
-1.19959,41.2945,0
3.40639,41.4131,0
-0.309029,41.7895,0
2.29291,41.4736,0
2.85405,41.4468,0
0.906787,41.8417,0
1.03301,41.6805,0
2.7908,41.5484,0
1.60142,41.0711,0
3.07553,41.3319,0
2.66801,41.3151,0
1.57486,41.8305,0
4.79379,42.3375,0
0.683916,41.4464,0
4.30425,41.5736,0
-4.99684,41.4194,0
0.407559,41.5083,0
1.31942,41.2904,0
-2.36349,41.9116,0
4.31274,41.5094,0
3.94453,41.3956,0
46.3802,3.32142,1
48.4628,3.09102,1
46.1492,3.13285,1
48.8009,2.98271,1
46.13,3.22293,1
48.8492,3.34834,1
52.4008,2.66164,1
45.6532,2.92506,1
51.7098,3.12781,1
49.3856,2.97191,1
54.8398,3.09188,1
51.3912,3.04305,1
48.7477,2.88549,1
47.3895,3.08509,1
45.5833,3.42697,1
52.1513,3.39225,1
54.7523,3.18117,1
47.9111,3.25294,1
43.0308,3.16003,1
52.5527,2.9644,1
51.0376,3.12574,1
39.7875,3.24483,1
51.4247,3.02434,1
54.9527,3.11502,1
55.269,3.23374,1
51.6444,3.05482,1
52.7037,3.40248,1
47.3123,3.11889,1
39.9097,3.19928,1
46.9546,3.21672,1
54.7801,3.45151,1
46.8688,3.30554,1
50.7167,3.3011,1
43.3099,3.13502,1
49.0462,3.30432,1
46.7978,3.14741,1
39.1666,3.25898,1
52.8303,3.2864,1
43.5847,3.09874,1
43.8313,3.26131,1
47.0354,3.37726,1
43.8303,3.00101,1
49.2136,2.82901,1
53.0703,3.25768,1
51.5605,3.13051,1
51.6565,3.03276,1
54.6803,3.21761,1
52.8974,3.11975,1
49.8031,3.24972,1
42.8679,3.06868,1
49.5025,3.32133,1
48.1066,3.25607,1
47.5742,3.31038,1
47.9516,3.0661,1
45.6754,3.06525,1
53.5019,3.2174,1
56.9209,3.22393,1
46.6076,3.35139,1
45.2508,3.26453,1
46.0753,3.31525,1
49.2275,2.83737,1
48.5368,3.22135,1
51.592,3.38263,1
52.079,3.253,1
47.6605,3.08802,1
52.4266,3.27257,1
47.2102,3.07995,1
51.7235,3.18778,1
48.3572,3.40303,1
50.4224,3.43891,1
48.8457,3.17306,1
52.3564,3.11401,1
47.8073,3.37175,1
45.2018,3.14158,1
42.0126,3.31301,1
47.1974,3.31478,1
44.5805,3.09813,1
47.9305,3.36138,1
46.6296,3.05323,1
51.7158,3.24734,1
48.241,3.1586,1
43.1457,3.10903,1
53.5328,3.23527,1
51.2686,3.05155,1
52.6787,3.26477,1
47.3229,3.25546,1
50.6885,2.86228,1
56.2172,3.26755,1
52.5805,3.00891,1
41.348,3.07628,1
44.471,3.22271,1
45.5043,3.17306,1
52.9761,3.46274,1
48.554,2.92317,1
51.0304,3.05824,1
55.3886,3.30659,1
43.4255,3.25965,1
51.9739,3.21241,1
55.4508,3.21214,1
48.8366,3.10592,1
49.6574,3.43084,1
48.2048,3.10344,1
48.3355,3.2727,1
51.1717,2.86299,1
48.2203,3.30793,1
45.1759,3.38008,1
52.557,3.41086,1
50.9924,2.98479,1
49.8621,3.36127,1
49.4336,3.19372,1
48.7223,3.22935,1
46.0376,3.53939,1
51.1494,3.09522,1
51.5877,3.53584,1
54.3164,3.11213,1
55.9106,3.02007,1
51.3509,3.13694,1
43.6607,3.15844,1
50.3681,3.26108,1
52.1362,3.10912,1
45.1316,3.19939,1
49.7714,2.84767,1
45.1409,3.17001,1
48.7293,3.39143,1
55.6218,3.21222,1
45.5667,3.1683,1
50.5703,3.15988,1
40.4224,3.12012,1
46.9099,2.

Download .txt

gitextract_0ddfptui/

├── .gitignore
├── README-old.asc
├── README.md
├── consensus
├── evolve-sc
├── nb-configuration.xml
├── pom.xml
├── run
├── src/
│   ├── main/
│   │   ├── java/
│   │   │   └── org/
│   │   │       └── clueminer/
│   │   │           ├── clustering/
│   │   │           │   └── benchmark/
│   │   │           │       ├── AbsParams.java
│   │   │           │       ├── Bench.java
│   │   │           │       ├── BenchParams.java
│   │   │           │       ├── ClusteringBenchmark.java
│   │   │           │       ├── Container.java
│   │   │           │       ├── Experiment.java
│   │   │           │       ├── GnuplotReporter.java
│   │   │           │       ├── Main.java
│   │   │           │       ├── ParamExperiment.java
│   │   │           │       ├── chameleon2/
│   │   │           │       │   └── Cham2Bench.java
│   │   │           │       ├── consensus/
│   │   │           │       │   ├── ConsensusExp.java
│   │   │           │       │   ├── ConsensusParams.java
│   │   │           │       │   └── ConsensusRun.java
│   │   │           │       ├── cutoff/
│   │   │           │       │   ├── CutoffComparison.java
│   │   │           │       │   ├── CutoffExp.java
│   │   │           │       │   ├── CutoffParams.java
│   │   │           │       │   └── FirstJumpOptimization.java
│   │   │           │       ├── evolve/
│   │   │           │       │   ├── EvolveExp.java
│   │   │           │       │   └── EvolveParams.java
│   │   │           │       ├── exp/
│   │   │           │       │   ├── Data.java
│   │   │           │       │   ├── EvolveScores.java
│   │   │           │       │   ├── HclusPar.java
│   │   │           │       │   ├── HclusPar2.java
│   │   │           │       │   └── Hclust.java
│   │   │           │       ├── gen/
│   │   │           │       │   ├── NsgaGen.java
│   │   │           │       │   ├── NsgaGenExp.java
│   │   │           │       │   └── NsgaGenParams.java
│   │   │           │       ├── nsga/
│   │   │           │       │   ├── NsgaExp.java
│   │   │           │       │   ├── NsgaParams.java
│   │   │           │       │   └── NsgaScore.java
│   │   │           │       └── partition/
│   │   │           │           ├── PartitionBench.java
│   │   │           │           ├── PartitionExp.java
│   │   │           │           └── PartitionParams.java
│   │   │           └── data/
│   │   │               ├── DataLoader.java
│   │   │               └── ResourceList.java
│   │   ├── nbm/
│   │   │   └── manifest.mf
│   │   └── resources/
│   │       ├── datasets/
│   │       │   ├── artificial/
│   │       │   │   ├── 2d-10c.arff
│   │       │   │   ├── 2d-20c-no0.arff
│   │       │   │   ├── 2d-3c-no123.arff
│   │       │   │   ├── 2d-4c-no4.arff
│   │       │   │   ├── 2d-4c-no9.arff
│   │       │   │   ├── 2d-4c.arff
│   │       │   │   ├── 2dnormals.arff
│   │       │   │   ├── 2sp2glob.arff
│   │       │   │   ├── 3-spiral.arff
│   │       │   │   ├── 3MC.arff
│   │       │   │   ├── D31.arff
│   │       │   │   ├── DS-577.arff
│   │       │   │   ├── DS-850.arff
│   │       │   │   ├── R15.arff
│   │       │   │   ├── aggregation.arff
│   │       │   │   ├── aml28.arff
│   │       │   │   ├── atom.arff
│   │       │   │   ├── banana.arff
│   │       │   │   ├── birch-rg1.arff
│   │       │   │   ├── birch-rg2.arff
│   │       │   │   ├── birch-rg3.arff
│   │       │   │   ├── blobs.arff
│   │       │   │   ├── cassini.arff
│   │       │   │   ├── chainlink.arff
│   │       │   │   ├── circle.arff
│   │       │   │   ├── cluto-t4-8k.arff
│   │       │   │   ├── cluto-t5-8k.arff
│   │       │   │   ├── cluto-t7-10k.arff
│   │       │   │   ├── cluto-t8-8k.arff
│   │       │   │   ├── complex8.arff
│   │       │   │   ├── complex9.arff
│   │       │   │   ├── compound.arff
│   │       │   │   ├── cuboids.arff
│   │       │   │   ├── cure-t0-2000n-2D.arff
│   │       │   │   ├── cure-t1-2000n-2D.arff
│   │       │   │   ├── cure-t2-4k.arff
│   │       │   │   ├── curves1.arff
│   │       │   │   ├── curves2.arff
│   │       │   │   ├── dartboard1.arff
│   │       │   │   ├── dartboard2.arff
│   │       │   │   ├── dense-disk-3000.arff
│   │       │   │   ├── dense-disk-5000.arff
│   │       │   │   ├── diamond9.arff
│   │       │   │   ├── disk-1000n.arff
│   │       │   │   ├── disk-3000n.arff
│   │       │   │   ├── disk-4000n.arff
│   │       │   │   ├── disk-4500n.arff
│   │       │   │   ├── disk-4600n.arff
│   │       │   │   ├── disk-5000n.arff
│   │       │   │   ├── disk-6000n.arff
│   │       │   │   ├── donut1.arff
│   │       │   │   ├── donut2.arff
│   │       │   │   ├── donut3.arff
│   │       │   │   ├── donutcurves.arff
│   │       │   │   ├── dpb.arff
│   │       │   │   ├── dpc.arff
│   │       │   │   ├── ds2c2sc13.arff
│   │       │   │   ├── ds3c3sc6.arff
│   │       │   │   ├── ds4c2sc8.arff
│   │       │   │   ├── elliptical_10_2.arff
│   │       │   │   ├── elly-2d10c13s.arff
│   │       │   │   ├── engytime.arff
│   │       │   │   ├── flame.arff
│   │       │   │   ├── fourty.arff
│   │       │   │   ├── gaussians1.arff
│   │       │   │   ├── golfball.arff
│   │       │   │   ├── hepta.arff
│   │       │   │   ├── hypercube.arff
│   │       │   │   ├── impossible.arff
│   │       │   │   ├── insect.arff
│   │       │   │   ├── jain.arff
│   │       │   │   ├── long1.arff
│   │       │   │   ├── long2.arff
│   │       │   │   ├── long3.arff
│   │       │   │   ├── longsquare.arff
│   │       │   │   ├── lsun.arff
│   │       │   │   ├── mopsi-finland.arff
│   │       │   │   ├── mopsi-joensuu.arff
│   │       │   │   ├── pathbased.arff
│   │       │   │   ├── pmf.arff
│   │       │   │   ├── rings.arff
│   │       │   │   ├── s-set1.arff
│   │       │   │   ├── s-set2.arff
│   │       │   │   ├── s-set3.arff
│   │       │   │   ├── s-set4.arff
│   │       │   │   ├── shapes.arff
│   │       │   │   ├── simplex.arff
│   │       │   │   ├── sizes1.arff
│   │       │   │   ├── sizes2.arff
│   │       │   │   ├── sizes3.arff
│   │       │   │   ├── sizes4.arff
│   │       │   │   ├── sizes5.arff
│   │       │   │   ├── smile1.arff
│   │       │   │   ├── smile2.arff
│   │       │   │   ├── smile3.arff
│   │       │   │   ├── spherical_4_3.arff
│   │       │   │   ├── spherical_5_2.arff
│   │       │   │   ├── spherical_6_2.arff
│   │       │   │   ├── spiral.arff
│   │       │   │   ├── spiralsquare.arff
│   │       │   │   ├── square1.arff
│   │       │   │   ├── square2.arff
│   │       │   │   ├── square3.arff
│   │       │   │   ├── square4.arff
│   │       │   │   ├── square5.arff
│   │       │   │   ├── st900.arff
│   │       │   │   ├── target.arff
│   │       │   │   ├── tetra.arff
│   │       │   │   ├── threenorm.arff
│   │       │   │   ├── triangle1.arff
│   │       │   │   ├── triangle2.arff
│   │       │   │   ├── twenty.arff
│   │       │   │   ├── twodiamonds.arff
│   │       │   │   ├── wingnut.arff
│   │       │   │   ├── xclara.arff
│   │       │   │   ├── xor.arff
│   │       │   │   ├── zelnik1.arff
│   │       │   │   ├── zelnik2.arff
│   │       │   │   ├── zelnik3.arff
│   │       │   │   ├── zelnik4.arff
│   │       │   │   ├── zelnik5.arff
│   │       │   │   └── zelnik6.arff
│   │       │   └── real-world/
│   │       │       ├── arrhythmia.arff
│   │       │       ├── balance-scale.arff
│   │       │       ├── cpu.arff
│   │       │       ├── dermatology.arff
│   │       │       ├── ecoli.arff
│   │       │       ├── german.arff
│   │       │       ├── glass.arff
│   │       │       ├── haberman.arff
│   │       │       ├── heart-statlog.arff
│   │       │       ├── iono.arff
│   │       │       ├── iris.arff
│   │       │       ├── letter.arff
│   │       │       ├── segment.arff
│   │       │       ├── sonar.arff
│   │       │       ├── tae.arff
│   │       │       ├── thy.arff
│   │       │       ├── vehicle.arff
│   │       │       ├── vowel.arff
│   │       │       ├── water-treatment.arff
│   │       │       ├── wdbc.arff
│   │       │       ├── wine.arff
│   │       │       ├── wisc.arff
│   │       │       ├── yeast.arff
│   │       │       └── zoo.arff
│   │       ├── log4j2.properties
│   │       └── org/
│   │           └── clueminer/
│   │               └── clustering/
│   │                   └── benchmark/
│   │                       └── Bundle.properties
│   └── test/
│       ├── java/
│       │   └── org/
│       │       └── clueminer/
│       │           └── clustering/
│       │               └── benchmark/
│       │                   ├── ExperimentTest.java
│       │                   ├── HclustBenchmarkTest.java
│       │                   └── chameleon2/
│       │                       └── Cham2BenchTest.java
│       └── resources/
│           └── log4j2.properties
└── updreadme.rb

Download .txt

SYMBOL INDEX (172 symbols across 39 files)

FILE: src/main/java/org/clueminer/clustering/benchmark/AbsParams.java
  class AbsParams (line 28) | public class AbsParams {

FILE: src/main/java/org/clueminer/clustering/benchmark/Bench.java
  class Bench (line 34) | public abstract class Bench {
    method Bench (line 40) | public Bench() {
    method ensureFolder (line 44) | public static void ensureFolder(String folder) {
    method main (line 55) | public abstract void main(String[] args);
    method printUsage (line 57) | public static void printUsage(String[] args, JCommander cmd, AbsParams...
    method loadDatasets (line 69) | protected void loadDatasets() {
    method loadBenchArtificial (line 77) | protected void loadBenchArtificial() {
    method loadBenchRealWorld (line 81) | protected void loadBenchRealWorld() {
    method load (line 90) | protected void load(String name) {
    method safeName (line 100) | public static String safeName(String name) {

FILE: src/main/java/org/clueminer/clustering/benchmark/BenchParams.java
  class BenchParams (line 25) | public class BenchParams extends AbsParams {

FILE: src/main/java/org/clueminer/clustering/benchmark/ClusteringBenchmark.java
  class ClusteringBenchmark (line 35) | public class ClusteringBenchmark<E extends Instance> {
    method cluster (line 37) | public Container<E> cluster(Dataset<E> dataset, Props props) {
    method cluster (line 44) | public Container<E> cluster(ClusteringAlgorithm algorithm, Dataset<E> ...
    method hclust (line 48) | public Container<E> hclust(final AgglomerativeClustering algorithm, fi...
    method singleLinkage (line 55) | public Container<E> singleLinkage(final AgglomerativeClustering algori...
    method completeLinkage (line 59) | public Container<E> completeLinkage(final AgglomerativeClustering algo...

FILE: src/main/java/org/clueminer/clustering/benchmark/Container.java
  class Container (line 38) | public class Container<E extends Instance> implements Runnable {
    method Container (line 47) | public Container(ClusteringAlgorithm algorithm, Dataset<E> dataset) {
    method Container (line 54) | public Container(ClusteringAlgorithm algorithm, Dataset<E> dataset, Pr...
    method hierarchical (line 61) | public HierarchicalResult hierarchical(AgglomerativeClustering algorit...
    method run (line 66) | @Override
    method cluster (line 75) | public Clustering cluster(ClusteringAlgorithm algorithm, Dataset<E> da...
    method equals (line 79) | public boolean equals(Container other) {

FILE: src/main/java/org/clueminer/clustering/benchmark/Experiment.java
  class Experiment (line 36) | public class Experiment<E extends Instance> implements Runnable {
    method Experiment (line 44) | public Experiment(BenchParams params, String results) {
    method Experiment (line 50) | public Experiment(BenchParams params, String results, ClusteringAlgori...
    method run (line 57) | @Override
    method generateData (line 111) | public Dataset<E> generateData(int size, int dim) {

FILE: src/main/java/org/clueminer/clustering/benchmark/GnuplotReporter.java
  class GnuplotReporter (line 36) | public class GnuplotReporter extends GnuplotHelper implements Reporter {
    method GnuplotReporter (line 42) | public GnuplotReporter(String folder, String[] opts, String[] algorith...
    method writeHeader (line 70) | private void writeHeader(String[] opts) {
    method finalResult (line 80) | @Override
    method writePlotScript (line 99) | private void writePlotScript(File file, String script) {
    method plotCpu (line 111) | private String plotCpu(int labelPos, String yLabel, int x, int y, Stri...
    method plotComplexity (line 143) | private String plotComplexity(int labelPos, String yLabel, int x, int ...
    method finish (line 182) | public void finish() {
    method writeBashScript (line 186) | private void writeBashScript(String dataDir) {
    method bashPlotScript (line 211) | public static void bashPlotScript(String[] plots, String dir, String g...

FILE: src/main/java/org/clueminer/clustering/benchmark/Main.java
  class Main (line 38) | public class Main {
    method Main (line 43) | public Main() {
    method main (line 62) | public static void main(String[] args) {
    method usage (line 80) | private static void usage() {

FILE: src/main/java/org/clueminer/clustering/benchmark/ParamExperiment.java
  class ParamExperiment (line 31) | public class ParamExperiment<E extends Instance> extends Experiment<E> {
    method ParamExperiment (line 35) | public ParamExperiment(BenchParams params, String results) {
    method ParamExperiment (line 39) | public ParamExperiment(BenchParams params, String results, Props[] con...
    method run (line 45) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/chameleon2/Cham2Bench.java
  class Cham2Bench (line 34) | public class Cham2Bench extends Hclust {
    method main (line 39) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusExp.java
  class ConsensusExp (line 31) | public class ConsensusExp extends Bench {
    method parseArguments (line 35) | protected static ConsensusParams parseArguments(String[] args) {
    method main (line 42) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusParams.java
  class ConsensusParams (line 26) | public class ConsensusParams extends AbsParams {

FILE: src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusRun.java
  class ConsensusRun (line 52) | public class ConsensusRun implements Runnable {
    method ConsensusRun (line 62) | public ConsensusRun(ConsensusParams params, String benchmarkFolder, Da...
    method createTable (line 71) | private void createTable() {
    method run (line 82) | @Override
    method algorithmSetup (line 139) | private Props algorithmSetup(String alg) {

FILE: src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffComparison.java
  class CutoffComparison (line 51) | public class CutoffComparison implements Runnable {
    method CutoffComparison (line 61) | public CutoffComparison(CutoffParams params, String benchmarkFolder, A...
    method run (line 68) | @Override
    method loadExternalEvals (line 115) | private void loadExternalEvals() {
    method writeHeader (line 124) | private void writeHeader(Dataset<? extends Instance> dataset) {
    method writeValues (line 137) | private void writeValues(CutoffStrategy cutoff, String internaEval, Cl...
    method writeAverages (line 164) | private void writeAverages() {
    method getCutoffStrategy (line 179) | private CutoffStrategy getCutoffStrategy(String strategy, String eval) {
    method initAverages (line 187) | private void initAverages() {
    class Average (line 205) | private class Average {
      method Average (line 212) | Average(String name) {
      method getAverage (line 222) | public double getAverage(String eval) {
      method addValues (line 226) | public void addValues(Clustering c) {

FILE: src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffExp.java
  class CutoffExp (line 33) | public class CutoffExp extends Bench {
    method parseArguments (line 37) | protected static CutoffParams parseArguments(String[] args) {
    method main (line 44) | @Override
    method createDatasetsArray (line 76) | private ArrayList<Dataset<? extends Instance>> createDatasetsArray(Str...

FILE: src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffParams.java
  class CutoffParams (line 26) | public class CutoffParams extends AbsParams {

FILE: src/main/java/org/clueminer/clustering/benchmark/cutoff/FirstJumpOptimization.java
  class FirstJumpOptimization (line 46) | public class FirstJumpOptimization implements Runnable {
    method FirstJumpOptimization (line 55) | public FirstJumpOptimization(CutoffParams params, String benchmarkFold...
    method run (line 64) | @Override
    method testParameters (line 102) | private double testParameters(int i, double j) {
    method computeDendrograms (line 135) | private void computeDendrograms(AgglomerativeClustering alg) {
    method getCutoffStrategy (line 143) | private CutoffStrategy getCutoffStrategy(String strategy) {

FILE: src/main/java/org/clueminer/clustering/benchmark/evolve/EvolveExp.java
  class EvolveExp (line 46) | public class EvolveExp implements Runnable {
    method EvolveExp (line 56) | public EvolveExp(EvolveParams params, String benchmarkFolder, ClusterE...
    method run (line 73) | @Override
    method fetchExternal (line 113) | private ClusterEvaluation fetchExternal(String external) {

FILE: src/main/java/org/clueminer/clustering/benchmark/evolve/EvolveParams.java
  class EvolveParams (line 26) | public class EvolveParams extends AbsParams {

FILE: src/main/java/org/clueminer/clustering/benchmark/exp/Data.java
  class Data (line 46) | public class Data extends Bench {
    method main (line 58) | @Override
    method init (line 117) | private void init() {
    method execute (line 141) | public void execute(String datasetName) {

FILE: src/main/java/org/clueminer/clustering/benchmark/exp/EvolveScores.java
  class EvolveScores (line 38) | public class EvolveScores extends Bench {
    method parseArguments (line 42) | protected static EvolveParams parseArguments(String[] args) {
    method main (line 49) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/exp/HclusPar.java
  class HclusPar (line 35) | public class HclusPar extends Hclust {
    method main (line 40) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/exp/HclusPar2.java
  class HclusPar2 (line 36) | public class HclusPar2 extends Hclust {
    method main (line 41) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/exp/Hclust.java
  class Hclust (line 37) | public class Hclust extends Bench {
    method main (line 44) | @Override
    method parseArguments (line 61) | protected static BenchParams parseArguments(String[] args) {
    method printUsage (line 68) | public static void printUsage(String[] args, JCommander cmd, BenchPara...

FILE: src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGen.java
  class NsgaGen (line 37) | public class NsgaGen extends Bench {
    method parseArguments (line 41) | protected static NsgaGenParams parseArguments(String[] args) {
    method main (line 48) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGenExp.java
  class NsgaGenExp (line 43) | public class NsgaGenExp implements Runnable {
    method NsgaGenExp (line 55) | public NsgaGenExp(NsgaGenParams params, String benchmarkFolder, Cluste...
    method createTable (line 66) | private void createTable() {
    method run (line 77) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGenParams.java
  class NsgaGenParams (line 26) | public class NsgaGenParams extends AbsParams {

FILE: src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaExp.java
  class NsgaExp (line 43) | public class NsgaExp implements Runnable {
    method NsgaExp (line 54) | public NsgaExp(NsgaParams params, String benchmarkFolder, ClusterEvalu...
    method run (line 64) | @Override
    method createTable (line 124) | private void createTable() {

FILE: src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaParams.java
  class NsgaParams (line 26) | public class NsgaParams extends AbsParams {

FILE: src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaScore.java
  class NsgaScore (line 38) | public class NsgaScore extends Bench {
    method parseArguments (line 42) | protected static NsgaParams parseArguments(String[] args) {
    method main (line 49) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/partition/PartitionBench.java
  class PartitionBench (line 38) | public class PartitionBench implements Runnable {
    method PartitionBench (line 45) | public PartitionBench(PartitionParams params, String results, Partitio...
    method run (line 52) | @Override
    method bench (line 96) | public Runnable bench(final Partitioning algorithm, final Graph g, fin...
    method generateData (line 116) | protected Dataset<? extends Instance> generateData(int size, int dim) {
    method determineK (line 132) | private int determineK(Dataset<? extends Instance> dataset) {
    method determineMaxPartitionSize (line 141) | private int determineMaxPartitionSize(Dataset<? extends Instance> data...

FILE: src/main/java/org/clueminer/clustering/benchmark/partition/PartitionExp.java
  class PartitionExp (line 33) | public class PartitionExp extends Bench {
    method parseArguments (line 37) | protected static PartitionParams parseArguments(String[] args) {
    method main (line 44) | @Override

FILE: src/main/java/org/clueminer/clustering/benchmark/partition/PartitionParams.java
  class PartitionParams (line 26) | public class PartitionParams extends AbsParams {

FILE: src/main/java/org/clueminer/data/DataLoader.java
  class DataLoader (line 43) | public class DataLoader implements DataProvider {
    method DataLoader (line 50) | public DataLoader(Map<String, String> datasets, String prefix, Map<Str...
    method getDatasetNames (line 57) | @Override
    method getDataset (line 62) | @Override
    method first (line 76) | @Override
    method count (line 85) | @Override
    method loadDataset (line 97) | private Dataset<? extends Instance> loadDataset(String name, String ty...
    method resource (line 126) | public File resource(String path, String fullPath) {
    method loadResource (line 158) | private File loadResource(String resource) {
    method createLoader (line 177) | public static DataProvider createLoader(String p1, String p2) {
    method iterator (line 202) | @Override
    method hasDataset (line 207) | @Override
    class DataLoaderIterator (line 217) | private class DataLoaderIterator implements Iterator<Dataset<? extends...
      method DataLoaderIterator (line 221) | public DataLoaderIterator() {
      method hasNext (line 225) | @Override
      method next (line 230) | @Override
      method remove (line 235) | @Override

FILE: src/main/java/org/clueminer/data/ResourceList.java
  class ResourceList (line 43) | public class ResourceList {
    method getResources (line 55) | public static Collection<String> getResources(String p1, String p2) {
    method browseFiles (line 91) | private static void browseFiles(final List<String> retval, File fileMe...
    method getResources (line 106) | private static Collection<String> getResources(final String element, f...
    method getResourcesFromJarFile (line 119) | private static Collection<String> getResourcesFromJarFile(final File f...
    method getResourcesFromDirectory (line 148) | private static Collection<String> getResourcesFromDirectory(
    method isWindows (line 171) | public static boolean isWindows() {
    method isMac (line 175) | public static boolean isMac() {
    method isUnix (line 179) | public static boolean isUnix() {
    method isSolaris (line 183) | public static boolean isSolaris() {
    method loadFromJar (line 193) | private static void loadFromJar(List<String> retval, Pattern pattern) {

FILE: src/test/java/org/clueminer/clustering/benchmark/ExperimentTest.java
  class ExperimentTest (line 31) | public class ExperimentTest {
    method testGenerateData (line 35) | @Test

FILE: src/test/java/org/clueminer/clustering/benchmark/HclustBenchmarkTest.java
  class HclustBenchmarkTest (line 36) | public class HclustBenchmarkTest {
    method HclustBenchmarkTest (line 40) | public HclustBenchmarkTest() {
    method testSingleLinkage (line 45) | @Test
    method testCompleteLinkage (line 56) | @Test
    method testSingleLinkageSameResultTwoAlg (line 67) | @Test
    method testSingleLinkageSameResult (line 89) | public void testSingleLinkageSameResult() {
    method testAverageLinkageResult (line 96) | @Test
    method testMedianLinkageResult (line 108) | public void testMedianLinkageResult() {
    method compareTreeResults (line 115) | private void compareTreeResults(Dataset<? extends Instance> dataset, S...

FILE: src/test/java/org/clueminer/clustering/benchmark/chameleon2/Cham2BenchTest.java
  class Cham2BenchTest (line 29) | public class Cham2BenchTest {
    method testMain (line 31) | @Test

FILE: updreadme.rb
  function basename (line 8) | def basename(file)

Copy disabled (too large) Download .json

Condensed preview — 197 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (18,072K chars).

[
  {
    "path": ".gitignore",
    "chars": 125,
    "preview": "*.class\n\n# Package Files #\n*.jar\n*.war\n*.ear\n*~\n.classpath\n.project\n.settings/\ntarget/\nlogs/\n/nbproject/private/\n\n/nbpro"
  },
  {
    "path": "README-old.asc",
    "chars": 5079,
    "preview": "# Clustering datasets\n\n## Datasets\n\nThis project contains collection of labeled clustering problems that can be found in"
  },
  {
    "path": "README.md",
    "chars": 33556,
    "preview": "# Clustering benchmarks\n\n## Datasets\n\nThis project contains collection of labeled clustering problems that can be found "
  },
  {
    "path": "consensus",
    "chars": 167,
    "preview": "#!/bin/bash\nmeth=(\"KmB-COMUSA-RAND\" \"KmB-COMUSA-MO\" \"KmB-COMUSA-RAND-fixed\")\nfor m in \"${meth[@]}\"; do\n ./run consensus "
  },
  {
    "path": "evolve-sc",
    "chars": 86,
    "preview": "#!/bin/bash\nARGS=\"evolve-sc --test --generations 20 --population 50 $@\"\n`./run $ARGS`\n"
  },
  {
    "path": "nb-configuration.xml",
    "chars": 1208,
    "preview": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<project-shared-configuration>\n    <!--\nThis file contains additional configurati"
  },
  {
    "path": "pom.xml",
    "chars": 9505,
    "preview": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<project xmlns=\"http://maven.apache.org/POM/4.0.0\" xmlns:xsi=\"http://www.w3.org/2"
  },
  {
    "path": "run",
    "chars": 350,
    "preview": "#!/bin/bash\nARGS=\"$@\"\nMAIN=\"org.clueminer.clustering.benchmark.Main\"\njarfile=\"$(ls -t target/*jar-with-dependencies.jar "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/AbsParams.java",
    "chars": 1391,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/Bench.java",
    "chars": 3362,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/BenchParams.java",
    "chars": 1453,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/ClusteringBenchmark.java",
    "chars": 2454,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/Container.java",
    "chars": 3188,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/Experiment.java",
    "chars": 4486,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/GnuplotReporter.java",
    "chars": 9449,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/Main.java",
    "chars": 3282,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/ParamExperiment.java",
    "chars": 3227,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/chameleon2/Cham2Bench.java",
    "chars": 2401,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusExp.java",
    "chars": 2178,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusParams.java",
    "chars": 1551,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/consensus/ConsensusRun.java",
    "chars": 7309,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffComparison.java",
    "chars": 9656,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffExp.java",
    "chars": 3040,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/cutoff/CutoffParams.java",
    "chars": 2784,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/cutoff/FirstJumpOptimization.java",
    "chars": 5495,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/evolve/EvolveExp.java",
    "chars": 5105,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/evolve/EvolveParams.java",
    "chars": 1464,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/exp/Data.java",
    "chars": 7396,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/exp/EvolveScores.java",
    "chars": 2952,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/exp/HclusPar.java",
    "chars": 2211,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/exp/HclusPar2.java",
    "chars": 2242,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/exp/Hclust.java",
    "chars": 3610,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGen.java",
    "chars": 2900,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGenExp.java",
    "chars": 5302,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/gen/NsgaGenParams.java",
    "chars": 1721,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaExp.java",
    "chars": 5632,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaParams.java",
    "chars": 2046,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/nsga/NsgaScore.java",
    "chars": 3087,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/partition/PartitionBench.java",
    "chars": 5401,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/partition/PartitionExp.java",
    "chars": 2273,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/clustering/benchmark/partition/PartitionParams.java",
    "chars": 1398,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/data/DataLoader.java",
    "chars": 7526,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/java/org/clueminer/data/ResourceList.java",
    "chars": 8136,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/main/nbm/manifest.mf",
    "chars": 109,
    "preview": "Manifest-Version: 1.0\nOpenIDE-Module-Localizing-Bundle: org/clueminer/clustering/benchmark/Bundle.properties\n"
  },
  {
    "path": "src/main/resources/datasets/artificial/2d-10c.arff",
    "chars": 53705,
    "preview": "@RELATION 2d-10c\n%\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE CLASS {0,1,2,3,4,5,6,7,8,9}\n%\n@DATA\n1.00007,40.9378,"
  },
  {
    "path": "src/main/resources/datasets/artificial/2d-20c-no0.arff",
    "chars": 29670,
    "preview": "% Data from J. Handl\n% http://personalpages.manchester.ac.uk/mbs/Julia.Handl/generators.html\n%\n@RELATION 2d-20c-no0\n\n@AT"
  },
  {
    "path": "src/main/resources/datasets/artificial/2d-3c-no123.arff",
    "chars": 14863,
    "preview": "% Data from J. Handl\n% http://personalpages.manchester.ac.uk/mbs/Julia.Handl/generators.html\n%\n@RELATION 2d-3c-no123\n\n@A"
  },
  {
    "path": "src/main/resources/datasets/artificial/2d-4c-no4.arff",
    "chars": 16549,
    "preview": "% Data from J. Handl\n% http://personalpages.manchester.ac.uk/mbs/Julia.Handl/generators.html\n%\n@RELATION 2d-4c-no4\n\n@ATT"
  },
  {
    "path": "src/main/resources/datasets/artificial/2d-4c-no9.arff",
    "chars": 17117,
    "preview": "% Data from J. Handl\n% http://personalpages.manchester.ac.uk/mbs/Julia.Handl/generators.html\n%\n@RELATION 2d-4c-no9\n\n@ATT"
  },
  {
    "path": "src/main/resources/datasets/artificial/2d-4c.arff",
    "chars": 22688,
    "preview": "% Data from J. Handl\n% http://personalpages.manchester.ac.uk/mbs/Julia.Handl/generators.html\n%\n@RELATION 2d-4c.arff\n\n@AT"
  },
  {
    "path": "src/main/resources/datasets/artificial/2dnormals.arff",
    "chars": 28999,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\")\n% > set.seed(1234)\n% > data <- mlbench.2dnormals(100"
  },
  {
    "path": "src/main/resources/datasets/artificial/2sp2glob.arff",
    "chars": 39141,
    "preview": "% Dataset from: http://lasid.sor.ufscar.br/2sp2globBPCollection/\n%\n% Jane Piantoni, Katti Faceli, Tiemi C. Sakata, Julio"
  },
  {
    "path": "src/main/resources/datasets/artificial/3-spiral.arff",
    "chars": 4489,
    "preview": "% Spiral, N=312, M=3, D=2\r\n% Chang, H. and D.Y. Yeung, Robust path-based spectral clustering. Pattern Recognition, 2008."
  },
  {
    "path": "src/main/resources/datasets/artificial/3MC.arff",
    "chars": 6195,
    "preview": "% Dataset from: http://mail.tku.edu.tw/chchou/others/others.htm\n%\n% M. C. Su, C. H. Chou, and C. C. Hsieh, “Fuzzy C-Mean"
  },
  {
    "path": "src/main/resources/datasets/artificial/D31.arff",
    "chars": 59466,
    "preview": "% D31\r\n% N=3100, M=31, D=2\r\n% Veenman, C.J., M.J.T. Reinders, and E. Backer, A maximum variance cluster algorithm. IEEE "
  },
  {
    "path": "src/main/resources/datasets/artificial/DS-577.arff",
    "chars": 11930,
    "preview": "% Dataset from: http://mail.tku.edu.tw/chchou/others/others.htm\n%\n% M. C. Su, C. H. Chou, and C. C. Hsieh, “Fuzzy C-Mean"
  },
  {
    "path": "src/main/resources/datasets/artificial/DS-850.arff",
    "chars": 16110,
    "preview": "% Dataset from: http://mail.tku.edu.tw/chchou/others/others.htm\n%\n% M. C. Su, C. H. Chou, and C. C. Hsieh, “Fuzzy C-Mean"
  },
  {
    "path": "src/main/resources/datasets/artificial/R15.arff",
    "chars": 9866,
    "preview": "% R15, N=600, M=15, D=2\r\n% Veenman, C.J., M.J.T. Reinders, and E. Backer, A maximum variance cluster algorithm. IEEE Tra"
  },
  {
    "path": "src/main/resources/datasets/artificial/aggregation.arff",
    "chars": 10764,
    "preview": "% Aggregation\r\n% N=788, M=7, D=2\r\n% Gionis, A., H. Mannila, and P. Tsaparas, Clustering aggregation. ACM Transactions on"
  },
  {
    "path": "src/main/resources/datasets/artificial/aml28.arff",
    "chars": 14806,
    "preview": "% The variant allele frequencies of somatic single nucleotide variants\n% from whole genome sequencing of a AML patient a"
  },
  {
    "path": "src/main/resources/datasets/artificial/atom.arff",
    "chars": 25074,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/banana.arff",
    "chars": 96324,
    "preview": "@relation banana1\n\n@attribute x numeric\n@attribute y numeric\n@attribute class {Class 1, Class 2}\n\n@data\n0.228,0.559,Clas"
  },
  {
    "path": "src/main/resources/datasets/artificial/birch-rg1.arff",
    "chars": 3391298,
    "preview": "@relation birch-rg1\n\n@attribute x numeric\n@attribute y numeric\n\n@data\n-0.00146450671296638,32.7798726126056\n-0.001614216"
  },
  {
    "path": "src/main/resources/datasets/artificial/birch-rg2.arff",
    "chars": 3431163,
    "preview": "@relation birch-rg2\n\n@attribute x numeric\n@attribute y numeric\n\n@data\n10.0595619587971,12.8904495132233\n10.1105631655715"
  },
  {
    "path": "src/main/resources/datasets/artificial/birch-rg3.arff",
    "chars": 3379730,
    "preview": "@relation birch-rg3\n\n@attribute x numeric\n@attribute y numeric\n\n@data\n-0.0728094069993936,96.3490812438782\n-0.1309206570"
  },
  {
    "path": "src/main/resources/datasets/artificial/blobs.arff",
    "chars": 13095,
    "preview": "% Data generated using Python package 'sklearn' (sklearn.datasets.samples_generator)\r\n%\r\n% make_blobs(n_samples=300, cen"
  },
  {
    "path": "src/main/resources/datasets/artificial/cassini.arff",
    "chars": 29019,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\")\n% > set.seed(1234)\n% > cassini <- mlbench.cassini(10"
  },
  {
    "path": "src/main/resources/datasets/artificial/chainlink.arff",
    "chars": 33832,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/circle.arff",
    "chars": 29329,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\")\n% > set.seed(1234)\n% > data <- mlbench.circle(1000, "
  },
  {
    "path": "src/main/resources/datasets/artificial/cluto-t4-8k.arff",
    "chars": 187509,
    "preview": "% source: G. Karypis, \"CLUTO A Clustering Toolkit,\" Dept. of Computer Science, University of Minnesota,\n% Tech. Rep. 02-"
  },
  {
    "path": "src/main/resources/datasets/artificial/cluto-t5-8k.arff",
    "chars": 183984,
    "preview": "% source: G. Karypis, \"CLUTO A Clustering Toolkit,\" Dept. of Computer Science, University of Minnesota,\n% Tech. Rep. 02-"
  },
  {
    "path": "src/main/resources/datasets/artificial/cluto-t7-10k.arff",
    "chars": 235555,
    "preview": "% source: G. Karypis, \"CLUTO A Clustering Toolkit,\" Dept. of Computer Science, University of Minnesota,\n% Tech. Rep. 02-"
  },
  {
    "path": "src/main/resources/datasets/artificial/cluto-t8-8k.arff",
    "chars": 185684,
    "preview": "% source: G. Karypis, \"CLUTO A Clustering Toolkit,\" Dept. of Computer Science, University of Minnesota,\n% Tech. Rep. 02-"
  },
  {
    "path": "src/main/resources/datasets/artificial/complex8.arff",
    "chars": 47276,
    "preview": "% from http://www2.cs.uh.edu/~ml_kdd/Complex&Diamond/complex&diamond.htm\r\n% Salvador, S. and Chan, P.,Determining the Nu"
  },
  {
    "path": "src/main/resources/datasets/artificial/complex9.arff",
    "chars": 53485,
    "preview": "% from http://www2.cs.uh.edu/~ml_kdd/Complex&Diamond/complex&diamond.htm\n% Salvador, S. and Chan, P.,Determining the Num"
  },
  {
    "path": "src/main/resources/datasets/artificial/compound.arff",
    "chars": 5744,
    "preview": "% Compound\r\n% N=399, M=6, D=2\r\n% Zahn, C.T., Graph-theoretical methods for detecting and describing gestalt clusters. IE"
  },
  {
    "path": "src/main/resources/datasets/artificial/cuboids.arff",
    "chars": 41181,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\")\n% > set.seed(1234)\n% > data <- mlbench.cuboids(1000)"
  },
  {
    "path": "src/main/resources/datasets/artificial/cure-t0-2000n-2D.arff",
    "chars": 42385,
    "preview": "@RELATION cure-t0-2000n-2D\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1,2}\n\n@DATA\n-0.227326,0.771386,0\n0.2"
  },
  {
    "path": "src/main/resources/datasets/artificial/cure-t1-2000n-2D.arff",
    "chars": 42321,
    "preview": "@RELATION cure-t1-2000n-2D\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1,2,3,4,5}\n\n@DATA\n-0.845336,-0.53319"
  },
  {
    "path": "src/main/resources/datasets/artificial/cure-t2-4k.arff",
    "chars": 88386,
    "preview": "@RELATION cure-t2-4k\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1,2,3,4,5,noise}\n\n@DATA\n-0.590353,-0.56673"
  },
  {
    "path": "src/main/resources/datasets/artificial/curves1.arff",
    "chars": 20888,
    "preview": "@RELATION curves1.arff\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n-0.2,0.7,0\n-0.2,0.700314,0\n-"
  },
  {
    "path": "src/main/resources/datasets/artificial/curves2.arff",
    "chars": 20892,
    "preview": "@RELATION curves2.arff\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n-0.2,0.7,0\n-0.2,0.700314,0\n-"
  },
  {
    "path": "src/main/resources/datasets/artificial/dartboard1.arff",
    "chars": 20867,
    "preview": "@RELATION dartboard1.arff\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n-0.1,0.5,0\n-0.100126,"
  },
  {
    "path": "src/main/resources/datasets/artificial/dartboard2.arff",
    "chars": 20833,
    "preview": "@RELATION dartboard2.arff\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n-0.3,0.5,0\n-0.300063,"
  },
  {
    "path": "src/main/resources/datasets/artificial/dense-disk-3000.arff",
    "chars": 57546,
    "preview": "@RELATION dense-disk-3k\n%\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE CLASS {0,1}\n%\n@DATA\n7.58859,1.35137,0\n-7.5016,0"
  },
  {
    "path": "src/main/resources/datasets/artificial/dense-disk-5000.arff",
    "chars": 95772,
    "preview": "@RELATION dense-disk-5k\n%\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE CLASS {0,1}\n%\n@DATA\n4.41933,-4.36495,0\n-1.77464"
  },
  {
    "path": "src/main/resources/datasets/artificial/diamond9.arff",
    "chars": 60384,
    "preview": "% from http://www2.cs.uh.edu/~ml_kdd/Complex&Diamond/complex&diamond.htm\r\n% Salvador, S. and Chan, P.,Determining the Nu"
  },
  {
    "path": "src/main/resources/datasets/artificial/disk-1000n.arff",
    "chars": 19258,
    "preview": "% -b 0.3 -g 0.03\n@RELATION disk-1000n\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n6.77101,0.29793"
  },
  {
    "path": "src/main/resources/datasets/artificial/disk-3000n.arff",
    "chars": 57397,
    "preview": "@RELATION disk-3000n\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n-8.54736,4.57022,0\n0.088762,-8.1"
  },
  {
    "path": "src/main/resources/datasets/artificial/disk-4000n.arff",
    "chars": 76360,
    "preview": "@RELATION disk-4000n\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n6.23269,-7.17151,0\n8.35739,2.604"
  },
  {
    "path": "src/main/resources/datasets/artificial/disk-4500n.arff",
    "chars": 85940,
    "preview": "% disk in disk: -b 0.3 -g 0.03\n@RELATION disk-4500n\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n4"
  },
  {
    "path": "src/main/resources/datasets/artificial/disk-4600n.arff",
    "chars": 87508,
    "preview": "@RELATION disk-4600n\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n2.30561,-9.61493,0\n-0.770364,-7."
  },
  {
    "path": "src/main/resources/datasets/artificial/disk-5000n.arff",
    "chars": 95195,
    "preview": "% disk-in-disk generator -b 0.2 -g 0.03\n@RELATION disk-5000n\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1}"
  },
  {
    "path": "src/main/resources/datasets/artificial/disk-6000n.arff",
    "chars": 114280,
    "preview": "@RELATION disk-6000n\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n-7.95237,4.51553,0\n-7.47322,-2.4"
  },
  {
    "path": "src/main/resources/datasets/artificial/donut1.arff",
    "chars": 20902,
    "preview": "@RELATION donut1.arff\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n-0.303573,0.72294,0\n-0.288011"
  },
  {
    "path": "src/main/resources/datasets/artificial/donut2.arff",
    "chars": 20866,
    "preview": "@RELATION donut2.arff\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1}\n\n@DATA\n-0.303573,0.72294,0\n-0.288011"
  },
  {
    "path": "src/main/resources/datasets/artificial/donut3.arff",
    "chars": 20808,
    "preview": "@RELATION donut3.arff\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2}\n\n@DATA\n-0.263573,0.72294,0\n-0.2480"
  },
  {
    "path": "src/main/resources/datasets/artificial/donutcurves.arff",
    "chars": 21126,
    "preview": "@RELATION donutcurves.arff\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n0.0464265,0.72294,0\n"
  },
  {
    "path": "src/main/resources/datasets/artificial/dpb.arff",
    "chars": 109719,
    "preview": "% Dataset used in Figure 2B\n%\n% A Rodriguez, A Laio, Clustering by fast search and find of density peaks, SCIENCE, 1492,"
  },
  {
    "path": "src/main/resources/datasets/artificial/dpc.arff",
    "chars": 27551,
    "preview": "% Dataset used in Figure 2C\n%\n% A Rodriguez, A Laio, Clustering by fast search and find of density peaks, SCIENCE, 1492,"
  },
  {
    "path": "src/main/resources/datasets/artificial/ds2c2sc13.arff",
    "chars": 10996,
    "preview": "@RELATION ds2c2sc13\n%\n% Katti Faceli, Tiemi C. Sakata, Marcilio C. P. de Souto, and Andr\\&\\#233; C. P. L. F. de Carvalho"
  },
  {
    "path": "src/main/resources/datasets/artificial/ds3c3sc6.arff",
    "chars": 16495,
    "preview": "@RELATION ds3c3sc6\n%\n% Katti Faceli, Tiemi C. Sakata, Marcilio C. P. de Souto, and Andr\\&\\#233; C. P. L. F. de Carvalho."
  },
  {
    "path": "src/main/resources/datasets/artificial/ds4c2sc8.arff",
    "chars": 8953,
    "preview": "@RELATION ds4c2sc8\n%\n% Katti Faceli, Tiemi C. Sakata, Marcilio C. P. de Souto, and Andr\\&\\#233; C. P. L. F. de Carvalho."
  },
  {
    "path": "src/main/resources/datasets/artificial/elliptical_10_2.arff",
    "chars": 7390,
    "preview": "% Data_10_2 or AD_10_2: This data is in 2-d space, and has 10 clusters.  The total number of  points is 500.\n%\n% If you "
  },
  {
    "path": "src/main/resources/datasets/artificial/elly-2d10c13s.arff",
    "chars": 58223,
    "preview": "% Data from J. Handl\n% http://personalpages.manchester.ac.uk/mbs/Julia.Handl/generators.html\n% source code: https://gith"
  },
  {
    "path": "src/main/resources/datasets/artificial/engytime.arff",
    "chars": 83065,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/flame.arff",
    "chars": 3391,
    "preview": "% Flame, N=240, M=2, D=2\r\n% Fu, L. and E. Medico, FLAME, a novel fuzzy clustering method for the analysis of DNA microar"
  },
  {
    "path": "src/main/resources/datasets/artificial/fourty.arff",
    "chars": 19266,
    "preview": "% source Handl, Knowles: MOCK\n@RELATION fourty\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1,2,3,4,5,6,7,8,"
  },
  {
    "path": "src/main/resources/datasets/artificial/gaussians1.arff",
    "chars": 3026,
    "preview": "@relation gaussians1\n% %% generated two Gaussian clouds (R lang)\n% cl1 <- cbind(rnorm(100,0.2,0.05),rnorm(100,0.8,0.06))"
  },
  {
    "path": "src/main/resources/datasets/artificial/golfball.arff",
    "chars": 119364,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/hepta.arff",
    "chars": 6998,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/hypercube.arff",
    "chars": 33605,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\") \n% > set.seed(1234)\n% > data <- mlbench.hypercube(10"
  },
  {
    "path": "src/main/resources/datasets/artificial/impossible.arff",
    "chars": 70586,
    "preview": "@RELATION impossible\n%\n% A dataset inspired by Jain, Anil K. \"Data clustering: 50 years beyond K-means.\"\n% Pattern recog"
  },
  {
    "path": "src/main/resources/datasets/artificial/insect.arff",
    "chars": 760,
    "preview": "% source: http://people.sc.fsu.edu/~jburkardt/datasets/martinez/martinez.html\n% The INSECT data set contains 10 measurem"
  },
  {
    "path": "src/main/resources/datasets/artificial/jain.arff",
    "chars": 5167,
    "preview": "% Jain\r\n% N=373, M=2, D=2\r\n% Jain, A. and M. Law, Data clustering: A user's dilemma. Lecture Notes in Computer Science, "
  },
  {
    "path": "src/main/resources/datasets/artificial/long1.arff",
    "chars": 20626,
    "preview": "@RELATION long1\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of the numb"
  },
  {
    "path": "src/main/resources/datasets/artificial/long2.arff",
    "chars": 20786,
    "preview": "@RELATION long2\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of the numb"
  },
  {
    "path": "src/main/resources/datasets/artificial/long3.arff",
    "chars": 21168,
    "preview": "@RELATION long3\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of the numb"
  },
  {
    "path": "src/main/resources/datasets/artificial/longsquare.arff",
    "chars": 16903,
    "preview": "% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of the number of clusters.\"\n"
  },
  {
    "path": "src/main/resources/datasets/artificial/lsun.arff",
    "chars": 8481,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/mopsi-finland.arff",
    "chars": 202138,
    "preview": "% source: http://sse.tongji.edu.cn/zhaoqinpei/Software/\r\n\r\n@relation mopsi-finland\r\n\r\n@ATTRIBUTE x REAL\r\n@ATTRIBUTE y RE"
  },
  {
    "path": "src/main/resources/datasets/artificial/mopsi-joensuu.arff",
    "chars": 96523,
    "preview": "% source: http://sse.tongji.edu.cn/zhaoqinpei/Software/\r\n\r\n@relation mopsi-joensuu\r\n\r\n@ATTRIBUTE x REAL\r\n@ATTRIBUTE y RE"
  },
  {
    "path": "src/main/resources/datasets/artificial/pathbased.arff",
    "chars": 4075,
    "preview": "% Pathbased dataset, N=300, M=3, D=2\n%\n% Chang, H. and D.Y. Yeung, Robust path-based spectral clustering. Pattern Recogn"
  },
  {
    "path": "src/main/resources/datasets/artificial/pmf.arff",
    "chars": 11922,
    "preview": "@RELATION PMF\n\n@ATTRIBUTE PMF-VAF REAL\n@ATTRIBUTE AML-VAF REAL\n@ATTRIBUTE Relapse-PMF-VAF REAL\n@ATTRIBUTE class {0,1,2,3"
  },
  {
    "path": "src/main/resources/datasets/artificial/rings.arff",
    "chars": 28987,
    "preview": "% Three rings generated by RapidMiner\n\n@RELATION Rings\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1,2}\n\n@D"
  },
  {
    "path": "src/main/resources/datasets/artificial/s-set1.arff",
    "chars": 102051,
    "preview": "@RELATION s-set1\n%\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE CLASS {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}\n%\n@DATA\n"
  },
  {
    "path": "src/main/resources/datasets/artificial/s-set2.arff",
    "chars": 102099,
    "preview": "% source: http://cs.joensuu.fi/sipu/datasets/\n%\n% Synthetic 2-d data with N=5000 vectors and M=15 Gaussian clusters with"
  },
  {
    "path": "src/main/resources/datasets/artificial/s-set3.arff",
    "chars": 70327,
    "preview": "% source: http://cs.joensuu.fi/sipu/datasets/\n%\n% Synthetic 2-d data with N=5000 vectors and M=15 Gaussian clusters with"
  },
  {
    "path": "src/main/resources/datasets/artificial/s-set4.arff",
    "chars": 70350,
    "preview": "% source: http://cs.joensuu.fi/sipu/datasets/\n%\n% Synthetic 2-d data with N=5000 vectors and M=15 Gaussian clusters with"
  },
  {
    "path": "src/main/resources/datasets/artificial/shapes.arff",
    "chars": 28983,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\") \n% > set.seed(1234)\n% > data <- mlbench.shapes(1000)"
  },
  {
    "path": "src/main/resources/datasets/artificial/simplex.arff",
    "chars": 21758,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\")\n% > set.seed(1234)\n% > data <- mlbench.simplex(n = 1"
  },
  {
    "path": "src/main/resources/datasets/artificial/sizes1.arff",
    "chars": 18783,
    "preview": "@RELATION sizes1.arff\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of th"
  },
  {
    "path": "src/main/resources/datasets/artificial/sizes2.arff",
    "chars": 18567,
    "preview": "@RELATION sizes2.arff\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of th"
  },
  {
    "path": "src/main/resources/datasets/artificial/sizes3.arff",
    "chars": 18467,
    "preview": "@RELATION sizes3.arff\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of th"
  },
  {
    "path": "src/main/resources/datasets/artificial/sizes4.arff",
    "chars": 18391,
    "preview": "@RELATION sizes4.arff\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of th"
  },
  {
    "path": "src/main/resources/datasets/artificial/sizes5.arff",
    "chars": 18343,
    "preview": "@RELATION sizes5.arff\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of th"
  },
  {
    "path": "src/main/resources/datasets/artificial/smile1.arff",
    "chars": 21178,
    "preview": "@RELATION smile1\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of the num"
  },
  {
    "path": "src/main/resources/datasets/artificial/smile2.arff",
    "chars": 21159,
    "preview": "@RELATION smile2\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of the num"
  },
  {
    "path": "src/main/resources/datasets/artificial/smile3.arff",
    "chars": 21317,
    "preview": "@RELATION smile3\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of the num"
  },
  {
    "path": "src/main/resources/datasets/artificial/spherical_4_3.arff",
    "chars": 8279,
    "preview": "% Data_4_3 or AD_4_3: This data is in 3-d space, and has 4 clusters.\n% The total number of  points is 400.\n%\n%  If you u"
  },
  {
    "path": "src/main/resources/datasets/artificial/spherical_5_2.arff",
    "chars": 4123,
    "preview": "% Data_5_2 or AD_5_2: This data is in 2-d space, and has 5 clusters.\n% The total number of  points is 250.\n%\n% If you us"
  },
  {
    "path": "src/main/resources/datasets/artificial/spherical_6_2.arff",
    "chars": 6798,
    "preview": "% Data_6_2: This data is in 2-d space, and has 6clusters.  The total number of  points is 300.\n%\n% If you use this data,"
  },
  {
    "path": "src/main/resources/datasets/artificial/spiral.arff",
    "chars": 19488,
    "preview": "@RELATION spiral.arff\n\n% Handl, Julia, and Joshua Knowles. \"Multiobjective clustering with automatic determination of th"
  },
  {
    "path": "src/main/resources/datasets/artificial/spiralsquare.arff",
    "chars": 28230,
    "preview": "@RELATION spiralsquare\n%\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE CLASS {0,1,2,3,4,5}\n%\n@DATA\n15.464,19.4409,0\n1"
  },
  {
    "path": "src/main/resources/datasets/artificial/square1.arff",
    "chars": 18799,
    "preview": "@RELATION square1\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n9.28531,14.5879,0\n12.3977,8.4"
  },
  {
    "path": "src/main/resources/datasets/artificial/square2.arff",
    "chars": 18790,
    "preview": "@RELATION square2\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n8.28531,13.5879,0\n11.3977,7.4"
  },
  {
    "path": "src/main/resources/datasets/artificial/square3.arff",
    "chars": 18788,
    "preview": "@RELATION square3\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n7.28531,12.5879,0\n10.3977,6.4"
  },
  {
    "path": "src/main/resources/datasets/artificial/square4.arff",
    "chars": 18792,
    "preview": "@RELATION square4\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n6.28531,11.5879,0\n9.39775,5.4"
  },
  {
    "path": "src/main/resources/datasets/artificial/square5.arff",
    "chars": 18807,
    "preview": "@RELATION square5\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n5.28531,10.5879,0\n8.39775,4.4"
  },
  {
    "path": "src/main/resources/datasets/artificial/st900.arff",
    "chars": 19636,
    "preview": "% Also sometimes referred to as st900_2_9. This data is in 2-d space, and has 9 clusters.\n% The total number of  points "
  },
  {
    "path": "src/main/resources/datasets/artificial/target.arff",
    "chars": 16483,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/tetra.arff",
    "chars": 12727,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/threenorm.arff",
    "chars": 29363,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\")\n% > set.seed(1234)\n% > data <- mlbench.threenorm(n=1"
  },
  {
    "path": "src/main/resources/datasets/artificial/triangle1.arff",
    "chars": 18956,
    "preview": "@RELATION triangle1\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n10.6427,13.294,0\n12.1989,10"
  },
  {
    "path": "src/main/resources/datasets/artificial/triangle2.arff",
    "chars": 19198,
    "preview": "@RELATION triangle2\n\n@ATTRIBUTE a0 REAL\n@ATTRIBUTE a1 REAL\n@ATTRIBUTE class {0,1,2,3}\n\n@DATA\n10.2853,15.5879,0\n13.3977,9"
  },
  {
    "path": "src/main/resources/datasets/artificial/twenty.arff",
    "chars": 19153,
    "preview": "% source Handl, Knowles: MOCK\n@RELATION fourty\n\n@ATTRIBUTE x REAL\n@ATTRIBUTE y REAL\n@ATTRIBUTE class {0,1,2,3,4,5,6,7,8,"
  },
  {
    "path": "src/main/resources/datasets/artificial/twodiamonds.arff",
    "chars": 14531,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/wingnut.arff",
    "chars": 19814,
    "preview": "% This Fundamental Clustering Problems Suite (FCPS) may serve as a minimal test for new invented cluster algorithms.\n%\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/xclara.arff",
    "chars": 60648,
    "preview": "% source http://vincentarelbundock.github.io/Rdatasets/datasets.html\n@RELATION xclara\n%\n@ATTRIBUTE x REAL\n@ATTRIBUTE y R"
  },
  {
    "path": "src/main/resources/datasets/artificial/xor.arff",
    "chars": 42612,
    "preview": "% Data generated using R package 'mlbench'\n%\n% > library(\"mlbench\")\n% > set.seed(1234)\n% > data <- mlbench.xor(1000,3)\n%"
  },
  {
    "path": "src/main/resources/datasets/artificial/zelnik1.arff",
    "chars": 12201,
    "preview": "% Toy dataset from http://www.vision.caltech.edu/lihi/Demos/SelfTuningClustering.html\n%\n% Zelnik-Manor, Lihi, and Pietro"
  },
  {
    "path": "src/main/resources/datasets/artificial/zelnik2.arff",
    "chars": 12404,
    "preview": "% Toy dataset from http://www.vision.caltech.edu/lihi/Demos/SelfTuningClustering.html\n%\n% Zelnik-Manor, Lihi, and Pietro"
  },
  {
    "path": "src/main/resources/datasets/artificial/zelnik3.arff",
    "chars": 10907,
    "preview": "% Toy dataset from http://www.vision.caltech.edu/lihi/Demos/SelfTuningClustering.html\n%\n% Zelnik-Manor, Lihi, and Pietro"
  },
  {
    "path": "src/main/resources/datasets/artificial/zelnik4.arff",
    "chars": 25637,
    "preview": "% Toy dataset from http://www.vision.caltech.edu/lihi/Demos/SelfTuningClustering.html\n%\n% Zelnik-Manor, Lihi, and Pietro"
  },
  {
    "path": "src/main/resources/datasets/artificial/zelnik5.arff",
    "chars": 20681,
    "preview": "% Toy dataset from http://www.vision.caltech.edu/lihi/Demos/SelfTuningClustering.html\n%\n% Zelnik-Manor, Lihi, and Pietro"
  },
  {
    "path": "src/main/resources/datasets/artificial/zelnik6.arff",
    "chars": 9772,
    "preview": "% Toy dataset from http://www.vision.caltech.edu/lihi/Demos/SelfTuningClustering.html\n%\n% Zelnik-Manor, Lihi, and Pietro"
  },
  {
    "path": "src/main/resources/datasets/real-world/arrhythmia.arff",
    "chars": 368581,
    "preview": "%  1. Title: Cardiac Arrhythmia Database\n%\n%  2. Sources:\n%     (a) Original owners od Database:\n%         -- 1. H. Alta"
  },
  {
    "path": "src/main/resources/datasets/real-world/balance-scale.arff",
    "chars": 8707,
    "preview": "%1. Title: Balance Scale Weight & Distance Database\n%\n%2. Source Information:\n%    (a) Source: Generated to model psycho"
  },
  {
    "path": "src/main/resources/datasets/real-world/cpu.arff",
    "chars": 5559,
    "preview": "%\n% As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction\n% using instance-based learning with encodi"
  },
  {
    "path": "src/main/resources/datasets/real-world/dermatology.arff",
    "chars": 32417,
    "preview": "% 1. Title: Dermatology Database\n% \n% 2. Source Information:\n%    (a) Original owners:\n%        -- 1. Nilsel Ilter, M.D."
  },
  {
    "path": "src/main/resources/datasets/real-world/ecoli.arff",
    "chars": 16158,
    "preview": "% 1. Title: Protein Localization Sites\n%\n%\n% 2. Creator and Maintainer:\n%        Kenta Nakai\n%              Institue of "
  },
  {
    "path": "src/main/resources/datasets/real-world/german.arff",
    "chars": 80805,
    "preview": "@relation german\n\n@attribute Status_of_existing_checking_account {A12,A14,A11,A13}\n@attribute Duration_in_month numeric\n"
  },
  {
    "path": "src/main/resources/datasets/real-world/glass.arff",
    "chars": 18191,
    "preview": "% source: https://archive.ics.uci.edu/ml/datasets/Glass+Identification\n%\n% 1. Title: Glass Identification Database\n%\n% 2"
  },
  {
    "path": "src/main/resources/datasets/real-world/haberman.arff",
    "chars": 4863,
    "preview": "% 1. Title: Haberman's Survival Data\n% \n% 2. Sources:\n%    (a) Donor:   Tjen-Sien Lim (limt@stat.wisc.edu)\n%    (b) Date"
  },
  {
    "path": "src/main/resources/datasets/real-world/heart-statlog.arff",
    "chars": 12950,
    "preview": "% source: https://archive.ics.uci.edu/ml/datasets/Statlog+(Heart)\n%\n% This database contains 13 attributes (which have b"
  },
  {
    "path": "src/main/resources/datasets/real-world/iono.arff",
    "chars": 87168,
    "preview": "%\n% GALE automatic stratified fold generation for CV\n%\n% Generated at        : Mon Nov 27 15:46:14 GMT+00:00 2000\n% Rand"
  },
  {
    "path": "src/main/resources/datasets/real-world/iris.arff",
    "chars": 4987,
    "preview": "%\n% GALE automatic stratified fold generation for CV\n%\n% Generated at        : Mon Nov 27 15:50:01 GMT+00:00 2000\n% Rand"
  },
  {
    "path": "src/main/resources/datasets/real-world/letter.arff",
    "chars": 719873,
    "preview": "% 1. TITLE:\n% \tLetter Image Recognition Data\n%\n%    The objective is to identify each of a large number of black-and-whi"
  },
  {
    "path": "src/main/resources/datasets/real-world/segment.arff",
    "chars": 306014,
    "preview": "% 1. Title: Image Segmentation data\n% \n% 2. Source Information\n%    -- Creators: Vision Group, University of Massachuset"
  },
  {
    "path": "src/main/resources/datasets/real-world/sonar.arff",
    "chars": 94706,
    "preview": "% NAME: Sonar, Mines vs. Rocks\n% \n% SUMMARY: This is the data set used by Gorman and Sejnowski in their study\n% of the c"
  },
  {
    "path": "src/main/resources/datasets/real-world/tae.arff",
    "chars": 4110,
    "preview": "% 1. Title: Teaching Assistant Evaluation\n% \n% 2. Sources:\n%    (a) Collector: Wei-Yin Loh (Department of Statistics, UW"
  },
  {
    "path": "src/main/resources/datasets/real-world/thy.arff",
    "chars": 4992,
    "preview": "@relation new-thyroid\n@attribute a1 real\n@attribute a2 real\n@attribute a3 real\n@attribute a4 real\n@attribute a5 real\n@at"
  },
  {
    "path": "src/main/resources/datasets/real-world/vehicle.arff",
    "chars": 63838,
    "preview": "% !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!IMPORTANT!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n% \n%         This dataset comes from the Tur"
  },
  {
    "path": "src/main/resources/datasets/real-world/vowel.arff",
    "chars": 91401,
    "preview": "% \n%                 Introduction\n%                 ============\n% \n% In my work on context-sensitive learning, I used t"
  },
  {
    "path": "src/main/resources/datasets/real-world/water-treatment.arff",
    "chars": 130125,
    "preview": "% 1. Title: Faults in a urban waste water treatment plant\r\n% \r\n% 2. Source Information:\r\n%    -- Creators: Manel Poch (i"
  },
  {
    "path": "src/main/resources/datasets/real-world/wdbc.arff",
    "chars": 131185,
    "preview": "% 1. Title: Wisconsin Diagnostic Breast Cancer (WDBC)\r\n%\r\n% 2. Source Information\r\n%\r\n% a) Creators:\r\n%\r\n%   Dr. William"
  },
  {
    "path": "src/main/resources/datasets/real-world/wine.arff",
    "chars": 14528,
    "preview": "% 1. Title of Database: Wine recognition data\n% \tUpdated Sept 21, 1998 by C.Blake : Added attribute information\n% \n% 2. "
  },
  {
    "path": "src/main/resources/datasets/real-world/wisc.arff",
    "chars": 45347,
    "preview": "%\r\n% File generated by BRAIN 2.00\r\n% \r\n% Generated at:\t\tFri Nov 04 13:17:17 2005\r\n\r\n% File name:\t\t./../wisconsin-breast-"
  },
  {
    "path": "src/main/resources/datasets/real-world/yeast.arff",
    "chars": 95251,
    "preview": "@RELATION 'Yeast'\n\n@ATTRIBUTE SequenceName string\n@ATTRIBUTE mcg real\n@ATTRIBUTE gvh real\n@ATTRIBUTE alm real\n@ATTRIBUTE"
  },
  {
    "path": "src/main/resources/datasets/real-world/zoo.arff",
    "chars": 5644,
    "preview": "@relation zoo\n\n@attribute HAIR integer [0, 1]\n@attribute FEATHERS integer [0, 1]\n@attribute EGGS integer [0, 1]\n@attribu"
  },
  {
    "path": "src/main/resources/log4j2.properties",
    "chars": 285,
    "preview": "status = error\ndest = err\nname = PropertiesConfig\n\nappender.console.type = Console\nappender.console.name = STDOUT\nappend"
  },
  {
    "path": "src/main/resources/org/clueminer/clustering/benchmark/Bundle.properties",
    "chars": 221,
    "preview": "# Localized module labels. Defaults taken from POM (<name>, <description>, <groupId>) if unset.\n#OpenIDE-Module-Name=\n#O"
  },
  {
    "path": "src/test/java/org/clueminer/clustering/benchmark/ExperimentTest.java",
    "chars": 1694,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/test/java/org/clueminer/clustering/benchmark/HclustBenchmarkTest.java",
    "chars": 4961,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/test/java/org/clueminer/clustering/benchmark/chameleon2/Cham2BenchTest.java",
    "chars": 1508,
    "preview": "/*\n * Copyright (C) 2011-2016 clueminer.org\n *\n * This program is free software: you can redistribute it and/or modify\n "
  },
  {
    "path": "src/test/resources/log4j2.properties",
    "chars": 285,
    "preview": "status = error\ndest = err\nname = PropertiesConfig\n\nappender.console.type = Console\nappender.console.name = STDOUT\nappend"
  },
  {
    "path": "updreadme.rb",
    "chars": 570,
    "preview": "#!/bin/ruby\n\ndatasets=[]\nDir[\"src/main/resources/datasets/artificial/*.arff\"].each do |f|\n  datasets << File.basename(f)"
  }
]

About this extraction

This page contains the full source code of the deric/clustering-benchmark GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 197 files (16.6 MB), approximately 4.4M tokens, and a symbol index with 172 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo