Repository: Confusezius/Revisiting_Deep_Metric_Learning_PyTorch
Branch: master
Commit: efddbf23ccbe
Files: 73
Total size: 325.4 KB

Directory structure:
gitextract_la23kpcg/

├── .gitignore
├── LICENSE
├── README.md
├── Result_Evaluations.py
├── Sample_Runs/
│   └── ICML2020_RevisitDML_SampleRuns.sh
├── architectures/
│   ├── __init__.py
│   ├── bninception.py
│   ├── googlenet.py
│   └── resnet50.py
├── batchminer/
│   ├── __init__.py
│   ├── distance.py
│   ├── intra_random.py
│   ├── lifted.py
│   ├── npair.py
│   ├── parametric.py
│   ├── random.py
│   ├── random_distance.py
│   ├── rho_distance.py
│   ├── semihard.py
│   └── softhard.py
├── criteria/
│   ├── __init__.py
│   ├── adversarial_separation.py
│   ├── angular.py
│   ├── arcface.py
│   ├── contrastive.py
│   ├── histogram.py
│   ├── lifted.py
│   ├── margin.py
│   ├── multisimilarity.py
│   ├── npair.py
│   ├── proxynca.py
│   ├── quadruplet.py
│   ├── snr.py
│   ├── softmax.py
│   ├── softtriplet.py
│   └── triplet.py
├── datasampler/
│   ├── __init__.py
│   ├── class_random_sampler.py
│   ├── d2_coreset_sampler.py
│   ├── disthist_batchmatch_sampler.py
│   ├── fid_batchmatch_sampler.py
│   ├── greedy_coreset_sampler.py
│   ├── random_sampler.py
│   └── samplers.py
├── datasets/
│   ├── __init__.py
│   ├── basic_dataset_scaffold.py
│   ├── cars196.py
│   ├── cub200.py
│   └── stanford_online_products.py
├── evaluation/
│   └── __init__.py
├── main.py
├── metrics/
│   ├── __init__.py
│   ├── c_f1.py
│   ├── c_mAP_1000.py
│   ├── c_mAP_c.py
│   ├── c_mAP_lim.py
│   ├── c_nmi.py
│   ├── c_recall.py
│   ├── compute_stack.py
│   ├── dists.py
│   ├── e_recall.py
│   ├── f1.py
│   ├── mAP.py
│   ├── mAP_1000.py
│   ├── mAP_c.py
│   ├── mAP_lim.py
│   ├── nmi.py
│   └── rho_spectrum.py
├── parameters.py
├── toy_experiments/
│   └── toy_example_diagonal_lines.py
└── utilities/
    ├── __init__.py
    ├── logger.py
    └── misc.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
__pycache__
*.pyc
Training_Results
wandb
diva_main.py


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2020 Karsten Roth

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================
# Deep Metric Learning Research in PyTorch

---
## What can I find here?

This repository contains all code and implementations used in:

```
Revisiting Training Strategies and Generalization Performance in Deep Metric Learning
```

accepted to **ICML 2020**.

**Link**: https://arxiv.org/abs/2002.08473

The code is meant to serve as a research starting point in Deep Metric Learning.
By implementing key baselines under a consistent setting and logging a vast set of metrics, it should be easier to ensure that method gains are not due to implementational variations, while better understanding driving factors.

It is set up in a modular way to allow for fast and detailed prototyping, but with key elements written in a way that allows the code to be directly copied into other pipelines. In addition, multiple training and test metrics are logged in W&B to allow for easy and large-scale evaluation.

Finally, please find a public W&B repo with key runs performed in the paper here: https://app.wandb.ai/confusezius/RevisitDML.

**Contact**: Karsten Roth, karsten.rh1@gmail.com  

*Suggestions are always welcome!*

---
## Some Notes:

If you use this code in your research, please cite
```
@misc{roth2020revisiting,
    title={Revisiting Training Strategies and Generalization Performance in Deep Metric Learning},
    author={Karsten Roth and Timo Milbich and Samarth Sinha and Prateek Gupta and Björn Ommer and Joseph Paul Cohen},
    year={2020},
    eprint={2002.08473},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
```

This repository contains (in parts) code that has been adapted from:
* https://github.com/idstcv/SoftTriple
* https://github.com/bnu-wangxun/Deep_Metric
* https://github.com/valerystrizh/pytorch-histogram-loss
* https://github.com/Confusezius/Deep-Metric-Learning-Baselines

Make sure to also check out the following repo with a great plug-and-play implementation of DML methods:
* https://github.com/KevinMusgrave/pytorch-metric-learning

---

**[All implemented methods and metrics are listed at the bottom!](#-implemented-methods)**

---

## Paper-related Information

#### Reproduce results from our paper **[Revisiting Training Strategies and Generalization Performance in Deep Metric Learning](https://arxiv.org/pdf/2002.08473.pdf)**

* *ALL* standardized Runs that were used are available in `Revisit_Runs.sh`.
* These runs are also logged in this public W&B repo: https://app.wandb.ai/confusezius/RevisitDML.
* All Runs and their respective metrics can be downloaded and evaluated to generate the plots in our paper by following `Result_Evaluations.py`. This also allows for potential introspection of other relations. It also converts results directly into Latex-table format with mean and standard deviations.
* To utilize different batch-creation methods, simply set the flag `--data_sampler` to the method of choice. Allowed flags are listed in `datasampler/__init__.py`.
* To use the proposed spectral regularization for tuple-based methods, set `--batch_mining rho_distance` with flip probability `--miner_rho_distance_cp e.g. 0.2`.
* A script to run the toy experiments in the paper is provided in `toy_experiments`.

**Note**: There may be small deviations in results based on the Hardware (e.g. between P100 and RTX GPUs) and Software (different PyTorch/Cuda versions) used to run these experiments, but they should be covered in the standard deviations reported in the paper.

---

## How to use this Repo

### Requirements:

* PyTorch 1.2.0+ & Faiss-Gpu
* Python 3.6+
* pretrainedmodels, torchvision 0.3.0+

An exemplary setup of a virtual environment containing everything needed:
```
(1) wget  https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
(2) bash Miniconda3-latest-Linux-x86_64.sh (say yes to append path to bashrc)
(3) source .bashrc
(4) conda create -n DL python=3.6
(5) conda activate DL
(6) conda install matplotlib scipy scikit-learn scikit-image tqdm pandas pillow
(7) conda install pytorch torchvision faiss-gpu cudatoolkit=10.0 -c pytorch
(8) pip install wandb pretrainedmodels
(9) Run the scripts!
```

### Datasets:
Data for
* CUB200-2011 (http://www.vision.caltech.edu/visipedia/CUB-200.html)
* CARS196 (https://ai.stanford.edu/~jkrause/cars/car_dataset.html)
* Stanford Online Products (http://cvgl.stanford.edu/projects/lifted_struct/)

can be downloaded either from the respective project sites or directly via Dropbox:

* CUB200-2011 (1.08 GB): https://www.dropbox.com/s/tjhf7fbxw5f9u0q/cub200.tar?dl=0
* CARS196 (1.86 GB): https://www.dropbox.com/s/zi2o92hzqekbmef/cars196.tar?dl=0
* SOP (2.84 GB): https://www.dropbox.com/s/fu8dgxulf10hns9/online_products.tar?dl=0

**The latter ensures that the folder structure is already consistent with this pipeline and the dataloaders**.   

Otherwise, please make sure that the datasets have the following internal structure:

* For CUB200-2011/CARS196:
```
cub200/cars196
└───images
|    └───001.Black_footed_Albatross
|           │   Black_Footed_Albatross_0001_796111
|           │   ...
|    ...
```

* For Stanford Online Products:
```
online_products
└───images
|    └───bicycle_final
|           │   111085122871_0.jpg
|    ...
|
└───Info_Files
|    │   bicycle.txt
|    │   ...
```

Assuming your folder is placed in e.g. `<$datapath/cub200>`, pass `$datapath` as input to `--source`.

### Training:
Training is done by using `main.py` and setting the respective flags, all of which are listed and explained in `parameters.py`. A vast set of exemplary runs is provided in `Revisit_Runs.sh`.

**[I.]** **A basic sample run using default parameters would like this**:

```
python main.py --loss margin --batch_mining distance --log_online \
              --project DML_Project --group Margin_with_Distance --seed 0 \
              --gpu 0 --bs 112 --data_sampler class_random --samples_per_class 2 \
              --arch resnet50_frozen_normalize --source $datapath --n_epochs 150 \
              --lr 0.00001 --embed_dim 128 --evaluate_on_gpu
```

The purpose of each flag explained:

* `--loss <loss_name>`: Name of the training objective used. See folder `criteria` for implementations of these methods.
* `--batch_mining <batchminer_name>`: Name of the batch-miner to use (for tuple-based ranking methods). See folder `batch_mining` for implementations of these methods.
* `--log_online`: Log metrics online via either W&B (Default) or CometML. Regardless, plots, weights and parameters are all stored offline as well.
*  `--project`, `--group`: Project name as well as name of the run. Different seeds will be logged into the same `--group` online. The group as well as the used seed also define the local savename.
* `--seed`, `--gpu`, `--source`: Basic Parameters setting the training seed, the used GPU and the path to the parent folder containing the respective Datasets.
* `--arch`: The utilized backbone, e.g. ResNet50. You can append `_frozen` and `_normalize` to the name to ensure that BatchNorm layers are frozen and embeddings are normalized, respectively.
* `--data_sampler`, `--samples_per_class`: How to construct a batch. The default method, `class_random`, selects classes at random and places `<samples_per_class>` samples into the batch until the batch is filled.
* `--lr`, `--n_epochs`, `--bs` ,`--embed_dim`: Learning rate, number of training epochs, the batchsize and the embedding dimensionality.  
* `--evaluate_on_gpu`: If set, all metrics are computed using the gpu - requires Faiss-GPU and may need additional GPU memory.

#### Some Notes:
* During training, metrics listed in `--evaluation_metrics` will be logged for both training and validation/test set. If you do not care about detailed training metric logging, simply set the flag `--no_train_metrics`. A checkpoint is saved for improvements in metrics listed in `--storage_metrics` on training, validation or test sets. Detailed information regarding the available metrics can be found at the bottom of this `README`.
* If one wishes to use a training/validation split, simply set `--use_tv_split` and `--tv_split_perc <train/val split percentage>`.


**[II.]** **Advanced Runs**:

```
python main.py --loss margin --batch_mining distance --loss_margin_beta 0.6 --miner_distance_lower_cutoff 0.5 ... (basic parameters)
```

* To use specific parameters that are loss, batchminer or e.g. datasampler-related, simply set the respective flag.
* For structure and ease of use, parameters relating to a specifc loss function/batchminer etc. are marked as e.g. `--loss_<lossname>_<parameter_name>`, see `parameters.py`.
* However, every parameter can be called from every class, as all parameters are stored in a shared namespace that is passed to all methods. This makes it easy to create novel fusion losses and the likes.


### Evaluating Results with W&B
Here some information on using W&B (highly encouraged!)

* Create an account here (free): https://wandb.ai
* After the account is set, make sure to include your API key in `parameters.py` under `--wandb_key`.
* To make sure that W&B data can be stored, ensure to run `wandb on` in the folder pointed to by `--save_path`.
* When data is logged online to W&B, one can use `Result_Evaluations.py` to download all data, create named metric and correlation plots and output a summary in the form of a latex-ready table with mean and standard deviations of all metrics. **This ensures that there are no errors between computed and reported results.**


### Creating custom methods:

1. **Create custom objectives**: Simply take a look at e.g. `criteria/margin.py`, and ensure that the used methods has the following properties:
  * Inherit from `torch.nn.Module` and define a custom `forward()` function.
  * When using trainable parameters, make sure to either provide a `self.lr` to set the learning rate of the loss-specific parameters, or set `self.optim_dict_list`, which is a list containing optimization dictionaries passed to the optimizer (see e.g `criteria/proxynca.py`). If both are set, `self.optim_dict_list` has priority.
  * Depending on the loss, remember to set the variables `ALLOWED_MINING_OPS  = None or list of allowed mining operations`, `REQUIRES_BATCHMINER = False or True`, `REQUIRES_OPTIM = False or True` to denote if the method needs a batchminer or optimization of internal parameters.


2. **Create custom batchminer**: Simply take a look at e.g. `batch_mining/distance.py` - The miner needs to be a class with a defined `__call__()`-function, taking in a batch and labels and returning e.g. a list of triplets.

3. **Create custom datasamplers**:Simply take a look at e.g. `datasampler/class_random_sampler.py`. The sampler needs to inherit from `torch.utils.data.sampler.Sampler` and has to provide a `__iter__()` and a `__len__()` function. It has to yield a set of indices that are used to create the batch.


---

# Implemented Methods

For a detailed explanation of everything, please refer to the supplementary of our paper!

### DML criteria

* **Angular** [[Deep Metric Learning with Angular Loss](https://arxiv.org/pdf/1708.01682.pdf)] `--loss angular`
* **ArcFace** [[ArcFace: Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/pdf/1801.07698.pdf)] `--loss arcface`
* **Contrastive** [[Dimensionality Reduction by Learning an Invariant Mapping](http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf)] `--loss contrastive`
* **Generalized Lifted Structure** [[In Defense of the Triplet Loss for Person Re-Identification](https://arxiv.org/abs/1703.07737)] `--loss lifted`
* **Histogram** [[Learning Deep Embeddings with Histogram Loss](https://arxiv.org/pdf/1611.00822.pdf)] `--loss histogram`
* **Marginloss** [[Sampling Matters in Deep Embeddings Learning](https://arxiv.org/abs/1706.07567)] `--loss margin`
* **MultiSimilarity** [[Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning](https://arxiv.org/abs/1904.06627)] `--loss multisimilarity`
* **N-Pair** [[Improved Deep Metric Learning with Multi-class N-pair Loss Objective](https://papers.nips.cc/paper/6200-improved-deep-metric-learning-with-multi-class-n-pair-loss-objective)] `--loss npair`
* **ProxyNCA** [[No Fuss Distance Metric Learning using Proxies](https://arxiv.org/pdf/1703.07464.pdf)] `--loss proxynca`
* **Quadruplet** [[Beyond triplet loss: a deep quadruplet network for person re-identification](https://arxiv.org/abs/1704.01719)] `--loss quadruplet`
* **Signal-to-Noise Ratio (SNR)** [[Signal-to-Noise Ratio: A Robust Distance Metric for Deep Metric Learning](https://arxiv.org/pdf/1904.02616.pdf)] `--loss snr`
* **SoftTriple** [[SoftTriple Loss: Deep Metric Learning Without Triplet Sampling](https://arxiv.org/abs/1909.05235)] `--loss softtriplet`
* **Normalized Softmax** [[Classification is a Strong Baseline for Deep Metric Learning](https://arxiv.org/abs/1811.12649)] `--loss softmax`
* **Triplet** [[Facenet: A unified embedding for face recognition and clustering](https://arxiv.org/abs/1503.03832)] `--loss triplet`

### DML batchminer

* **Random** [[Facenet: A unified embedding for face recognition and clustering](https://arxiv.org/abs/1503.03832)] `--batch_mining random`
* **Semihard** [[Facenet: A unified embedding for face recognition and clustering](https://arxiv.org/abs/1503.03832)] `--batch_mining semihard`
* **Softhard** [https://github.com/Confusezius/Deep-Metric-Learning-Baselines] `--batch_mining softhard`
* **Distance-based** [[Sampling Matters in Deep Embeddings Learning](https://arxiv.org/abs/1706.07567)] `--batch_mining distance`
* **Rho-Distance** [[Revisiting Training Strategies and Generalization Performance in Deep Metric Learning](https://arxiv.org/abs/2002.08473)] `--batch_mining rho_distance`
* **Parametric** [[PADS: Policy-Adapted Sampling for Visual Similarity Learning](https://arxiv.org/abs/2003.11113)] `--batch_mining parametric`

### Architectures

* **ResNet50** [[Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)] e.g. `--arch resnet50_frozen_normalize`.
* **Inception-BN** [[Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)] e.g. `--arch bninception_normalize_frozen`.
* **GoogLeNet** (torchvision variant w/ BN) [[Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)] e.g. `--arch googlenet`.

### Datasets
* **CUB200-2011** [[Caltech-UCSD Birds-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)]  `--dataset cub200`.
* **CARS196** [[Cars Dataset](https://ai.stanford.edu/~jkrause/cars/car_dataset.html)] `--dataset cars196`.
* **Stanford Online Products** [[Deep Metric Learning via Lifted Structured Feature Embedding](https://cvgl.stanford.edu/projects/lifted_struct/)] `--dataset online_products`.


### Evaluation Metrics
**Metrics based on Euclidean Distances**
* **Recall@k**: Include R@1 e.g. with `e_recall@1` into the list of evaluation metrics `--evaluation_metrics`.
* **Normalized Mutual Information (NMI)**: Include with `nmi`.
* **F1**: include with `f1`.
* **mAP (class-averaged)**: Include standard mAP at Recall with `mAP_lim`. You may also include `mAP_1000` for mAP limited to Recall@1000, and `mAP_c` limited to mAP at Recall@Max_Num_Samples_Per_Class. Note that all of these are heavily correlated.

**Metrics based on Cosine Similarities** *(not included by default)*
* **Cosine Recall@k**: Cosine-Similarity variant of Recall@k. Include with `c_recall@k` in `--evaluation_metrics`.
* **Cosine Normalized Mutual Information (NMI)**: Include with `c_nmi`.
* **Cosine F1**: include with `c_f1`.
* **Cosine mAP (class-averaged)**: Include cosine similarity mAP at Recall variants with `c_mAP_lim`. You may also include `c_mAP_1000` for mAP limited to Recall@1000, and `c_mAP_c` limited to mAP at Recall@Max_Num_Samples_Per_Class.

**Embedding Space Metrics**
* **Spectral Variance**: This metric refers to the spectral decay metric used in our ICML paper. Include it with `rho_spectrum@1`. To exclude the `k` largest spectral values for a more robust estimate, simply include `rho_spectrum@k+1`. Adding `rho_spectrum@0` logs the whole singular value distribution, and `rho_spectrum@-1` computes KL(q,p) instead of KL(p,q).
* **Mean Intraclass Distance**: Include the mean intraclass distance via `dists@intra`.
* **Mean Interclass Distance**: Include the mean interlcass distance via `dists@inter`.
* **Ratio Intra- to Interclass Distance**: Include the ratio of distances via `dists@intra_over_inter`.


================================================
FILE: Result_Evaluations.py
================================================
"""
This scripts downloads and evaluates W&B run data to produce plots and tables used in the original paper.
"""
import numpy as np
import wandb
import matplotlib.pyplot as plt


def get_data(project):
    from tqdm import tqdm
    api = wandb.Api()
    # Project is specified by <entity/project-name>
    runs = api.runs(project)

    info_list = []

    # history_list = []
    for run in tqdm(runs, desc='Downloading data...'):
        config = {k:v for k,v in run.config.items() if not k.startswith('_')}
        info_dict = {'metrics':run.history(), 'config':config}
        info_list.append((run.name,info_dict))
    return info_list

all_df  = get_data("confusezius/RevisitDML")

names_to_check = list(np.unique(['_s'.join(x[0].split('_s')[:-1]) for x in all_df]))
metrics        = ['Test: discriminative_e_recall: e_recall@1', 'Test: discriminative_e_recall: e_recall@2', \
                  'Test: discriminative_e_recall: e_recall@4', 'Test: discriminative_nmi: nmi', \
                  'Test: discriminative_f1: f1', 'Test: discriminative_mAP: mAP']
metric_names   = ['R@1', 'R@2', 'R@4', 'NMI', 'F1', 'mAP']

idxs = {x:[i for i,y in enumerate(all_df) if x=='_s'.join(y[0].split('_s')[:-1])] for x in names_to_check}
vals = {}
for group, runs in idxs.items():
    if 'CUB' in group:
        min_len = 40
    elif 'CAR' in group:
        min_len = 40
    elif 'SOP' in group:
        min_len = 40

    vals[group] = {metric_name:[] for metric_name in metric_names}
    vals[group]['Max_Epoch'] = []
    vals[group]['Intra_over_Inter'] = []
    vals[group]['Intra'] = []
    vals[group]['Inter'] = []
    vals[group]['Rho1'] = []
    vals[group]['Rho2'] = []
    vals[group]['Rho3'] = []
    vals[group]['Rho4'] = []
    for i,run in enumerate(runs):
        name, data = all_df[run]
        for metric,metric_name in zip(metrics, metric_names):
            if len(data['metrics']):
                sub_data = list(data['metrics'][metric])
                if len(sub_data)>min_len:
                    vals[group][metric_name].append(np.nanmax(sub_data))
                    if metric_name=='R@1':
                        r_argmax = np.nanargmax(sub_data)
                        vals[group]['Max_Epoch'].append(r_argmax)
                        vals[group]['Intra_over_Inter'].append(data['metrics']['Train: discriminative_dists: dists@intra_over_inter'][r_argmax])
                        vals[group]['Intra'].append(data['metrics']['Train: discriminative_dists: dists@intra'][r_argmax])
                        vals[group]['Inter'].append(data['metrics']['Train: discriminative_dists: dists@inter'][r_argmax])
                        vals[group]['Rho1'].append(data['metrics']['Train: discriminative_rho_spectrum: rho_spectrum@-1'][r_argmax])
                        vals[group]['Rho2'].append(data['metrics']['Train: discriminative_rho_spectrum: rho_spectrum@1'][r_argmax])
                        vals[group]['Rho3'].append(data['metrics']['Train: discriminative_rho_spectrum: rho_spectrum@2'][r_argmax])
                        vals[group]['Rho4'].append(data['metrics']['Train: discriminative_rho_spectrum: rho_spectrum@10'][r_argmax])
    vals[group] = {metric_name:(np.mean(metric_vals),np.std(metric_vals)) for metric_name, metric_vals in vals[group].items()}


###
cub_vals = {key:item for key,item in vals.items() if 'CUB' in key}
car_vals = {key:item for key,item in vals.items() if 'CAR' in key}
sop_vals = {key:item for key,item in vals.items() if 'SOP' in key}


##########
def name_filter(n):
    n = '_'.join(n.split('_')[1:])
    return n

def name_adjust(n, prep='', app='', for_plot=True):
    if 'Margin_b06_Distance' in n:
        t = 'Margin (D), \\beta=0.6' if for_plot else 'Margin (D, \\beta=0.6)'
    elif 'Margin_b12_Distance' in n:
        t = 'Margin (D), \\beta=1.2' if for_plot else 'Margin (D, \\beta=1.2)'
    elif 'ArcFace' in n:
        t = 'ArcFace'
    elif 'Histogram' in n:
        t = 'Histogram'
    elif 'SoftTriple' in n:
        t = 'SoftTriple'
    elif 'Contrastive' in n:
        t = 'Contrastive (D)'
    elif 'Triplet_Distance' in n:
        t = 'Triplet (D)'
    elif 'Quadruplet_Distance' in n:
        t = 'Quadruplet (D)'
    elif 'SNR_Distance' in n:
        t = 'SNR (D)'
    elif 'Triplet_Random' in n:
        t = 'Triplet (R)'
    elif 'Triplet_Semihard' in n:
        t = 'Triplet (S)'
    elif 'Triplet_Softhard' in n:
        t = 'Triplet (H)'
    elif 'Softmax' in n:
        t = 'Softmax'
    elif 'MS' in n:
        t = 'Multisimilarity'
    else:
        t = '_'.join(n.split('_')[1:])

    if for_plot:
        t = r'${0}$'.format(t)

    return prep+t+app


########
def single_table(vals):
    print_str = ''
    for name,metrics in vals.items():
        prep = 'R-' if 'reg_' in name else ''
        name = name_adjust(name, for_plot=False, prep=prep)
        add = '{0} & ${1:2.2f}\\pm{2:2.2f}$ & ${3:2.2f}\\pm{4:2.2f}$ & ${5:2.2f}\\pm{6:2.2f}$ & ${7:2.2f}\\pm{8:2.2f}$ & ${9:2.2f}\\pm{10:2.2f}$ & ${11:2.2f}\\pm{12:2.2f}$'.format(name,
                                                                                                                                     metrics['R@1'][0]*100, metrics['R@1'][1]*100,
                                                                                                                                     metrics['R@2'][0]*100, metrics['R@2'][1]*100,
                                                                                                                                     metrics['F1'][0]*100, metrics['F1'][1]*100,
                                                                                                                                     metrics['mAP'][0]*100, metrics['mAP'][1]*100,
                                                                                                                                     metrics['NMI'][0]*100, metrics['NMI'][1]*100,
                                                                                                                                     metrics['Max_Epoch'][0], metrics['Max_Epoch'][1])
        print_str += add
        print_str += '\\'
        print_str += '\\'
        print_str += '\n'
    return print_str

print(single_table(cub_vals))
print(single_table(car_vals))
print(single_table(sop_vals))


########
def shared_table():
    cub_names, car_names, sop_names = list(cub_vals.keys()), list(car_vals.keys()), list(sop_vals.keys())
    cub_names  = [name_adjust(n, for_plot=False, prep='R-' if 'reg_' in n else '') for n in cub_names]
    cub_vals_2 = {name_adjust(n, for_plot=False, prep='R-' if 'reg_' in n else ''):item for n,item in cub_vals.items()}
    car_names = [name_adjust(n, for_plot=False, prep='R-' if 'reg_' in n else '') for n in car_names]
    car_vals_2 = {name_adjust(n, for_plot=False, prep='R-' if 'reg_' in n else ''):item for n,item in car_vals.items()}
    sop_names = [name_adjust(n, for_plot=False, prep='R-' if 'reg_' in n else '') for n in sop_names]
    sop_vals_2 = {name_adjust(n, for_plot=False, prep='R-' if 'reg_' in n else ''):item for n,item in sop_vals.items()}
    cub_vvals, car_vvals, sop_vvals = list(cub_vals.values()), list(car_vals.values()), list(sop_vals.values())
    unique_names = np.unique(np.concatenate([cub_names, car_names, sop_names], axis=0).reshape(-1))
    unique_names = sorted([x for x in unique_names if 'R-' not in x]) + sorted([x for x in unique_names if 'R-' in x])

    print_str = ''

    for name in unique_names:
        cub_rm, cub_rs = ('{0:2.2f}'.format(cub_vals_2[name]['R@1'][0]*100), '{0:2.2f}'.format(cub_vals_2[name]['R@1'][1]*100)) if name in cub_vals_2 else ('-', '-')
        cub_nm, cub_ns = ('{0:2.2f}'.format(cub_vals_2[name]['NMI'][0]*100), '{0:2.2f}'.format(cub_vals_2[name]['NMI'][1]*100)) if name in cub_vals_2 else ('-', '-')
        car_rm, car_rs = ('{0:2.2f}'.format(car_vals_2[name]['R@1'][0]*100), '{0:2.2f}'.format(car_vals_2[name]['R@1'][1]*100)) if name in car_vals_2 else ('-', '-')
        car_nm, car_ns = ('{0:2.2f}'.format(car_vals_2[name]['NMI'][0]*100), '{0:2.2f}'.format(car_vals_2[name]['NMI'][1]*100)) if name in car_vals_2 else ('-', '-')
        sop_rm, sop_rs = ('{0:2.2f}'.format(sop_vals_2[name]['R@1'][0]*100), '{0:2.2f}'.format(sop_vals_2[name]['R@1'][1]*100)) if name in sop_vals_2 else ('-', '-')
        sop_nm, sop_ns = ('{0:2.2f}'.format(sop_vals_2[name]['NMI'][0]*100), '{0:2.2f}'.format(sop_vals_2[name]['NMI'][1]*100)) if name in sop_vals_2 else ('-', '-')

        add = '{0} & ${1}\\pm{2}$ & ${3}\\pm{4}$ & ${5}\\pm{6}$ & ${7}\\pm{8}$ & ${9}\\pm{10}$ & ${11}\\pm{12}$'.format(name,
                                                                                                                        cub_rm, cub_rs,
                                                                                                                        cub_nm, cub_ns,
                                                                                                                        car_rm, car_rs,
                                                                                                                        car_nm, car_ns,
                                                                                                                        sop_rm, sop_rs,
                                                                                                                        sop_nm, sop_ns)

        print_str += add
        print_str += '\\'
        print_str += '\\'
        print_str += '\n'
    return print_str

print(shared_table())


"""==================================================="""
def give_basic_metr(vals, key='CUB'):
    if key=='CUB':
        Basic  = sorted(list(filter(lambda x: '{}_'.format(key) in x, list(vals.keys()))))
    elif key=='CARS':
        Basic  = sorted(list(filter(lambda x: 'CARS_' in x, list(vals.keys()))))
    elif key=='SOP':
        Basic  = sorted(list(filter(lambda x: 'SOP_' in x, list(vals.keys()))))

    basic_recall      = np.array([vals[k]['R@1'][0] for k in Basic])
    basic_recall_err  = np.array([vals[k]['R@1'][1] for k in Basic])
    #
    basic_recall2 = np.array([vals[k]['R@2'][0] for k in Basic])
    basic_recall4 = np.array([vals[k]['R@4'][0] for k in Basic])
    basic_nmi     = np.array([vals[k]['NMI'][0] for k in Basic])
    basic_f1      = np.array([vals[k]['F1'][0] for k in Basic])
    basic_map     = np.array([vals[k]['mAP'][0] for k in Basic])

    mets = [basic_recall, basic_recall2, basic_recall4, basic_nmi, basic_f1, basic_map]

    return Basic, mets, basic_recall, basic_recall_err


def give_reg_metr(vals, key='CUB'):
    if key=='CUB':
        RhoReg = sorted(list(filter(lambda x: '{}reg_'.format(key) in x, list(vals.keys()))))
    elif key=='CARS':
        RhoReg = sorted(list(filter(lambda x: 'CARreg_' in x, list(vals.keys()))))
    elif key=='SOP':
        RhoReg = sorted(list(filter(lambda x: 'SOPreg_' in x, list(vals.keys()))))

    rho_recall      = np.array([vals[k]['R@1'][0] for k in RhoReg])
    rho_recall_err  = np.array([vals[k]['R@1'][1] for k in RhoReg])
    #
    rho_recall2 = np.array([vals[k]['R@2'][0] for k in RhoReg])
    rho_recall4 = np.array([vals[k]['R@4'][0] for k in RhoReg])
    rho_nmi     = np.array([vals[k]['NMI'][0] for k in RhoReg])
    rho_f1      = np.array([vals[k]['F1'][0] for k in RhoReg])
    rho_map     = np.array([vals[k]['mAP'][0] for k in RhoReg])

    mets = [rho_recall, rho_recall2, rho_recall4, rho_nmi, rho_f1, rho_map]

    return RhoReg, mets, rho_recall, rho_recall_err

cub_basic_names, cub_mets, cub_basic_recall, cub_basic_recall_err = give_basic_metr(cub_vals, key='CUB')
car_basic_names, car_mets, car_basic_recall, car_basic_recall_err = give_basic_metr(car_vals, key='CARS')
sop_basic_names, sop_mets, sop_basic_recall, sop_basic_recall_err = give_basic_metr(sop_vals, key='SOP')
cub_reg_names, cub_reg_mets, cub_reg_recall, cub_reg_recall_err = give_reg_metr(cub_vals, key='CUB')
car_reg_names, car_reg_mets, car_reg_recall, car_reg_recall_err = give_reg_metr(car_vals, key='CARS')
sop_reg_names, sop_reg_mets, sop_reg_recall, sop_reg_recall_err = give_reg_metr(sop_vals, key='SOP')


"""============================================================="""
# def produce_plot(basic_recall, basic_recall_err, BasicLosses, vals, ylim=[0.58, 0.635]):
#
#     intra  = np.array([vals[k]['Intra'][0] for k in BasicLosses])
#     inter  = np.array([vals[k]['Inter'][0] for k in BasicLosses])
#     ratio  = np.array([vals[k]['Intra_over_Inter'][0] for k in BasicLosses])
#     rho1  = np.array([vals[k]['Rho1'][0] for k in BasicLosses])
#     rho2  = np.array([vals[k]['Rho2'][0] for k in BasicLosses])
#     rho3  = np.array([vals[k]['Rho3'][0] for k in BasicLosses])
#     rho4  = np.array([vals[k]['Rho4'][0] for k in BasicLosses])
#
#     def comp(met):
#         sort = np.argsort(met)
#         corr = np.corrcoef(met[sort],basic_recall[sort])[0,1]
#         m,b  = np.polyfit(met[sort], basic_recall[sort], 1)
#         lim  = [np.min(met)*0.9, np.max(met)*1.1]
#         x    = np.linspace(lim[0], lim[1], 50)
#         linfit = m*x + b
#         return sort, corr, linfit, x, lim
#
#     intra_sort, intra_corr, intra_linfit, intra_x, intra_lim = comp(intra)
#     inter_sort, inter_corr, inter_linfit, inter_x, inter_lim = comp(inter)
#     ratio_sort, ratio_corr, ratio_linfit, ratio_x, ratio_lim = comp(ratio)
#     rho1_sort, rho1_corr, rho1_linfit, rho1_x, rho1_lim = comp(rho1)
#     rho2_sort, rho2_corr, rho2_linfit, rho2_x, rho2_lim = comp(rho2)
#     rho3_sort, rho3_corr, rho3_linfit, rho3_x, rho3_lim = comp(rho3)
#     rho4_sort, rho4_corr, rho4_linfit, rho4_x, rho4_lim = comp(rho4)
#
#
#
#     f,ax = plt.subplots(1,4)
#     # f,ax = plt.subplots(1,7)
#     colors = np.array([np.random.rand(3,) for _ in range(len(basic_recall))])
#     for i in range(len(colors)):
#         ax[0].errorbar(intra[intra_sort][i], basic_recall[intra_sort][i], yerr=basic_recall_err[intra_sort][i], fmt='o', color=colors[intra_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
#         ax[1].errorbar(inter[inter_sort][i], basic_recall[inter_sort][i], yerr=basic_recall_err[inter_sort][i], fmt='o', color=colors[inter_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
#         ax[2].errorbar(ratio[ratio_sort][i], basic_recall[ratio_sort][i], yerr=basic_recall_err[ratio_sort][i], fmt='o', color=colors[ratio_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
#         # ax[3].errorbar(rho1[rho1_sort][i], basic_recall[rho1_sort][i], yerr=basic_recall_err[rho1_sort][i], fmt='o', color=colors[rho1_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
#         # ax[4].errorbar(rho2[rho2_sort][i], basic_recall[rho2_sort][i], yerr=basic_recall_err[rho2_sort][i], fmt='o', color=colors[rho2_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
#         ax[3].errorbar(rho3[rho3_sort][i], basic_recall[rho3_sort][i], yerr=basic_recall_err[rho3_sort][i], fmt='o', color=colors[rho3_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
#         # ax[6].errorbar(rho4[rho4_sort][i], basic_recall[rho4_sort][i], yerr=basic_recall_err[rho4_sort][i], fmt='o', color=colors[rho4_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
#     ax[1].set_yticks([])
#     ax[2].set_yticks([])
#     ax[3].set_yticks([])
#     # ax[4].set_yticks([])
#     # ax[5].set_yticks([])
#     # ax[6].set_yticks([])
#     ax[0].plot(intra_x, intra_linfit, 'k--', alpha=0.5, linewidth=3)
#     ax[1].plot(inter_x, inter_linfit, 'k--', alpha=0.5, linewidth=3)
#     ax[2].plot(ratio_x, ratio_linfit, 'k--', alpha=0.5, linewidth=3)
#     # ax[3].plot(rho1_x, rho1_linfit, 'k--', alpha=0.5, linewidth=3)
#     ax[3].plot(rho2_x, rho2_linfit, 'k--', alpha=0.5, linewidth=3)
#     # ax[5].plot(rho3_x, rho3_linfit, 'k--', alpha=0.5, linewidth=3)
#     # ax[6].plot(rho4_x, rho4_linfit, 'k--', alpha=0.5, linewidth=3)
#     ax[0].text('Correlation: {0:2.2f}'.format(intra_corr), fontsize=18)
#     ax[1].text('Correlation: {0:2.2f}'.format(inter_corr), fontsize=18)
#     ax[2].text('Correlation: {0:2.2f}'.format(ratio_corr), fontsize=18)
#     # ax[3].text('Correlation: {0:2.2f}'.format(rho1_corr), fontsize=18)
#     ax[3].text('Correlation: {0:2.2f}'.format(rho2_corr), fontsize=18)
#     # ax[5].set_title('Correlation: {0:2.2f}'.format(rho3_corr), fontsize=18)
#     # ax[6].set_title('Correlation: {0:2.2f}'.format(rho4_corr), fontsize=18)
#     ax[0].set_title(r'$\pi_{intra}$', fontsize=18)
#     ax[1].set_title(r'$\pi_{inter}$', fontsize=18)
#     ax[2].set_title(r'$\pi_{ratio}$', fontsize=18)
#     ax[3].set_title(r'$\rho(\Phi)$', fontsize=18)
#     ax[0].set_ylabel('Recall Performance', fontsize=18)
#     for a in ax.reshape(-1):
#         a.tick_params(axis='both', which='major', labelsize=16)
#         a.tick_params(axis='both', which='minor', labelsize=16)
#         a.set_ylim(ylim)
#     f.set_size_inches(22,8)
#     f.tight_layout()


# produce_plot(cub_basic_recall, cub_basic_recall_err, cub_basic_names, cub_vals, ylim=[0.581,0.635])
# produce_plot(car_basic_recall, car_basic_recall_err, car_basic_names, car_vals, ylim=[0.70,0.82])
# produce_plot(sop_basic_recall, sop_basic_recall_err, sop_basic_names, sop_vals, ylim=[0.67,0.79])


def full_rel_plot():
    recallss= [cub_basic_recall, car_basic_recall, sop_basic_recall]
    rerrss  = [cub_basic_recall_err, car_basic_recall_err, sop_basic_recall_err]
    namess  = [cub_basic_names, car_basic_names, sop_basic_names]
    valss   = [cub_vals, car_vals, sop_vals]
    ylims   = [[0.581, 0.638],[0.70,0.82],[0.67,0.79]]

    f,axes = plt.subplots(3,4)
    for k,(ax, recalls, rerrs, names, vals, ylim) in enumerate(zip(axes, recallss, rerrss, namess, valss, ylims)):
        col = 'red' if k==3 else 'gray'

        intra  = np.array([vals[k]['Intra'][0] for k in names])
        inter  = np.array([vals[k]['Inter'][0] for k in names])
        ratio  = np.array([vals[k]['Intra_over_Inter'][0] for k in names])
        rho    = np.array([vals[k]['Rho3'][0] for k in names])

        def comp(met):
            sort = np.argsort(met)
            corr = np.corrcoef(met[sort],recalls[sort])[0,1]
            m,b  = np.polyfit(met[sort], recalls[sort], 1)
            lim  = [np.min(met)*0.9, np.max(met)*1.1]
            x    = np.linspace(lim[0], lim[1], 50)
            linfit = m*x + b
            return sort, corr, linfit, x, lim

        intra_sort, intra_corr, intra_linfit, intra_x, intra_lim = comp(intra)
        inter_sort, inter_corr, inter_linfit, inter_x, inter_lim = comp(inter)
        ratio_sort, ratio_corr, ratio_linfit, ratio_x, ratio_lim = comp(ratio)
        rho_sort, rho_corr, rho_linfit, rho_x, rho_lim = comp(rho)

        # f,ax = plt.subplots(1,7)
        colors = np.array([np.random.rand(3,) for _ in range(len(recalls))])
        for i in range(len(colors)):
            ax[0].errorbar(intra[intra_sort][i], recalls[intra_sort][i], yerr=rerrs[intra_sort][i], fmt='o', color=colors[intra_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
            ax[1].errorbar(inter[inter_sort][i], recalls[inter_sort][i], yerr=rerrs[inter_sort][i], fmt='o', color=colors[inter_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
            ax[2].errorbar(ratio[ratio_sort][i], recalls[ratio_sort][i], yerr=rerrs[ratio_sort][i], fmt='o', color=colors[ratio_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
            ax[3].errorbar(rho[rho_sort][i], recalls[rho_sort][i], yerr=rerrs[rho_sort][i], fmt='o', color=colors[rho_sort][i], ecolor='gray', elinewidth=3, capsize=0, label='Basic Criteria', markersize=8)
        ax[1].set_yticks([])
        ax[2].set_yticks([])
        ax[3].set_yticks([])
        ax[0].plot(intra_x, intra_linfit, 'k--', alpha=0.5, linewidth=3)
        ax[1].plot(inter_x, inter_linfit, 'k--', alpha=0.5, linewidth=3)
        ax[2].plot(ratio_x, ratio_linfit, 'k--', alpha=0.5, linewidth=3)
        ax[3].plot(rho_x, rho_linfit, 'r--', alpha=0.5, linewidth=3)

        ax[0].text(intra_lim[1]-0.7*(intra_lim[1]-intra_lim[0]),ylim[0]+0.05*(ylim[1]-ylim[0]),'Corr: {0:1.2f}'.format(intra_corr), bbox=dict(facecolor='gray', alpha=0.5), fontsize=26)
        ax[1].text(inter_lim[1]-0.7*(inter_lim[1]-inter_lim[0]),ylim[0]+0.05*(ylim[1]-ylim[0]),'Corr: {0:1.2f}'.format(inter_corr), bbox=dict(facecolor='gray', alpha=0.5), fontsize=26)
        ax[2].text(ratio_lim[1]-0.7*(ratio_lim[1]-ratio_lim[0]),ylim[0]+0.05*(ylim[1]-ylim[0]),'Corr: {0:1.2f}'.format(ratio_corr), bbox=dict(facecolor='gray', alpha=0.5), fontsize=26)
        ax[3].text(rho_lim[1]-0.7*(rho_lim[1]-rho_lim[0]),ylim[0]+0.05*(ylim[1]-ylim[0]),'Corr: {0:1.2f}'.format(rho_corr),   bbox=dict(facecolor='red', alpha=0.5), fontsize=26)

        if k==0:
            ax[0].set_title(r'$\pi_{intra}$', fontsize=26)
            ax[1].set_title(r'$\pi_{inter}$', fontsize=26)
            ax[2].set_title(r'$\pi_{ratio}$', fontsize=26)
            ax[3].set_title(r'$\rho(\Phi)$', fontsize=26, color='red')
        if k==0:
            ax[0].set_ylabel('CUB200-2011 R@1', fontsize=23)
        elif k==1:
            ax[0].set_ylabel('CARS196 R@1', fontsize=23)
        elif k==2:
            ax[0].set_ylabel('SOP R@1', fontsize=23)
        for a in ax.reshape(-1):
            a.tick_params(axis='both', which='major', labelsize=20)
            a.tick_params(axis='both', which='minor', labelsize=20)
            a.set_ylim(ylim)
    f.set_size_inches(21,15)
    f.tight_layout()

    f.savefig('comp_metric_relation.pdf')
    f.savefig('comp_metric_relation.png')

full_rel_plot()


"""================================================"""
import itertools as it
cub_corr_mat = np.corrcoef(cub_mets)
f,ax         = plt.subplots(1,3)
ax[0].imshow(cub_corr_mat, vmin=0, vmax=1, cmap='plasma')
corr_x = [0,1,2,3,4,5]
ax[0].set_xticklabels(metric_names)
ax[0].set_yticklabels(metric_names)
ax[0].set_xticks(corr_x)
ax[0].set_yticks(corr_x)
ax[0].set_xlim([-0.5,5.5])
ax[0].set_ylim([-0.5,5.5])
cs = list(it.product(corr_x, corr_x))
for c in cs:
    ax[0].text(c[0]-0.2, c[1]-0.11, '{0:1.2f}'.format(cub_corr_mat[c[0], c[1]]), fontsize=18)
ax[0].tick_params(axis='both', which='major', labelsize=18)
ax[0].tick_params(axis='both', which='minor', labelsize=18)
car_corr_mat = np.corrcoef(car_mets)
ax[1].imshow(car_corr_mat, vmin=0, vmax=1, cmap='plasma')
corr_x = [0,1,2,3,4,5]
ax[1].set_xticklabels(metric_names)
ax[1].set_yticklabels(metric_names)
ax[1].set_xticks(corr_x)
ax[1].set_yticks(corr_x)
ax[1].set_xlim([-0.5,5.5])
ax[1].set_ylim([-0.5,5.5])
cs = list(it.product(corr_x, corr_x))
for c in cs:
    ax[1].text(c[0]-0.2, c[1]-0.11, '{0:1.2f}'.format(car_corr_mat[c[0], c[1]]), fontsize=18)
ax[1].tick_params(axis='both', which='major', labelsize=18)
ax[1].tick_params(axis='both', which='minor', labelsize=18)
sop_corr_mat = np.corrcoef(sop_mets)
ax[2].imshow(sop_corr_mat, vmin=0, vmax=1, cmap='plasma')
corr_x = [0,1,2,3,4,5]
ax[2].set_xticklabels(metric_names)
ax[2].set_yticklabels(metric_names)
ax[2].set_xticks(corr_x)
ax[2].set_yticks(corr_x)
ax[2].set_xlim([-0.5,5.5])
ax[2].set_ylim([-0.5,5.5])
cs = list(it.product(corr_x, corr_x))
for c in cs:
    ax[2].text(c[0]-0.2, c[1]-0.11, '{0:1.2f}'.format(sop_corr_mat[c[0], c[1]]), fontsize=18)
ax[2].tick_params(axis='both', which='major', labelsize=18)
ax[2].tick_params(axis='both', which='minor', labelsize=18)
ax[0].set_title('CUB200-2011', fontsize=22)
ax[1].set_title('CARS196', fontsize=22)
ax[2].set_title('Stanford Online Products', fontsize=22)
f.set_size_inches(22,8)
f.tight_layout()
f.savefig('metric_correlation_matrix.pdf')
f.savefig('metric_correlation_matrix.png')


"""=================================================="""
####
recallss, valss = [cub_basic_recall, car_basic_recall, sop_basic_recall], [cub_vals, car_vals, sop_vals]
errss           = [cub_basic_recall_err, car_basic_recall_err, sop_basic_recall_err]
namess          = [cub_basic_names, car_basic_names, sop_basic_names]
reg_recallss, reg_valss = [cub_reg_recall, car_reg_recall, sop_reg_recall], [cub_vals, car_vals, sop_vals]
reg_errss           = [cub_reg_recall_err, car_reg_recall_err, sop_reg_recall_err]
reg_namess          = [cub_reg_names, car_reg_names, sop_reg_names]

####
def plot(vals, recalls, errs, names, reg_vals=None, reg_recalls=None, reg_errs=None, reg_names=None, xlab=None, ylab=None, xlim=[0,1], ylim=[0,1], savename=None):
    from adjustText import adjust_text
    f, ax = plt.subplots(1)
    texts = []
    rho       = np.array([vals[k]['Rho3'][0] for k in names])
    adj_names = names

    nnames    = []
    for n in adj_names:
        nnames.append(name_adjust(n, prep='', app=''))
    print(nnames)
    ax.errorbar(rho, recalls, yerr=errs,color='deepskyblue',fmt='o',ecolor='deepskyblue',elinewidth=5,capsize=0,markersize=16,mec='k')
    recalls = np.array(recalls)
    for rho_v, rec_v, n in zip(rho, recalls, nnames):
        r = ax.text(rho_v, rec_v, n, fontsize=17, va='top', ha='left')
        # r = ax.text(rho_v, rec_v, n, fontsize=15, bbox=dict(facecolor='gray', alpha=0.5), va='left', ha='left')
        texts.append(r)

    if reg_names is not None:
        rho       = np.array([vals[k]['Rho3'][0] for k in reg_names])
        adj_names = ['_'.join(x.split('_')[1:]) for x in reg_names]

        nnames    = []
        for n in adj_names:
            nnames.append(name_adjust(n, prep='R-', app=''))
        ax.errorbar(rho, reg_recalls, yerr=reg_errs,color='orange',fmt='o',ecolor='gray',elinewidth=5,capsize=0,markersize=16,mec='k')
        for rho_v, rec_v, n in zip(rho, reg_recalls, nnames):
            r = ax.text(rho_v, rec_v, n, fontsize=17, va='top', ha='left', color='chocolate')
            texts.append(r)
    ax.tick_params(axis='both', which='major', labelsize=20)
    ax.tick_params(axis='both', which='minor', labelsize=20)
    if xlab is not None:
        ax.set_xlabel(xlab, fontsize=20)
    if ylab is not None:
        ax.set_ylabel(ylab, fontsize=20)
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
    ax.grid()
    f.set_size_inches(25,5)
    f.tight_layout()
    adjust_text(texts, arrowprops=dict(arrowstyle="->", color='k', lw=1))
    f.savefig('{}.png'.format(savename))
    f.savefig('{}.pdf'.format(savename))

plot(valss[0], recallss[0], errss[0], namess[0], reg_valss[0], reg_recallss[0], reg_errss[0], reg_namess[0], xlab=r'$\rho(\Phi)$', ylab=r'$CUB200-2011, R@1$', xlim=[0,0.59], ylim=[0.58, 0.66], savename='Detailed_Rel_Recall_Rho_CUB')
plot(valss[1], recallss[1], errss[1], namess[1], reg_valss[1], reg_recallss[1], reg_errss[1], reg_namess[1], xlab=r'$\rho(\Phi)$', ylab=r'$CARS196, R@1$',     xlim=[0,0.59], ylim=[0.7, 0.84], savename='Detailed_Rel_Recall_Rho_CAR')
plot(valss[2], recallss[2], errss[2], namess[2], reg_valss[2], reg_recallss[2], reg_errss[2], reg_namess[2], xlab=r'$\rho(\Phi)$', ylab=r'$SOP, R@1$', xlim=[0,0.59], ylim=[0.67, 0.81], savename='Detailed_Rel_Recall_Rho_SOP')


"""=================================================="""
#### First Page Figure
plt.style.use('seaborn')
total_recall = np.array(cub_basic_recall.tolist() + cub_reg_recall.tolist())
total_err    = np.array(cub_basic_recall_err.tolist() + cub_reg_recall_err.tolist())
total_names  = np.array(cub_basic_names+cub_reg_names)
sort_idx = np.argsort(total_recall)
f, ax = plt.subplots(1)
basic_label, reg_label = False, False
for i,idx in enumerate(sort_idx):
    if 'reg_' not in total_names[idx]:
        if basic_label:
            ax.barh(i,total_recall[idx], xerr=total_err[idx], color='orange', alpha=0.6)
        else:
            ax.barh(i,total_recall[idx], xerr=total_err[idx], color='orange', alpha=0.6, label='Basic DML Criteria')
            basic_label = True
        ax.text(0.5703,i-0.2,name_adjust(total_names[idx]), fontsize=17)
    else:
        if reg_label:
            ax.barh(i,total_recall[idx], xerr=total_err[idx], color='forestgreen', alpha=0.8)
        else:
            ax.barh(i,total_recall[idx], xerr=total_err[idx], color='forestgreen', alpha=0.8, label='Regularized Variant')
            reg_label = True
        ax.text(0.5703,i-0.2,name_adjust(total_names[idx], prep='R-'), fontsize=17)
ax.legend(fontsize=20)
ax.set_yticks([])
ax.set_yticklabels([])
ax.set_xticks([0.58, 0.6, 0.62, 0.64])
ax.tick_params(axis='both', which='major', labelsize=22)
ax.tick_params(axis='both', which='minor', labelsize=22)
ax.set_title('CUB200-2011, R@1', fontsize=20)
ax.set_ylim([-0.5,22.5])
ax.set_xlim([0.57, 0.655])
f.set_size_inches(15,8)
f.tight_layout()
f.savefig('FirstPage.png')
f.savefig('FirstPage.pdf')


================================================
FILE: Sample_Runs/ICML2020_RevisitDML_SampleRuns.sh
================================================
python main.py --kernels 6 --source /home/karsten_dl/Dropbox/Projects/Datasets --n_epochs 150 --seed 0 --gpu 1 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen


"""============= Baseline Runs --- CUB200-2011 ===================="""
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Npair --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Npair --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Npair --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Npair --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Npair --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_GenLifted --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_GenLifted --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_GenLifted --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_GenLifted --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_GenLifted --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ProxyNCA --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ProxyNCA --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ProxyNCA --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ProxyNCA --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ProxyNCA --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Histogram --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Histogram --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Histogram --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Histogram --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Histogram --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Contrastive --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Contrastive --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Contrastive --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Contrastive --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Contrastive --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SoftTriple --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SoftTriple --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SoftTriple --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SoftTriple --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SoftTriple --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Angular --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Angular --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Angular --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Angular --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Angular --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ArcFace --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ArcFace --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ArcFace --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ArcFace --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_ArcFace --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Random --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Random --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Random --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Random --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Random --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Semihard --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Semihard --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Semihard --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Semihard --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Semihard --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Softhard --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Softhard --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Softhard --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Softhard --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Softhard --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Triplet_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Quadruplet_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Quadruplet_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Quadruplet_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Quadruplet_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Quadruplet_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b06_Distance --loss_margin_beta 0.6 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b06_Distance --loss_margin_beta 0.6 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b06_Distance --loss_margin_beta 0.6 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b06_Distance --loss_margin_beta 0.6 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b06_Distance --loss_margin_beta 0.6 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b12_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b12_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b12_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b12_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Margin_b12_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SNR_Distance  --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SNR_Distance  --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SNR_Distance  --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SNR_Distance  --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_SNR_Distance  --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_MS --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_MS --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_MS --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_MS --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_MS --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Softmax --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Softmax --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Softmax --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Softmax --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUB_Softmax --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize


### Specturm-Regularized Ranking Losses
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Contrastive --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Contrastive --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Contrastive --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Contrastive --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Contrastive --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b12_Distance_3 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b12_Distance_3 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b12_Distance_3 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b12_Distance_3 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Margin_b12_Distance_3 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Triplet_Distance_3 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Triplet_Distance_3 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Triplet_Distance_3 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Triplet_Distance_3 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_Triplet_Distance_3 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.4

python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_SNR_Distance _3 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.3
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_SNR_Distance _3 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.3
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_SNR_Distance _3 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.3
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_SNR_Distance _3 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.3
python main.py --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CUBreg_SNR_Distance _3 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.3


"""============= Baseline Runs --- CARS196 ===================="""
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Npair --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Npair --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Npair --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Npair --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Npair --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_GenLifted --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_GenLifted --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_GenLifted --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_GenLifted --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_GenLifted --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ProxyNCA --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ProxyNCA --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ProxyNCA --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ProxyNCA --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ProxyNCA --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss proxynca --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Histogram --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Histogram --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Histogram --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Histogram --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Histogram --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 65

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Contrastive --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Contrastive --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Contrastive --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Contrastive --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Contrastive --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SoftTriple --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SoftTriple --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SoftTriple --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SoftTriple --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SoftTriple --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss softtriplet --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Angular --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Angular --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Angular --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Angular --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Angular --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ArcFace --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ArcFace --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ArcFace --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ArcFace --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_ArcFace --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Random --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Random --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Random --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Random --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Random --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Semihard --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Semihard --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Semihard --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Semihard --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Semihard --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Softhard --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Softhard --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Softhard --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Softhard --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Softhard --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Triplet_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Quadruplet_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Quadruplet_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Quadruplet_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Quadruplet_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Quadruplet_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b06_Distance --loss_margin_beta 0.6 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b06_Distance --loss_margin_beta 0.6 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b06_Distance --loss_margin_beta 0.6 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b06_Distance --loss_margin_beta 0.6 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b06_Distance --loss_margin_beta 0.6 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b12_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b12_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b12_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b12_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Margin_b12_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SNR_Distance  --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SNR_Distance  --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SNR_Distance  --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SNR_Distance  --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_SNR_Distance  --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_MS --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_MS --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_MS --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_MS --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_MS --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Softmax --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Softmax --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Softmax --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Softmax --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARS_Softmax --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize


### Specturm-Regularized Ranking Losses
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Contrastive --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Contrastive --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Contrastive --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Contrastive --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Contrastive --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b12_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b12_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b12_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b12_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Margin_b12_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Triplet_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Triplet_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Triplet_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Triplet_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_Triplet_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35

python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_SNR_Distance  --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_SNR_Distance  --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_SNR_Distance  --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_SNR_Distance  --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35
python main.py --dataset cars196 --kernels 6 --source $datapath --n_epochs 150 --log_online --project RevisitDML --group CARreg_SNR_Distance  --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.35


"""============= Baseline Runs --- Online Products ===================="""
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Npair --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Npair --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Npair --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Npair --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Npair --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss npair --batch_mining npair --arch resnet50_frozen


python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_GenLifted --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_GenLifted --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_GenLifted --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_GenLifted --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_GenLifted --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss lifted --batch_mining lifted --arch resnet50_frozen


python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Histogram --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 11
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Histogram --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 11
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Histogram --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 11
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Histogram --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 11
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Histogram --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss histogram --arch resnet50_frozen_normalize --loss_histogram_nbins 11


python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Contrastive --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Contrastive --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Contrastive --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Contrastive --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Contrastive --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining distance --arch resnet50_frozen_normalize


python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Angular --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Angular --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Angular --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Angular --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Angular --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss angular --batch_mining npair --arch resnet50_frozen

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_ArcFace --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_ArcFace --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_ArcFace --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_ArcFace --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_ArcFace --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss arcface --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Random --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Random --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Random --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Random --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Random --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining random --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Semihard --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Semihard --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Semihard --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Semihard --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Semihard --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining semihard --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Softhard --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Softhard --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Softhard --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Softhard --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Softhard --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining softhard --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Triplet_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Quadruplet_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Quadruplet_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Quadruplet_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Quadruplet_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Quadruplet_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss quadruplet --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b06_Distance --loss_margin_beta 0.6 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b06_Distance --loss_margin_beta 0.6 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b06_Distance --loss_margin_beta 0.6 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b06_Distance --loss_margin_beta 0.6 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b06_Distance --loss_margin_beta 0.6 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b12_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b12_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b12_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b12_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Margin_b12_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_SNR_Distance  --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_SNR_Distance  --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_SNR_Distance  --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_SNR_Distance  --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_SNR_Distance  --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining distance --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_MS --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_MS --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_MS --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_MS --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_MS --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss multisimilarity --arch resnet50_frozen_normalize

python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Softmax --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize --loss_softmax_lr 0.002
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Softmax --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize --loss_softmax_lr 0.002
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Softmax --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize --loss_softmax_lr 0.002
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Softmax --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize --loss_softmax_lr 0.002
python main.py --dataset online_products --kernels 2 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOP_Softmax --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss softmax --batch_mining distance --arch resnet50_frozen_normalize --loss_softmax_lr 0.002


### Specturm-Regularized Ranking Losses
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Contrastive --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Contrastive --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Contrastive --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Contrastive --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Contrastive --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss contrastive --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b06_Distance --loss_margin_beta 0.6 --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b12_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b12_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b12_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b12_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Margin_b12_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss margin --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Triplet_Distance --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Triplet_Distance --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Triplet_Distance --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Triplet_Distance --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_Triplet_Distance --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss triplet --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15

python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_SNR_Distance  --seed 0 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_SNR_Distance  --seed 1 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_SNR_Distance  --seed 2 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_SNR_Distance  --seed 3 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15
python main.py --dataset online_products --kernels 6 --source $datapath --n_epochs 100 --log_online --project RevisitDML --group SOPreg_SNR_Distance  --seed 4 --gpu 0 --bs 112 --samples_per_class 2 --loss snr --batch_mining rho_distance --arch resnet50_frozen_normalize --miner_rho_distance_cp 0.15


================================================
FILE: architectures/__init__.py
================================================
import architectures.resnet50
import architectures.googlenet
import architectures.bninception

def select(arch, opt):
    if 'resnet50' in arch:
        return resnet50.Network(opt)
    if 'googlenet' in arch:
        return googlenet.Network(opt)
    if 'bninception' in arch:
        return bninception.Network(opt)


================================================
FILE: architectures/bninception.py
================================================
"""
The network architectures and weights are adapted and used from the great https://github.com/Cadene/pretrained-models.pytorch.
"""
import torch, torch.nn as nn, torch.nn.functional as F
import pretrainedmodels as ptm


"""============================================================="""
class Network(torch.nn.Module):
    def __init__(self, opt, return_embed_dict=False):
        super(Network, self).__init__()

        self.pars  = opt
        self.model = ptm.__dict__['bninception'](num_classes=1000, pretrained='imagenet')
        self.model.last_linear = torch.nn.Linear(self.model.last_linear.in_features, opt.embed_dim)
        if '_he' in opt.arch:
            torch.nn.init.kaiming_normal_(self.model.last_linear.weight, mode='fan_out')
            torch.nn.init.constant_(self.model.last_linear.bias, 0)

        if 'frozen' in opt.arch:
            for module in filter(lambda m: type(m) == nn.BatchNorm2d, self.model.modules()):
                module.eval()
                module.train = lambda _: None

        self.return_embed_dict = return_embed_dict

        self.pool_base = torch.nn.AdaptiveAvgPool2d(1)
        self.pool_aux = torch.nn.AdaptiveMaxPool2d(1) if 'double' in opt.arch else None

        self.name = opt.arch

        self.out_adjust = None

    def forward(self, x, warmup=False, **kwargs):
        x = self.model.features(x)
        y = self.pool_base(x)
        if self.pool_aux is not None:
            y += self.pool_aux(x)
        if warmup:
            y,x = y.detach(), x.detach()
        z = self.model.last_linear(y.view(len(x),-1))
        if 'normalize' in self.name:
            z = F.normalize(z, dim=-1)
        if self.out_adjust and not self.training:
            z = self.out_adjust(z)
        return z,(y,x)

    def functional_forward(self, x):
        pass


================================================
FILE: architectures/googlenet.py
================================================
"""
The network architectures and weights are adapted and used from the great https://github.com/Cadene/pretrained-models.pytorch.
"""
import torch, torch.nn as nn
import torchvision.models as mod


"""============================================================="""
class Network(torch.nn.Module):
    def __init__(self, opt):
        super(Network, self).__init__()

        self.pars  = opt
        self.model = mod.googlenet(pretrained=True)

        self.model.last_linear = torch.nn.Linear(self.model.fc.in_features, opt.embed_dim)
        self.model.fc = self.model.last_linear

        self.name = opt.arch

    def forward(self, x):
        x = self.model(x)
        if not 'normalize' in self.pars.arch:
            return x
        return torch.nn.functional.normalize(x, dim=-1)


================================================
FILE: architectures/resnet50.py
================================================
"""
The network architectures and weights are adapted and used from the great https://github.com/Cadene/pretrained-models.pytorch.
"""
import torch, torch.nn as nn
import pretrainedmodels as ptm


"""============================================================="""
class Network(torch.nn.Module):
    def __init__(self, opt):
        super(Network, self).__init__()

        self.pars  = opt
        self.model = ptm.__dict__['resnet50'](num_classes=1000, pretrained='imagenet' if not opt.not_pretrained else None)

        self.name = opt.arch

        if 'frozen' in opt.arch:
            for module in filter(lambda m: type(m) == nn.BatchNorm2d, self.model.modules()):
                module.eval()
                module.train = lambda _: None

        self.model.last_linear = torch.nn.Linear(self.model.last_linear.in_features, opt.embed_dim)

        self.layer_blocks = nn.ModuleList([self.model.layer1, self.model.layer2, self.model.layer3, self.model.layer4])

        self.out_adjust = None


    def forward(self, x, **kwargs):
        x = self.model.maxpool(self.model.relu(self.model.bn1(self.model.conv1(x))))
        for layerblock in self.layer_blocks:
            x = layerblock(x)
        no_avg_feat = x
        x = self.model.avgpool(x)
        enc_out = x = x.view(x.size(0),-1)

        x = self.model.last_linear(x)

        if 'normalize' in self.pars.arch:
            x = torch.nn.functional.normalize(x, dim=-1)
        if self.out_adjust and not self.train:
            x = self.out_adjust(x)

        return x, (enc_out, no_avg_feat)


================================================
FILE: batchminer/__init__.py
================================================
from batchminer import random_distance, intra_random
from batchminer import lifted, rho_distance, softhard, npair, parametric, random, semihard, distance

BATCHMINING_METHODS = {'random':random,
                       'semihard':semihard,
                       'softhard':softhard,
                       'distance':distance,
                       'rho_distance':rho_distance,
                       'npair':npair,
                       'parametric':parametric,
                       'lifted':lifted,
                       'random_distance': random_distance,
                       'intra_random': intra_random}


def select(batchminername, opt):
    #####
    if batchminername not in BATCHMINING_METHODS: raise NotImplementedError('Batchmining {} not available!'.format(batchminername))

    batchmine_lib = BATCHMINING_METHODS[batchminername]

    return batchmine_lib.BatchMiner(opt)


================================================
FILE: batchminer/distance.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer


class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.lower_cutoff = opt.miner_distance_lower_cutoff
        self.upper_cutoff = opt.miner_distance_upper_cutoff
        self.name         = 'distance'

    def __call__(self, batch, labels, tar_labels=None, return_distances=False, distances=None):
        if isinstance(labels, torch.Tensor): labels = labels.detach().cpu().numpy()
        bs, dim = batch.shape

        if distances is None:
            distances = self.pdist(batch.detach()).clamp(min=self.lower_cutoff)
        sel_d = distances.shape[-1]

        positives, negatives = [],[]
        labels_visited       = []
        anchors              = []

        tar_labels = labels if tar_labels is None else tar_labels

        for i in range(bs):
            neg = tar_labels!=labels[i]; pos = tar_labels==labels[i]

            anchors.append(i)
            q_d_inv = self.inverse_sphere_distances(dim, bs, distances[i], tar_labels, labels[i])
            negatives.append(np.random.choice(sel_d,p=q_d_inv))

            if np.sum(pos)>0:
                #Sample positives randomly
                if np.sum(pos)>1: pos[i] = 0
                positives.append(np.random.choice(np.where(pos)[0]))
                #Sample negatives by distance

        sampled_triplets = [[a,p,n] for a,p,n in zip(anchors, positives, negatives)]

        if return_distances:
            return sampled_triplets, distances
        else:
            return sampled_triplets


    def inverse_sphere_distances(self, dim, bs, anchor_to_all_dists, labels, anchor_label):
            dists  = anchor_to_all_dists

            #negated log-distribution of distances of unit sphere in dimension <dim>
            log_q_d_inv = ((2.0 - float(dim)) * torch.log(dists) - (float(dim-3) / 2) * torch.log(1.0 - 0.25 * (dists.pow(2))))
            log_q_d_inv[np.where(labels==anchor_label)[0]] = 0

            q_d_inv     = torch.exp(log_q_d_inv - torch.max(log_q_d_inv)) # - max(log) for stability
            q_d_inv[np.where(labels==anchor_label)[0]] = 0

            ### NOTE: Cutting of values with high distances made the results slightly worse. It can also lead to
            # errors where there are no available negatives (for high samples_per_class cases).
            # q_d_inv[np.where(dists.detach().cpu().numpy()>self.upper_cutoff)[0]]    = 0

            q_d_inv = q_d_inv/q_d_inv.sum()
            return q_d_inv.detach().cpu().numpy()


    def pdist(self, A):
        prod = torch.mm(A, A.t())
        norm = prod.diag().unsqueeze(1).expand_as(prod)
        res = (norm + norm.t() - 2 * prod).clamp(min = 0)
        return res.sqrt()


================================================
FILE: batchminer/intra_random.py
================================================
import numpy as np, torch
import itertools as it
import random

class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.name         = 'random'

    def __call__(self, batch, labels):
        if isinstance(labels, torch.Tensor): labels = labels.detach().cpu().numpy()
        unique_classes   = np.unique(labels)
        indices          = np.arange(len(batch))
        class_dict       = {i:indices[labels==i] for i in unique_classes}

        sampled_triplets = []
        for cls in np.random.choice(list(class_dict.keys()), len(labels), replace=True):
            a,p,n = np.random.choice(class_dict[cls], 3, replace=True)
            sampled_triplets.append((a,p,n))

        return sampled_triplets


================================================
FILE: batchminer/lifted.py
================================================
import numpy as np, torch

class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.name         = 'lifted'

    def __call__(self, batch, labels):
        if isinstance(labels, torch.Tensor): labels = labels.detach().cpu().numpy()

        ###
        anchors, positives, negatives = [], [], []
        list(range(len(batch)))

        for i in range(len(batch)):
            anchor = i
            pos    = labels==labels[anchor]

            ###
            if np.sum(pos)>1:
                anchors.append(anchor)
                positive_set = np.where(pos)[0]
                positive_set = positive_set[positive_set!=anchor]
                positives.append(positive_set)

        ###
        negatives = []
        for anchor,positive_set in zip(anchors, positives):
            neg_idxs = [i for i in range(len(batch)) if i not in [anchor]+list(positive_set)]
            negative_set = np.arange(len(batch))[neg_idxs]
            negatives.append(negative_set)

        return anchors, positives, negatives


================================================
FILE: batchminer/npair.py
================================================
import numpy as np, torch
class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.name         = 'npair'

    def __call__(self, batch, labels):
        if isinstance(labels, torch.Tensor): labels = labels.detach().cpu().numpy()

        anchors, positives, negatives = [],[],[]

        for i in range(len(batch)):
            anchor = i
            pos    = labels==labels[anchor]

            if np.sum(pos)>1:
                anchors.append(anchor)
                avail_positive = np.where(pos)[0]
                avail_positive = avail_positive[avail_positive!=anchor]
                positive       = np.random.choice(avail_positive)
                positives.append(positive)

        ###
        negatives = []
        for anchor,positive in zip(anchors, positives):
            neg_idxs = [i for i in range(len(batch)) if i not in [anchor, positive] and labels[i] != labels[anchor]]
            # neg_idxs = [i for i in range(len(batch)) if i not in [anchor, positive]]
            negative_set = np.arange(len(batch))[neg_idxs]
            negatives.append(negative_set)

        return anchors, positives, negatives


================================================
FILE: batchminer/parametric.py
================================================
import numpy as np, torch


class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.mode         = opt.miner_parametric_mode
        self.n_support    = opt.miner_parametric_n_support
        self.support_lim  = opt.miner_parametric_support_lim
        self.name         = 'parametric'

        ###
        self.set_sample_distr()


    def __call__(self, batch, labels):
        bs           = batch.shape[0]
        sample_distr = self.sample_distr

        if isinstance(labels, torch.Tensor): labels = labels.detach().cpu().numpy()

        ###
        distances = self.pdist(batch.detach())

        p_assigns           = np.sum((distances.cpu().numpy().reshape(-1)>self.support[1:-1].reshape(-1,1)).T,axis=1).reshape(distances.shape)
        outside_support_lim = (distances.cpu().numpy().reshape(-1)<self.support_lim[0]) * (distances.cpu().numpy().reshape(-1)>self.support_lim[1])
        outside_support_lim = outside_support_lim.reshape(distances.shape)

        sample_ps                      = sample_distr[p_assigns]
        sample_ps[outside_support_lim] = 0

        ###
        anchors, labels_visited = [], []
        positives, negatives = [],[]

        ###
        for i in range(bs):
            neg = labels!=labels[i]; pos = labels==labels[i]

            if np.sum(pos)>1:
                anchors.append(i)

                #Sample positives randomly
                pos[i] = 0
                positives.append(np.random.choice(np.where(pos)[0]))

                #Sample negatives by distance
                sample_p = sample_ps[i][neg]
                sample_p = sample_p/sample_p.sum()
                negatives.append(np.random.choice(np.arange(bs)[neg],p=sample_p))

        sampled_triplets = [[a,p,n] for a,p,n in zip(anchors, positives, negatives)]
        return sampled_triplets


    def pdist(self, A, eps=1e-4):
        prod = torch.mm(A, A.t())
        norm = prod.diag().unsqueeze(1).expand_as(prod)
        res = (norm + norm.t() - 2 * prod).clamp(min = 0)
        return res.clamp(min = eps).sqrt()


    def set_sample_distr(self):
        self.support = np.linspace(self.support_lim[0], self.support_lim[1], self.n_support)

        if self.mode == 'uniform':
            self.sample_distr = np.array([1.] * (self.n_support-1))

        if self.mode == 'hards':
            self.sample_distr = self.support.copy()
            self.sample_distr[self.support<=0.5] = 1
            self.sample_distr[self.support>0.5]  = 0

        if self.mode == 'semihards':
            self.sample_distr = self.support.copy()
            from IPython import embed; embed()
            self.sample_distr[(self.support<=0.7) * (self.support>=0.3)] = 1
            self.sample_distr[(self.support<0.3)  * (self.support>0.7)]  = 0

        if self.mode == 'veryhards':
            self.sample_distr = self.support.copy()
            self.sample_distr[self.support<=0.3] = 1
            self.sample_distr[self.support>0.3]  = 0

        self.sample_distr = np.clip(self.sample_distr, 1e-15, 1)
        self.sample_distr = self.sample_distr/self.sample_distr.sum()


================================================
FILE: batchminer/random.py
================================================
import numpy as np, torch
import itertools as it
import random

class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.name         = 'random'

    def __call__(self, batch, labels):
        if isinstance(labels, torch.Tensor): labels = labels.detach().cpu().numpy()
        unique_classes = np.unique(labels)
        indices        = np.arange(len(batch))
        class_dict     = {i:indices[labels==i] for i in unique_classes}

        sampled_triplets = [list(it.product([x],[x],[y for y in unique_classes if x!=y])) for x in unique_classes]
        sampled_triplets = [x for y in sampled_triplets for x in y]

        sampled_triplets = [[x for x in list(it.product(*[class_dict[j] for j in i])) if x[0]!=x[1]] for i in sampled_triplets]
        sampled_triplets = [x for y in sampled_triplets for x in y]

        #NOTE: The number of possible triplets is given by #unique_classes*(2*(samples_per_class-1)!)*(#unique_classes-1)*samples_per_class
        sampled_triplets = random.sample(sampled_triplets, batch.shape[0])
        return sampled_triplets


================================================
FILE: batchminer/random_distance.py
================================================
import numpy as np, torch


class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.lower_cutoff = opt.miner_distance_lower_cutoff
        self.upper_cutoff = opt.miner_distance_upper_cutoff
        self.name         = 'distance'

    def __call__(self, batch, labels):
        if isinstance(labels, torch.Tensor): labels = labels.detach().cpu().numpy()
        labels = labels[np.random.choice(len(labels), len(labels), replace=False)]

        bs = batch.shape[0]
        distances    = self.pdist(batch.detach()).clamp(min=self.lower_cutoff)

        positives, negatives = [],[]
        labels_visited = []
        anchors = []

        for i in range(bs):
            neg = labels!=labels[i]; pos = labels==labels[i]

            if np.sum(pos)>1:
                anchors.append(i)
                q_d_inv = self.inverse_sphere_distances(batch, distances[i], labels, labels[i])
                #Sample positives randomly
                pos[i] = 0
                positives.append(np.random.choice(np.where(pos)[0]))
                #Sample negatives by distance
                negatives.append(np.random.choice(bs,p=q_d_inv))

        sampled_triplets = [[a,p,n] for a,p,n in zip(anchors, positives, negatives)]
        return sampled_triplets


    def inverse_sphere_distances(self, batch, anchor_to_all_dists, labels, anchor_label):
            dists        = anchor_to_all_dists
            bs,dim       = len(dists),batch.shape[-1]

            #negated log-distribution of distances of unit sphere in dimension <dim>
            log_q_d_inv = ((2.0 - float(dim)) * torch.log(dists) - (float(dim-3) / 2) * torch.log(1.0 - 0.25 * (dists.pow(2))))
            log_q_d_inv[np.where(labels==anchor_label)[0]] = 0

            q_d_inv     = torch.exp(log_q_d_inv - torch.max(log_q_d_inv)) # - max(log) for stability
            q_d_inv[np.where(labels==anchor_label)[0]] = 0

            ### NOTE: Cutting of values with high distances made the results slightly worse. It can also lead to
            # errors where there are no available negatives (for high samples_per_class cases).
            # q_d_inv[np.where(dists.detach().cpu().numpy()>self.upper_cutoff)[0]]    = 0

            q_d_inv = q_d_inv/q_d_inv.sum()
            return q_d_inv.detach().cpu().numpy()


    def pdist(self, A):
        prod = torch.mm(A, A.t())
        norm = prod.diag().unsqueeze(1).expand_as(prod)
        res = (norm + norm.t() - 2 * prod).clamp(min = 0)
        return res.sqrt()


================================================
FILE: batchminer/rho_distance.py
================================================
import numpy as np, torch


class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.lower_cutoff  = opt.miner_rho_distance_lower_cutoff
        self.upper_cutoff  = opt.miner_rho_distance_upper_cutoff
        self.contrastive_p = opt.miner_rho_distance_cp

        self.name         = 'rho_distance'

    def __call__(self, batch, labels, return_distances=False):
        if isinstance(labels, torch.Tensor): labels = labels.detach().cpu().numpy()
        bs = batch.shape[0]
        distances    = self.pdist(batch.detach()).clamp(min=self.lower_cutoff)

        positives, negatives = [],[]
        labels_visited = []
        anchors = []

        for i in range(bs):
            neg = labels!=labels[i]; pos = labels==labels[i]

            use_contr = np.random.choice(2, p=[1-self.contrastive_p, self.contrastive_p])
            if np.sum(pos)>1:
                anchors.append(i)
                if use_contr:
                    positives.append(i)
                    #Sample negatives by distance
                    pos[i] = 0
                    negatives.append(np.random.choice(np.where(pos)[0]))
                else:
                    q_d_inv = self.inverse_sphere_distances(batch, distances[i], labels, labels[i])
                    #Sample positives randomly
                    pos[i] = 0
                    positives.append(np.random.choice(np.where(pos)[0]))
                    #Sample negatives by distance
                    negatives.append(np.random.choice(bs,p=q_d_inv))

        sampled_triplets   = [[a,p,n] for a,p,n in zip(anchors, positives, negatives)]
        self.push_triplets = np.sum([m[1]==m[2] for m in labels[sampled_triplets]])

        if return_distances:
            return sampled_triplets, distances
        else:
            return sampled_triplets


    def inverse_sphere_distances(self, batch, anchor_to_all_dists, labels, anchor_label):
            dists        = anchor_to_all_dists
            bs,dim       = len(dists),batch.shape[-1]

            #negated log-distribution of distances of unit sphere in dimension <dim>
            log_q_d_inv = ((2.0 - float(dim)) * torch.log(dists) - (float(dim-3) / 2) * torch.log(1.0 - 0.25 * (dists.pow(2))))
            log_q_d_inv[np.where(labels==anchor_label)[0]] = 0

            q_d_inv     = torch.exp(log_q_d_inv - torch.max(log_q_d_inv)) # - max(log) for stability
            q_d_inv[np.where(labels==anchor_label)[0]] = 0

            ### NOTE: Cutting of values with high distances made the results slightly worse. It can also lead to
            # errors where there are no available negatives (for high samples_per_class cases).
            # q_d_inv[np.where(dists.detach().cpu().numpy()>self.upper_cutoff)[0]]    = 0

            q_d_inv = q_d_inv/q_d_inv.sum()
            return q_d_inv.detach().cpu().numpy()


    def pdist(self, A, eps=1e-4):
        prod = torch.mm(A, A.t())
        norm = prod.diag().unsqueeze(1).expand_as(prod)
        res = (norm + norm.t() - 2 * prod).clamp(min = 0)
        return res.clamp(min = eps).sqrt()


================================================
FILE: batchminer/semihard.py
================================================
import numpy as np, torch


class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.name         = 'semihard'
        self.margin       = vars(opt)['loss_'+opt.loss+'_margin']

    def __call__(self, batch, labels, return_distances=False):
        if isinstance(labels, torch.Tensor): labels = labels.detach().numpy()
        bs = batch.size(0)
        #Return distance matrix for all elements in batch (BSxBS)
        distances = self.pdist(batch.detach()).detach().cpu().numpy()

        positives, negatives = [], []
        anchors = []
        for i in range(bs):
            l, d = labels[i], distances[i]
            neg = labels!=l; pos = labels==l

            anchors.append(i)
            pos[i] = 0
            p      = np.random.choice(np.where(pos)[0])
            positives.append(p)

            #Find negatives that violate tripet constraint semi-negatives
            neg_mask = np.logical_and(neg,d>d[p])
            neg_mask = np.logical_and(neg_mask,d<self.margin+d[p])
            if neg_mask.sum()>0:
                negatives.append(np.random.choice(np.where(neg_mask)[0]))
            else:
                negatives.append(np.random.choice(np.where(neg)[0]))

        sampled_triplets = [[a, p, n] for a, p, n in zip(anchors, positives, negatives)]

        if return_distances:
            return sampled_triplets, distances
        else:
            return sampled_triplets


    def pdist(self, A):
        prod = torch.mm(A, A.t())
        norm = prod.diag().unsqueeze(1).expand_as(prod)
        res = (norm + norm.t() - 2 * prod).clamp(min = 0)
        return res.clamp(min = 0).sqrt()


================================================
FILE: batchminer/softhard.py
================================================
import numpy as np, torch


class BatchMiner():
    def __init__(self, opt):
        self.par          = opt
        self.name         = 'softhard'

    def __call__(self, batch, labels, return_distances=False):
        if isinstance(labels, torch.Tensor): labels = labels.detach().numpy()
        bs = batch.size(0)
        #Return distance matrix for all elements in batch (BSxBS)
        distances = self.pdist(batch.detach()).detach().cpu().numpy()

        positives, negatives = [], []
        anchors = []
        for i in range(bs):
            l, d = labels[i], distances[i]
            neg = labels!=l; pos = labels==l

            if np.sum(pos)>1:
                anchors.append(i)
                #1 for batchelements with label l
                #0 for current anchor
                pos[i] = False

                #Find negatives that violate triplet constraint in a hard fashion
                neg_mask = np.logical_and(neg,d<d[np.where(pos)[0]].max())
                #Find positives that violate triplet constraint in a hard fashion
                pos_mask = np.logical_and(pos,d>d[np.where(neg)[0]].min())

                if pos_mask.sum()>0:
                    positives.append(np.random.choice(np.where(pos_mask)[0]))
                else:
                    positives.append(np.random.choice(np.where(pos)[0]))

                if neg_mask.sum()>0:
                    negatives.append(np.random.choice(np.where(neg_mask)[0]))
                else:
                    negatives.append(np.random.choice(np.where(neg)[0]))

        sampled_triplets = [[a, p, n] for a, p, n in zip(anchors, positives, negatives)]
        if return_distances:
            return sampled_triplets, distances
        else:
            return sampled_triplets


    def pdist(self, A):
        prod = torch.mm(A, A.t())
        norm = prod.diag().unsqueeze(1).expand_as(prod)
        res = (norm + norm.t() - 2 * prod).clamp(min = 0)
        return res.clamp(min = 0).sqrt()


================================================
FILE: criteria/__init__.py
================================================
### Standard DML criteria
from criteria import triplet, margin, proxynca, npair
from criteria import lifted, contrastive, softmax
from criteria import angular, snr, histogram, arcface
from criteria import softtriplet, multisimilarity, quadruplet
### Non-Standard Criteria
from criteria import adversarial_separation
### Basic Libs
import copy


"""================================================================================================="""
def select(loss, opt, to_optim, batchminer=None):
    #####
    losses = {'triplet': triplet,
              'margin':margin,
              'proxynca':proxynca,
              'npair':npair,
              'angular':angular,
              'contrastive':contrastive,
              'lifted':lifted,
              'snr':snr,
              'multisimilarity':multisimilarity,
              'histogram':histogram,
              'softmax':softmax,
              'softtriplet':softtriplet,
              'arcface':arcface,
              'quadruplet':quadruplet,
              'adversarial_separation':adversarial_separation}


    if loss not in losses: raise NotImplementedError('Loss {} not implemented!'.format(loss))

    loss_lib = losses[loss]
    if loss_lib.REQUIRES_BATCHMINER:
        if batchminer is None:
            raise Exception('Loss {} requires one of the following batch mining methods: {}'.format(loss, loss_lib.ALLOWED_MINING_OPS))
        else:
            if batchminer.name not in loss_lib.ALLOWED_MINING_OPS:
                raise Exception('{}-mining not allowed for {}-loss!'.format(batchminer.name, loss))


    loss_par_dict  = {'opt':opt}
    if loss_lib.REQUIRES_BATCHMINER:
        loss_par_dict['batchminer'] = batchminer

    criterion = loss_lib.Criterion(**loss_par_dict)

    if loss_lib.REQUIRES_OPTIM:
        if hasattr(criterion,'optim_dict_list') and criterion.optim_dict_list is not None:
            to_optim += criterion.optim_dict_list
        else:
            to_optim    += [{'params':criterion.parameters(), 'lr':criterion.lr}]

    return criterion, to_optim


================================================
FILE: criteria/adversarial_separation.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer


"""================================================================================================="""
ALLOWED_MINING_OPS  = list(batchminer.BATCHMINING_METHODS.keys())
REQUIRES_BATCHMINER = False
REQUIRES_OPTIM      = True

### MarginLoss with trainable class separation margin beta. Runs on Mini-batches as well.
class Criterion(torch.nn.Module):
    def __init__(self, opt):
        """
        Args:
            margin:             Triplet Margin.
            nu:                 Regularisation Parameter for beta values if they are learned.
            beta:               Class-Margin values.
            n_classes:          Number of different classes during training.
        """
        super().__init__()

        ####
        self.embed_dim  = opt.embed_dim
        self.proj_dim   = opt.diva_decorrnet_dim

        self.directions = opt.diva_decorrelations
        self.weights    = opt.diva_rho_decorrelation

        self.name       = 'adversarial_separation'

        #Projection network
        self.regressors = nn.ModuleDict()
        for direction in self.directions:
            self.regressors[direction] = torch.nn.Sequential(torch.nn.Linear(self.embed_dim, self.proj_dim), torch.nn.ReLU(), torch.nn.Linear(self.proj_dim, self.embed_dim)).to(torch.float).to(opt.device)

        #Learning Rate for Projection Network
        self.lr        = opt.diva_decorrnet_lr


        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, feature_dict):
        #Apply gradient reversal on input embeddings.
        adj_feature_dict = {key:torch.nn.functional.normalize(grad_reverse(features),dim=-1) for key, features in feature_dict.items()}
        #Project one embedding to the space of the other (with normalization), then compute the correlation.
        sim_loss = 0
        for weight, direction in zip(self.weights, self.directions):
            source, target = direction.split('-')
            sim_loss += -1.*weight*torch.mean(torch.mean((adj_feature_dict[target]*torch.nn.functional.normalize(self.regressors[direction](adj_feature_dict[source]),dim=-1))**2,dim=-1))
        return sim_loss


### Gradient Reversal Layer
class GradRev(torch.autograd.Function):
    """
    Implements an autograd class to flip gradients during backward pass.
    """
    def forward(self, x):
        """
        Container which applies a simple identity function.

        Input:
            x: any torch tensor input.
        """
        return x.view_as(x)

    def backward(self, grad_output):
        """
        Container to reverse gradient signal during backward pass.

        Input:
            grad_output: any computed gradient.
        """
        return (grad_output * -1.)

### Gradient reverse function
def grad_reverse(x):
    """
    Applies gradient reversal on input.

    Input:
        x: any torch tensor input.
    """
    return GradRev()(x)


================================================
FILE: criteria/angular.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer


"""================================================================================================="""
ALLOWED_MINING_OPS = ['npair']
REQUIRES_BATCHMINER = True
REQUIRES_OPTIM      = False

class Criterion(torch.nn.Module):
    def __init__(self, opt, batchminer):
        super(Criterion, self).__init__()

        self.tan_angular_margin = np.tan(np.pi/180*opt.loss_angular_alpha)
        self.lam            = opt.loss_angular_npair_ang_weight
        self.l2_weight      = opt.loss_angular_npair_l2
        self.batchminer     = batchminer

        self.name           = 'angular'

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM

        
    def forward(self, batch, labels, **kwargs):
        ####NOTE: Normalize Angular Loss, but dont normalize npair loss!
        anchors, positives, negatives = self.batchminer(batch, labels)
        anchors, positives, negatives = batch[anchors], batch[positives], batch[negatives]
        n_anchors, n_positives, n_negatives = F.normalize(anchors, dim=1), F.normalize(positives, dim=1), F.normalize(negatives, dim=-1)

        is_term1 = 4*self.tan_angular_margin**2*(n_anchors + n_positives)[:,None,:].bmm(n_negatives.permute(0,2,1))
        is_term2 = 2*(1+self.tan_angular_margin**2)*n_anchors[:,None,:].bmm(n_positives[:,None,:].permute(0,2,1))
        is_term1 = is_term1.view(is_term1.shape[0], is_term1.shape[-1])
        is_term2 = is_term2.view(-1, 1)

        inner_sum_ang = is_term1 - is_term2
        angular_loss = torch.mean(torch.log(torch.sum(torch.exp(inner_sum_ang), dim=1) + 1))


        inner_sum_npair = anchors[:,None,:].bmm((negatives - positives[:,None,:]).permute(0,2,1))
        inner_sum_npair = inner_sum_npair.view(inner_sum_npair.shape[0], inner_sum_npair.shape[-1])
        npair_loss = torch.mean(torch.log(torch.sum(torch.exp(inner_sum_npair.clamp(max=50,min=-50)), dim=1) + 1))

        loss = npair_loss + self.lam*angular_loss + self.l2_weight*torch.mean(torch.norm(batch, p=2, dim=1))
        return loss


================================================
FILE: criteria/arcface.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS  = None
REQUIRES_BATCHMINER = False
REQUIRES_OPTIM      = True

### This implementation follows the pseudocode provided in the original paper.
class Criterion(torch.nn.Module):
    def __init__(self, opt):
        super(Criterion, self).__init__()
        self.par = opt

        ####
        self.angular_margin = opt.loss_arcface_angular_margin
        self.feature_scale  = opt.loss_arcface_feature_scale

        self.class_map = torch.nn.Parameter(torch.Tensor(opt.n_classes, opt.embed_dim))
        stdv = 1. / np.sqrt(self.class_map.size(1))
        self.class_map.data.uniform_(-stdv, stdv)

        self.name  = 'arcface'

        self.lr    = opt.loss_arcface_lr

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        bs, labels = len(batch), labels.to(self.par.device)

        class_map      = torch.nn.functional.normalize(self.class_map, dim=1)
        #Note that the similarity becomes the cosine for normalized embeddings. Denoted as 'fc7' in the paper pseudocode.
        cos_similarity = batch.mm(class_map.T).clamp(min=1e-10, max=1-1e-10)

        pick = torch.zeros(bs, self.par.n_classes).bool().to(self.par.device)
        pick[torch.arange(bs), labels] = 1

        original_target_logit  = cos_similarity[pick]

        theta                 = torch.acos(original_target_logit)
        marginal_target_logit = torch.cos(theta + self.angular_margin)

        class_pred = self.feature_scale * (cos_similarity + pick * (marginal_target_logit-original_target_logit).unsqueeze(1))
        loss       = torch.nn.CrossEntropyLoss()(class_pred, labels)

        return loss


================================================
FILE: criteria/contrastive.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS  = list(batchminer.BATCHMINING_METHODS.keys())
REQUIRES_BATCHMINER = True
REQUIRES_OPTIM      = False


class Criterion(torch.nn.Module):
    def __init__(self, opt, batchminer):
        super(Criterion, self).__init__()
        self.pos_margin = opt.loss_contrastive_pos_margin
        self.neg_margin = opt.loss_contrastive_neg_margin
        self.batchminer = batchminer

        self.name           = 'contrastive'

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        sampled_triplets = self.batchminer(batch, labels)

        anchors   = [triplet[0] for triplet in sampled_triplets]
        positives = [triplet[1] for triplet in sampled_triplets]
        negatives = [triplet[2] for triplet in sampled_triplets]

        pos_dists = torch.mean(F.relu(nn.PairwiseDistance(p=2)(batch[anchors,:], batch[positives,:]) -  self.pos_margin))
        neg_dists = torch.mean(F.relu(self.neg_margin - nn.PairwiseDistance(p=2)(batch[anchors,:], batch[negatives,:])))

        loss      = pos_dists + neg_dists

        return loss


================================================
FILE: criteria/histogram.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS  = None
REQUIRES_BATCHMINER = False
REQUIRES_OPTIM      = False


#NOTE: This implementation follows: https://github.com/valerystrizh/pytorch-histogram-loss
class Criterion(torch.nn.Module):
    def __init__(self, opt):
        """
        Args:
            margin:             Triplet Margin.
        """
        super(Criterion, self).__init__()
        self.par       = opt

        self.nbins     = opt.loss_histogram_nbins
        self.bin_width = 2/(self.nbins - 1)

        # We require a numpy and torch support as parts of the computation require numpy.
        self.support        = np.linspace(-1,1,self.nbins).reshape(-1,1)
        self.support_torch  = torch.linspace(-1,1,self.nbins).reshape(-1,1).to(opt.device)

        self.name           = 'histogram'

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        #The original paper utilizes similarities instead of distances.
        similarity = batch.mm(batch.T)

        bs         = labels.size()[0]

        ### We create a equality matrix for labels occuring in the batch
        label_eqs = (labels.repeat(bs, 1)  == labels.view(-1, 1).repeat(1, bs))

        ### Because the similarity matrix is symmetric, we will only utilise the upper triangular.
        ### These values are indexed by sim_inds
        sim_inds = torch.triu(torch.ones(similarity.size()), 1).bool().to(self.par.device)

        ### For the upper triangular similarity matrix, we want to know where our positives/anchors and negatives are:
        pos_inds = label_eqs[sim_inds].repeat(self.nbins, 1)
        neg_inds = ~label_eqs[sim_inds].repeat(self.nbins, 1)

        ###
        n_pos = pos_inds[0].sum()
        n_neg = neg_inds[0].sum()

        ### Extract upper triangular from the similarity matrix. (produces a one-dim vector)
        unique_sim = similarity[sim_inds].view(1, -1)

        ### We broadcast this vector to each histogram bin. Each bin entry requires a different summation in self.histogram()
        unique_sim_rep = unique_sim.repeat(self.nbins, 1)

        ### This assigns bin-values for float-similarities. The conversion to numpy is important to avoid rounding errors in torch.
        assigned_bin_values = ((unique_sim_rep.detach().cpu().numpy() + 1) / self.bin_width).astype(int) * self.bin_width - 1

        ### We now compute the histogram over distances
        hist_pos_sim = self.histogram(unique_sim_rep, assigned_bin_values, pos_inds, n_pos)
        hist_neg_sim = self.histogram(unique_sim_rep, assigned_bin_values, neg_inds, n_neg)

        ### Compute the CDF for the positive similarity histogram
        hist_pos_rep  = hist_pos_sim.view(-1, 1).repeat(1, hist_pos_sim.size()[0])
        hist_pos_inds = torch.tril(torch.ones(hist_pos_rep.size()), -1).bool()
        hist_pos_rep[hist_pos_inds] = 0
        hist_pos_cdf  = hist_pos_rep.sum(0)

        loss = torch.sum(hist_neg_sim * hist_pos_cdf)

        return loss


    def histogram(self, unique_sim_rep, assigned_bin_values, idxs, n_elem):
        """
        Compute the histogram over similarities.
        Args:
            unique_sim_rep:      torch tensor of shape nbins x n_unique_neg_similarities.
            assigned_bin_values: Bin value for each similarity value in unique_sim_rep.
            idxs:                positive/negative entry indices in unique_sim_rep
            n_elem:              number of elements in unique_sim_rep.
        """
        # Cloning is required because we change the similarity matrix in-place, but need it for the
        # positive AND negative histogram. Note that clone() allows for backprop.
        usr = unique_sim_rep.clone()
        # For each bin (and its lower neighbour bin) we find the distance values that belong.
        indsa = torch.tensor((assigned_bin_values==(self.support-self.bin_width) ) & idxs.detach().cpu().numpy())
        indsb = torch.tensor((assigned_bin_values==self.support) & idxs.detach().cpu().numpy())
        # Set all irrelevant similarities to 0
        usr[~(indsb|indsa)]=0
        #
        usr[indsa] = (usr  - self.support_torch + self.bin_width)[indsa] / self.bin_width
        usr[indsb] = (-usr + self.support_torch + self.bin_width)[indsb] / self.bin_width

        return usr.sum(1)/n_elem


================================================
FILE: criteria/lifted.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS = ['lifted']
REQUIRES_BATCHMINER = True
REQUIRES_OPTIM      = False


class Criterion(torch.nn.Module):
    def __init__(self, opt, batchminer):

        super(Criterion, self).__init__()
        self.margin     = opt.loss_lifted_neg_margin
        self.l2_weight  = opt.loss_lifted_l2
        self.batchminer = batchminer

        self.name           = 'lifted'

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        anchors, positives, negatives = self.batchminer(batch, labels)

        loss = []
        for anchor, positive_set, negative_set in zip(anchors, positives, negatives):
            anchor, positive_set, negative_set = batch[anchor, :].view(1,-1), batch[positive_set, :].view(1,len(positive_set),-1), batch[negative_set, :].view(1,len(negative_set),-1)
            pos_term = torch.logsumexp(nn.PairwiseDistance(p=2)(anchor[:,:,None], positive_set.permute(0,2,1)), dim=1)
            neg_term = torch.logsumexp(self.margin - nn.PairwiseDistance(p=2)(anchor[:,:,None], negative_set.permute(0,2,1)), dim=1)
            loss.append(F.relu(pos_term + neg_term))

        loss = torch.mean(torch.stack(loss)) + self.l2_weight*torch.mean(torch.norm(batch, p=2, dim=1))
        return loss


================================================
FILE: criteria/margin.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS  = list(batchminer.BATCHMINING_METHODS.keys())
REQUIRES_BATCHMINER = True
REQUIRES_OPTIM      = True

### MarginLoss with trainable class separation margin beta. Runs on Mini-batches as well.
class Criterion(torch.nn.Module):
    def __init__(self, opt, batchminer):
        super(Criterion, self).__init__()
        self.n_classes          = opt.n_classes

        self.margin             = opt.loss_margin_margin
        self.nu                 = opt.loss_margin_nu
        self.beta_constant      = opt.loss_margin_beta_constant
        self.beta_val           = opt.loss_margin_beta

        if opt.loss_margin_beta_constant:
            self.beta = opt.loss_margin_beta
        else:
            self.beta = torch.nn.Parameter(torch.ones(opt.n_classes)*opt.loss_margin_beta)

        self.batchminer = batchminer

        self.name  = 'margin'

        self.lr    = opt.loss_margin_beta_lr

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        sampled_triplets = self.batchminer(batch, labels)

        if len(sampled_triplets):
            d_ap, d_an = [],[]
            for triplet in sampled_triplets:
                train_triplet = {'Anchor': batch[triplet[0],:], 'Positive':batch[triplet[1],:], 'Negative':batch[triplet[2]]}

                pos_dist = ((train_triplet['Anchor']-train_triplet['Positive']).pow(2).sum()+1e-8).pow(1/2)
                neg_dist = ((train_triplet['Anchor']-train_triplet['Negative']).pow(2).sum()+1e-8).pow(1/2)

                d_ap.append(pos_dist)
                d_an.append(neg_dist)
            d_ap, d_an = torch.stack(d_ap), torch.stack(d_an)

            if self.beta_constant:
                beta = self.beta
            else:
                beta = torch.stack([self.beta[labels[triplet[0]]] for triplet in sampled_triplets]).to(torch.float).to(d_ap.device)

            pos_loss = torch.nn.functional.relu(d_ap-beta+self.margin)
            neg_loss = torch.nn.functional.relu(beta-d_an+self.margin)

            pair_count = torch.sum((pos_loss>0.)+(neg_loss>0.)).to(torch.float).to(d_ap.device)

            if pair_count == 0.:
                loss = torch.sum(pos_loss+neg_loss)
            else:
                loss = torch.sum(pos_loss+neg_loss)/pair_count

            if self.nu: 
                beta_regularization_loss = torch.sum(beta)
                loss += self.nu * beta_regularisation_loss.to(torch.float).to(d_ap.device)
        else:
            loss = torch.tensor(0.).to(torch.float).to(batch.device)

        return loss


================================================
FILE: criteria/multisimilarity.py
================================================
import torch, torch.nn as nn


"""================================================================================================="""
ALLOWED_MINING_OPS  = None
REQUIRES_BATCHMINER = False
REQUIRES_OPTIM      = False

class Criterion(torch.nn.Module):
    def __init__(self, opt):
        super(Criterion, self).__init__()
        self.n_classes          = opt.n_classes

        self.pos_weight = opt.loss_multisimilarity_pos_weight
        self.neg_weight = opt.loss_multisimilarity_neg_weight
        self.margin     = opt.loss_multisimilarity_margin
        self.thresh     = opt.loss_multisimilarity_thresh

        self.name           = 'multisimilarity'

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        similarity = batch.mm(batch.T)

        loss = []
        for i in range(len(batch)):
            pos_idxs       = labels==labels[i]
            pos_idxs[i]    = 0
            neg_idxs       = labels!=labels[i]

            anchor_pos_sim = similarity[i][pos_idxs]
            anchor_neg_sim = similarity[i][neg_idxs]

            ### This part doesn't really work, especially when you dont have a lot of positives in the batch...
            neg_idxs = (anchor_neg_sim + self.margin) > torch.min(anchor_pos_sim)
            pos_idxs = (anchor_pos_sim - self.margin) < torch.max(anchor_neg_sim)
            if not torch.sum(neg_idxs) or not torch.sum(pos_idxs):
                continue
            anchor_neg_sim = anchor_neg_sim[neg_idxs]
            anchor_pos_sim = anchor_pos_sim[pos_idxs]

            pos_term = 1./self.pos_weight * torch.log(1+torch.sum(torch.exp(-self.pos_weight* (anchor_pos_sim - self.thresh))))
            neg_term = 1./self.neg_weight * torch.log(1+torch.sum(torch.exp(self.neg_weight * (anchor_neg_sim - self.thresh))))

            loss.append(pos_term + neg_term)

        loss = torch.mean(torch.stack(loss))
        return loss


================================================
FILE: criteria/npair.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer


"""================================================================================================="""
ALLOWED_MINING_OPS = ['npair']
REQUIRES_BATCHMINER = True
REQUIRES_OPTIM      = False

class Criterion(torch.nn.Module):
    def __init__(self, opt, batchminer):
        """
        Args:
        """
        super(Criterion, self).__init__()
        self.pars = opt
        self.l2_weight = opt.loss_npair_l2
        self.batchminer = batchminer

        self.name           = 'npair'

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        anchors, positives, negatives = self.batchminer(batch, labels)

        ##
        loss  = 0
        if 'bninception' in self.pars.arch:
            ### clamping/value reduction to avoid initial overflow for high embedding dimensions!
            batch = batch/4
        for anchor, positive, negative_set in zip(anchors, positives, negatives):
            a_embs, p_embs, n_embs = batch[anchor:anchor+1], batch[positive:positive+1], batch[negative_set]
            inner_sum = a_embs[:,None,:].bmm((n_embs - p_embs[:,None,:]).permute(0,2,1))
            inner_sum = inner_sum.view(inner_sum.shape[0], inner_sum.shape[-1])
            loss  = loss + torch.mean(torch.log(torch.sum(torch.exp(inner_sum), dim=1) + 1))/len(anchors)
            loss  = loss + self.l2_weight*torch.mean(torch.norm(batch, p=2, dim=1))/len(anchors)


        return loss


================================================
FILE: criteria/proxynca.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer


"""================================================================================================="""
ALLOWED_MINING_OPS  = None
REQUIRES_BATCHMINER = False
REQUIRES_OPTIM      = True


class Criterion(torch.nn.Module):
    def __init__(self, opt):
        """
        Args:
            opt: Namespace containing all relevant parameters.
        """
        super(Criterion, self).__init__()

        ####
        self.num_proxies        = opt.n_classes
        self.embed_dim          = opt.embed_dim

        self.proxies            = torch.nn.Parameter(torch.randn(self.num_proxies, self.embed_dim)/8)
        self.class_idxs         = torch.arange(self.num_proxies)

        self.name           = 'proxynca'

        self.optim_dict_list = [{'params':self.proxies, 'lr':opt.lr * opt.loss_proxynca_lrmulti}]


        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM

        
    def forward(self, batch, labels, **kwargs):
        #Empirically, multiplying the embeddings during the computation of the loss seem to allow for more stable training;
        #Acts as a temperature in the NCA objective.
        batch   = 3*torch.nn.functional.normalize(batch, dim=1)
        proxies = 3*torch.nn.functional.normalize(self.proxies, dim=1)
        #Group required proxies
        pos_proxies = torch.stack([proxies[pos_label:pos_label+1,:] for pos_label in labels])
        neg_proxies = torch.stack([torch.cat([self.class_idxs[:class_label],self.class_idxs[class_label+1:]]) for class_label in labels])
        neg_proxies = torch.stack([proxies[neg_labels,:] for neg_labels in neg_proxies])
        #Compute Proxy-distances
        dist_to_neg_proxies = torch.sum((batch[:,None,:]-neg_proxies).pow(2),dim=-1)
        dist_to_pos_proxies = torch.sum((batch[:,None,:]-pos_proxies).pow(2),dim=-1)
        #Compute final proxy-based NCA loss
        loss = torch.mean(dist_to_pos_proxies[:,0] + torch.logsumexp(-dist_to_neg_proxies, dim=1))

        return loss


================================================
FILE: criteria/quadruplet.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer
"""================================================================================================="""
ALLOWED_MINING_OPS  = list(batchminer.BATCHMINING_METHODS.keys())
REQUIRES_BATCHMINER = True
REQUIRES_OPTIM      = False


class Criterion(torch.nn.Module):
    def __init__(self, opt, batchminer):
        super(Criterion, self).__init__()
        self.batchminer = batchminer

        self.name           = 'quadruplet'

        self.margin_alpha_1 = opt.loss_quadruplet_margin_alpha_1
        self.margin_alpha_2 = opt.loss_quadruplet_margin_alpha_2

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def triplet_distance(self, anchor, positive, negative):
        return torch.nn.functional.relu(torch.norm(anchor-positive, p=2, dim=-1)-torch.norm(anchor-negative, p=2, dim=-1)+self.margin_alpha_1)

    def quadruplet_distance(self, anchor, positive, negative, fourth_negative):
        return torch.nn.functional.relu(torch.norm(anchor-positive, p=2, dim=-1)-torch.norm(negative-fourth_negative, p=2, dim=-1)+self.margin_alpha_2)

    def forward(self, batch, labels, **kwargs):
        sampled_triplets    = self.batchminer(batch, labels)

        anchors   = np.array([triplet[0] for triplet in sampled_triplets]).reshape(-1,1)
        positives = np.array([triplet[1] for triplet in sampled_triplets]).reshape(-1,1)
        negatives = np.array([triplet[2] for triplet in sampled_triplets]).reshape(-1,1)

        fourth_negatives = negatives!=negatives.T
        fourth_negatives = [np.random.choice(np.arange(len(batch))[idxs]) for idxs in fourth_negatives]

        triplet_loss     = self.triplet_distance(batch[anchors,:],batch[positives,:],batch[negatives,:])
        quadruplet_loss  = self.quadruplet_distance(batch[anchors,:],batch[positives,:],batch[negatives,:],batch[fourth_negatives,:])

        return torch.mean(triplet_loss) + torch.mean(quadruplet_loss)


================================================
FILE: criteria/snr.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS  = list(batchminer.BATCHMINING_METHODS.keys())
REQUIRES_BATCHMINER = True
REQUIRES_OPTIM      = False

### This implements the Signal-To-Noise Ratio Triplet Loss
class Criterion(torch.nn.Module):
    def __init__(self, opt, batchminer):
        super(Criterion, self).__init__()
        self.margin     = opt.loss_snr_margin
        self.reg_lambda = opt.loss_snr_reg_lambda
        self.batchminer = batchminer

        if self.batchminer.name=='distance': self.reg_lambda = 0

        self.name = 'snr'

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        sampled_triplets = self.batchminer(batch, labels)
        anchors   = [triplet[0] for triplet in sampled_triplets]
        positives = [triplet[1] for triplet in sampled_triplets]
        negatives = [triplet[2] for triplet in sampled_triplets]

        pos_snr  = torch.var(batch[anchors,:]-batch[positives,:], dim=1)/torch.var(batch[anchors,:], dim=1)
        neg_snr  = torch.var(batch[anchors,:]-batch[negatives,:], dim=1)/torch.var(batch[anchors,:], dim=1)

        reg_loss = torch.mean(torch.abs(torch.sum(batch[anchors,:],dim=1)))

        snr_loss = torch.nn.functional.relu(pos_snr - neg_snr + self.margin)
        snr_loss = torch.sum(snr_loss)/torch.sum(snr_loss>0)

        loss = snr_loss + self.reg_lambda * reg_loss

        return loss


================================================
FILE: criteria/softmax.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS  = None
REQUIRES_BATCHMINER = False
REQUIRES_OPTIM      = True

### This Implementation follows: https://github.com/azgo14/classification_metric_learning

class Criterion(torch.nn.Module):
    def __init__(self, opt):
        super(Criterion, self).__init__()
        self.par         = opt

        self.temperature = opt.loss_softmax_temperature

        self.class_map = torch.nn.Parameter(torch.Tensor(opt.n_classes, opt.embed_dim))
        stdv = 1. / np.sqrt(self.class_map.size(1))
        self.class_map.data.uniform_(-stdv, stdv)

        self.name           = 'softmax'

        self.lr = opt.loss_softmax_lr

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        class_mapped_batch = torch.nn.functional.linear(batch, torch.nn.functional.normalize(self.class_map, dim=1))

        loss = torch.nn.CrossEntropyLoss()(class_mapped_batch/self.temperature, labels.to(torch.long).to(self.par.device))

        return loss


================================================
FILE: criteria/softtriplet.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS  = None
REQUIRES_BATCHMINER = False
REQUIRES_OPTIM      = True

### This implementation follows https://github.com/idstcv/SoftTriple
class Criterion(torch.nn.Module):
    def __init__(self, opt):
        super(Criterion, self).__init__()

        ####
        self.par         = opt
        self.n_classes   = opt.n_classes

        ####
        self.n_centroids  = opt.loss_softtriplet_n_centroids
        self.margin_delta = opt.loss_softtriplet_margin_delta
        self.gamma        = opt.loss_softtriplet_gamma
        self.lam          = opt.loss_softtriplet_lambda
        self.reg_weight   = opt.loss_softtriplet_reg_weight


        ####
        self.reg_norm    = self.n_classes*self.n_centroids*(self.n_centroids-1)
        self.reg_indices = torch.zeros((self.n_classes*self.n_centroids, self.n_classes*self.n_centroids), dtype=torch.bool).to(opt.device)
        for i in range(0, self.n_classes):
            for j in range(0, self.n_centroids):
                self.reg_indices[i*self.n_centroids+j, i*self.n_centroids+j+1:(i+1)*self.n_centroids] = 1


        ####
        self.intra_class_centroids = torch.nn.Parameter(torch.Tensor(opt.embed_dim, self.n_classes*self.n_centroids))
        stdv = 1. / np.sqrt(self.intra_class_centroids.size(1))
        self.intra_class_centroids.data.uniform_(-stdv, stdv)

        self.name = 'softtriplet'

        self.lr   = opt.lr*opt.loss_softtriplet_lr

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def forward(self, batch, labels, **kwargs):
        bs = batch.size(0)

        intra_class_centroids     = torch.nn.functional.normalize(self.intra_class_centroids, dim=1)
        similarities_to_centroids = batch.mm(intra_class_centroids).reshape(-1, self.n_classes, self.n_centroids)

        soft_weight_over_centroids = torch.nn.Softmax(dim=1)(self.gamma*similarities_to_centroids)
        per_class_embed            = torch.sum(soft_weight_over_centroids * similarities_to_centroids, dim=2)

        margin_delta = torch.zeros(per_class_embed.shape).to(self.par.device)
        margin_delta[torch.arange(0, bs), labels] = self.margin_delta

        centroid_classification_loss = torch.nn.CrossEntropyLoss()(self.lam*(per_class_embed-margin_delta), labels.to(torch.long).to(self.par.device))

        inter_centroid_similarity = intra_class_centroids.T.mm(intra_class_centroids)
        regularisation_loss = torch.sum(torch.sqrt(2.00001-2*inter_centroid_similarity[self.reg_indices]))/self.reg_norm

        return centroid_classification_loss + self.reg_weight * regularisation_loss


================================================
FILE: criteria/triplet.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
import batchminer

"""================================================================================================="""
ALLOWED_MINING_OPS  = list(batchminer.BATCHMINING_METHODS.keys())
REQUIRES_BATCHMINER = True
REQUIRES_OPTIM      = False

### Standard Triplet Loss, finds triplets in Mini-batches.
class Criterion(torch.nn.Module):
    def __init__(self, opt, batchminer):
        super(Criterion, self).__init__()
        self.margin     = opt.loss_triplet_margin
        self.batchminer = batchminer
        self.name           = 'triplet'

        ####
        self.ALLOWED_MINING_OPS  = ALLOWED_MINING_OPS
        self.REQUIRES_BATCHMINER = REQUIRES_BATCHMINER
        self.REQUIRES_OPTIM      = REQUIRES_OPTIM


    def triplet_distance(self, anchor, positive, negative):
        return torch.nn.functional.relu((anchor-positive).pow(2).sum()-(anchor-negative).pow(2).sum()+self.margin)

    def forward(self, batch, labels, **kwargs):
        if isinstance(labels, torch.Tensor): labels = labels.cpu().numpy()
        sampled_triplets = self.batchminer(batch, labels)
        loss             = torch.stack([self.triplet_distance(batch[triplet[0],:],batch[triplet[1],:],batch[triplet[2],:]) for triplet in sampled_triplets])

        return torch.mean(loss)


================================================
FILE: datasampler/__init__.py
================================================
import datasampler.class_random_sampler
import datasampler.random_sampler
import datasampler.greedy_coreset_sampler
import datasampler.fid_batchmatch_sampler
import datasampler.disthist_batchmatch_sampler
import datasampler.d2_coreset_sampler


def select(sampler, opt, image_dict, image_list=None, **kwargs):
    if 'batchmatch' in sampler:
        if sampler=='disthist_batchmatch':
            sampler_lib = disthist_batchmatch_sampler
        elif sampler=='fid_batchmatch':
            sampler_lib = spc_fid_batchmatch_sampler
    elif 'random' in sampler:
        if 'class' in sampler:
            sampler_lib = class_random_sampler
        elif 'full' in sampler:
            sampler_lib = random_sampler
    elif 'coreset' in sampler:
        if 'greedy' in sampler:
            sampler_lib = greedy_coreset_sampler
        elif 'd2' in sampler:
            sampler_lib = d2_coreset_sampler
    else:
        raise Exception('Minibatch sampler <{}> not available!'.format(sampler))

    sampler = sampler_lib.Sampler(opt,image_dict=image_dict,image_list=image_list)

    return sampler


================================================
FILE: datasampler/class_random_sampler.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
from tqdm import tqdm
import random


"""======================================================"""
REQUIRES_STORAGE = False

###
class Sampler(torch.utils.data.sampler.Sampler):
    """
    Plugs into PyTorch Batchsampler Package.
    """
    def __init__(self, opt, image_dict, image_list, **kwargs):
        self.pars = opt

        #####
        self.image_dict         = image_dict
        self.image_list         = image_list

        #####
        self.classes        = list(self.image_dict.keys())

        ####
        self.batch_size         = opt.bs
        self.samples_per_class  = opt.samples_per_class
        self.sampler_length     = len(image_list)//opt.bs
        assert self.batch_size%self.samples_per_class==0, '#Samples per class must divide batchsize!'

        self.name             = 'class_random_sampler'
        self.requires_storage = False

    def __iter__(self):
        for _ in range(self.sampler_length):
            subset = []
            ### Random Subset from Random classes
            draws = self.batch_size//self.samples_per_class

            for _ in range(draws):
                class_key = random.choice(self.classes)
                class_ix_list = [random.choice(self.image_dict[class_key])[-1] for _ in range(self.samples_per_class)]
                subset.extend(class_ix_list)

            yield subset

    def __len__(self):
        return self.sampler_length


================================================
FILE: datasampler/d2_coreset_sampler.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
from tqdm import tqdm
import random
from scipy import linalg
from scipy.stats import multivariate_normal

"""======================================================"""
REQUIRES_STORAGE = True

###
class Sampler(torch.utils.data.sampler.Sampler):
    """
    Plugs into PyTorch Batchsampler Package.
    """
    def __init__(self, opt, image_dict, image_list):
        self.image_dict = image_dict
        self.image_list = image_list

        self.batch_size         = opt.bs
        self.samples_per_class  = opt.samples_per_class
        self.sampler_length     = len(image_list)//opt.bs
        assert self.batch_size%self.samples_per_class==0, '#Samples per class must divide batchsize!'

        self.name             = 'greedy_coreset_sampler'
        self.requires_storage = True

        self.bigbs           = opt.data_batchmatch_bigbs
        self.update_storage  = not opt.data_storage_no_update
        self.num_batch_comps = opt.data_batchmatch_ncomps

        self.low_proj_dim    = opt.data_sampler_lowproj_dim

        self.lam             = opt.data_d2_coreset_lambda

        self.n_jobs = 16

    def __iter__(self):
        for i in range(self.sampler_length):
            yield self.epoch_indices[i]


    def precompute_indices(self):
        from joblib import Parallel, delayed
        import time

        ### Random Subset from Random classes
        bigb_idxs = np.random.choice(len(self.storage), self.bigbs, replace=True)
        bigbatch  = self.storage[bigb_idxs]

        print('Precomputing Indices... ', end='')
        start = time.time()
        def batchfinder(n_calls, pos):
            idx_sets         = self.d2_coreset(n_calls, pos)
            structured_batches = [list(bigb_idxs[idx_set]) for idx_set in idx_sets]
            # structured_batch = list(bigb_idxs[self.fid_match(bigbatch, batch_size=self.batch_size//self.samples_per_class)])
            #Add random per-class fillers to ensure that the batch is build up correctly.
            for i in range(len(structured_batches)):
                class_idxs = [self.image_list[idx][-1] for idx in structured_batches[i]]
                for class_idx in class_idxs:
                    structured_batches[i].extend([random.choice(self.image_dict[class_idx])[-1] for _ in range(self.samples_per_class-1)])

            return structured_batches

        n_calls            = int(np.ceil(self.sampler_length/self.n_jobs))
        # self.epoch_indices = batchfinder(n_calls, 0)
        self.epoch_indices = Parallel(n_jobs = self.n_jobs)(delayed(batchfinder)(n_calls, i) for i in range(self.n_jobs))
        self.epoch_indices = [x for y in self.epoch_indices for x in y]
        # self.epoch_indices = Parallel(n_jobs = self.n_jobs)(delayed(batchfinder)(self.storage[np.random.choice(len(self.storage), self.bigbs, replace=True)]) for _ in tqdm(range(self.sampler_length), desc='Precomputing Indices...'))

        print('Done in {0:3.4f}s.'.format(time.time()-start))
    def replace_storage_entries(self, embeddings, indices):
        self.storage[indices] = embeddings

    def create_storage(self, dataloader, model, device):
        with torch.no_grad():
            _ = model.eval()
            _ = model.to(device)

            embed_collect = []
            for i,input_tuple in enumerate(tqdm(dataloader, 'Creating data storage...')):
                embed = model(input_tuple[1].type(torch.FloatTensor).to(device))
                if isinstance(embed, tuple): embed = embed[0]
                embed = embed.cpu()
                embed_collect.append(embed)
            embed_collect = torch.cat(embed_collect, dim=0)
            self.storage = embed_collect


    def d2_coreset(self, calls, pos):
        """
        """
        coll = []

        for _ in range(calls):
            bigbatch   = self.storage[np.random.choice(len(self.storage), self.bigbs, replace=False)]
            batch_size = self.batch_size//self.samples_per_class

            if self.low_proj_dim>0:
                low_dim_proj = nn.Linear(bigbatch.shape[-1],self.low_proj_dim,bias=False)
                with torch.no_grad(): bigbatch = low_dim_proj(bigbatch)

            bigbatch = bigbatch.numpy()
            # emp_mean, emp_std = np.mean(bigbatch, axis=0), np.std(bigbatch, axis=0)
            emp_mean, emp_cov = np.mean(bigbatch, axis=0), np.cov(bigbatch.T)

            prod        = np.matmul(bigbatch, bigbatch.T)
            sq          = prod.diagonal().reshape(bigbatch.shape[0], 1)
            dist_matrix = np.clip(-2*prod + sq + sq.T, 0, None)

            start_anchor = np.random.multivariate_normal(emp_mean, emp_cov, 1).reshape(-1)
            start_dists  = np.linalg.norm(bigbatch-start_anchor,axis=1)
            start_point  = np.argmin(start_dists, axis=0)

            idxs = list(range(len(bigbatch)))
            del idxs[start_point]

            k, sampled_indices = 1, [start_point]
            dist_weights = dist_matrix[:,start_point]

            normal_weights = multivariate_normal.pdf(bigbatch,emp_mean,emp_cov)
            while k<batch_size:
                normal_weights_to_use = normal_weights[idxs]/normal_weights[idxs].sum()
                dim = bigbatch.shape[-1]

                sampling_p = normal_weights_to_use*dist_weights[idxs]**self.lam
                sampling_p/= np.sum(sampling_p)

                dm_idx = np.random.choice(range(len(dist_matrix)-k),p=sampling_p.reshape(-1))
                sample = idxs[dm_idx]

                del idxs[dm_idx]

                sampled_indices.append(sample)

                dist_weights = dist_weights + dist_matrix[:,sample]
                k += 1

            coll.append(sampled_indices)

        return coll


    def __len__(self):
        return self.sampler_length


================================================
FILE: datasampler/disthist_batchmatch_sampler.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
from tqdm import tqdm
import random
from scipy import linalg
from scipy.stats import wasserstein_distance

"""======================================================"""
REQUIRES_STORAGE = True

###
class Sampler(torch.utils.data.sampler.Sampler):
    """
    Plugs into PyTorch Batchsampler Package.
    """
    def __init__(self, opt, image_dict, image_list):
        self.image_dict = image_dict
        self.image_list = image_list

        self.batch_size         = opt.bs
        self.samples_per_class  = opt.samples_per_class
        self.sampler_length     = len(image_list)//opt.bs
        assert self.batch_size%self.samples_per_class==0, '#Samples per class must divide batchsize!'

        self.name             = 'distmoment_batchmatch_sampler'
        self.requires_storage = True

        self.bigbs           = opt.data_batchmatch_bigbs
        self.update_storage  = not opt.data_storage_no_update
        self.num_batch_comps = opt.data_batchmatch_ncomps

        self.low_proj_dim    = opt.data_sampler_lowproj_dim

        self.n_jobs = 16

        self.internal_image_dict = {self.image_list[i]:i for i in range(len(self.image_list))}


    def __iter__(self):
        for i in range(self.sampler_length):
            # ### Random Subset from Random classes
            # bigb_idxs = np.random.choice(len(self.storage), self.bigbs, replace=True)
            # bigbatch  = self.storage[bigb_idxs]
            #
            # structured_batch = list(bigb_idxs[self.fid_match(bigbatch, batch_size=self.batch_size//self.samples_per_class)])
            # #Add random per-class fillers to ensure that the batch is build up correctly.
            #
            # class_idxs = [self.image_list[idx][-1] for idx in structured_batch]
            # for class_idx in class_idxs:
            #     structured_batch.extend([random.choice(self.image_dict[class_idx])[-1] for _ in range(self.samples_per_class-1)])

            yield self.epoch_indices[i]


    def precompute_indices(self):
        from joblib import Parallel, delayed
        import time
        ### Random Subset from Random classes
        # self.disthist_match()
        print('Precomputing Indices... ', end='')
        start = time.time()
        n_calls            = int(np.ceil(self.sampler_length/self.n_jobs))
        self.epoch_indices = Parallel(n_jobs = self.n_jobs)(delayed(self.disthist_match)(n_calls, i) for i in range(self.n_jobs))
        self.epoch_indices = [x for y in self.epoch_indices for x in y]
        print('Done in {0:3.4f}s.'.format(time.time()-start))


    def replace_storage_entries(self, embeddings, indices):
        self.storage[indices] = embeddings

    def create_storage(self, dataloader, model, device):
        with torch.no_grad():
            _ = model.eval()
            _ = model.to(device)

            embed_collect = []
            for i,input_tuple in enumerate(tqdm(dataloader, 'Creating data storage...')):
                embed = model(input_tuple[1].type(torch.FloatTensor).to(device))
                if isinstance(embed, tuple): embed = embed[0]
                embed = embed.cpu()
                embed_collect.append(embed)
            embed_collect = torch.cat(embed_collect, dim=0)
            self.storage = embed_collect


    def spc_batchfinder(self, n_samples):
        ### SpC-Sample big batch:
        subset, classes = [], []
        ### Random Subset from Random classes
        for _ in range(n_samples//self.samples_per_class):
            class_key = random.choice(list(self.image_dict.keys()))
            # subset.extend([(class_key, random.choice(len(self.image_dict[class_key])) for _ in range(self.samples_per_class)])
            subset.extend([random.choice(self.image_dict[class_key])[-1] for _ in range(self.samples_per_class)])
            classes.extend([class_key]*self.samples_per_class)
        return np.array(subset), np.array(classes)


    def get_distmat(self, arr):
        prod = np.matmul(arr, arr.T)
        sq   = prod.diagonal().reshape(arr.shape[0], 1)
        dist_matrix = np.sqrt(np.clip(-2*prod + sq + sq.T, 0 , None))
        return dist_matrix

    def disthist_match(self, calls, pos):
        """
        """
        coll = []
        for _ in range(calls):
            bigb_data_idxs, bigb_data_classes = self.spc_batchfinder(self.bigbs)
            bigb_dict = {}
            for i, bigb_cls in enumerate(bigb_data_classes):
                if bigb_cls not in bigb_dict: bigb_dict[bigb_cls] = []
                bigb_dict[bigb_cls].append(i)

            bigbatch = self.storage[bigb_data_idxs]
            if self.low_proj_dim>0:
                low_dim_proj = nn.Linear(bigbatch.shape[-1],self.low_proj_dim,bias=False)
                with torch.no_grad(): bigbatch = low_dim_proj(bigbatch)
            bigbatch = bigbatch.numpy()

            bigb_distmat_triu_idxs = np.triu_indices(len(bigbatch),1)
            bigb_distvals          = self.get_distmat(bigbatch)[bigb_distmat_triu_idxs]

            bigb_disthist_range, bigb_disthist_bins = (np.min(bigb_distvals), np.max(bigb_distvals)), 50
            bigb_disthist, _                        = np.histogram(bigb_distvals, bins=bigb_disthist_bins, range=bigb_disthist_range)
            bigb_disthist = bigb_disthist/np.sum(bigb_disthist)

            bigb_mu  = np.mean(bigbatch, axis=0)
            bigb_std = np.std(bigbatch, axis=0)


            cost_collect, bigb_idxs = [], []

            for _ in range(self.num_batch_comps):
                subset_idxs = [np.random.choice(bigb_dict[np.random.choice(list(bigb_dict.keys()))], self.samples_per_class, replace=False) for _ in range(self.batch_size//self.samples_per_class)]
                subset_idxs = [x for y in subset_idxs for x in y]
                # subset_idxs = sorted(np.random.choice(len(bigbatch), batch_size, replace=False))
                bigb_idxs.append(subset_idxs)
                subset         = bigbatch[subset_idxs,:]
                subset_distmat = self.get_distmat(subset)

                subset_distmat_triu_idxs = np.triu_indices(len(subset_distmat),1)
                subset_distvals          = self.get_distmat(subset)[subset_distmat_triu_idxs]

                subset_disthist_range, subset_disthist_bins = (np.min(subset_distvals), np.max(subset_distvals)), 50
                subset_disthist, _                          = np.histogram(subset_distvals, bins=bigb_disthist_bins, range=bigb_disthist_range)
                subset_disthist = subset_disthist/np.sum(subset_disthist)

                subset_mu  = np.mean(subset, axis=0)
                subset_std = np.std(subset, axis=0)


                dist_wd = wasserstein_distance(bigb_disthist, subset_disthist)+wasserstein_distance(subset_disthist, bigb_disthist)
                cost    = np.linalg.norm(bigb_mu - subset_mu) + np.linalg.norm(bigb_std - subset_std) + 75*dist_wd
                cost_collect.append(cost)

            bigb_ix      = bigb_idxs[np.argmin(cost_collect)]
            bigb_data_ix = bigb_data_idxs[bigb_ix]
            coll.append(bigb_data_ix)

        return coll

    def __len__(self):
        return self.sampler_length


================================================
FILE: datasampler/fid_batchmatch_sampler.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
from tqdm import tqdm
import random
from scipy import linalg


"""======================================================"""
REQUIRES_STORAGE = True

###
class Sampler(torch.utils.data.sampler.Sampler):
    """
    Plugs into PyTorch Batchsampler Package.
    """
    def __init__(self, opt, image_dict, image_list):
        self.image_dict = image_dict
        self.image_list = image_list

        self.batch_size         = opt.bs
        self.samples_per_class  = opt.samples_per_class
        self.sampler_length     = len(image_list)//opt.bs
        assert self.batch_size%self.samples_per_class==0, '#Samples per class must divide batchsize!'

        self.name             = 'spc_fid_batchmatch_sampler'
        self.requires_storage = True

        self.bigbs           = opt.data_batchmatch_bigbs
        self.update_storage  = not opt.data_storage_no_update
        self.num_batch_comps = opt.data_batchmatch_ncomps
        self.low_proj_dim    = opt.data_sampler_lowproj_dim

        self.n_jobs = 16

        self.internal_image_dict = {self.image_list[i]:i for i in range(len(self.image_list))}


    def __iter__(self):
        for i in range(self.sampler_length):
            # ### Random Subset from Random classes
            # bigb_idxs = np.random.choice(len(self.storage), self.bigbs, replace=True)
            # bigbatch  = self.storage[bigb_idxs]
            #
            # structured_batch = list(bigb_idxs[self.fid_match(bigbatch, batch_size=self.batch_size//self.samples_per_class)])
            # #Add random per-class fillers to ensure that the batch is build up correctly.
            #
            # class_idxs = [self.image_list[idx][-1] for idx in structured_batch]
            # for class_idx in class_idxs:
            #     structured_batch.extend([random.choice(self.image_dict[class_idx])[-1] for _ in range(self.samples_per_class-1)])

            yield self.epoch_indices[i]


    def precompute_indices(self):
        from joblib import Parallel, delayed
        import time
        ### Random Subset from Random classes
        # self.disthist_match()
        print('Precomputing Indices... ', end='')
        start = time.time()
        n_calls            = int(np.ceil(self.sampler_length/self.n_jobs))
        self.epoch_indices = Parallel(n_jobs = self.n_jobs)(delayed(self.spc_fid_match)(n_calls, i) for i in range(self.n_jobs))
        self.epoch_indices = [x for y in self.epoch_indices for x in y]
        print('Done in {0:3.4f}s.'.format(time.time()-start))


    def replace_storage_entries(self, embeddings, indices):
        self.storage[indices] = embeddings

    def create_storage(self, dataloader, model, device):
        with torch.no_grad():
            _ = model.eval()
            _ = model.to(device)

            embed_collect = []
            for i,input_tuple in enumerate(tqdm(dataloader, 'Creating data storage...')):
                embed = model(input_tuple[1].type(torch.FloatTensor).to(device))
                if isinstance(embed, tuple): embed = embed[0]
                embed = embed.cpu()
                embed_collect.append(embed)
            embed_collect = torch.cat(embed_collect, dim=0)
            self.storage = embed_collect


    def spc_batchfinder(self, n_samples):
        ### SpC-Sample big batch:
        subset, classes = [], []
        ### Random Subset from Random classes
        for _ in range(n_samples//self.samples_per_class):
            class_key = random.choice(list(self.image_dict.keys()))
            # subset.extend([(class_key, random.choice(len(self.image_dict[class_key])) for _ in range(self.samples_per_class)])
            subset.extend([random.choice(self.image_dict[class_key])[-1] for _ in range(self.samples_per_class)])
            classes.extend([class_key]*self.samples_per_class)
        return np.array(subset), np.array(classes)


    def spc_fid_match(self, calls, pos):
        """
        """
        coll = []

        for _ in range(calls):
            bigb_data_idxs, bigb_data_classes = self.spc_batchfinder(self.bigbs)
            bigb_dict = {}
            for i, bigb_cls in enumerate(bigb_data_classes):
                if bigb_cls not in bigb_dict: bigb_dict[bigb_cls] = []
                bigb_dict[bigb_cls].append(i)

            bigbatch      = self.storage[bigb_data_idxs]
            if self.low_proj_dim>0:
                low_dim_proj = nn.Linear(bigbatch.shape[-1],self.low_proj_dim,bias=False)
                with torch.no_grad(): bigbatch = low_dim_proj(bigbatch)
            bigbatch  = bigbatch.numpy()

            bigbatch_mean = np.mean(bigbatch, axis=0).reshape(-1,1)
            bigbatch_cov  = np.cov(bigbatch.T)


            fid_collect, bigb_idxs = [], []

            for _ in range(self.num_batch_comps):
                subset_idxs = [np.random.choice(bigb_dict[np.random.choice(list(bigb_dict.keys()))], self.samples_per_class, replace=False) for _ in range(self.batch_size//self.samples_per_class)]
                subset_idxs = [x for y in subset_idxs for x in y]
                # subset_idxs = sorted(np.random.choice(len(bigbatch), batch_size, replace=False))
                bigb_idxs.append(subset_idxs)
                subset      = bigbatch[subset_idxs,:]

                subset_mean = np.mean(subset, axis=0).reshape(-1,1)
                subset_cov  = np.cov(subset.T)

                diag_offset = np.eye(subset_cov.shape[0])*1e-8
                cov_sqrt    = linalg.sqrtm((bigbatch_cov+diag_offset).dot((subset_cov+diag_offset)), disp=False)[0].real

                diff = bigbatch_mean-subset_mean
                fid  = diff.T.dot(diff) + np.trace(bigbatch_cov) + np.trace(subset_cov) - 2*np.trace(cov_sqrt)

                fid_collect.append(fid)

            bigb_ix      = bigb_idxs[np.argmin(fid_collect)]
            bigb_data_ix = bigb_data_idxs[bigb_ix]
            coll.append(bigb_data_ix)

        return coll


    def __len__(self):
        return self.sampler_length


================================================
FILE: datasampler/greedy_coreset_sampler.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
from tqdm import tqdm
import random
from scipy import linalg


"""======================================================"""
REQUIRES_STORAGE = True

###
class Sampler(torch.utils.data.sampler.Sampler):
    """
    Plugs into PyTorch Batchsampler Package.
    """
    def __init__(self, opt, image_dict, image_list):
        self.image_dict = image_dict
        self.image_list = image_list

        self.batch_size         = opt.bs
        self.samples_per_class  = opt.samples_per_class
        self.sampler_length     = len(image_list)//opt.bs
        assert self.batch_size%self.samples_per_class==0, '#Samples per class must divide batchsize!'

        self.name             = 'greedy_coreset_sampler'
        self.requires_storage = True

        self.bigbs           = opt.data_batchmatch_bigbs
        self.update_storage  = not opt.data_storage_no_update
        self.num_batch_comps = opt.data_batchmatch_ncomps
        self.dist_lim        = opt.data_gc_coreset_lim

        self.low_proj_dim    = opt.data_sampler_lowproj_dim

        self.softened = opt.data_gc_softened

        self.n_jobs = 16

    def __iter__(self):
        for i in range(self.sampler_length):
            yield self.epoch_indices[i]


    def precompute_indices(self):
        from joblib import Parallel, delayed
        import time

        ### Random Subset from Random classes
        bigb_idxs = np.random.choice(len(self.storage), self.bigbs, replace=True)
        bigbatch  = self.storage[bigb_idxs]

        print('Precomputing Indices... ', end='')
        start = time.time()
        def batchfinder(n_calls, pos):
            idx_sets         = self.greedy_coreset(n_calls, pos)
            structured_batches = [list(bigb_idxs[idx_set]) for idx_set in idx_sets]
            # structured_batch = list(bigb_idxs[self.fid_match(bigbatch, batch_size=self.batch_size//self.samples_per_class)])
            #Add random per-class fillers to ensure that the batch is build up correctly.
            for i in range(len(structured_batches)):
                class_idxs = [self.image_list[idx][-1] for idx in structured_batches[i]]
                for class_idx in class_idxs:
                    structured_batches[i].extend([random.choice(self.image_dict[class_idx])[-1] for _ in range(self.samples_per_class-1)])

            return structured_batches

        n_calls            = int(np.ceil(self.sampler_length/self.n_jobs))
        # self.epoch_indices = batchfinder(n_calls, 0)
        self.epoch_indices = Parallel(n_jobs = self.n_jobs)(delayed(batchfinder)(n_calls, i) for i in range(self.n_jobs))
        self.epoch_indices = [x for y in self.epoch_indices for x in y]
        # self.epoch_indices = Parallel(n_jobs = self.n_jobs)(delayed(batchfinder)(self.storage[np.random.choice(len(self.storage), self.bigbs, replace=True)]) for _ in tqdm(range(self.sampler_length), desc='Precomputing Indices...'))

        print('Done in {0:3.4f}s.'.format(time.time()-start))


    def replace_storage_entries(self, embeddings, indices):
        self.storage[indices] = embeddings

    def create_storage(self, dataloader, model, device):
        with torch.no_grad():
            _ = model.eval()
            _ = model.to(device)

            embed_collect = []
            for i,input_tuple in enumerate(tqdm(dataloader, 'Creating data storage...')):
                embed = model(input_tuple[1].type(torch.FloatTensor).to(device))
                if isinstance(embed, tuple): embed = embed[0]
                embed = embed.cpu()
                embed_collect.append(embed)
            embed_collect = torch.cat(embed_collect, dim=0)
            self.storage = embed_collect


    def full_storage_update(self, dataloader, model, device):
        with torch.no_grad():
            _ = model.eval()
            _ = model.to(device)

            embed_collect = []
            for i,input_tuple in enumerate(tqdm(dataloader, 'Creating data storage...')):
                embed = model(input_tuple[1].type(torch.FloatTensor).to(device))
                if isinstance(embed, tuple): embed = embed[0]
                embed = embed.cpu()
                embed_collect.append(embed)
            embed_collect = torch.cat(embed_collect, dim=0)
            if self.mb_mom>0:
                self.delta_storage = self.mb_mom*self.delta_storage + (1-self.mb_mom)*(embed_collect-self.storage)
                self.storage       = embed_collect + self.mb_lr*self.delta_storage
            else:
                self.storage = embed_collect
                
    def greedy_coreset(self, calls, pos):
        """
        """
        coll = []


        for _ in range(calls):
            bigbatch   = self.storage[np.random.choice(len(self.storage), self.bigbs, replace=False)]
            batch_size = self.batch_size//self.samples_per_class

            if self.low_proj_dim>0:
                low_dim_proj = nn.Linear(bigbatch.shape[-1],self.low_proj_dim,bias=False)
                with torch.no_grad(): bigbatch = low_dim_proj(bigbatch)

            bigbatch = bigbatch.numpy()

            prod        = np.matmul(bigbatch, bigbatch.T)
            sq          = prod.diagonal().reshape(bigbatch.shape[0], 1)
            dist_matrix = np.clip(-2*prod + sq + sq.T, 0, None)
            coreset_anchor_dists = np.linalg.norm(dist_matrix, axis=1)

            k, sampled_indices = 0, []

            while k<batch_size:
                if k==0:
                    no    = np.random.randint(len(coreset_anchor_dists))
                else:
                    if self.softened:
                        no = np.random.choice(np.where(coreset_anchor_dists>=np.percentile(coreset_anchor_dists,97))[0])
                    else:
                        no    = np.argmax(coreset_anchor_dists)

                sampled_indices.append(no)
                add_d  = dist_matrix[:, no:no+1]
                #If its closer to the remaining points than the new addition/additions, sample it.
                new_dj = np.concatenate([np.expand_dims(coreset_anchor_dists,-1), add_d], axis=1)
                coreset_anchor_dists = np.min(new_dj, axis=1)
                k += 1

            coll.append(sampled_indices)

        return coll


    def __len__(self):
        return self.sampler_length


================================================
FILE: datasampler/random_sampler.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
from tqdm import tqdm
import random


"""======================================================"""
REQUIRES_STORAGE = False

###
class Sampler(torch.utils.data.sampler.Sampler):
    """
    Plugs into PyTorch Batchsampler Package.
    """
    def __init__(self, opt, image_dict, image_list=None):
        self.image_dict         = image_dict
        self.image_list         = image_list

        self.batch_size         = opt.bs
        self.samples_per_class  = opt.samples_per_class
        self.sampler_length     = len(image_list)//opt.bs
        assert self.batch_size%self.samples_per_class==0, '#Samples per class must divide batchsize!'

        self.name             = 'random_sampler'
        self.requires_storage = False

    def __iter__(self):
        for _ in range(self.sampler_length):
            subset = []
            ### Random Subset from Random classes
            for _ in range(self.batch_size-1):
                class_key  = random.choice(list(self.image_dict.keys()))
                sample_idx = np.random.choice(len(self.image_dict[class_key]))
                subset.append(self.image_dict[class_key][sample_idx][-1])
            #
            subset.append(random.choice(self.image_dict[self.image_list[random.choice(subset)][-1]])[-1])
            yield subset

    def __len__(self):
        return self.sampler_length


================================================
FILE: datasampler/samplers.py
================================================
import numpy as np
import torch, torch.nn as nn, torch.nn.functional as F
from tqdm import tqdm
import random


"""======================================================"""
def sampler_parse_args(parser):
    parser.add_argument('--batch_selection',     default='class_random', type=str,   help='Selection of the data batch: Modes of Selection: random, greedy_coreset')
    parser.add_argument('--primary_subset_perc', default=0.1,            type=float, help='Size of the randomly selected subset before application of coreset selection.')
    return parser


"""======================================================"""
###
# Methods: Full random, Per-Class-Random, CoreSet
class AdvancedSampler(torch.utils.data.sampler.Sampler):
    """
    Plugs into PyTorch Batchsampler Package.
    """
    def __init__(self, method='class_random', random_subset_perc=0.1, batch_size=128, samples_per_class=4):
        self.random_subset_perc = random_subset_perc
        self.batch_size = batch_size
        self.samples_per_class = samples_per_class

        self.method         = method

        self.storage = None
        self.sampler_length = None

        self.methods_requiring_storage = ['greedy_class_coreset', 'greedy_semi_class_coreset', 'presampled_infobatch']

    def create_storage(self, dataloader, model, device):
        self.image_dict = dataloader.dataset.image_dict
        self.image_list = dataloader.dataset.image_list

        self.sampler_length = len(dataloader.dataset)//self.batch_size

        if self.method in self.methods_requiring_storage:
            with torch.no_grad():
                _ = model.eval()
                _ = model.to(device)

                embed_collect = []
                for i,input_tuple in enumerate(tqdm(dataloader, 'Creating data storage...')):
                    embed = model(input_tuple[1].type(torch.FloatTensor).to(device)).cpu()
                    embed_collect.append(embed)
                embed_collect = torch.cat(embed_collect, dim=0)
                self.storage = embed_collect

            self.random_subset_len = int(self.random_subset_perc*len(self.storage))

    def update_storage(self, embeddings, indices):
        if 'coreset' in self.method:
            self.storage[indices] = embeddings

    def __iter__(self):
        for _ in range(self.sampler_length):
            subset = []
            if self.method=='greedy_class_coreset':
                for _ in range(self.batch_size//self.samples_per_class):
                    class_key     = random.choice(list(self.image_dict.keys()))
                    class_indices = np.array([x[1] for x in self.image_dict[class_key]])
                    # print(class_indices)
                    ### Coreset subset of subset
                    subset.extend(class_indices[self.greedy_coreset(self.storage[class_indices], self.samples_per_class)])
                # print([self.image_list[x][1] for x in subset])
            elif self.method=='greedy_semi_class_coreset':
                ### Big random subset
                subset = np.random.randint(0,len(self.storage),self.random_subset_len)
                ### Coreset subset of subset of half the batch size
                subset  = subset[self.greedy_coreset(self.storage[subset], self.batch_size//2)]
                ### Fill the rest of the batch with random samples from each coreset member class
                subset = list(subset)+[random.choice(self.image_dict[self.image_list[idx][-1]])[-1] for idx in subset]
            elif self.method=='presampled_infobatch':
                ### Big random subset
                subset  = np.random.randint(0,len(self.storage),self.random_subset_len)
                classes = torch.tensor([self.image_list[idx][-1] for idx in subset])
                ### Presampled Infobatch for subset of data.
                subset = subset[self.presample_infobatch(classes, self.storage[subset], self.batch_size//2)]
                ### Fill the rest of the batch with random samples from each member class
                subset = list(subset)+[random.choice(self.image_dict[self.image_list[idx][-1]])[-1] for idx in subset]
            elif self.method=='class_random':
                ### Random Subset from Random classes
                for _ in range(self.batch_size//self.samples_per_class):
                    class_key = random.choice(list(self.image_dict.keys()))
                    subset.extend([random.choice(self.image_dict[class_key])[-1] for _ in range(self.samples_per_class)])
            elif self.method=='semi_class_random':
                ### Select half of the indices completely at random, and the other half corresponding to the classes.
                for _ in range(self.batch_size//2):
                    rand_idx       = np.random.randint(len(self.image_list))
                    class_idx      = self.image_list[rand_idx][-1]
                    rand_class_idx = random.choice(self.image_dict[class_idx])[-1]
                    subset.extend([rand_idx, rand_class_idx])
            else:
                raise NotImplementedError('Batch selection method {} not available!'.format(self.method))
            yield subset

    def __len__(self):
        return self.sampler_length

    def pdistsq(self, A):
        prod = torch.mm(A, A.t())
        diag = prod.diag().unsqueeze(1).expand_as(prod)
        return (-2*prod + diag + diag.T)

    def greedy_coreset(self, A, samples):
        dist_matrix          = self.pdistsq(A)
        coreset_anchor_dists = torch.norm(dist_matrix, dim=1)

        sampled_indices, i = [], 0

        while i<samples:
            if i==0:
                sample_idx = np.random.randint(len(coreset_anchor_dists))
            else:
                sample_idx = torch.argmax(coreset_anchor_dists).item()
            sampled_indices.append(sample_idx)
            sample_anchor_dists  = dist_matrix[:, sample_idx:sample_idx+1]
            new_search_dists     = torch.cat([coreset_anchor_dists.unsqueeze(-1), sample_anchor_dists], dim=1)
            coreset_anchor_dists = torch.min(new_search_dists, dim=1)[0]
            i += 1

        return sampled_indices

    def presample_infobatch(self, classes, A, samples):
        equiv_classes = ((classes.reshape(-1,1)-classes.reshape(-1,1).T)==0).type(torch.BoolTensor)

        dim  = A.shape[-1]

        dist = self.pdistsq(A).clamp(min=0.5)
        dist = ((2.0 - float(dim)) * torch.log(dist) - (float(dim-3) / 2) * torch.log(1.0 - 0.25 * (dist.pow(2))))
        dist[equiv_classes] = 0
        dist = torch.exp(dist - torch.max(dist))
        dist[equiv_classes] = 0

        dist = dist/torch.sum(dist)
        dist = dist.flatten().detach().cpu().numpy()

        sampled_idxs = set()
        while len(sampled_idxs)<samples:
            index = np.random.choice(len(dist),p=dist)
            ### Ensure that we do not continously sample the same one in the case of high prob. imbalances!
            dist[index] = 0
            dist = dist/np.sum(dist)
            sample_a, sample_b = index//equiv_classes.shape[0], index%equiv_classes.shape[1]
            sampled_idxs = sampled_idxs.union(set([sample_a, sample_b]))

        sampled_idxs = list(sampled_idxs)
        sampled_idxs = sampled_idxs[:samples]

        return sampled_idxs
        # dist = dist/torch.sum(dist, dim=1).view(-1,1)

        # Divergence of dist from uniform:, normalize with number if NON-ZERO ELEMENTS!
        # non_equiv_classes = (1-equiv_classes.type(torch.LongTensor)).type(torch.BoolTensor)
        # uniform_reference = torch.ones(dist.shape)/torch.sum(non_equiv_classes, dim=1).view(-1,1)
        #
        # kl_divs = []
        # for i in range(len(dist)):
        #     nec_idxs = non_equiv_classes[i,:].type(torch.BoolTensor)
        #     u        = uniform_reference[i,nec_idxs]
        #     d        = dist[i,nec_idxs]
        #     kl_div   = -((u*(d/u).log()).sum())/len(u)
        #     kl_divs.append(kl_div)
        # kl_divs      = torch.stack(kl_divs, dim=0)
        # sample_probs = (kl_divs/kl_divs.sum()).detach().cpu().numpy()


        # return np.random.choice(len(sample_probs), samples, p=sample_probs)


================================================
FILE: datasets/__init__.py
================================================
import datasets.cub200
import datasets.cars196
import datasets.stanford_online_products


def select(dataset, opt, data_path):
    if 'cub200' in dataset:
        return cub200.Give(opt, data_path)

    if 'cars196' in dataset:
        return cars196.Give(opt, data_path)

    if 'online_products' in dataset:
        return stanford_online_products.Give(opt, data_path)

    raise NotImplementedError('A dataset for {} is currently not implemented.\n\
                               Currently available are : cub200, cars196 & online_products!'.format(dataset))


================================================
FILE: datasets/basic_dataset_scaffold.py
================================================
from torch.utils.data import Dataset
import torchvision.transforms as transforms
import numpy as np
from PIL import Image


"""==================================================================================================="""
################## BASIC PYTORCH DATASET USED FOR ALL DATASETS ##################################
class BaseDataset(Dataset):
    def __init__(self, image_dict, opt, is_validation=False):
        self.is_validation = is_validation
        self.pars          = opt

        #####
        self.image_dict = image_dict

        #####
        self.init_setup()


        #####
        if 'bninception' not in opt.arch:
            self.f_norm = normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225])
        else:
            # normalize = transforms.Normalize(mean=[0.502, 0.4588, 0.4078],std=[1., 1., 1.])
            self.f_norm = normalize = transforms.Normalize(mean=[0.502, 0.4588, 0.4078],std=[0.0039, 0.0039, 0.0039])

        transf_list = []

        self.crop_size = crop_im_size = 224 if 'googlenet' not in opt.arch else 227
        if opt.augmentation=='big':
            crop_im_size = 256

        #############
        self.normal_transform = []
        if not self.is_validation:
            if opt.augmentation=='base' or opt.augmentation=='big':
                self.normal_transform.extend([transforms.RandomResizedCrop(size=crop_im_size), transforms.RandomHorizontalFlip(0.5)])
            elif opt.augmentation=='adv':
                self.normal_transform.extend([transforms.RandomResizedCrop(size=crop_im_size), transforms.RandomGrayscale(p=0.2),
                                              transforms.ColorJitter(0.2, 0.2, 0.2, 0.2), transforms.RandomHorizontalFlip(0.5)])
            elif opt.augmentation=='red':
                self.normal_transform.extend([transforms.Resize(size=256), transforms.RandomCrop(crop_im_size), transforms.RandomHorizontalFlip(0.5)])
        else:
            self.normal_transform.extend([transforms.Resize(256), transforms.CenterCrop(crop_im_size)])
        self.normal_transform.extend([transforms.ToTensor(), normalize])
        self.normal_transform = transforms.Compose(self.normal_transform)


    def init_setup(self):
        self.n_files       = np.sum([len(self.image_dict[key]) for key in self.image_dict.keys()])
        self.avail_classes = sorted(list(self.image_dict.keys()))


        counter = 0
        temp_image_dict = {}
        for i,key in enumerate(self.avail_classes):
            temp_image_dict[key] = []
            for path in self.image_dict[key]:
                temp_image_dict[key].append([path, counter])
                counter += 1

        self.image_dict = temp_image_dict
        self.image_list = [[(x[0],key) for x in self.image_dict[key]] for key in self.image_dict.keys()]
        self.image_list = [x for y in self.image_list for x in y]

        self.image_paths = self.image_list

        self.is_init = True


    def ensure_3dim(self, img):
        if len(img.size)==2:
            img = img.convert('RGB')
        return img


    def __getitem__(self, idx):
        input_image = self.ensure_3dim(Image.open(self.image_list[idx][0]))

        ### Basic preprocessing.
        im_a = self.normal_transform(input_image)
        if 'bninception' in self.pars.arch:
            im_a = im_a[range(3)[::-1],:]
        return self.image_list[idx][-1], im_a, idx


    def __len__(self):
        return self.n_files


================================================
FILE: datasets/cars196.py
================================================
from datasets.basic_dataset_scaffold import BaseDataset
import os


def Give(opt, datapath):
    image_sourcepath  = datapath+'/images'
    image_classes     = sorted([x for x in os.listdir(image_sourcepath)])
    total_conversion  = {i:x for i,x in enumerate(image_classes)}
    image_list    = {i:sorted([image_sourcepath+'/'+key+'/'+x for x in os.listdir(image_sourcepath+'/'+key)]) for i,key in enumerate(image_classes)}
    image_list    = [[(key,img_path) for img_path in image_list[key]] for key in image_list.keys()]
    image_list    = [x for y in image_list for x in y]

    ### Dictionary of structure class:list_of_samples_with_said_class
    image_dict    = {}
    for key, img_path in image_list:
        if not key in image_dict.keys():
            image_dict[key] = []
        image_dict[key].append(img_path)

    ### Use the first half of the sorted data as training and the second half as test set
    keys = sorted(list(image_dict.keys()))
    train,test      = keys[:len(keys)//2], keys[len(keys)//2:]

    ### If required, split the training data into a train/val setup either by or per class.
    if opt.use_tv_split:
        if not opt.tv_split_by_samples:
            train_val_split = int(len(train)*opt.tv_split_perc)
            train, val      = train[:train_val_split], train[train_val_split:]
            ###
            train_image_dict = {i:image_dict[key] for i,key in enumerate(train)}
            val_image_dict   = {i:image_dict[key] for i,key in enumerate(val)}
            test_image_dict  = {i:image_dict[key] for i,key in enumerate(test)}
        else:
            val = train
            train_image_dict, val_image_dict = {},{}
            for key in train:
                train_ixs = np.random.choice(len(image_dict[key]), int(len(image_dict[key])*opt.tv_split_perc), replace=False)
                val_ixs   = np.array([x for x in range(len(image_dict[key])) if x not in train_ixs])
                train_image_dict[key] = np.array(image_dict[key])[train_ixs]
                val_image_dict[key]   = np.array(image_dict[key])[val_ixs]
        val_dataset   = BaseDataset(val_image_dict,   opt, is_validation=True)
        val_conversion = {i:total_conversion[key] for i,key in enumerate(val)}
        ###
        val_dataset.conversion   = val_conversion
    else:
        train_image_dict = {key:image_dict[key] for key in train}
        val_image_dict   = None
        val_dataset      = None

    ###
    train_conversion = {i:total_conversion[key] for i,key in enumerate(train)}
    test_conversion  = {i:total_conversion[key] for i,key in enumerate(test)}

    ###
    test_image_dict = {key:image_dict[key] for key in test}

    ###
    print('\nDataset Setup:\nUsing Train-Val Split: {0}\n#Classes: Train ({1}) | Val ({2}) | Test ({3})\n'.format(opt.use_tv_split, len(train_image_dict), len(val_image_dict) if val_image_dict else 'X', len(test_image_dict)))

    ###
    train_dataset       = BaseDataset(train_image_dict, opt)
    test_dataset        = BaseDataset(test_image_dict,  opt, is_validation=True)
    eval_dataset        = BaseDataset(train_image_dict, opt, is_validation=True)
    eval_train_dataset  = BaseDataset(train_image_dict, opt, is_validation=False)
    train_dataset.conversion       = train_conversion
    test_dataset.conversion        = test_conversion
    eval_dataset.conversion        = test_conversion
    eval_train_dataset.conversion  = train_conversion

    return {'training':train_dataset, 'validation':val_dataset, 'testing':test_dataset, 'evaluation':eval_dataset, 'evaluation_train':eval_train_dataset}


================================================
FILE: datasets/cub200.py
================================================
from datasets.basic_dataset_scaffold import BaseDataset
import os

def Give(opt, datapath):
    image_sourcepath  = datapath+'/images'
    image_classes     = sorted([x for x in os.listdir(image_sourcepath) if '._' not in x], key=lambda x: int(x.split('.')[0]))
    total_conversion  = {int(x.split('.')[0])-1:x.split('.')[-1] for x in image_classes}
    image_list        = {int(key.split('.')[0])-1:sorted([image_sourcepath+'/'+key+'/'+x for x in os.listdir(image_sourcepath+'/'+key) if '._' not in x]) for key in image_classes}
    image_list        = [[(key,img_path) for img_path in image_list[key]] for key in image_list.keys()]
    image_list        = [x for y in image_list for x in y]

    ### Dictionary of structure class:list_of_samples_with_said_class
    image_dict    = {}
    for key, img_path in image_list:
        if not key in image_dict.keys():
            image_dict[key] = []
        image_dict[key].append(img_path)

    ### Use the first half of the sorted data as training and the second half as test set
    keys = sorted(list(image_dict.keys()))
    train,test      = keys[:len(keys)//2], keys[len(keys)//2:]

    ### If required, split the training data into a train/val setup either by or per class.
    if opt.use_tv_split:
        if not opt.tv_split_by_samples:
            train_val_split = int(len(train)*opt.tv_split_perc)
            train, val      = train[:train_val_split], train[train_val_split:]
            ###
            train_image_dict = {i:image_dict[key] for i,key in enumerate(train)}
            val_image_dict   = {i:image_dict[key] for i,key in enumerate(val)}
            test_image_dict  = {i:image_dict[key] for i,key in enumerate(test)}
        else:
            val = train
            train_image_dict, val_image_dict = {},{}
            for key in train:
                train_ixs   = np.array(list(set(np.round(np.linspace(0,len(image_dict[key])-1,int(len(image_dict[key])*opt.tv_split_perc)))))).astype(int)
                val_ixs     = np.array([x for x in range(len(image_dict[key])) if x not in train_ixs])
                train_image_dict[key] = np.array(image_dict[key])[train_ixs]
                val_image_dict[key]   = np.array(image_dict[key])[val_ixs]
        val_dataset    = BaseDataset(val_image_dict, opt, is_validation=True)
        val_conversion = {i:total_conversion[key] for i,key in enumerate(val)}
        ###
        val_dataset.conversion   = val_conversion
    else:
        train_image_dict = {key:image_dict[key] for key in train}
        val_image_dict   = None
        val_dataset      = None

    ###
    train_conversion = {i:total_conversion[key] for i,key in enumerate(train)}
    test_conversion  = {i:total_conversion[key] for i,key in enumerate(test)}

    ###
    test_image_dict = {key:image_dict[key] for key in test}

    ###
    print('\nDataset Setup:\nUsing Train-Val Split: {0}\n#Classes: Train ({1}) | Val ({2}) | Test ({3})\n'.format(opt.use_tv_split, len(train_image_dict), len(val_image_dict) if val_image_dict else 'X', len(test_image_dict)))

    ###
    train_dataset       = BaseDataset(train_image_dict, opt)
    test_dataset        = BaseDataset(test_image_dict,  opt, is_validation=True)
    eval_dataset        = BaseDataset(train_image_dict, opt, is_validation=True)
    eval_train_dataset  = BaseDataset(train_image_dict, opt, is_validation=False)
    train_dataset.conversion       = train_conversion
    test_dataset.conversion        = test_conversion
    eval_dataset.conversion        = test_conversion
    eval_train_dataset.conversion  = train_conversion

    return {'training':train_dataset, 'validation':val_dataset, 'testing':test_dataset, 'evaluation':eval_dataset, 'evaluation_train':eval_train_dataset}


================================================
FILE: datasets/stanford_online_products.py
================================================
from datasets.basic_dataset_scaffold import BaseDataset
import os, numpy as np
import pandas as pd


def Give(opt, datapath):
    image_sourcepath  = opt.source_path+'/images'
    training_files = pd.read_table(opt.source_path+'/Info_Files/Ebay_train.txt', header=0, delimiter=' ')
    test_files     = pd.read_table(opt.source_path+'/Info_Files/Ebay_test.txt', header=0, delimiter=' ')

    spi   = np.array([(a,b) for a,b in zip(training_files['super_class_id'], training_files['class_id'])])
    super_dict       = {}
    super_conversion = {}
    for i,(super_ix, class_ix, image_path) in enumerate(zip(training_files['super_class_id'],training_files['class_id'],training_files['path'])):
        if super_ix not in super_dict: super_dict[super_ix] = {}
        if class_ix not in super_dict[super_ix]: super_dict[super_ix][class_ix] = []
        super_dict[super_ix][class_ix].append(image_sourcepath+'/'+image_path)

    if opt.use_tv_split:
        if not opt.tv_split_by_samples:
            train_image_dict, val_image_dict = {},{}
            train_count, val_count = 0, 0
            for super_ix in super_dict.keys():
                class_ixs       = sorted(list(super_dict[super_ix].keys()))
                train_val_split = int(len(super_dict[super_ix])*opt.tv_split_perc)
                train_image_dict[super_ix] = {}
                for _,class_ix in enumerate(class_ixs[:train_val_split]):
                    train_image_dict[super_ix][train_count] = super_dict[super_ix][class_ix]
                    train_count += 1
                val_image_dict[super_ix] = {}
                for _,class_ix in enumerate(class_ixs[train_val_split:]):
                    val_image_dict[super_ix][val_count]     = super_dict[super_ix][class_ix]
                    val_count += 1
        else:
            train_image_dict, val_image_dict = {},{}
            for super_ix in super_dict.keys():
                class_ixs       = sorted(list(super_dict[super_ix].keys()))
                train_image_dict[super_ix] = {}
                val_image_dict[super_ix]   = {}
                for class_ix in class_ixs:
                    train_val_split = int(len(super_dict[super_ix][class_ix])*opt.tv_split_perc)
                    train_image_dict[super_ix][class_ix] = super_dict[super_ix][class_ix][:train_val_split]
                    val_image_dict[super_ix][class_ix]   = super_dict[super_ix][class_ix][train_val_split:]
    else:
        train_image_dict = super_dict
        val_image_dict   = None

    ####
    test_image_dict        = {}
    train_image_dict_temp  = {}
    val_image_dict_temp    = {}
    super_train_image_dict = {}
    super_val_image_dict   = {}
    train_conversion       = {}
    super_train_conversion = {}
    val_conversion         = {}
    super_val_conversion   = {}
    test_conversion        = {}
    super_test_conversion  = {}

    ## Create Training Dictionaries
    i = 0
    for super_ix,super_set in train_image_dict.items():
        super_ix -= 1
        counter   = 0
        super_train_image_dict[super_ix] = []
        for class_ix,class_set in super_set.items():
            class_ix -= 1
            super_train_image_dict[super_ix].extend(class_set)
            train_image_dict_temp[class_ix] = class_set
            if class_ix not in train_conversion:
                train_conversion[class_ix] = class_set[0].split('/')[-1].split('_')[0]
                super_conversion[class_ix] = class_set[0].split('/')[-2]
            counter += 1
            i       += 1
    train_image_dict = train_image_dict_temp

    ## Create Validation Dictionaries
    if opt.use_tv_split:
        i = 0
        for super_ix,super_set in val_image_dict.items():
            super_ix -= 1
            counter   = 0
            super_val_image_dict[super_ix] = []
            for class_ix,class_set in super_set.items():
                class_ix -= 1
                super_val_image_dict[super_ix].extend(class_set)
                val_image_dict_temp[class_ix] = class_set
                if class_ix not in val_conversion:
                    val_conversion[class_ix] = class_set[0].split('/')[-1].split('_')[0]
                    super_conversion[class_ix] = class_set[0].split('/')[-2]
                counter += 1
                i       += 1
        val_image_dict = val_image_dict_temp
    else:
        val_image_dict = None

    ## Create Test Dictioniaries
    for class_ix, img_path in zip(test_files['class_id'],test_files['path']):
        class_ix = class_ix-1
        if not class_ix in test_image_dict.keys():
            test_image_dict[class_ix] = []
        test_image_dict[class_ix].append(image_sourcepath+'/'+img_path)
        test_conversion[class_ix]       = img_path.split('/')[-1].split('_')[0]
        super_test_conversion[class_ix] = img_path.split('/')[-2]

    ##
    if val_image_dict:
        val_dataset            = BaseDataset(val_image_dict,   opt, is_validation=True)
        val_dataset.conversion = val_conversion
    else:
        val_dataset = None

    print('\nDataset Setup:\nUsing Train-Val Split: {0}\n#Classes: Train ({1}) | Val ({2}) | Test ({3})\n'.format(opt.use_tv_split, len(train_image_dict), len(val_image_dict) if val_image_dict else 'X', len(test_image_dict)))

    super_train_dataset = BaseDataset(super_train_image_dict, opt, is_validation=True)
    train_dataset       = BaseDataset(train_image_dict, opt)
    test_dataset        = BaseDataset(test_image_dict,  opt, is_validation=True)
    eval_dataset        = BaseDataset(train_image_dict, opt, is_validation=True)
    eval_train_dataset  = BaseDataset(train_image_dict, opt)

    super_train_dataset.conversion = super_train_conversion
    train_dataset.conversion       = train_conversion
    test_dataset.conversion        = test_conversion
    eval_dataset.conversion        = train_conversion

    return {'training':train_dataset, 'validation':val_dataset, 'testing':test_dataset, 'evaluation':eval_dataset, 'evaluation_train':eval_train_dataset, 'super_evaluation':super_train_dataset}


================================================
FILE: evaluation/__init__.py
================================================
import faiss, matplotlib.pyplot as plt, os, numpy as np, torch
from PIL import Image


#######################
def evaluate(dataset, LOG, metric_computer, dataloaders, model, opt, evaltypes, device,
             aux_store=None, make_recall_plot=False, store_checkpoints=True, log_key='Test'):
    """
    Parent-Function to compute evaluation metrics, print summary string and store checkpoint files/plot sample recall plots.
    """
    computed_metrics, extra_infos = metric_computer.compute_standard(opt, model, dataloaders[0], evaltypes, device)

    numeric_metrics = {}
    histogr_metrics = {}
    for main_key in computed_metrics.keys():
        for name,value in computed_metrics[main_key].items():
            if isinstance(value, np.ndarray):
                if main_key not in histogr_metrics: histogr_metrics[main_key] = {}
                histogr_metrics[main_key][name] = value
            else:
                if main_key not in numeric_metrics: numeric_metrics[main_key] = {}
                numeric_metrics[main_key][name] = value

    ###
    full_result_str = ''
    for evaltype in numeric_metrics.keys():
        full_result_str += 'Embed-Type: {}:\n'.format(evaltype)
        for i,(metricname, metricval) in enumerate(numeric_metrics[evaltype].items()):
            full_result_str += '{0}{1}: {2:4.4f}'.format(' | ' if i>0 else '',metricname, metricval)
        full_result_str += '\n'

    print(full_result_str)


    ###
    for evaltype in evaltypes:
        for storage_metric in opt.storage_metrics:
            parent_metric = evaltype+'_{}'.format(storage_metric.split('@')[0])
            if parent_metric not in LOG.progress_saver[log_key].groups.keys() or \
               numeric_metrics[evaltype][storage_metric]>np.max(LOG.progress_saver[log_key].groups[parent_metric][storage_metric]['content']):
               print('Saved weights for best {}: {}\n'.format(log_key, parent_metric))
               set_checkpoint(model, opt, LOG.progress_saver, LOG.prop.save_path+'/checkpoint_{}_{}_{}.pth.tar'.format(log_key, evaltype, storage_metric), aux=aux_store)


    ###
    if opt.log_online:
        for evaltype in histogr_metrics.keys():
            for eval_metric, hist in histogr_metrics[evaltype].items():
                import wandb, numpy
                wandb.log({log_key+': '+evaltype+'_{}'.format(eval_metric): wandb.Histogram(np_histogram=(list(hist),list(np.arange(len(hist)+1))))}, step=opt.epoch)
                wandb.log({log_key+': '+evaltype+'_LOG-{}'.format(eval_metric): wandb.Histogram(np_histogram=(list(np.log(hist)+20),list(np.arange(len(hist)+1))))}, step=opt.epoch)

    ###
    for evaltype in numeric_metrics.keys():
        for eval_metric in numeric_metrics[evaltype].keys():
            parent_metric = evaltype+'_{}'.format(eval_metric.split('@')[0])
            LOG.progress_saver[log_key].log(eval_metric, numeric_metrics[evaltype][eval_metric],  group=parent_metric)

        ###
        if make_recall_plot:
            recover_closest_standard(extra_infos[evaltype]['features'],
                                     extra_infos[evaltype]['image_paths'],
                                     LOG.prop.save_path+'/sample_recoveries.png')


###########################
def set_checkpoint(model, opt, progress_saver, savepath, aux=None):
    if 'experiment' in vars(opt):
        import argparse
        save_opt = {key:item for key,item in vars(opt).items() if key!='experiment'}
        save_opt = argparse.Namespace(**save_opt)
    else:
        save_opt = opt

    torch.save({'state_dict':model.state_dict(), 'opt':save_opt, 'progress':progress_saver, 'aux':aux}, savepath)


##########################
def recover_closest_standard(feature_matrix_all, image_paths, save_path, n_image_samples=10, n_closest=3):
    image_paths = np.array([x[0] for x in image_paths])
    sample_idxs = np.random.choice(np.arange(len(feature_matrix_all)), n_image_samples)

    faiss_search_index = faiss.IndexFlatL2(feature_matrix_all.shape[-1])
    faiss_search_index.add(feature_matrix_all)
    _, closest_feature_idxs = faiss_search_index.search(feature_matrix_all, n_closest+1)

    sample_paths = image_paths[closest_feature_idxs][sample_idxs]

    f,axes = plt.subplots(n_image_samples, n_closest+1)
    for i,(ax,plot_path) in enumerate(zip(axes.reshape(-1), sample_paths.reshape(-1))):
        ax.imshow(np.array(Image.open(plot_path)))
        ax.set_xticks([])
        ax.set_yticks([])
        if i%(n_closest+1):
            ax.axvline(x=0, color='g', linewidth=13)
        else:
            ax.axvline(x=0, color='r', linewidth=13)
    f.set_size_inches(10,20)
    f.tight_layout()
    f.savefig(save_path)
    plt.close()


================================================
FILE: main.py
================================================
"""==================================================================================================="""
################### LIBRARIES ###################
### Basic Libraries
import warnings
warnings.filterwarnings("ignore")

import os, sys, numpy as np, argparse, imp, datetime, pandas as pd, copy
import time, pickle as pkl, random, json, collections
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt

from tqdm import tqdm

import parameters    as par


"""==================================================================================================="""
################### INPUT ARGUMENTS ###################
parser = argparse.ArgumentParser()

parser = par.basic_training_parameters(parser)
parser = par.batch_creation_parameters(parser)
parser = par.batchmining_specific_parameters(parser)
parser = par.loss_specific_parameters(parser)
parser = par.wandb_parameters(parser)

##### Read in parameters
opt = parser.parse_args()


"""==================================================================================================="""
### The following setting is useful when logging to wandb and running multiple seeds per setup:
### By setting the savename to <group_plus_seed>, the savename will instead comprise the group and the seed!
if opt.savename=='group_plus_seed':
    if opt.log_online:
        opt.savename = opt.group+'_s{}'.format(opt.seed)
    else:
        opt.savename = ''

### If wandb-logging is turned on, initialize the wandb-run here:
if opt.log_online:
    import wandb
    _ = os.system('wandb login {}'.format(opt.wandb_key))
    os.environ['WANDB_API_KEY'] = opt.wandb_key
    wandb.init(project=opt.project, group=opt.group, name=opt.savename, dir=opt.save_path)
    wandb.config.update(opt)


"""==================================================================================================="""
### Load Remaining Libraries that neeed to be loaded after comet_ml
import torch, torch.nn as nn
import torch.multiprocessing
torch.multiprocessing.set_sharing_strategy('file_system')
import architectures as archs
import datasampler   as dsamplers
import datasets      as datasets
import criteria      as criteria
import metrics       as metrics
import batchminer    as bmine
import evaluation    as eval
from utilities import misc
from utilities import logger


"""==================================================================================================="""
full_training_start_time = time.time()


"""==================================================================================================="""
opt.source_path += '/'+opt.dataset
opt.save_path   += '/'+opt.dataset

#Assert that the construction of the batch makes sense, i.e. the division into class-subclusters.
assert not opt.bs%opt.samples_per_class, 'Batchsize needs to fit number of samples per class for distance sampling and margin/triplet loss!'

opt.pretrained = not opt.not_pretrained


"""==================================================================================================="""
################### GPU SETTINGS ###########################
os.environ["CUDA_DEVICE_ORDER"]   ="PCI_BUS_ID"
# if not opt.use_data_parallel:
os.environ["CUDA_VISIBLE_DEVICES"]= str(opt.gpu[0])


"""==================================================================================================="""
#################### SEEDS FOR REPROD. #####################
torch.backends.cudnn.deterministic=True; np.random.seed(opt.seed); random.seed(opt.seed)
torch.manual_seed(opt.seed); torch.cuda.manual_seed(opt.seed); torch.cuda.manual_seed_all(opt.seed)


"""==================================================================================================="""
##################### NETWORK SETUP ##################
opt.device = torch.device('cuda')
model      = archs.select(opt.arch, opt)

if opt.fc_lr<0:
    to_optim   = [{'params':model.parameters(),'lr':opt.lr,'weight_decay':opt.decay}]
else:
    all_but_fc_params = [x[-1] for x in list(filter(lambda x: 'last_linear' not in x[0], model.named_parameters()))]
    fc_params         = model.model.last_linear.parameters()
    to_optim          = [{'params':all_but_fc_params,'lr':opt.lr,'weight_decay':opt.decay},
                         {'params':fc_params,'lr':opt.fc_lr,'weight_decay':opt.decay}]

_  = model.to(opt.device)


"""============================================================================"""
#################### DATALOADER SETUPS ##################
dataloaders = {}
datasets    = datasets.select(opt.dataset, opt, opt.source_path)

dataloaders['evaluation'] = torch.utils.data.DataLoader(datasets['evaluation'], num_workers=opt.kernels, batch_size=opt.bs, shuffle=False)
dataloaders['testing']    = torch.utils.data.DataLoader(datasets['testing'],    num_workers=opt.kernels, batch_size=opt.bs, shuffle=False)
if opt.use_tv_split:
    dataloaders['validation'] = torch.utils.data.DataLoader(datasets['validation'], num_workers=opt.kernels, batch_size=opt.bs,shuffle=False)

train_data_sampler      = dsamplers.select(opt.data_sampler, opt, datasets['training'].image_dict, datasets['training'].image_list)
if train_data_sampler.requires_storage:
    train_data_sampler.create_storage(dataloaders['evaluation'], model, opt.device)

dataloaders['training'] = torch.utils.data.DataLoader(datasets['training'], num_workers=opt.kernels, batch_sampler=train_data_sampler)

opt.n_classes  = len(dataloaders['training'].dataset.avail_classes)


"""============================================================================"""
#################### CREATE LOGGING FILES ###############
sub_loggers = ['Train', 'Test', 'Model Grad']
if opt.use_tv_split: sub_loggers.append('Val')
LOG = logger.LOGGER(opt, sub_loggers=sub_loggers, start_new=True, log_online=opt.log_online)


"""============================================================================"""
#################### LOSS SETUP ####################
batchminer   = bmine.select(opt.batch_mining, opt)
criterion, to_optim = criteria.select(opt.loss, opt, to_optim, batchminer)
_ = criterion.to(opt.device)

if 'criterion' in train_data_sampler.name:
    train_data_sampler.internal_criterion = criterion


"""============================================================================"""
#################### OPTIM SETUP ####################
if opt.optim == 'adam':
    optimizer    = torch.optim.Adam(to_optim)
elif opt.optim == 'sgd':
    optimizer    = torch.optim.SGD(to_optim, momentum=0.9)
else:
    raise Exception('Optimizer <{}> not available!'.format(opt.optim))
scheduler    = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=opt.tau, gamma=opt.gamma)


"""============================================================================"""
#################### METRIC COMPUTER ####################
opt.rho_spectrum_embed_dim = opt.embed_dim
metric_computer = metrics.MetricComputer(opt.evaluation_metrics, opt)


"""============================================================================"""
################### Summary #########################3
data_text  = 'Dataset:\t {}'.format(opt.dataset.upper())
setup_text = 'Objective:\t {}'.format(opt.loss.upper())
miner_text = 'Batchminer:\t {}'.format(opt.batch_mining if criterion.REQUIRES_BATCHMINER else 'N/A')
arch_text  = 'Backbone:\t {} (#weights: {})'.format(opt.arch.upper(), misc.gimme_params(model))
summary    = data_text+'\n'+setup_text+'\n'+miner_text+'\n'+arch_text
print(summary)


"""============================================================================"""
################### SCRIPT MAIN ##########################
print('\n-----\n')

iter_count = 0
loss_args  = {'batch':None, 'labels':None, 'batch_features':None, 'f_embed':None}


for epoch in range(opt.n_epochs):
    epoch_start_time = time.time()

    if epoch>0 and opt.data_idx_full_prec and train_data_sampler.requires_storage:
        train_data_sampler.full_storage_update(dataloaders['evaluation'], model, opt.device)

    opt.epoch = epoch
    ### Scheduling Changes specifically for cosine scheduling
    if opt.scheduler!='none': print('Running with learning rates {}...'.format(' | '.join('{}'.format(x) for x in scheduler.get_lr())))

    """======================================="""
    if train_data_sampler.requires_storage:
        train_data_sampler.precompute_indices()


    """======================================="""
    ### Train one epoch
    start = time.time()
    _ = model.train()


    loss_collect = []
    data_iterator = tqdm(dataloaders['training'], desc='Epoch {} Training...'.format(epoch))


    for i,out in enumerate(data_iterator):
        class_labels, input, input_indices = out

        ### Compute Embedding
        input      = input.to(opt.device)
        model_args = {'x':input.to(opt.device)}
        # Needed for MixManifold settings.
        if 'mix' in opt.arch: model_args['labels'] = class_labels
        embeds  = model(**model_args)
        if isinstance(embeds, tuple): embeds, (avg_features, features) = embeds

        ### Compute Loss
        loss_args['batch']          = embeds
        loss_args['labels']         = class_labels
        loss_args['f_embed']        = model.model.last_linear
        loss_args['batch_features'] = features
        loss      = criterion(**loss_args)

        ###
        optimizer.zero_grad()
        loss.backward()

        ### Compute Model Gradients and log them!
        grads              = np.concatenate([p.grad.detach().cpu().numpy().flatten() for p in model.parameters() if p.grad is not None])
        grad_l2, grad_max  = np.mean(np.sqrt(np.mean(np.square(grads)))), np.mean(np.max(np.abs(grads)))
        LOG.progress_saver['Model Grad'].log('Grad L2',  grad_l2,  group='L2')
        LOG.progress_saver['Model Grad'].log('Grad Max', grad_max, group='Max')

        ### Update network weights!
        optimizer.step()

        ###
        loss_collect.append(loss.item())

        ###
        iter_count += 1

        if i==len(dataloaders['training'])-1: data_iterator.set_description('Epoch (Train) {0}: Mean Loss [{1:.4f}]'.format(epoch, np.mean(loss_collect)))

        """======================================="""
        if train_data_sampler.requires_storage and train_data_sampler.update_storage:
            train_data_sampler.replace_storage_entries(embeds.detach().cpu(), input_indices)

    result_metrics = {'loss': np.mean(loss_collect)}

    ####
    LOG.progress_saver['Train'].log('epochs', epoch)
    for metricname, metricval in result_metrics.items():
        LOG.progress_saver['Train'].log(metricname, metricval)
    LOG.progress_saver['Train'].log('time', np.round(time.time()-start, 4))


    """======================================="""
    ### Evaluate Metric for Training & Test (& Validation)
    _ = model.eval()
    print('\nComputing Testing Metrics...')
    eval.evaluate(opt.dataset, LOG, metric_computer, [dataloaders['testing']],    model, opt, opt.evaltypes, opt.device, log_key='Test')
    if opt.use_tv_split:
        print('\nComputing Validation Metrics...')
        eval.evaluate(opt.dataset, LOG, metric_computer, [dataloaders['validation']], model, opt, opt.evaltypes, opt.device, log_key='Val')
    print('\nComputing Training Metrics...')
    eval.evaluate(opt.dataset, LOG, metric_computer, [dataloaders['evaluation']], model, opt, opt.evaltypes, opt.device, log_key='Train')


    LOG.update(all=True)


    """======================================="""
    ### Learning Rate Scheduling Step
    if opt.scheduler != 'none':
        scheduler.step()

    print('Total Epoch Runtime: {0:4.2f}s'.format(time.time()-epoch_start_time))
    print('\n-----\n')


"""======================================================="""
### CREATE A SUMMARY TEXT FILE
summary_text = ''
full_training_time = time.time()-full_training_start_time
summary_text += 'Training Time: {} min.\n'.format(np.round(full_training_time/60,2))

summary_text += '---------------\n'
for sub_logger in LOG.sub_loggers:
    metrics       = LOG.graph_writer[sub_logger].ov_title
    summary_text += '{} metrics: {}\n'.format(sub_logger.upper(), metrics)

with open(opt.save_path+'/training_summary.txt','w') as summary_file:
    summary_file.write(summary_text)


================================================
FILE: metrics/__init__.py
================================================
from metrics import e_recall, nmi, f1, mAP, mAP_c, mAP_1000, mAP_lim
from metrics import dists, rho_spectrum
from metrics import c_recall, c_nmi, c_f1, c_mAP_c, c_mAP_1000, c_mAP_lim
import numpy as np
import faiss
import torch
from sklearn.preprocessing import normalize
from tqdm import tqdm
import copy


def select(metricname, opt):
    #### Metrics based on euclidean distances
    if 'e_recall' in metricname:
        k = int(metricname.split('@')[-1])
        return e_recall.Metric(k)
    elif metricname=='nmi':
        return nmi.Metric()
    elif metricname=='mAP':
        return mAP.Metric()
    elif metricname=='mAP_c':
        return mAP_c.Metric()
    elif metricname=='mAP_lim':
        return mAP_lim.Metric()
    elif metricname=='mAP_1000':
        return mAP_1000.Metric()
    elif metricname=='f1':
        return f1.Metric()

    #### Metrics based on cosine similarity
    elif 'c_recall' in metricname:
        k = int(metricname.split('@')[-1])
        return c_recall.Metric(k)
    elif metricname=='c_nmi':
        return c_nmi.Metric()
    elif metricname=='c_mAP':
        return c_mAP.Metric()
    elif metricname=='c_mAP_c':
        return c_mAP_c.Metric()
    elif metricname=='c_mAP_lim':
        return c_mAP_lim.Metric()
    elif metricname=='c_mAP_1000':
        return c_mAP_1000.Metric()
    elif metricname=='c_f1':
        return c_f1.Metric()

    #### Generic Embedding space metrics
    elif 'dists' in metricname:
        mode = metricname.split('@')[-1]
        return dists.Metric(mode)
    elif 'rho_spectrum' in metricname:
        mode = int(metricname.split('@')[-1])
        embed_dim = opt.rho_spectrum_embed_dim
        return rho_spectrum.Metric(embed_dim, mode=mode, opt=opt)
    else:
        raise NotImplementedError("Metric {} not available!".format(metricname))


class MetricComputer():
    def __init__(self, metric_names, opt):
        self.pars            = opt
        self.metric_names    = metric_names
        self.list_of_metrics = [select(metricname, opt) for metricname in metric_names]
        self.requires        = [metric.requires for metric in self.list_of_metrics]
        self.requires        = list(set([x for y in self.requires for x in y]))

    def compute_standard(self, opt, model, dataloader, evaltypes, device, **kwargs):
        evaltypes = copy.deepcopy(evaltypes)

        n_classes = opt.n_classes
        image_paths     = np.array([x[0] for x in dataloader.dataset.image_list])
        _ = model.eval()

        ###
        feature_colls  = {key:[] for key in evaltypes}

        ###
        with torch.no_grad():
            target_labels = []
            final_iter = tqdm(dataloader, desc='Embedding Data...'.format(len(evaltypes)))
            image_paths= [x[0] for x in dataloader.dataset.image_list]
            for idx,inp in enumerate(final_iter):
                input_img,target = inp[1], inp[0]
                target_labels.extend(target.numpy().tolist())
                out = model(input_img.to(device))
                if isinstance(out, tuple): out, aux_f = out

                ### Include embeddings of all output features
                for evaltype in evaltypes:
                    if isinstance(out, dict):
                        feature_colls[evaltype].extend(out[evaltype].cpu().detach().numpy().tolist())
                    else:
                        feature_colls[evaltype].extend(out.cpu().detach().numpy().tolist())


            target_labels = np.hstack(target_labels).reshape(-1,1)


        computed_metrics = {evaltype:{} for evaltype in evaltypes}
        extra_infos      = {evaltype:{} for evaltype in evaltypes}


        ###
        faiss.omp_set_num_threads(self.pars.kernels)
        # faiss.omp_set_num_threads(self.pars.kernels)
        res = None
        torch.cuda.empty_cache()
        if self.pars.evaluate_on_gpu:
            res = faiss.StandardGpuResources()


        import time
        for evaltype in evaltypes:
            features        = np.vstack(feature_colls[evaltype]).astype('float32')
            features_cosine = normalize(features, axis=1)

            start = time.time()

            """============ Compute k-Means ==============="""
            if 'kmeans' in self.requires:
                ### Set CPU Cluster index
                cluster_idx = faiss.IndexFlatL2(features.shape[-1])
                if res is not None: cluster_idx = faiss.index_cpu_to_gpu(res, 0, cluster_idx)
                kmeans            = faiss.Clustering(features.shape[-1], n_classes)
                kmeans.niter = 20
                kmeans.min_points_per_centroid = 1
                kmeans.max_points_per_centroid = 1000000000
                ### Train Kmeans
                kmeans.train(features, cluster_idx)
                centroids = faiss.vector_float_to_array(kmeans.centroids).reshape(n_classes, features.shape[-1])

            if 'kmeans_cosine' in self.requires:
                ### Set CPU Cluster index
                cluster_idx = faiss.IndexFlatL2(features_cosine.shape[-1])
                if res is not None: cluster_idx = faiss.index_cpu_to_gpu(res, 0, cluster_idx)
                kmeans            = faiss.Clustering(features_cosine.shape[-1], n_classes)
                kmeans.niter = 20
                kmeans.min_points_per_centroid = 1
                kmeans.max_points_per_centroid = 1000000000
                ### Train Kmeans
                kmeans.train(features_cosine, cluster_idx)
                centroids_cosine = faiss.vector_float_to_array(kmeans.centroids).reshape(n_classes, features_cosine.shape[-1])
                centroids_cosine = normalize(centroids,axis=1)


            """============ Compute Cluster Labels ==============="""
            if 'kmeans_nearest' in self.requires:
                faiss_search_index = faiss.IndexFlatL2(centroids.shape[-1])
                if res is not None: faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
                faiss_search_index.add(centroids)
                _, computed_cluster_labels = faiss_search_index.search(features, 1)

            if 'kmeans_nearest_cosine' in self.requires:
                faiss_search_index = faiss.IndexFlatIP(centroids_cosine.shape[-1])
                if res is not None: faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
                faiss_search_index.add(centroids_cosine)
                _, computed_cluster_labels_cosine = faiss_search_index.search(features_cosine, 1)


            """============ Compute Nearest Neighbours ==============="""
            if 'nearest_features' in self.requires:
                faiss_search_index  = faiss.IndexFlatL2(features.shape[-1])
                if res is not None: faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
                faiss_search_index.add(features)

                max_kval            = np.max([int(x.split('@')[-1]) for x in self.metric_names if 'recall' in x])
                _, k_closest_points = faiss_search_index.search(features, int(max_kval+1))
                k_closest_classes   = target_labels.reshape(-1)[k_closest_points[:,1:]]

            if 'nearest_features_cosine' in self.requires:
                faiss_search_index  = faiss.IndexFlatIP(features_cosine.shape[-1])
                if res is not None: faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
                faiss_search_index.add(normalize(features_cosine,axis=1))

                max_kval                   = np.max([int(x.split('@')[-1]) for x in self.metric_names if 'recall' in x])
                _, k_closest_points_cosine = faiss_search_index.search(normalize(features_cosine,axis=1), int(max_kval+1))
                k_closest_classes_cosine   = target_labels.reshape(-1)[k_closest_points_cosine[:,1:]]


            ###
            if self.pars.evaluate_on_gpu:
                features        = torch.from_numpy(features).to(self.pars.device)
                features_cosine = torch.from_numpy(features_cosine).to(self.pars.device)

            start = time.time()
            for metric in self.list_of_metrics:
                input_dict = {}
                if 'features' in metric.requires:         input_dict['features'] = features
                if 'target_labels' in metric.requires:    input_dict['target_labels'] = target_labels

                if 'kmeans' in metric.requires:           input_dict['centroids'] = centroids
                if 'kmeans_nearest' in metric.requires:   input_dict['computed_cluster_labels'] = computed_cluster_labels
                if 'nearest_features' in metric.requires: input_dict['k_closest_classes'] = k_closest_classes

                if 'features_cosine' in metric.requires:         input_dict['features_cosine'] = features_cosine

                if 'kmeans_cosine' in metric.requires:           input_dict['centroids_cosine'] = centroids_cosine
                if 'kmeans_nearest_cosine' in metric.requires:   input_dict['computed_cluster_labels_cosine'] = computed_cluster_labels_cosine
                if 'nearest_features_cosine' in metric.requires: input_dict['k_closest_classes_cosine'] = k_closest_classes_cosine

                computed_metrics[evaltype][metric.name] = metric(**input_dict)

            extra_infos[evaltype] = {'features':features, 'target_labels':target_labels,
                                     'image_paths': dataloader.dataset.image_paths,
                                     'query_image_paths':None, 'gallery_image_paths':None}

        torch.cuda.empty_cache()
        return computed_metrics, extra_infos


================================================
FILE: metrics/c_f1.py
================================================
import numpy as np
from scipy.special import comb, binom
import torch

class Metric():
    def __init__(self, **kwargs):
        self.requires = ['kmeans_cosine', 'kmeans_nearest_cosine', 'features_cosine', 'target_labels']
        self.name     = 'c_f1'

    def __call__(self, target_labels, computed_cluster_labels_cosine, features_cosine, centroids_cosine):
        import time
        start = time.time()
        if isinstance(features_cosine, torch.Tensor):
            features_cosine = features_cosine.detach().cpu().numpy()
        d = np.zeros(len(features_cosine))
        for i in range(len(features_cosine)):
            d[i] = np.linalg.norm(features_cosine[i,:] - centroids_cosine[computed_cluster_labels_cosine[i],:])

        start = time.time()
        labels_pred = np.zeros(len(features_cosine))
        for i in np.unique(computed_cluster_labels_cosine):
            index = np.where(computed_cluster_labels_cosine == i)[0]
            ind = np.argmin(d[index])
            cid = index[ind]
            labels_pred[index] = cid


        start = time.time()
        N = len(target_labels)

        # cluster n_labels
        avail_labels = np.unique(target_labels)
        n_labels     = len(avail_labels)

        # count the number of objects in each cluster
        count_cluster = np.zeros(n_labels)
        for i in range(n_labels):
            count_cluster[i] = len(np.where(target_labels == avail_labels[i])[0])

        # build a mapping from item_id to item index
        keys     = np.unique(labels_pred)
        num_item = len(keys)
        values   = range(num_item)
        item_map = dict()
        for i in range(len(keys)):
            item_map.update([(keys[i], values[i])])


        # count the number of objects of each item
        count_item = np.zeros(num_item)
        for i in range(N):
            index = item_map[labels_pred[i]]
            count_item[index] = count_item[index] + 1

        # compute True Positive (TP) plus False Positive (FP)
        # tp_fp = 0
        tp_fp = comb(count_cluster, 2).sum()
        # for k in range(n_labels):
        #     if count_cluster[k] > 1:
        #         tp_fp = tp_fp + comb(count_cluster[k], 2)

        # compute True Positive (TP)
        tp     = 0
        start = time.time()
        for k in range(n_labels):
            member     = np.where(target_labels == avail_labels[k])[0]
            member_ids = labels_pred[member]
            count = np.zeros(num_item)
            for j in range(len(member)):
                index = item_map[member_ids[j]]
                count[index] = count[index] + 1
            # for i in range(num_item):
            #     if count[i] > 1:
            #         tp = tp + comb(count[i], 2)
            tp += comb(count,2).sum()
        # False Positive (FP)
        fp = tp_fp - tp

        # Compute False Negative (FN)
        count = comb(count_item, 2).sum()
        # count = 0
        # for j in range(num_item):
            # if count_item[j] > 1:
            #     count = count + comb(count_item[j], 2)
        fn = count - tp

        # compute F measure
        P = tp / (tp + fp)
        R = tp / (tp + fn)
        beta = 1
        F = (beta*beta + 1) * P * R / (beta*beta * P + R)
        return F


================================================
FILE: metrics/c_mAP_1000.py
================================================
import torch
import numpy as np
import faiss


class Metric():
    def __init__(self, **kwargs):
        self.requires = ['features_cosine', 'target_labels']
        self.name     = 'c_mAP_1000'

    def __call__(self, target_labels, features_cosine):
        labels, freqs = np.unique(target_labels, return_counts=True)
        R             = 1000

        faiss_search_index  = faiss.IndexFlatIP(features_cosine.shape[-1])
        if isinstance(features_cosine, torch.Tensor):
            features_cosine = features_cosine.detach().cpu().numpy()
            res = faiss.StandardGpuResources()
            faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
        faiss_search_index.add(features_cosine)
        nearest_neighbours  = faiss_search_index.search(features_cosine, int(R+1))[1][:,1:]

        target_labels = target_labels.reshape(-1)
        nn_labels = target_labels[nearest_neighbours]

        avg_r_precisions = []
        for label, freq in zip(labels, freqs):
            rows_with_label = np.where(target_labels==label)[0]
            for row in rows_with_label:
                n_recalled_samples           = np.arange(1,R+1)
                target_label_occ_in_row      = nn_labels[row,:]==label
                cumsum_target_label_freq_row = np.cumsum(target_label_occ_in_row)
                avg_r_pr_row = np.sum(cumsum_target_label_freq_row*target_label_occ_in_row/n_recalled_samples)/freq
                avg_r_precisions.append(avg_r_pr_row)

        return np.mean(avg_r_precisions)


================================================
FILE: metrics/c_mAP_c.py
================================================
import torch
import numpy as np
import faiss


class Metric():
    def __init__(self, **kwargs):
        self.requires = ['features_cosine', 'target_labels']
        self.name     = 'c_mAP_c'

    def __call__(self, target_labels, features_cosine):
        labels, freqs = np.unique(target_labels, return_counts=True)
        R             = np.max(freqs)

        faiss_search_index  = faiss.IndexFlatIP(features_cosine.shape[-1])
        if isinstance(features_cosine, torch.Tensor):
            features_cosine = features_cosine.detach().cpu().numpy()
            res = faiss.StandardGpuResources()
            faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
        faiss_search_index.add(features_cosine)
        nearest_neighbours  = faiss_search_index.search(features_cosine, int(R+1))[1][:,1:]

        target_labels = target_labels.reshape(-1)
        nn_labels = target_labels[nearest_neighbours]

        avg_r_precisions = []
        for label, freq in zip(labels, freqs):
            rows_with_label = np.where(target_labels==label)[0]
            for row in rows_with_label:
                n_recalled_samples           = np.arange(1,freq+1)
                target_label_occ_in_row      = nn_labels[row,:freq]==label
                cumsum_target_label_freq_row = np.cumsum(target_label_occ_in_row)
                avg_r_pr_row = np.sum(cumsum_target_label_freq_row*target_label_occ_in_row/n_recalled_samples)/freq
                avg_r_precisions.append(avg_r_pr_row)

        return np.mean(avg_r_precisions)


================================================
FILE: metrics/c_mAP_lim.py
================================================
import torch
import numpy as np
import faiss


class Metric():
    def __init__(self, **kwargs):
        self.requires = ['features_cosine', 'target_labels']
        self.name     = 'c_mAP_lim'

    def __call__(self, target_labels, features_cosine):
        labels, freqs = np.unique(target_labels, return_counts=True)
        ## Account for faiss-limit at k=1023
        R             = min(1023,len(features_cosine))

        faiss_search_index  = faiss.IndexFlatIP(features_cosine.shape[-1])
        if isinstance(features_cosine, torch.Tensor):
            features_cosine = features_cosine.detach().cpu().numpy()
            res = faiss.StandardGpuResources()
            faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
        faiss_search_index.add(features_cosine)
        nearest_neighbours  = faiss_search_index.search(features_cosine, int(R+1))[1][:,1:]

        target_labels = target_labels.reshape(-1)
        nn_labels = target_labels[nearest_neighbours]

        avg_r_precisions = []
        for label, freq in zip(labels, freqs):
            rows_with_label = np.where(target_labels==label)[0]
            for row in rows_with_label:
                n_recalled_samples           = np.arange(1,R+1)
                target_label_occ_in_row      = nn_labels[row,:]==label
                cumsum_target_label_freq_row = np.cumsum(target_label_occ_in_row)
                avg_r_pr_row = np.sum(cumsum_target_label_freq_row*target_label_occ_in_row/n_recalled_samples)/freq
                avg_r_precisions.append(avg_r_pr_row)

        return np.mean(avg_r_precisions)


================================================
FILE: metrics/c_nmi.py
================================================
from sklearn import metrics

class Metric():
    def __init__(self, **kwargs):
        self.requires = ['kmeans_nearest_cosine', 'target_labels']
        self.name     = 'c_nmi'

    def __call__(self, target_labels, computed_cluster_labels_cosine):
        NMI = metrics.cluster.normalized_mutual_info_score(computed_cluster_labels_cosine.reshape(-1), target_labels.reshape(-1))
        return NMI


================================================
FILE: metrics/c_recall.py
================================================
import numpy as np

class Metric():
    def __init__(self, k, **kwargs):
        self.k        = k
        self.requires = ['nearest_features_cosine', 'target_labels']
        self.name     = 'c_recall@{}'.format(k)

    def __call__(self, target_labels, k_closest_classes_cosine, **kwargs):
        recall_at_k = np.sum([1 for target, recalled_predictions in zip(target_labels, k_closest_classes_cosine) if target in recalled_predictions[:self.k]])/len(target_labels)
        return recall_at_k


================================================
FILE: metrics/compute_stack.py
================================================


================================================
FILE: metrics/dists.py
================================================
from scipy.spatial import distance
from sklearn.preprocessing import normalize
import numpy as np
import torch

class Metric():
    def __init__(self, mode, **kwargs):
        self.mode        = mode
        self.requires = ['features', 'target_labels']
        self.name     = 'dists@{}'.format(mode)

    def __call__(self, features, target_labels):
        features_locs = []
        for lab in np.unique(target_labels):
            features_locs.append(np.where(target_labels==lab)[0])

        if 'intra' in self.mode:
            if isinstance(features, torch.Tensor):
                intrafeatures = features.detach().cpu().numpy()
            else:
                intrafeatures = features

            intra_dists = []
            for loc in features_locs:
                c_dists = distance.cdist(intrafeatures[loc], intrafeatures[loc], 'cosine')
                c_dists = np.sum(c_dists)/(len(c_dists)**2-len(c_dists))
                intra_dists.append(c_dists)
            intra_dists = np.array(intra_dists)
            maxval      = np.max(intra_dists[1-np.isnan(intra_dists)])
            intra_dists[np.isnan(intra_dists)] = maxval
            intra_dists[np.isinf(intra_dists)] = maxval
            dist_metric = dist_metric_intra = np.mean(intra_dists)

        if 'inter' in self.mode:
            if not isinstance(features, torch.Tensor):
                coms = []
                for loc in features_locs:
                    com   = normalize(np.mean(features[loc],axis=0).reshape(1,-1)).reshape(-1)
                    coms.append(com)
                mean_inter_dist = distance.cdist(np.array(coms), np.array(coms), 'cosine')
                dist_metric = dist_metric_inter = np.sum(mean_inter_dist)/(len(mean_inter_dist)**2-len(mean_inter_dist))
            else:
                coms = []
                for loc in features_locs:
                    com   = torch.nn.functional.normalize(torch.mean(features[loc],dim=0).reshape(1,-1), dim=-1).reshape(1,-1)
                    coms.append(com)
                mean_inter_dist = 1-torch.cat(coms,dim=0).mm(torch.cat(coms,dim=0).T).detach().cpu().numpy()
                dist_metric = dist_metric_inter = np.sum(mean_inter_dist)/(len(mean_inter_dist)**2-len(mean_inter_dist))

        if self.mode=='intra_over_inter':
            dist_metric = dist_metric_intra/np.clip(dist_metric_inter, 1e-8, None)

        return dist_metric


================================================
FILE: metrics/e_recall.py
================================================
import numpy as np

class Metric():
    def __init__(self, k, **kwargs):
        self.k        = k
        self.requires = ['nearest_features', 'target_labels']
        self.name     = 'e_recall@{}'.format(k)

    def __call__(self, target_labels, k_closest_classes, **kwargs):
        recall_at_k = np.sum([1 for target, recalled_predictions in zip(target_labels, k_closest_classes) if target in recalled_predictions[:self.k]])/len(target_labels)
        return recall_at_k


================================================
FILE: metrics/f1.py
================================================
import numpy as np
from scipy.special import comb, binom
import torch

class Metric():
    def __init__(self, **kwargs):
        self.requires = ['kmeans', 'kmeans_nearest', 'features', 'target_labels']
        self.name     = 'f1'

    def __call__(self, target_labels, computed_cluster_labels, features, centroids):
        import time
        start = time.time()
        if isinstance(features, torch.Tensor):
            features = features.detach().cpu().numpy()
        d = np.zeros(len(features))
        for i in range(len(features)):
            d[i] = np.linalg.norm(features[i,:] - centroids[computed_cluster_labels[i],:])

        start = time.time()
        labels_pred = np.zeros(len(features))
        for i in np.unique(computed_cluster_labels):
            index = np.where(computed_cluster_labels == i)[0]
            ind = np.argmin(d[index])
            cid = index[ind]
            labels_pred[index] = cid


        start = time.time()
        N = len(target_labels)

        # cluster n_labels
        avail_labels = np.unique(target_labels)
        n_labels     = len(avail_labels)

        # count the number of objects in each cluster
        count_cluster = np.zeros(n_labels)
        for i in range(n_labels):
            count_cluster[i] = len(np.where(target_labels == avail_labels[i])[0])

        # build a mapping from item_id to item index
        keys     = np.unique(labels_pred)
        num_item = len(keys)
        values   = range(num_item)
        item_map = dict()
        for i in range(len(keys)):
            item_map.update([(keys[i], values[i])])


        # count the number of objects of each item
        count_item = np.zeros(num_item)
        for i in range(N):
            index = item_map[labels_pred[i]]
            count_item[index] = count_item[index] + 1

        # compute True Positive (TP) plus False Positive (FP)
        # tp_fp = 0
        tp_fp = comb(count_cluster, 2).sum()
        # for k in range(n_labels):
        #     if count_cluster[k] > 1:
        #         tp_fp = tp_fp + comb(count_cluster[k], 2)

        # compute True Positive (TP)
        tp     = 0
        start = time.time()
        for k in range(n_labels):
            member     = np.where(target_labels == avail_labels[k])[0]
            member_ids = labels_pred[member]
            count = np.zeros(num_item)
            for j in range(len(member)):
                index = item_map[member_ids[j]]
                count[index] = count[index] + 1
            # for i in range(num_item):
            #     if count[i] > 1:
            #         tp = tp + comb(count[i], 2)
            tp += comb(count,2).sum()
        # False Positive (FP)
        fp = tp_fp - tp

        # Compute False Negative (FN)
        count = comb(count_item, 2).sum()
        # count = 0
        # for j in range(num_item):
            # if count_item[j] > 1:
            #     count = count + comb(count_item[j], 2)
        fn = count - tp

        # compute F measure
        P = tp / (tp + fp)
        R = tp / (tp + fn)
        beta = 1
        F = (beta*beta + 1) * P * R / (beta*beta * P + R)
        return F


================================================
FILE: metrics/mAP.py
================================================
import torch
import numpy as np
import faiss


class Metric():
    def __init__(self, **kwargs):
        self.requires = ['features', 'target_labels']
        self.name     = 'mAP'

    def __call__(self, target_labels, features):
        labels, freqs = np.unique(target_labels, return_counts=True)
        R             = len(features)

        faiss_search_index  = faiss.IndexFlatL2(features.shape[-1])
        if isinstance(features, torch.Tensor):
            features = features.detach().cpu().numpy()
            res = faiss.StandardGpuResources()
            faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)        
        faiss_search_index.add(features)
        nearest_neighbours  = faiss_search_index.search(features, int(R+1))[1][:,1:]

        target_labels = target_labels.reshape(-1)
        nn_labels = target_labels[nearest_neighbours]

        avg_r_precisions = []
        for label, freq in zip(labels, freqs):
            rows_with_label = np.where(target_labels==label)[0]
            for row in rows_with_label:
                n_recalled_samples           = np.arange(1,R+1)
                target_label_occ_in_row      = nn_labels[row,:]==label
                cumsum_target_label_freq_row = np.cumsum(target_label_occ_in_row)
                avg_r_pr_row = np.sum(cumsum_target_label_freq_row*target_label_occ_in_row/n_recalled_samples)/freq
                avg_r_precisions.append(avg_r_pr_row)

        return np.mean(avg_r_precisions)


================================================
FILE: metrics/mAP_1000.py
================================================
import torch
import numpy as np
import faiss


class Metric():
    def __init__(self, **kwargs):
        self.requires = ['features', 'target_labels']
        self.name     = 'mAP_1000'

    def __call__(self, target_labels, features):
        labels, freqs = np.unique(target_labels, return_counts=True)
        R             = 1000

        faiss_search_index  = faiss.IndexFlatL2(features.shape[-1])
        if isinstance(features, torch.Tensor):
            features = features.detach().cpu().numpy()
            res = faiss.StandardGpuResources()
            faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
        faiss_search_index.add(features)
        nearest_neighbours  = faiss_search_index.search(features, int(R+1))[1][:,1:]

        target_labels = target_labels.reshape(-1)
        nn_labels = target_labels[nearest_neighbours]

        avg_r_precisions = []
        for label, freq in zip(labels, freqs):
            rows_with_label = np.where(target_labels==label)[0]
            for row in rows_with_label:
                n_recalled_samples           = np.arange(1,R+1)
                target_label_occ_in_row      = nn_labels[row,:]==label
                cumsum_target_label_freq_row = np.cumsum(target_label_occ_in_row)
                avg_r_pr_row = np.sum(cumsum_target_label_freq_row*target_label_occ_in_row/n_recalled_samples)/freq
                avg_r_precisions.append(avg_r_pr_row)

        return np.mean(avg_r_precisions)


================================================
FILE: metrics/mAP_c.py
================================================
import torch
import numpy as np
import faiss


class Metric():
    def __init__(self, **kwargs):
        self.requires = ['features', 'target_labels']
        self.name     = 'mAP_c'

    def __call__(self, target_labels, features):
        labels, freqs = np.unique(target_labels, return_counts=True)
        R             = np.max(freqs)

        faiss_search_index  = faiss.IndexFlatL2(features.shape[-1])
        if isinstance(features, torch.Tensor):
            features = features.detach().cpu().numpy()
            res = faiss.StandardGpuResources()
            faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)
        faiss_search_index.add(features)
        nearest_neighbours  = faiss_search_index.search(features, int(R+1))[1][:,1:]

        target_labels = target_labels.reshape(-1)
        nn_labels = target_labels[nearest_neighbours]

        avg_r_precisions = []
        for label, freq in zip(labels, freqs):
            rows_with_label = np.where(target_labels==label)[0]
            for row in rows_with_label:
                n_recalled_samples           = np.arange(1,freq+1)
                target_label_occ_in_row      = nn_labels[row,:freq]==label
                cumsum_target_label_freq_row = np.cumsum(target_label_occ_in_row)
                avg_r_pr_row = np.sum(cumsum_target_label_freq_row*target_label_occ_in_row/n_recalled_samples)/freq
                avg_r_precisions.append(avg_r_pr_row)

        return np.mean(avg_r_precisions)


================================================
FILE: metrics/mAP_lim.py
================================================
import torch
import numpy as np
import faiss


class Metric():
    def __init__(self, **kwargs):
        self.requires = ['features', 'target_labels']
        self.name     = 'mAP_lim'

    def __call__(self, target_labels, features):
        labels, freqs = np.unique(target_labels, return_counts=True)
        ## Account for faiss-limit at k=1023
        R             = min(1023,len(features))

        faiss_search_index  = faiss.IndexFlatL2(features.shape[-1])
        if isinstance(features, torch.Tensor):
            features = features.detach().cpu().numpy()
            res = faiss.StandardGpuResources()
            faiss_search_index = faiss.index_cpu_to_gpu(res, 0, faiss_search_index)        
        faiss_search_index.add(features)
        nearest_neighbours  = faiss_search_index.search(features, int(R+1))[1][:,1:]

        target_labels = target_labels.reshape(-1)
        nn_labels = target_labels[nearest_neighbours]

        avg_r_precisions = []
        for label, freq in zip(labels, freqs):
            rows_with_label = np.where(target_labels==label)[0]
            for row in rows_with_label:
                n_recalled_samples           = np.arange(1,R+1)
                target_label_occ_in_row      = nn_labels[row,:]==label
                cumsum_target_label_freq_row = np.cumsum(target_label_occ_in_row)
                avg_r_pr_row = np.sum(cumsum_target_label_freq_row*target_label_occ_in_row/n_recalled_samples)/freq
                avg_r_precisions.append(avg_r_pr_row)

        return np.mean(avg_r_precisions)


================================================
FILE: metrics/nmi.py
================================================
from sklearn import metrics

class Metric():
    def __init__(self, **kwargs):
        self.requires = ['kmeans_nearest', 'target_labels']
        self.name     = 'nmi'

    def __call__(self, target_labels, computed_cluster_labels):
        NMI = metrics.cluster.normalized_mutual_info_score(computed_cluster_labels.reshape(-1), target_labels.reshape(-1))
        return NMI


================================================
FILE: metrics/rho_spectrum.py
================================================
from scipy.spatial import distance
from sklearn.preprocessing import normalize
import numpy as np


class Metric():
    def __init__(self, embed_dim, mode,  **kwargs):
        self.mode      = mode
        self.embed_dim = embed_dim
        self.requires = ['features']
        self.name     = 'rho_spectrum@'+str(mode)

    def __call__(self, features):
        from sklearn.decomposition import TruncatedSVD
        from scipy.stats import entropy
        import torch

        if isinstance(features, torch.Tensor):
            _,s,_ = torch.svd(features)
            s     = s.cpu().numpy()
        else:
            svd = TruncatedSVD(n_components=self.embed_dim-1, n_iter=7, random_state=42)
            svd.fit(features)
            s = svd.singular_values_

        if self.mode!=0:
            s = s[np.abs(self.mode)-1:]
        s_norm  = s/np.sum(s)
        uniform = np.ones(len(s))/(len(s))

        if self.mode<0:
            kl = entropy(s_norm, uniform)
        if self.mode>0:
            kl = entropy(uniform, s_norm)
        if self.mode==0:
            kl = s_norm

        return kl


================================================
FILE: parameters.py
================================================
import argparse, os


#######################################
def basic_training_parameters(parser):
    ##### Dataset-related Parameters
    parser.add_argument('--dataset',              default='cub200',   type=str,   help='Dataset to use. Currently supported: cub200, cars196, online_products.')
    parser.add_argument('--use_tv_split',         action='store_true',            help='Flag. If set, split the training set into a training/validation set.')
    parser.add_argument('--tv_split_by_samples',  action='store_true',            help='Flag. If set, create the validation set by taking a percentage of samples PER class. \
                                                                                        Otherwise, the validation set is create by taking a percentage of classes.')
    parser.add_argument('--tv_split_perc',        default=0.8,      type=float, help='Percentage with which the training dataset is split into training/validation.')
    parser.add_argument('--augmentation',         default='base',   type=str,   help='Type of preprocessing/augmentation to use on the data.  \
                                                                                      Available: base (standard), adv (with color/brightness changes), big (Images of size 256x256), red (No RandomResizedCrop).')

    ### General Training Parameters
    parser.add_argument('--lr',                default=0.00001,  type=float,        help='Learning Rate for network parameters.')
    parser.add_argument('--fc_lr',             default=-1,       type=float,        help='Optional. If not -1, sets the learning rate for the final linear embedding layer.')
    parser.add_argument('--decay',             default=0.0004,   type=float,        help='Weight decay placed on network weights.')
    parser.add_argument('--n_epochs',          default=150,      type=int,          help='Number of training epochs.')
    parser.add_argument('--kernels',           default=6,        type=int,          help='Number of workers for pytorch dataloader.')
    parser.add_argument('--bs',                default=112 ,     type=int,          help='Mini-Batchsize to use.')
    parser.add_argument('--seed',              default=1,        type=int,          help='Random seed for reproducibility.')
    parser.add_argument('--scheduler',         default='step',   type=str,          help='Type of learning rate scheduling. Currently supported: step')
    parser.add_argument('--gamma',             default=0.3,      type=float,        help='Learning rate reduction after tau epochs.')
    parser.add_argument('--tau',               default=[1000], nargs='+',type=int , help='Stepsize before reducing learning rate.')

    ##### Loss-specific Settings
    parser.add_argument('--optim',           default='adam',        type=str,   help='Optimization method to use. Currently supported: adam & sgd.')
    parser.add_argument('--loss',            default='margin',      type=str,   help='Training criteria: For supported methods, please check criteria/__init__.py')
    parser.add_argument('--batch_mining',    default='distance',    type=str,   help='Batchminer for tuple-based losses: For supported methods, please check batch_mining/__init__.py')

    ##### Network-related Flags
    parser.add_argument('--embed_dim',        default=128,         type=int,                    help='Embedding dimensionality of the network. Note: dim = 64, 128 or 512 is used in most papers, depending on the architecture.')
    parser.add_argument('--not_pretrained',   action='store_true',                              help='Flag. If set, no ImageNet pretraining is used to initialize the network.')
    parser.add_argument('--arch',             default='resnet50_frozen_normalize',  type=str,   help='Underlying network architecture. Frozen denotes that \
                                                                                                  exisiting pretrained batchnorm layers are frozen, and normalize denotes normalization of the output embedding.')

    ##### Evaluation Parameters
    parser.add_argument('--no_train_metrics', action='store_true',   help='Flag. If set, evaluation metrics are not computed for the training data. Saves a forward pass over the full training dataset.')
    parser.add_argument('--evaluate_on_gpu',  action='store_true',   help='Flag. If set, all metrics, when possible, are computed on the GPU (requires Faiss-GPU).')
    parser.add_argument('--evaluation_metrics', nargs='+', default=['e_recall@1', 'e_recall@2', 'e_recall@4', 'nmi', 'f1', 'mAP_1000', 'mAP_lim', 'mAP_c', \
                                                                    'dists@intra', 'dists@inter', 'dists@intra_over_inter', 'rho_spectrum@0', \
                                                                    'rho_spectrum@-1', 'rho_spectrum@1', 'rho_spectrum@2', 'rho_spectrum@10'], type=str, help='Metrics to evaluate performance by.')

    parser.add_argument('--storage_metrics',    nargs='+', default=['e_recall@1'],     type=str, help='Improvement in these metrics on a dataset trigger checkpointing.')
    parser.add_argument('--evaltypes',          nargs='+', default=['discriminative'], type=str, help='The network may produce multiple embeddings (ModuleDict, relevant for e.g. DiVA). If the key is listed here, the entry will be evaluated on the evaluation metrics.\
                                                                                                       Note: One may use Combined_embed1_embed2_..._embedn-w1-w1-...-wn to compute evaluation metrics on weighted (normalized) combinations.')


    ##### Setup Parameters
    parser.add_argument('--gpu',          default=[0], nargs='+',                  type=int, help='Gpu to use.')
    parser.add_argument('--savename',     default='group_plus_seed',               type=str, help='Run savename - if default, the savename will comprise the project and group name (see wandb_parameters()).')
    parser.add_argument('--source_path',  default=os.getcwd()+'/../../Datasets',   type=str, help='Path to training data.')
    parser.add_argument('--save_path',    default=os.getcwd()+'/Training_Results', type=str, help='Where to save everything.')

    return parser


#######################################
def wandb_parameters(parser):
    ### Online Logging/Wandb Log Arguments
    parser.add_argument('--log_online',      action='store_true',            help='Flag. If set, run metrics are stored online in addition to offline logging. Should generally be set.')
    parser.add_argument('--wandb_key',       default='<your_api_key_here>',  type=str,   help='API key for W&B.')
    parser.add_argument('--project',         default='Sample_Project',       type=str,   help='Name of the project - relates to W&B project names. In --savename default setting part of the savename.')
    parser.add_argument('--group',           default='Sample_Group',         type=str,   help='Name of the group - relates to W&B group names - all runs with same setup but different seeds are logged into one group. \
                                                                                               In --savename default setting part of the savename.')
    return parser


#######################################
def loss_specific_parameters(parser):
    ### Contrastive Loss
    parser.add_argument('--loss_contrastive_pos_margin', default=0, type=float, help='positive margin for contrastive pairs.')
    parser.add_argument('--loss_contrastive_neg_margin', default=1, type=float, help='negative margin for contrastive pairs.')

    ### Triplet-based Losses
    parser.add_argument('--loss_triplet_margin',       default=0.2,         type=float, help='Margin for Triplet Loss')

    ### MarginLoss
    parser.add_argument('--loss_margin_margin',       default=0.2,          type=float, help='Triplet margin.')
    parser.add_argument('--loss_margin_beta_lr',      default=0.0005,       type=float, help='Learning Rate for learnable class margin parameters in MarginLoss')
    parser.add_argument('--loss_margin_beta',         default=1.2,          type=float, help='Initial Class Margin Parameter in Margin Loss')
    parser.add_argument('--loss_margin_nu',           default=0,            type=float, help='Regularisation value on betas in Margin Loss. Generally not needed.')
    parser.add_argument('--loss_margin_beta_constant',action='store_true',              help='Flag. If set, beta-values are left untrained.')

    ### ProxyNCA
    parser.add_argument('--loss_proxynca_lrmulti',      default=50,     type=float, help='Learning Rate multiplier for Proxies in proxynca.')
    #NOTE: The number of proxies is determined by the number of data classes.

    ### NPair
    parser.add_argument('--loss_npair_l2',     default=0.005,        type=float, help='L2 weight in NPair. Note: Set to 0.02 in paper, but multiplied with 0.25 in their implementation.')

    ### Angular Loss
    parser.add_argument('--loss_angular_alpha',             default=45, type=float, help='Angular margin in degrees.')
    parser.add_argument('--loss_angular_npair_ang_weight',  default=2,  type=float, help='Relative weighting between angular and npair contribution.')
    parser.add_argument('--loss_angular_npair_l2',          default=0.005,  type=float, help='L2 weight on NPair (as embeddings are not normalized).')

    ### Multisimilary Loss
    parser.add_argument('--loss_multisimilarity_pos_weight', default=2,         type=float, help='Weighting on positive similarities.')
    parser.add_argument('--loss_multisimilarity_neg_weight', default=40,        type=float, help='Weighting on negative similarities.')
    parser.add_argument('--loss_multisimilarity_margin',     default=0.1,       type=float, help='Distance margin for both positive and negative similarities.')
    parser.add_argument('--loss_multisimilarity_thresh',     default=0.5,       type=float, help='Exponential thresholding.')

    ### Lifted Structure Loss
    parser.add_argument('--loss_lifted_neg_margin', default=1,     type=float, help='Margin placed on similarities.')
    parser.add_argument('--loss_lifted_l2',         default=0.005, type=float, help='As embeddings are not normalized, they need to be placed under penalty.')

    ### Quadruplet Loss
    parser.add_argument('--loss_quadruplet_margin_alpha_1',  default=0.2, type=float, help='Quadruplet Loss requires two margins. This is the first one.')
    parser.add_argument('--loss_quadruplet_margin_alpha_2',  default=0.2, type=float, help='This is the second.')

    ### Soft-Triple Loss
    parser.add_argument('--loss_softtriplet_n_centroids',   default=2,    type=int,   help='Number of proxies per class.')
    parser.add_argument('--loss_softtriplet_margin_delta',  default=0.01, type=float, help='Margin placed on sample-proxy similarities.')
    parser.add_argument('--loss_softtriplet_gamma',         default=0.1,  type=float, help='Weight over sample-proxies within a class.')
    parser.add_argument('--loss_softtriplet_lambda',        default=8,    type=float, help='Serves as a temperature.')
    parser.add_argument('--loss_softtriplet_reg_weight',    default=0.2,  type=float, help='Regularization weight on the number of proxies.')
    parser.add_argument('--loss_softtriplet_lrmulti',       default=1,    type=float, help='Learning Rate multiplier for proxies.')

    ### Normalized Softmax Loss
    parser.add_argument('--loss_softmax_lr',           default=0.00001, type=float, help='Learning rate on class proxies.')
    parser.add_argument('--loss_softmax_temperature',  default=0.05,    type=float, help='Temperature for NCA objective.')

    ### Histogram Loss
    parser.add_argument('--loss_histogram_nbins',  default=65, type=int, help='Number of bins for histogram discretization.')

    ### SNR Triplet (with learnable margin) Loss
    parser.add_argument('--loss_snr_margin',      default=0.2,   type=float, help='Triplet margin.')
    parser.add_argument('--loss_snr_reg_lambda',  default=0.005, type=float, help='Regularization of in-batch element sum.')

    ### ArcFace
    parser.add_argument('--loss_arcface_lr',             default=0.0005,  type=float, help='Learning rate on class proxies.')
    parser.add_argument('--loss_arcface_angular_margin', default=0.5,     type=float, help='Angular margin in radians.')
    parser.add_argument('--loss_arcface_feature_scale',  default=16,      type=float, help='Inverse Temperature for NCA objective.')
    return parser


#######################################
def batchmining_specific_parameters(parser):
    ### Distance-based Batchminer
    parser.add_argument('--miner_distance_lower_cutoff', default=0.5, type=float, help='Lower cutoff on distances - values below are sampled with equal prob.')
    parser.add_argument('--miner_distance_upper_cutoff', default=1.4, type=float, help='Upper cutoff on distances - values above are IGNORED.')
    ### Spectrum-Regularized Miner (as proposed in our paper) - utilizes a distance-based sampler that is regularized.
    parser.add_argument('--miner_rho_distance_lower_cutoff', default=0.5, type=float, help='Lower cutoff on distances - values below are sampled with equal prob.')
    parser.add_argument('--miner_rho_distance_upper_cutoff', default=1.4, type=float, help='Upper cutoff on distances - values above are IGNORED.')
    parser.add_argument('--miner_rho_distance_cp',           default=0.2, type=float, help='Probability to replace a negative with a positive.')
    return parser


#######################################
def batch_creation_parameters(parser):
    parser.add_argument('--data_sampler',              default='class_random', type=str, help='How the batch is created. Available options: See datasampler/__init__.py.')
    parser.add_argument('--samples_per_class',         default=2,              type=int, help='Number of samples in one class drawn before choosing the next class. Set to >1 for tuple-based loss.')
    ### Batch-Sample Flags - Have no relevance to default SPC-N sampling
    parser.add_argument('--data_batchmatch_bigbs',     default=512,            type=int, help='Size of batch to be summarized into a smaller batch. For distillation/coreset-based methods.')
    parser.add_argument('--data_batchmatch_ncomps',    default=10,             type=int, help='Number of batch candidates that are evaluated, from which the best one is chosen.')
    parser.add_argument('--data_storage_no_update',    action='store_true',              help='Flag for methods that need a sample storage. If set, storage entries are NOT updated.')
    parser.add_argument('--data_d2_coreset_lambda',    default=1, type=float,            help='Regularisation for D2-coreset.')
    parser.add_argument('--data_gc_coreset_lim',       default=1e-9, type=float,         help='D2-coreset value limit.')
    parser.add_argument('--data_sampler_lowproj_dim',  default=-1, type=int,             help='Optionally project embeddings into a lower dimension to ensure that greedy coreset works better. Only makes a difference for large embedding dims.')
    parser.add_argument('--data_sim_measure',          default='euclidean', type=str,    help='Distance measure to use for batch selection.')
    parser.add_argument('--data_gc_softened',          action='store_true', help='Flag. If set, use a soft version of greedy coreset.')
    parser.add_argument('--data_idx_full_prec',        action='store_true', help='Deprecated.')
    parser.add_argument('--data_mb_mom',               default=-1, type=float, help='For memory-bank based samplers - momentum term on storage entry updates.')
    parser.add_argument('--data_mb_lr',                default=1,  type=float, help='Deprecated.')

    return parser


================================================
FILE: toy_experiments/toy_example_diagonal_lines.py
================================================
import os, numpy as np, matplotlib.pyplot as plt
import torch, torch.nn as nn, torchvision as tv
import numpy as np
import random

"""==================================================="""
seed = 1
torch.backends.cudnn.deterministic=True; np.random.seed(seed); random.seed(seed)
torch.manual_seed(seed); torch.cuda.manual_seed(seed); torch.cuda.manual_seed_all(seed)


"""==================================================="""
ppline     = 100
n_lines    = 4
noise_perc = 0.15
intervals  = [(0.1,0.3), (0.35,0.55), (0.6,0.8), (0.85,1.05)]
lines = [np.stack([np.linspace(intv[0],intv[1],ppline), np.linspace(intv[0],intv[1],ppline)])[:,np.random.choice(ppline, int(ppline*noise_perc), replace=False)] for intv in intervals]
cls   = [x*np.ones(int(ppline*noise_perc)) for x in range(n_lines)]
train_lines = np.concatenate(lines, axis=1).T
train_cls   = np.concatenate(cls)

x_test_line1 = np.stack([0.2*np.ones(ppline), np.linspace(0.2,0.4,ppline)])[:,np.random.choice(ppline, int(ppline*noise_perc), replace=False)]
x_test_line2 = np.stack([0.2*np.ones(ppline), np.linspace(0.55,0.85,ppline)])[:,np.random.choice(ppline, int(ppline*noise_perc), replace=False)]
y_test_line1 = np.stack([np.linspace(0.4,0.6,ppline), 0.2*np.ones(ppline)])[:,np.random.choice(ppline, int(ppline*noise_perc), replace=False)]
y_test_line2 = np.stack([np.linspace(0.7,0.9,ppline), 0.2*np.ones(ppline)])[:,np.random.choice(ppline, int(ppline*noise_perc), replace=False)]
# for line in lines:
#     plt.plot(line[0,:], line[1,:], '.', markersize=6)
# plt.plot(x_test_line1[0,:], x_test_line1[1,:])
# plt.plot(x_test_line2[0,:], x_test_line2[1,:])
# plt.plot(y_test_line1[0,:], y_test_line1[1,:])
# plt.plot(y_test_line2[0,:], y_test_line2[1,:])


###############
os.environ["CUDA_DEVICE_ORDER"]   = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]= '1'

###############
import itertools as it
from tqdm import tqdm
import torch.nn.functional as F
bs         = 24
lr         = 0.03
neg_margin = 0.1
train_iter = 200
device = torch.device('cpu')


###############
class Backbone(nn.Module):
    def __init__(self):
        super().__init__()
        self.backbone = nn.Sequential(nn.Linear(2,30), nn.ReLU(), nn.Linear(30,30), nn.ReLU(), nn.Linear(30,2))

    def forward(self, x):
        return torch.nn.functional.normalize(self.backbone(x),dim=1)


###############
base_net     = Backbone()
main_reg_net = Backbone()


###############
def train(net2train, p_switch=0):
    device = torch.device('cpu')
    _ = net2train.train()
    _ = net2train.to(device)
    optim = torch.optim.Adam(net2train.parameters(), lr=lr)

    loss_collect = []
    for i in range(train_iter):
        idxs   = np.random.choice(len(train_lines), bs, replace=False)
        batch  = torch.from_numpy(train_lines[idxs,:]).to(torch.float).to(device)
        train_labels = train_cls[idxs]
        embed        = net2train(batch)

        unique_cls   = np.unique(train_labels)
        indices      = np.arange(len(batch))
        class_dict   = {i:indices[train_labels==i] for i in unique_cls}

        sampled_triplets = [list(it.product([x],[x],[y for y in unique_cls if x!=y])) for x in unique_cls]
        sampled_triplets = [x for y in sampled_triplets for x in y]
        sampled_triplets = [[x for x in list(it.product(*[class_dict[j] for j in i])) if x[0]!=x[1]] for i in sampled_triplets]
        sampled_triplets = [x for y in sampled_triplets for x in y]

        anchors   = [triplet[0] for triplet in sampled_triplets]
        positives = [triplet[1] for triplet in sampled_triplets]
        negatives = [triplet[2] for triplet in sampled_triplets]

        if p_switch>0:
            negatives = [p if np.random.choice(2, p=[1-p_switch, p_switch]) else n for n,p in zip(negatives, positives)]
            neg_dists = torch.mean(F.relu(neg_margin - nn.PairwiseDistance(p=2)(embed[anchors,:], embed[negatives,:])))
            loss      = neg_dists
        else:
            pos_dists = torch.mean(F.relu(nn.PairwiseDistance(p=2)(embed[anchors,:], embed[positives,:])))
            neg_dists = torch.mean(F.relu(neg_margin - nn.PairwiseDistance(p=2)(embed[anchors,:], embed[negatives,:])))
            loss      = pos_dists + neg_dists

        optim.zero_grad()
        loss.backward()
        optim.step()

        loss_collect.append(loss.item())

    return loss_collect


###############
base_loss    = train(base_net)
_ = train(main_reg_net, p_switch=0.001)


###############
def get_embeds(net):
    _ = net.eval()
    with torch.no_grad():
        train_embed      = net(torch.from_numpy(train_lines).to(torch.float).to(device)).cpu().detach().numpy()
        x_embed_test_line1 = net(torch.from_numpy(x_test_line1.T).to(torch.float).to(device)).cpu().detach().numpy()
        x_embed_test_line2 = net(torch.from_numpy(x_test_line2.T).to(torch.float).to(device)).cpu().detach().numpy()
        y_embed_test_line1 = net(torch.from_numpy(y_test_line1.T).to(torch.float).to(device)).cpu().detach().numpy()
        y_embed_test_line2 = net(torch.from_numpy(y_test_line2.T).to(torch.float).to(device)).cpu().detach().numpy()
    _, s, _ = np.linalg.svd(train_embed)
    s = s/np.sum(s)
    return train_embed, x_embed_test_line1, x_embed_test_line2, y_embed_test_line1, y_embed_test_line2, s


###############
base_embed, x_base_t1, x_base_t2, y_base_t1, y_base_t2, base_s = get_embeds(base_net)
sp                                                             = get_embeds(main_reg_net)

###
theta = np.radians(np.linspace(0,360,300))
x_2 = np.cos(theta)
y_2 = np.sin(theta)


###
plt.style.use('default')
import matplotlib.cm as cm
colors = cm.rainbow(np.linspace(0, 0.5, n_lines))
colors = np.array([colors[int(sample_cls)] for sample_cls in train_cls][::-1])
f,ax = plt.subplots(1,4)
for i in range(len(lines)):
    loc = np.where(train_cls==i)[0]
    ax[0].scatter(train_lines[loc,0], train_lines[loc,1], color=list(colors[loc,:]), label='Train Cls {}'.format(i), s=40)
ax[0].scatter(x_test_line1[0,:], x_test_line1[1,:], marker='x', label='Test Cls 1', color='r', s=40)
ax[0].scatter(x_test_line2[0,:], x_test_line2[1,:], marker='x', label='Test Cls 2', color='black', s=60)
ax[0].scatter(y_test_line1[0,:], y_test_line1[1,:], marker='^', label='Test Cls 3', color='brown', s=40)
ax[0].scatter(y_test_line2[0,:], y_test_line2[1,:], marker='^', label='Test Cls 4', color='magenta', s=60)
ax[1].plot(x_2, y_2, '--', color='gray', label='Unit Circle')
for i in range(len(lines)):
    loc = np.where(train_cls==i)[0]
    ax[1].scatter(base_embed[loc,0], base_embed[loc,1], color=list(colors[loc,:]), s=60)
ax[1].scatter(x_base_t1[:,0], x_base_t1[:,1], marker='x', color='r', s=60)
ax[1].scatter(x_base_t2[:,0], x_base_t2[:,1], marker='x', color='black', s=60)
ax[1].scatter(y_base_t1[:,0], y_base_t1[:,1], marker='^', color='brown', s=60)
ax[1].scatter(y_base_t2[:,0], y_base_t2[:,1], marker='^', color='magenta', s=60)
ax[1].set_xlim([np.min(base_embed[:,0])*0.85,np.max(base_embed[:,0]*1.15)])
ax[1].set_ylim([np.min(base_embed[:,1])*1.15,np.max(base_embed[:,1]*0.85)])
ax[2].plot(x_2, y_2, '--', color='gray', label='Unit Circle')
for i in range(len(lines)):
    loc = np.where(train_cls==i)[0]
    ax[2].scatter(sp[0][loc,0], sp[0][loc,1], color=list(colors[loc,:]), alpha=0.4, s=60)
ax[2].scatter(sp[1][:,0], sp[1][:,1], marker='x', color='r', s=60)
ax[2].scatter(sp[2][:,0], sp[2][:,1], marker='x', color='black', s=60)
ax[2].scatter(sp[3][:,0], sp[3][:,1], marker='^', color='brown', s=60)
ax[2].scatter(sp[4][:,0], sp[4][:,1], marker='^', color='magenta', s=60)
ax[2].set_xlim([np.min(sp[0][:,0])*1.15,np.max(sp[0][:,0]*1.15)])
ax[2].set_ylim([np.min(sp[0][:,1])*1.15,np.max(sp[0][:,1]*1.15)])
ax[3].bar(np.array([0,0.25]), base_s, width=0.25, alpha=0.5,edgecolor='k', label=r'$Base$')
ax[3].bar(np.array([0.6,0.85]), sp[5],width=0.25, alpha=0.5,edgecolor='k', label=r'$Reg.~Emb.$')
ax[3].set_xticks([0,0.25,0.6,0.85])
ax[3].set_xticklabels([1,2,1,2])
# ax[1].text(0.5, -1.005, r'$Test~Classes$', fontsize=17, bbox=dict(boxstyle='round', facecolor='white', alpha=0.5))
# ax[2].text(-0.45, -0.05, r'$Test~Classes$', fontsize=17, bbox=dict(boxstyle='round', facecolor='white', alpha=0.5))
# ax[1].annotate("", xy=(0.5, -0.9),   xytext=(0.55, -0.98), arrowprops=dict(facecolor='k', headwidth=5, width=2, shrink=0, alpha=1))
# ax[1].annotate("", xy=(0.55, -0.85), xytext=(0.55, -0.98), arrowprops=dict(facecolor='k', headwidth=5, width=2, shrink=0, alpha=1))
# ax[1].annotate("", xy=(0.56, -0.86), xytext=(0.55, -0.98), arrowprops=dict(facecolor='k', headwidth=5, width=2, shrink=0, alpha=1))
# ax[1].annotate("", xy=(0.59, -0.82), xytext=(0.55, -0.98), arrowprops=dict(facecolor='k', headwidth=5, width=2, shrink=0, alpha=1))
# ax[2].annotate("", xy=(0.2, 0.8),  xytext=(0.2, -0.02), arrowprops=dict(facecolor='k', headwidth=5, width=2, shrink=0, alpha=1))
# ax[2].annotate("", xy=(0.9, 0.1), xytext=(0.2, -0.02), arrowprops=dict(facecolor='k', headwidth=5, width=2, shrink=0, alpha=1))
# ax[2].annotate("", xy=(0.8, -0.5), xytext=(0.2, -0.02), arrowprops=dict(facecolor='k', headwidth=5, width=2, shrink=0, alpha=1))
# ax[2].annotate("", xy=(0.7, -0.52), xytext=(0.2, -0.02), arrowprops=dict(facecolor='k', headwidth=5, width=2, shrink=0, alpha=1))
ax[0].set_title(r'$Train/Test~Data$', fontsize=22)
ax[1].set_title(r'$Base~Embed.$', fontsize=22)
ax[2].set_title(r'$Regularized~Embed.$', fontsize=22)
ax[3].set_title(r'$SV~Spectrum$', fontsize=22)
ax[0].legend(loc='upper center',fontsize=16)
ax[1].legend(fontsize=16)
ax[2].legend(loc='center left', fontsize=16)
# ax[1].legend(fontsize=16)
# ax[2].legend(loc=4,fontsize=16)
ax[3].legend(loc=1,fontsize=16)
for a in ax.reshape(-1):
    a.tick_params(axis='both', which='major', labelsize=20)
    a.tick_params(axis='both', which='minor', labelsize=20)
f.set_size_inches(22,6)
f.tight_layout()
f.savefig('diag_line_toy_ex_save.png')
f.savefig('diag_line_toy_ex_save.pdf')
plt.close()


================================================
FILE: utilities/__init__.py
================================================


================================================
FILE: utilities/logger.py
================================================
import datetime, csv, os, numpy as np
from matplotlib import pyplot as plt
import pickle as pkl
from utilities.misc import gimme_save_string

"""============================================================================================================="""
################## WRITE TO CSV FILE #####################
class CSV_Writer():
    def __init__(self, save_path):
        self.save_path = save_path
        self.written         = []
        self.n_written_lines = {}

    def log(self, group, segments, content):
        if group not in self.n_written_lines.keys():
            self.n_written_lines[group] = 0

        with open(self.save_path+'_'+group+'.csv', "a") as csv_file:
            writer = csv.writer(csv_file, delimiter=",")
            if group not in self.written: writer.writerow(segments)
            for line in content:
                writer.writerow(line)
                self.n_written_lines[group] += 1

        self.written.append(group)


################## PLOT SUMMARY IMAGE #####################
class InfoPlotter():
    def __init__(self, save_path, title='Training Log', figsize=(25,19)):
        self.save_path = save_path
        self.title     = title
        self.figsize   = figsize
        self.colors    = ['r','g','b','y','m','c','orange','darkgreen','lightblue']

    def make_plot(self, base_title, title_append, sub_plots, sub_plots_data):
        sub_plots = list(sub_plots)
        if 'epochs' not in sub_plots:
            x_data = range(len(sub_plots_data[0]))
        else:
            x_data = range(sub_plots_data[np.where(np.array(sub_plots)=='epochs')[0][0]][-1]+1)

        self.ov_title = [(sub_plot,sub_plot_data) for sub_plot, sub_plot_data in zip(sub_plots,sub_plots_data) if sub_plot not in ['epoch','epochs','time']]
        self.ov_title = [(x[0],np.max(x[1])) if 'loss' not in x[0] else (x[0],np.min(x[1])) for x in self.ov_title]
        self.ov_title = title_append +': '+ '  |  '.join('{0}: {1:.4f}'.format(x[0],x[1]) for x in self.ov_title)
        sub_plots_data = [x for x,y in zip(sub_plots_data, sub_plots)]
        sub_plots      = [x for x in sub_plots]

        plt.style.use('ggplot')
        f,ax = plt.subplots(1)
        ax.set_title(self.ov_title, fontsize=22)
        for i,(data, title) in enumerate(zip(sub_plots_data, sub_plots)):
            ax.plot(x_data, data, '-{}'.format(self.colors[i]), linewidth=1.7, label=base_title+' '+title)
        ax.tick_params(axis='both', which='major', labelsize=18)
        ax.tick_params(axis='both', which='minor', labelsize=18)
        ax.legend(loc=2, prop={'size': 16})
        f.set_size_inches(self.figsize[0], self.figsize[1])
        f.savefig(self.save_path+'_'+title_append+'.svg')
        plt.close()


################## GENERATE LOGGING FOLDER/FILES #######################
def set_logging(opt):
    checkfolder = opt.save_path+'/'+opt.savename
    if opt.savename == '':
        date = datetime.datetime.now()
        time_string = '{}-{}-{}-{}-{}-{}'.format(date.year, date.month, date.day, date.hour, date.minute, date.second)
        checkfolder = opt.save_path+'/{}_{}_'.format(opt.dataset.upper(), opt.arch.upper())+time_string
    counter     = 1
    while os.path.exists(checkfolder):
        checkfolder = opt.save_path+'/'+opt.savename+'_'+str(counter)
        counter += 1
    os.makedirs(checkfolder)
    opt.save_path = checkfolder

    if 'experiment' in vars(opt):
        import argparse
        save_opt = {key:item for key,item in vars(opt).items() if key!='experiment'}
        save_opt = argparse.Namespace(**save_opt)
    else:
        save_opt = opt

    with open(save_opt.save_path+'/Parameter_Info.txt','w') as f:
        f.write(gimme_save_string(save_opt))
    pkl.dump(save_opt,open(save_opt.save_path+"/hypa.pkl","wb"))


class Progress_Saver():
    def __init__(self):
        self.groups = {}

    def log(self, segment, content, group=None):
        if group is None: group = segment
        if group not in self.groups.keys():
            self.groups[group] = {}

        if segment not in self.groups[group].keys():
            self.groups[group][segment] = {'content':[],'saved_idx':0}

        self.groups[group][segment]['content'].append(content)


class LOGGER():
    def __init__(self, opt, sub_loggers=[], prefix=None, start_new=True, log_online=False):
        """
        LOGGER Internal Structure:

        self.progress_saver: Contains multiple Progress_Saver instances to log metrics for main metric subsets (e.g. "Train" for training metrics)
            ['main_subset_name']: Name of each main subset (-> e.g. "Train")
                .groups: Dictionary of subsets belonging to one of the main subsets, e.g. ["Recall", "NMI", ...]
                    ['specific_metric_name']: Specific name of the metric of interest, e.g. Recall@1.
        """
        self.prop        = opt
        self.prefix      = '{}_'.format(prefix) if prefix is not None else ''
        self.sub_loggers = sub_loggers

        ### Make Logging Directories
        if start_new: set_logging(opt)

        ### Set Graph and CSV writer
        self.csv_writer, self.graph_writer, self.progress_saver = {},{},{}
        for sub_logger in sub_loggers:
            csv_savepath = opt.save_path+'/CSV_Logs'
            if not os.path.exists(csv_savepath): os.makedirs(csv_savepath)
            self.csv_writer[sub_logger]     = CSV_Writer(csv_savepath+'/Data_{}{}'.format(self.prefix, sub_logger))

            prgs_savepath = opt.save_path+'/Progression_Plots'
            if not os.path.exists(prgs_savepath): os.makedirs(prgs_savepath)
            self.graph_writer[sub_logger]   = InfoPlotter(prgs_savepath+'/Graph_{}{}'.format(self.prefix, sub_logger))
            self.progress_saver[sub_logger] = Progress_Saver()


        ### WandB Init
        self.save_path   = opt.save_path
        self.log_online  = log_online


    def update(self, *sub_loggers, all=False):
        online_content = []

        if all: sub_loggers = self.sub_loggers

        for sub_logger in list(sub_loggers):
            for group in self.progress_saver[sub_logger].groups.keys():
                pgs      = self.progress_saver[sub_logger].groups[group]
                segments = pgs.keys()
                per_seg_saved_idxs   = [pgs[segment]['saved_idx'] for segment in segments]
                per_seg_contents     = [pgs[segment]['content'][idx:] for segment,idx in zip(segments, per_seg_saved_idxs)]
                per_seg_contents_all = [pgs[segment]['content'] for segment,idx in zip(segments, per_seg_saved_idxs)]

                #Adjust indexes
                for content,segment in zip(per_seg_contents, segments):
                    self.progress_saver[sub_logger].groups[group][segment]['saved_idx'] += len(content)

                tupled_seg_content = [list(seg_content_slice) for seg_content_slice in zip(*per_seg_contents)]

                self.csv_writer[sub_logger].log(group, segments, tupled_seg_content)
                self.graph_writer[sub_logger].make_plot(sub_logger, group, segments, per_seg_contents_all)

                for i,segment in enumerate(segments):
                    if group == segment:
                        name = sub_logger+': '+group
                    else:
                        name = sub_logger+': '+group+': '+segment
                    online_content.append((name,per_seg_contents[i]))

        if self.log_online:
            if self.prop.online_backend=='wandb':
                import wandb
                for i,item in enumerate(online_content):
                    if isinstance(item[1], list):
                        wandb.log({item[0]:np.mean(item[1])}, step=self.prop.epoch)
                    else:
                        wandb.log({item[0]:item[1]}, step=self.prop.epoch)
            elif self.prop.online_backend=='comet_ml':
                for i,item in enumerate(online_content):
                    if isinstance(item[1], list):
                        self.prop.experiment.log_metric(item[0],np.mean(item[1]), self.prop.epoch)
                    else:
                        self.prop.experiment.log_metric(item[0],item[1],self.prop.epoch)


================================================
FILE: utilities/misc.py
================================================
"""============================================================================================================="""
######## LIBRARIES #####################
import numpy as np


"""============================================================================================================="""
################# ACQUIRE NUMBER OF WEIGHTS #################
def gimme_params(model):
    model_parameters = filter(lambda p: p.requires_grad, model.parameters())
    params = sum([np.prod(p.size()) for p in model_parameters])
    return params


################# SAVE TRAINING PARAMETERS IN NICE STRING #################
def gimme_save_string(opt):
    varx = vars(opt)
    base_str = ''
    for key in varx:
        base_str += str(key)
        if isinstance(varx[key],dict):
            for sub_key, sub_item in varx[key].items():
                base_str += '\n\t'+str(sub_key)+': '+str(sub_item)
        else:
            base_str += '\n\t'+str(varx[key])
        base_str+='\n\n'
    return base_str


#############################################################################
import torch, torch.nn as nn

class DataParallel(nn.Module):
    def __init__(self, model, device_ids, dim):
        super().__init__()
        self.model    = model.model
        self.network  = nn.DataParallel(model, device_ids, dim)

    def forward(self, x):
        return self.network(x)