[
  {
    "path": "LICENSE.md",
    "content": "MIT License\n\nCopyright (c) 2023 ZHAO, BO\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# DeepCore: A Comprehensive Library for Coreset Selection in Deep Learning [PDF](https://arxiv.org/pdf/2204.08499.pdf)\n\n\n### Introduction\nTo advance the research of coreset selection in deep learning, we contribute a code library named **DeepCore**, an extensive and extendable code library, for coreset selection in deep learning, reproducing dozens of popular and advanced coreset selection methods and enabling a fair comparison of different methods in the same experimental settings. **DeepCore** is highly modular, allowing to add new architectures, datasets, methods and learning scenarios easily. It is built on PyTorch.   \n\n### Coreset Methods\nWe list the methods in DeepCore according to the categories in our original paper, they are 1) geometry based methods Contextual Diversity (CD), Herding  and k-Center Greedy; 2) uncertainty scores; 3) error based methods Forgetting  and GraNd score ; 4) decision boundary based methods Cal  and DeepFool ; 5) gradient matching based methods Craig  and GradMatch ; 6) bilevel optimiza- tion methods Glister ; and 7) Submodularity based Methods (GC) and Facility Location (FL) functions. we also have Random selection as the baseline.\n\n### Datasets\nIt contains a series of other popular computer vision datasets, namely MNIST, QMNIST, FashionMNIST, SVHN, CIFAR10, CIFAR100 and TinyImageNet and ImageNet.\n\n### Models\nThey are two-layer fully connected MLP, LeNet , AlexNet, VGG, Inception-v3, ResNet, WideResNet and MobileNet-v3.\n\n### Example\nSelecting with Glister and training on the coreset with fraction 0.1.\n```sh\nCUDA_VISIBLE_DEVICES=0 python -u main.py --fraction 0.1 --dataset CIFAR10 --data_path ~/datasets --num_exp 5 --workers 10 --optimizer SGD -se 10 --selection Glister --model InceptionV3 --lr 0.1 -sp ./result --batch 128\n```\n\nResuming interuppted training with argument ```--resume```.\n```sh\nCUDA_VISIBLE_DEVICES=0 python -u main.py --fraction 0.1 --dataset CIFAR10 --data_path ~/datasets --num_exp 5 --workers 10 --optimizer SGD -se 10 --selection Glister --model InceptionV3 --lr 0.1 -sp ./result --batch 128 --resume \"CIFAR10_InceptionV3_Glister_exp0_epoch200_2022-02-05 21:31:53.762903_0.1_unknown.ckpt\"\n```\n\nBatch size can be seperatedly assigned for both selection and training.\n```sh\nCUDA_VISIBLE_DEVICES=0 python -u main.py --fraction 0.5 --dataset ImageNet --data_path ~/datasets --num_exp 5 --workers 10 --optimizer SGD -se 10 --selection Cal --model MobileNetV3Large --lr 0.1 -sp ./result -tb 256 -sb 128\n```\n\nArgument ```--uncertainty``` to choose uncertainty scores.\n```sh\nCUDA_VISIBLE_DEVICES=0 python -u main.py --fraction 0.1 --dataset CIFAR10 --data_path ~/datasets --num_exp 5 --workers 10 --optimizer SGD -se 10 --selection Uncertainty --model ResNet18 --lr 0.1 -sp ./result --batch 128 --uncertainty Entropy\n```\n\n\nArgument ```--submodular``` to choose submodular function, e.g. ```GraphCut```, ```FacilityLocation``` or ```LogDeterminant```. You may also specify the type of greedy algorithm to use when maximizing functions with argument ```--submodular_greedy```, for example ```NaiveGreedy```, ```LazyGreedy```, ```StochasticGreedy```, etc.\n```sh\nCUDA_VISIBLE_DEVICES=0 python -u main.py --fraction 0.1 --dataset CIFAR10 --data_path ~/datasets --num_exp 5 --workers 10 --optimizer SGD -se 10 --selection Submodular --model ResNet18 --lr 0.1 -sp ./result --batch 128 --submodular GraphCut --submodular_greedy NaiveGreedy\n```\n\n### Extend\n\nDeepCore is highly modular and scalable. It allows to add new architectures, datasets and selection methods easily, to help coreset methods to be evaluated in a richer set of scenarios, and also to facilitate new methods for comparison. Here is an example for datasets. To add a new dataset, you need implement a function whose input is the data path and outputs are number of channels, size of image, number of classes, names of classes, mean, std and training and testing dataset inherited from ```torch.utils.data.Dataset```.\n\n\n```python\nfrom torchvision import datasets, transforms\n\n\ndef MNIST(data_path):\n    channel = 1\n    im_size = (28, 28)\n    num_classes = 10\n    mean = [0.1307]\n    std = [0.3081]\n    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)])\n    dst_train = datasets.MNIST(data_path, train=True, download=True, transform=transform)\n    dst_test = datasets.MNIST(data_path, train=False, download=True, transform=transform)\n    class_names = [str(c) for c in range(num_classes)]\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n```\nThis is an example for implementing network architecture.\n```python\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom torch import set_grad_enabled\nfrom .nets_utils import EmbeddingRecorder\n\n\nclass MLP(nn.Module):\n    def __init__(self, channel, num_classes, im_size, record_embedding: bool = False, no_grad: bool = False,\n                 pretrained: bool = False):\n        if pretrained:\n            raise NotImplementedError(\"torchvison pretrained models not available.\")\n        super(MLP, self).__init__()\n        self.fc_1 = nn.Linear(im_size[0] * im_size[1] * channel, 128)\n        self.fc_2 = nn.Linear(128, 128)\n        self.fc_3 = nn.Linear(128, num_classes)\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc_3\n\n    def forward(self, x):\n        with set_grad_enabled(not self.no_grad):\n            out = x.view(x.size(0), -1)\n            out = F.relu(self.fc_1(out))\n            out = F.relu(self.fc_2(out))\n            out = self.embedding_recorder(out)\n            out = self.fc_3(out)\n        return out\n```\n\nTo implement the new coreset method, you need to inherit the new method from the ```CoresetMethod``` class and return the selected indices via the ```select``` method.\n\n```python\nclass CoresetMethod(object):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, **kwargs):\n        if fraction <= 0.0 or fraction > 1.0:\n            raise ValueError(\"Illegal Coreset Size.\")\n        self.dst_train = dst_train\n        self.num_classes = len(dst_train.classes)\n        self.fraction = fraction\n        self.random_seed = random_seed\n        self.index = []\n        self.args = args\n\n        self.n_train = len(dst_train)\n        self.coreset_size = round(self.n_train * fraction)\n\n    def select(self, **kwargs):\n        return\n```\n\n### References\n\n1. Agarwal, S., Arora, H., Anand, S., Arora, C.: Contextual diversity for active learning. In: ECCV. pp. 137–153. Springer (2020)\n2. Coleman, C., Yeh, C., Mussmann, S., Mirzasoleiman, B., Bailis, P., Liang, P., Leskovec, J., Zaharia, M.: Selection via proxy: Efficient data selection for deep learning. In: ICLR (2019)\n3. Ducoffe, M., Precioso, F.: Adversarial active learning for deep networks: a margin based approach. arXiv preprint arXiv:1802.09841 (2018)\n4. Iyer, R., Khargoankar, N., Bilmes, J., Asanani, H.: Submodular combinatorial information measures with applications in machine learning. In: Algorithmic Learning Theory. pp. 722–754. PMLR (2021)\n5. Killamsetty, K., Durga, S., Ramakrishnan, G., De, A., Iyer, R.: Grad-match: Gradient matching based data subset selection for efficient deep model training. In: ICML. pp. 5464–5474 (2021)\n6. Killamsetty, K., Sivasubramanian, D., Ramakrishnan, G., Iyer, R.: Glister: Generalization based data subset selection for efficient and robust learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (2021)\n7. Margatina, K., Vernikos, G., Barrault, L., Aletras, N.: Active learning by acquiring contrastive examples. arXiv preprint arXiv:2109.03764 (2021)\n8. Mirzasoleiman, B., Bilmes, J., Leskovec, J.: Coresets for data-efficient training of machine learning models. In: ICML. PMLR (2020)\n9. Paul, M., Ganguli, S., Dziugaite, G.K.: Deep learning on a data diet: Finding important examples early in training. arXiv preprint arXiv:2107.07075 (2021)\n10. Sener, O., Savarese, S.: Active learning for convolutional neural networks: A coreset approach. In: ICLR (2018)\n11. Toneva, M., Sordoni, A., des Combes, R.T., Trischler, A., Bengio, Y., Gordon, G.J.: An empirical study of example forgetting during deep neural network learning. In: ICLR (2018)\n12. Welling, M.: Herding dynamical weights to learn. In: Proceedings of the 26th Annual International Conference on Machine Learning. pp. 1121–1128 (2009)\n\n\n"
  },
  {
    "path": "deepcore/__init__.py",
    "content": "# __init__.py"
  },
  {
    "path": "deepcore/datasets/__init__.py",
    "content": "from .cifar10 import *\nfrom .cifar100 import *\nfrom .fashionmnist import *\nfrom .imagenet import *\nfrom .mnist import *\nfrom .qmnist import *\nfrom .svhn import *\nfrom .tinyimagenet import *\n"
  },
  {
    "path": "deepcore/datasets/cifar10.py",
    "content": "from torchvision import datasets, transforms\nfrom torch import tensor, long\n\n\ndef CIFAR10(data_path):\n    channel = 3\n    im_size = (32, 32)\n    num_classes = 10\n    mean = [0.4914, 0.4822, 0.4465]\n    std = [0.2470, 0.2435, 0.2616]\n\n    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)])\n    dst_train = datasets.CIFAR10(data_path, train=True, download=True, transform=transform)\n    dst_test = datasets.CIFAR10(data_path, train=False, download=True, transform=transform)\n    class_names = dst_train.classes\n    dst_train.targets = tensor(dst_train.targets, dtype=long)\n    dst_test.targets = tensor(dst_test.targets, dtype=long)\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n"
  },
  {
    "path": "deepcore/datasets/cifar100.py",
    "content": "from torchvision import datasets, transforms\nfrom torch import tensor, long\n\n\ndef CIFAR100(data_path):\n    channel = 3\n    im_size = (32, 32)\n    num_classes = 100\n    mean = [0.5071, 0.4865, 0.4409]\n    std = [0.2673, 0.2564, 0.2762]\n    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)])\n    dst_train = datasets.CIFAR100(data_path, train=True, download=True, transform=transform)\n    dst_test = datasets.CIFAR100(data_path, train=False, download=True, transform=transform)\n    class_names = dst_train.classes\n    dst_train.targets = tensor(dst_train.targets, dtype=long)\n    dst_test.targets = tensor(dst_test.targets, dtype=long)\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n"
  },
  {
    "path": "deepcore/datasets/fashionmnist.py",
    "content": "from torchvision import datasets, transforms\n\n\ndef FashionMNIST(data_path):\n    channel = 1\n    im_size = (28, 28)\n    num_classes = 10\n    mean = [0.2861]\n    std = [0.3530]\n    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)])\n    dst_train = datasets.FashionMNIST(data_path, train=True, download=True, transform=transform)\n    dst_test = datasets.FashionMNIST(data_path, train=False, download=True, transform=transform)\n    class_names = dst_train.classes\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n"
  },
  {
    "path": "deepcore/datasets/imagenet.py",
    "content": "from torchvision import datasets, transforms\nfrom torch import tensor, long\n\n\ndef ImageNet(data_path):\n    channel = 3\n    im_size = (224, 224)\n    num_classes = 1000\n    mean = [0.485, 0.456, 0.406]\n    std = [0.229, 0.224, 0.225]\n    normalize = transforms.Normalize(mean, std)\n    dst_train = datasets.ImageNet(data_path, split=\"train\", transform=transforms.Compose([\n            transforms.Resize(256),\n            transforms.CenterCrop(224),\n            transforms.ToTensor(),\n            normalize,\n        ]))\n    dst_test = datasets.ImageNet(data_path, split=\"val\", transform=transforms.Compose([\n            transforms.Resize(256),\n            transforms.CenterCrop(224),\n            transforms.ToTensor(),\n            normalize,\n        ]))\n    class_names = dst_train.classes\n    dst_train.targets = tensor(dst_train.targets, dtype=long)\n    dst_test.targets = tensor(dst_test.targets, dtype=long)\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n"
  },
  {
    "path": "deepcore/datasets/mnist.py",
    "content": "from torchvision import datasets, transforms\nimport numpy as np\n\n\ndef MNIST(data_path, permuted=False, permutation_seed=None):\n    channel = 1\n    im_size = (28, 28)\n    num_classes = 10\n    mean = [0.1307]\n    std = [0.3081]\n    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)])\n    if permuted:\n        np.random.seed(permutation_seed)\n        pixel_permutation = np.random.permutation(28 * 28)\n        transform = transforms.Compose(\n            [transform, transforms.Lambda(lambda x: x.view(-1, 1)[pixel_permutation].view(1, 28, 28))])\n\n    dst_train = datasets.MNIST(data_path, train=True, download=True, transform=transform)\n    dst_test = datasets.MNIST(data_path, train=False, download=True, transform=transform)\n    class_names = [str(c) for c in range(num_classes)]\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n\n\ndef permutedMNIST(data_path, permutation_seed=None):\n    return MNIST(data_path, True, permutation_seed)\n"
  },
  {
    "path": "deepcore/datasets/qmnist.py",
    "content": "from torchvision import datasets, transforms\n\n\ndef QMNIST(data_path):\n    channel = 1\n    im_size = (28, 28)\n    num_classes = 10\n    mean = [0.1308]\n    std = [0.3088]\n    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)])\n    dst_train = datasets.QMNIST(data_path, train=True, download=True, transform=transform)\n    dst_test = datasets.QMNIST(data_path, train=False, download=True, transform=transform)\n    class_names = [str(c) for c in range(num_classes)]\n    dst_train.targets = dst_train.targets[:, 0]\n    dst_test.targets = dst_test.targets[:, 0]\n    dst_train.compat = False\n    dst_test.compat = False\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n"
  },
  {
    "path": "deepcore/datasets/svhn.py",
    "content": "from torchvision import datasets, transforms\nfrom torch import tensor, long\n\n\ndef SVHN(data_path):\n    channel = 3\n    im_size = (32, 32)\n    num_classes = 10\n    mean = [0.4377, 0.4438, 0.4728]\n    std = [0.1980, 0.2010, 0.1970]\n    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)])\n    dst_train = datasets.SVHN(data_path, split='train', download=True, transform=transform)\n    dst_test = datasets.SVHN(data_path, split='test', download=True, transform=transform)\n    class_names = [str(c) for c in range(num_classes)]\n    dst_train.classes = list(class_names)\n    dst_test.classes = list(class_names)\n    dst_train.targets = tensor(dst_train.labels, dtype=long)\n    dst_test.targets = tensor(dst_test.labels, dtype=long)\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n"
  },
  {
    "path": "deepcore/datasets/tinyimagenet.py",
    "content": "from torchvision import datasets, transforms\nimport os\nimport requests\nimport zipfile\n\n\ndef TinyImageNet(data_path, downsize=True):\n    if not os.path.exists(os.path.join(data_path, \"tiny-imagenet-200\")):\n        url = \"http://cs231n.stanford.edu/tiny-imagenet-200.zip\"  # 248MB\n        print(\"Downloading Tiny-ImageNet\")\n        r = requests.get(url, stream=True)\n        with open(os.path.join(data_path, \"tiny-imagenet-200.zip\"), \"wb\") as f:\n            for chunk in r.iter_content(chunk_size=1024):\n                if chunk:\n                    f.write(chunk)\n\n        print(\"Unziping Tiny-ImageNet\")\n        with zipfile.ZipFile(os.path.join(data_path, \"tiny-imagenet-200.zip\")) as zf:\n            zf.extractall(path=data_path)\n\n    channel = 3\n    im_size = (32, 32) if downsize else (64, 64)\n    num_classes = 200\n    mean = (0.4802, 0.4481, 0.3975)\n    std = (0.2770, 0.2691, 0.2821)\n\n    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)])\n    if downsize:\n        transform = transforms.Compose([transforms.Resize(32), transform])\n\n    dst_train = datasets.ImageFolder(root=os.path.join(data_path, 'tiny-imagenet-200/train'), transform=transform)\n    dst_test = datasets.ImageFolder(root=os.path.join(data_path, 'tiny-imagenet-200/test'), transform=transform)\n\n    class_names = dst_train.classes\n    return channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test\n"
  },
  {
    "path": "deepcore/methods/__init__.py",
    "content": "from .cal import *\nfrom .contextualdiversity import *\nfrom .coresetmethod import *\nfrom .craig import *\nfrom .deepfool import *\nfrom .earlytrain import *\nfrom .forgetting import *\nfrom .full import *\nfrom .glister import *\nfrom .grand import *\nfrom .gradmatch import *\nfrom .herding import *\nfrom .kcentergreedy import *\nfrom .submodular import *\nfrom .uncertainty import *\nfrom .uniform import *\n\n"
  },
  {
    "path": "deepcore/methods/cal.py",
    "content": "from .earlytrain import EarlyTrain\nfrom .methods_utils.euclidean import euclidean_dist_pair_np\nfrom .methods_utils.cossim import cossim_pair_np\nimport numpy as np\nimport torch\nfrom .. import nets\nfrom copy import deepcopy\nfrom torchvision import transforms\n\n\nclass Cal(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, specific_model=None,\n                 balance=True, metric=\"euclidean\", neighbors: int = 10, pretrain_model: str = \"ResNet18\", **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs, specific_model, **kwargs)\n\n        self.balance = balance\n\n        assert neighbors > 0 and neighbors < 100\n        self.neighbors = neighbors\n\n        if metric == \"euclidean\":\n            self.metric = euclidean_dist_pair_np\n        elif metric == \"cossim\":\n            self.metric = lambda a, b: -1. * cossim_pair_np(a, b)\n        elif callable(metric):\n            self.metric = metric\n        else:\n            self.metric = euclidean_dist_pair_np\n\n        self.pretrain_model = pretrain_model\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n                epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n    def find_knn(self):\n        \"\"\"\n        Find k-nearest-neighbor data points with the pretrained embedding model\n        :return: knn matrix\n        \"\"\"\n\n        # Initialize pretrained model\n        model = nets.__dict__[self.pretrain_model](channel=self.args.channel, num_classes=self.args.num_classes,\n                                                   im_size=(224, 224), record_embedding=True, no_grad=True,\n                                                   pretrained=True).to(self.args.device)\n        model.eval()\n\n        # Resize dst_train to 224*224\n        if self.args.im_size[0] != 224 or self.args.im_size[1] != 224:\n            dst_train = deepcopy(self.dst_train)\n            dst_train.transform = transforms.Compose([dst_train.transform, transforms.Resize(224)])\n        else:\n            dst_train = self.dst_train\n\n        # Calculate the distance matrix and return knn results\n        if self.balance:\n            knn = []\n            for c in range(self.args.num_classes):\n                class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n\n                # Start recording embedding vectors\n                embdeddings = []\n                batch_loader = torch.utils.data.DataLoader(torch.utils.data.Subset(dst_train, class_index),\n                                                           batch_size=self.args.selection_batch,\n                                                           num_workers=self.args.workers)\n                batch_num = len(batch_loader)\n                for i, (aa, _) in enumerate(batch_loader):\n                    if i % self.args.print_freq == 0:\n                        print(\"| Caculating embeddings for batch [%3d/%3d]\" % (i + 1, batch_num))\n                    model(aa.to(self.args.device))\n                    embdeddings.append(model.embedding_recorder.embedding.flatten(1).cpu().numpy())\n\n                embdeddings = np.concatenate(embdeddings, axis=0)\n\n                knn.append(np.argsort(self.metric(embdeddings), axis=1)[:, 1:(self.neighbors + 1)])\n            return knn\n        else:\n            # Start recording embedding vectors\n            embdeddings = []\n            batch_loader = torch.utils.data.DataLoader(dst_train, batch_size=self.args.selection_batch\n                                                       ,num_workers=self.args.workers)\n            batch_num = len(batch_loader)\n\n            for i, (aa, _) in enumerate(batch_loader):\n                if i % self.args.print_freq == 0:\n                    print(\"| Caculating embeddings for batch [%3d/%3d]\" % (i + 1, batch_num))\n                model(aa.to(self.args.device))\n                embdeddings.append(model.embedding_recorder.embedding.flatten(1).cpu().numpy())\n            embdeddings = np.concatenate(embdeddings, axis=0)\n\n            return np.argsort(self.metric(embdeddings), axis=1)[:, 1:(self.neighbors + 1)]\n\n    def calc_kl(self, knn, index=None):\n        self.model.eval()\n        self.model.no_grad = True\n        sample_num = self.n_train if index is None else len(index)\n        probs = np.zeros([sample_num, self.args.num_classes])\n\n        batch_loader = torch.utils.data.DataLoader(\n            self.dst_train if index is None else torch.utils.data.Subset(self.dst_train, index),\n            batch_size=self.args.selection_batch, num_workers=self.args.workers)\n        batch_num = len(batch_loader)\n\n        for i, (inputs, _) in enumerate(batch_loader):\n            probs[i * self.args.selection_batch:(i + 1) * self.args.selection_batch] = torch.nn.functional.softmax(\n                self.model(inputs.to(self.args.device)), dim=1).detach().cpu()\n\n        s = np.zeros(sample_num)\n        for i in range(0, sample_num, self.args.selection_batch):\n            if i % self.args.print_freq == 0:\n                print(\"| Caculating KL-divergence for batch [%3d/%3d]\" % (i // self.args.selection_batch + 1, batch_num))\n            aa = np.expand_dims(probs[i:(i + self.args.selection_batch)], 1).repeat(self.neighbors, 1)\n            bb = probs[knn[i:(i + self.args.selection_batch)], :]\n            s[i:(i + self.args.selection_batch)] = np.mean(\n                np.sum(0.5 * aa * np.log(aa / bb) + 0.5 * bb * np.log(bb / aa), axis=2), axis=1)\n        self.model.no_grad = False\n        return s\n\n    def finish_run(self):\n        scores=[]\n        if self.balance:\n            selection_result = np.array([], dtype=np.int32)\n            for c, knn in zip(range(self.args.num_classes), self.knn):\n                class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n                scores.append(self.calc_kl(knn, class_index))\n                selection_result = np.append(selection_result, class_index[np.argsort(\n                    #self.calc_kl(knn, class_index))[::1][:round(self.fraction * len(class_index))]])\n                    scores[-1])[::1][:round(self.fraction * len(class_index))]])\n        else:\n            selection_result = np.argsort(self.calc_kl(self.knn))[::1][:self.coreset_size]\n        return {\"indices\": selection_result, \"scores\":scores}\n\n    def select(self, **kwargs):\n        self.knn = self.find_knn()\n        selection_result = self.run()\n        return selection_result"
  },
  {
    "path": "deepcore/methods/contextualdiversity.py",
    "content": "from .kcentergreedy import kCenterGreedy\nimport torch\n\n\n# Acknowlegement to:\n# https://github.com/sharat29ag/CDAL\n\n\nclass ContextualDiversity(kCenterGreedy):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200,\n                 specific_model=None, balance=True, already_selected=[], torchvision_pretrain: bool = False, **kwargs):\n        super(ContextualDiversity, self).__init__(dst_train, args, fraction, random_seed, epochs=epochs, specific_model=specific_model, balance=balance, already_selected=already_selected, torchvision_pretrain=torchvision_pretrain, **kwargs)\n        self.metric = self._metric\n\n    def _metric(self, a_output, b_output):\n        with torch.no_grad():\n            # Overload self.metric function for kCenterGreedy Algorithm\n            aa = a_output.view(a_output.shape[0], 1, a_output.shape[1]).repeat(1, b_output.shape[0], 1)\n            bb = b_output.view(1, b_output.shape[0], b_output.shape[1]).repeat(a_output.shape[0], 1, 1)\n            return torch.sum(0.5 * aa * torch.log(aa / bb) + 0.5 * bb * torch.log(bb / aa), dim=2)\n\n    def construct_matrix(self, index=None):\n        self.model.eval()\n        self.model.no_grad = True\n        sample_num = self.n_train if index is None else len(index)\n        matrix = torch.zeros([sample_num, self.args.num_classes], requires_grad=False).to(self.args.device)\n        batch_loader = torch.utils.data.DataLoader(self.dst_train if index is None else\n                            torch.utils.data.Subset(self.dst_train, index), batch_size=self.args.selection_batch\n                                                   ,num_workers=self.args.workers)\n        for i, (inputs, _) in enumerate(batch_loader):\n            matrix[i * self.args.selection_batch:min((i + 1) * self.args.selection_batch, sample_num)] = torch.nn.functional.softmax(self.model(inputs.to(self.args.device)), dim=1)\n        self.model.no_grad = False\n        return matrix\n"
  },
  {
    "path": "deepcore/methods/coresetmethod.py",
    "content": "class CoresetMethod(object):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, **kwargs):\n        if fraction <= 0.0 or fraction > 1.0:\n            raise ValueError(\"Illegal Coreset Size.\")\n        self.dst_train = dst_train\n        self.num_classes = len(dst_train.classes)\n        self.fraction = fraction\n        self.random_seed = random_seed\n        self.index = []\n        self.args = args\n\n        self.n_train = len(dst_train)\n        self.coreset_size = round(self.n_train * fraction)\n\n    def select(self, **kwargs):\n        return\n\n"
  },
  {
    "path": "deepcore/methods/craig.py",
    "content": "from .earlytrain import EarlyTrain\nimport torch\nfrom .methods_utils import FacilityLocation, submodular_optimizer\nimport numpy as np\nfrom .methods_utils.euclidean import euclidean_dist_pair_np\nfrom ..nets.nets_utils import MyDataParallel\n\n\nclass Craig(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, specific_model=None,\n                 balance=True, greedy=\"LazyGreedy\", **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs, specific_model, **kwargs)\n\n        if greedy not in submodular_optimizer.optimizer_choices:\n            raise ModuleNotFoundError(\"Greedy optimizer not found.\")\n        self._greedy = greedy\n        self.balance = balance\n\n    def before_train(self):\n        pass\n\n    def after_loss(self, outputs, loss, targets, batch_inds, epoch):\n        pass\n\n    def before_epoch(self):\n        pass\n\n    def after_epoch(self):\n        pass\n\n    def before_run(self):\n        pass\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n                epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n    def calc_gradient(self, index=None):\n        self.model.eval()\n\n        batch_loader = torch.utils.data.DataLoader(\n            self.dst_train if index is None else torch.utils.data.Subset(self.dst_train, index),\n            batch_size=self.args.selection_batch, num_workers=self.args.workers)\n        sample_num = len(self.dst_val.targets) if index is None else len(index)\n        self.embedding_dim = self.model.get_last_layer().in_features\n\n        gradients = []\n\n        for i, (input, targets) in enumerate(batch_loader):\n            self.model_optimizer.zero_grad()\n            outputs = self.model(input.to(self.args.device))\n            loss = self.criterion(outputs.requires_grad_(True),\n                                  targets.to(self.args.device)).sum()\n            batch_num = targets.shape[0]\n            with torch.no_grad():\n                bias_parameters_grads = torch.autograd.grad(loss, outputs)[0]\n                weight_parameters_grads = self.model.embedding_recorder.embedding.view(batch_num, 1,\n                                                                                       self.embedding_dim).repeat(1,\n                                                                                                                  self.args.num_classes,\n                                                                                                                  1) * bias_parameters_grads.view(\n                    batch_num, self.args.num_classes, 1).repeat(1, 1, self.embedding_dim)\n                gradients.append(\n                    torch.cat([bias_parameters_grads, weight_parameters_grads.flatten(1)], dim=1).cpu().numpy())\n\n        gradients = np.concatenate(gradients, axis=0)\n\n        self.model.train()\n        return euclidean_dist_pair_np(gradients)\n\n    def calc_weights(self, matrix, result):\n        min_sample = np.argmax(matrix[result], axis=0)\n        weights = np.ones(np.sum(result) if result.dtype == bool else len(result))\n        for i in min_sample:\n            weights[i] = weights[i] + 1\n        return weights\n\n    def finish_run(self):\n        if isinstance(self.model, MyDataParallel):\n            self.model = self.model.module\n\n        self.model.no_grad = True\n        with self.model.embedding_recorder:\n            if self.balance:\n                # Do selection by class\n                selection_result = np.array([], dtype=np.int32)\n                weights = np.array([])\n                for c in range(self.args.num_classes):\n                    class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n                    matrix = -1. * self.calc_gradient(class_index)\n                    matrix -= np.min(matrix) - 1e-3\n                    submod_function = FacilityLocation(index=class_index, similarity_matrix=matrix)\n                    submod_optimizer = submodular_optimizer.__dict__[self._greedy](args=self.args, index=class_index,\n                                                                                   budget=round(self.fraction * len(\n                                                                                       class_index)))\n                    class_result = submod_optimizer.select(gain_function=submod_function.calc_gain,\n                                                           update_state=submod_function.update_state)\n                    selection_result = np.append(selection_result, class_result)\n                    weights = np.append(weights, self.calc_weights(matrix, np.isin(class_index, class_result)))\n            else:\n                matrix = np.zeros([self.n_train, self.n_train])\n                all_index = np.arange(self.n_train)\n                for c in range(self.args.num_classes):  # Sparse Matrix\n                    class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n                    matrix[np.ix_(class_index, class_index)] = -1. * self.calc_gradient(class_index)\n                    matrix[np.ix_(class_index, class_index)] -= np.min(matrix[np.ix_(class_index, class_index)]) - 1e-3\n                submod_function = FacilityLocation(index=all_index, similarity_matrix=matrix)\n                submod_optimizer = submodular_optimizer.__dict__[self._greedy](args=self.args, index=all_index,\n                                                                               budget=self.coreset_size)\n                selection_result = submod_optimizer.select(gain_function=submod_function.calc_gain_batch,\n                                                           update_state=submod_function.update_state,\n                                                           batch=self.args.selection_batch)\n                weights = self.calc_weights(matrix, selection_result)\n        self.model.no_grad = False\n        return {\"indices\": selection_result, \"weights\": weights}\n\n    def select(self, **kwargs):\n        selection_result = self.run()\n        return selection_result\n"
  },
  {
    "path": "deepcore/methods/deepfool.py",
    "content": "from .earlytrain import EarlyTrain\nimport torch\nimport numpy as np\n\n\nclass DeepFool(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200,\n                 specific_model=None, balance: bool = False, max_iter: int = 50, **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs, specific_model, **kwargs)\n\n        self.balance = balance\n        self.max_iter = max_iter\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n                epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n    def finish_run(self):\n        self.model.no_grad = False\n\n        # Create a data loader for self.dst_train with batch size self.args.selection_batch\n        batch_loader = torch.utils.data.DataLoader(self.dst_train, batch_size=self.args.selection_batch\n                                                   , num_workers=self.args.workers)\n\n        r = np.zeros(self.n_train, dtype=np.float32)\n        batch_num = len(batch_loader)\n        for i, (inputs, targets) in enumerate(batch_loader):\n            if i % self.args.print_freq == 0:\n                print('| Selecting Batch [%3d/%3d]' % (i + 1, batch_num))\n            r[(i * self.args.selection_batch):(i * self.args.selection_batch + targets.shape[0])] = self.deep_fool(\n                inputs)\n\n        if self.balance:\n            selection_result = np.array([], dtype=np.int64)\n            for c in range(self.args.num_classes):\n                class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n                selection_result = np.append(selection_result, class_index[\n                    r[class_index].argsort()[:round(len(class_index) * self.fraction)]])\n        else:\n            selection_result = r.argsort()[:self.coreset_size]\n        return {\"indices\": selection_result, \"scores\": r}\n\n    def deep_fool(self, inputs):\n        # Here, start running DeepFool algorithm.\n        self.model.eval()\n\n        # Initialize a boolean mask indicating if selection has been stopped at corresponding positions.\n        sample_size = inputs.shape[0]\n        boolean_mask = np.ones(sample_size, dtype=bool)\n        all_idx = np.arange(sample_size)\n\n        # A matrix to store total pertubations.\n        r_tot = np.zeros([sample_size, inputs.shape[1] * inputs.shape[2] * inputs.shape[3]])\n\n        # Set requires_grad for inputs.\n        cur_inputs = inputs.requires_grad_(True).to(self.args.device)\n\n        original_shape = inputs.shape[1:]\n\n        # set requires_grad for all parametres in network as False to accelerate autograd\n        for p in self.model.parameters():\n            p.requires_grad_(False)\n\n        self.model.no_grad = True\n        first_preds = self.model(cur_inputs).argmax(dim=1)\n        self.model.no_grad = False\n\n        for i in range(self.max_iter):\n            f_all = self.model(cur_inputs)\n\n            w_k = []\n            for c in range(self.args.num_classes):\n                w_k.append(torch.autograd.grad(f_all[:, c].sum(), cur_inputs,\n                                               retain_graph=False if c + 1 == self.args.num_classes else True)[\n                               0].flatten(1))\n            w_k = torch.stack(w_k, dim=0)\n            w_k = w_k - w_k[first_preds, boolean_mask[boolean_mask]].unsqueeze(0)\n            w_k_norm = w_k.norm(dim=2)\n\n            w_k_norm[first_preds, boolean_mask[\n                boolean_mask]] = 1.  # Set w_k_norm for preds positions to 1. to avoid division by zero.\n\n            l_all = (f_all - f_all[boolean_mask[boolean_mask], first_preds].unsqueeze(1)).detach().abs() / w_k_norm.T\n            l_all[boolean_mask[\n                      boolean_mask], first_preds] = np.inf  # Set l_k for preds positions to inf, as the argmin for each\n                                                            # row will be calculated soon.\n\n            l_hat = l_all.argmin(dim=1)\n            r_i = l_all[boolean_mask[boolean_mask], l_hat].unsqueeze(1) / w_k_norm[\n                l_hat, boolean_mask[boolean_mask]].T.unsqueeze(1) * w_k[l_hat, boolean_mask[boolean_mask]]\n\n            # Update r_tot values.\n            r_tot[boolean_mask] += r_i.cpu().numpy()\n\n            cur_inputs += r_i.reshape([r_i.shape[0]] + list(original_shape))\n\n            # Re-input the updated sample into the network and get new predictions.\n            self.model.no_grad = True\n            preds = self.model(cur_inputs).argmax(dim=1)\n            self.model.no_grad = False\n\n            # In DeepFool algorithm, the iteration stops when the updated sample produces a different prediction\n            # in the model.\n            index_unfinished = (preds == first_preds)\n            if torch.all(~index_unfinished):\n                break\n\n            cur_inputs = cur_inputs[index_unfinished]\n            first_preds = first_preds[index_unfinished]\n            boolean_mask[all_idx[boolean_mask][~index_unfinished.cpu().numpy()]] = False\n\n        return (r_tot * r_tot).sum(axis=1)\n\n    def select(self, **kwargs):\n        selection_result = self.run()\n        return selection_result\n"
  },
  {
    "path": "deepcore/methods/earlytrain.py",
    "content": "from .coresetmethod import CoresetMethod\nimport torch, time\nfrom torch import nn\nimport numpy as np\nfrom copy import deepcopy\nfrom .. import nets\nfrom torchvision import transforms\n\n\nclass EarlyTrain(CoresetMethod):\n    '''\n    Core code for training related to coreset selection methods when pre-training is required.\n    '''\n\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, specific_model=None,\n                 torchvision_pretrain: bool = False, dst_pretrain_dict: dict = {}, fraction_pretrain=1., dst_test=None,\n                 **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed)\n        self.epochs = epochs\n        self.n_train = len(dst_train)\n        self.coreset_size = round(self.n_train * fraction)\n        self.specific_model = specific_model\n\n        if fraction_pretrain <= 0. or fraction_pretrain > 1.:\n            raise ValueError(\"Illegal pretrain fraction value.\")\n        self.fraction_pretrain = fraction_pretrain\n\n        if dst_pretrain_dict.__len__() != 0:\n            dict_keys = dst_pretrain_dict.keys()\n            if 'im_size' not in dict_keys or 'channel' not in dict_keys or 'dst_train' not in dict_keys or \\\n                    'num_classes' not in dict_keys:\n                raise AttributeError(\n                    'Argument dst_pretrain_dict must contain imszie, channel, dst_train and num_classes.')\n            if dst_pretrain_dict['im_size'][0] != args.im_size[0] or dst_pretrain_dict['im_size'][0] != args.im_size[0]:\n                raise ValueError(\"im_size of pretrain dataset does not match that of the training dataset.\")\n            if dst_pretrain_dict['channel'] != args.channel:\n                raise ValueError(\"channel of pretrain dataset does not match that of the training dataset.\")\n            if dst_pretrain_dict['num_classes'] != args.num_classes:\n                self.num_classes_mismatch()\n\n        self.dst_pretrain_dict = dst_pretrain_dict\n        self.torchvision_pretrain = torchvision_pretrain\n        self.if_dst_pretrain = (len(self.dst_pretrain_dict) != 0)\n\n        if torchvision_pretrain:\n            # Pretrained models in torchvision only accept 224*224 inputs, therefore we resize current\n            # datasets to 224*224.\n            if args.im_size[0] != 224 or args.im_size[1] != 224:\n                self.dst_train = deepcopy(dst_train)\n                self.dst_train.transform = transforms.Compose([self.dst_train.transform, transforms.Resize(224)])\n                if self.if_dst_pretrain:\n                    self.dst_pretrain_dict['dst_train'] = deepcopy(dst_pretrain_dict['dst_train'])\n                    self.dst_pretrain_dict['dst_train'].transform = transforms.Compose(\n                        [self.dst_pretrain_dict['dst_train'].transform, transforms.Resize(224)])\n        if self.if_dst_pretrain:\n            self.n_pretrain = len(self.dst_pretrain_dict['dst_train'])\n        self.n_pretrain_size = round(\n            self.fraction_pretrain * (self.n_pretrain if self.if_dst_pretrain else self.n_train))\n        self.dst_test = dst_test\n\n    def train(self, epoch, list_of_train_idx, **kwargs):\n        \"\"\" Train model for one epoch \"\"\"\n\n        self.before_train()\n        self.model.train()\n\n        print('\\n=> Training Epoch #%d' % epoch)\n        trainset_permutation_inds = np.random.permutation(list_of_train_idx)\n        batch_sampler = torch.utils.data.BatchSampler(trainset_permutation_inds, batch_size=self.args.selection_batch,\n                                                      drop_last=False)\n        trainset_permutation_inds = list(batch_sampler)\n\n        train_loader = torch.utils.data.DataLoader(self.dst_pretrain_dict['dst_train'] if self.if_dst_pretrain\n                                                   else self.dst_train, shuffle=False, batch_sampler=batch_sampler,\n                                                   num_workers=self.args.workers, pin_memory=True)\n\n        for i, (inputs, targets) in enumerate(train_loader):\n            inputs, targets = inputs.to(self.args.device), targets.to(self.args.device)\n\n            # Forward propagation, compute loss, get predictions\n            self.model_optimizer.zero_grad()\n            outputs = self.model(inputs)\n            loss = self.criterion(outputs, targets)\n\n            self.after_loss(outputs, loss, targets, trainset_permutation_inds[i], epoch)\n\n            # Update loss, backward propagate, update optimizer\n            loss = loss.mean()\n\n            self.while_update(outputs, loss, targets, epoch, i, self.args.selection_batch)\n\n            loss.backward()\n            self.model_optimizer.step()\n        return self.finish_train()\n\n    def run(self):\n        torch.manual_seed(self.random_seed)\n        np.random.seed(self.random_seed)\n        self.train_indx = np.arange(self.n_train)\n\n        # Setup model and loss\n        self.model = nets.__dict__[self.args.model if self.specific_model is None else self.specific_model](\n            self.args.channel, self.dst_pretrain_dict[\"num_classes\"] if self.if_dst_pretrain else self.num_classes,\n            pretrained=self.torchvision_pretrain,\n            im_size=(224, 224) if self.torchvision_pretrain else self.args.im_size).to(self.args.device)\n\n        if self.args.device == \"cpu\":\n            print(\"Using CPU.\")\n        elif self.args.gpu is not None:\n            torch.cuda.set_device(self.args.gpu[0])\n            self.model = nets.nets_utils.MyDataParallel(self.model, device_ids=self.args.gpu)\n        elif torch.cuda.device_count() > 1:\n            self.model = nets.nets_utils.MyDataParallel(self.model).cuda()\n\n        self.criterion = nn.CrossEntropyLoss().to(self.args.device)\n        self.criterion.__init__()\n\n        # Setup optimizer\n        if self.args.selection_optimizer == \"SGD\":\n            self.model_optimizer = torch.optim.SGD(self.model.parameters(), lr=self.args.selection_lr,\n                                                   momentum=self.args.selection_momentum,\n                                                   weight_decay=self.args.selection_weight_decay,\n                                                   nesterov=self.args.selection_nesterov)\n        elif self.args.selection_optimizer == \"Adam\":\n            self.model_optimizer = torch.optim.Adam(self.model.parameters(), lr=self.args.selection_lr,\n                                                    weight_decay=self.args.selection_weight_decay)\n        else:\n            self.model_optimizer = torch.optim.__dict__[self.args.selection_optimizer](self.model.parameters(),\n                                                                       lr=self.args.selection_lr,\n                                                                       momentum=self.args.selection_momentum,\n                                                                       weight_decay=self.args.selection_weight_decay,\n                                                                       nesterov=self.args.selection_nesterov)\n\n        self.before_run()\n\n        for epoch in range(self.epochs):\n            list_of_train_idx = np.random.choice(np.arange(self.n_pretrain if self.if_dst_pretrain else self.n_train),\n                                                 self.n_pretrain_size, replace=False)\n            self.before_epoch()\n            self.train(epoch, list_of_train_idx)\n            if self.dst_test is not None and self.args.selection_test_interval > 0 and (\n                    epoch + 1) % self.args.selection_test_interval == 0:\n                self.test(epoch)\n            self.after_epoch()\n\n        return self.finish_run()\n\n    def test(self, epoch):\n        self.model.no_grad = True\n        self.model.eval()\n\n        test_loader = torch.utils.data.DataLoader(self.dst_test if self.args.selection_test_fraction == 1. else\n                                                  torch.utils.data.Subset(self.dst_test, np.random.choice(\n                                                      np.arange(len(self.dst_test)),\n                                                      round(len(self.dst_test) * self.args.selection_test_fraction),\n                                                      replace=False)),\n                                                  batch_size=self.args.selection_batch, shuffle=False,\n                                                  num_workers=self.args.workers, pin_memory=True)\n        correct = 0.\n        total = 0.\n\n        print('\\n=> Testing Epoch #%d' % epoch)\n\n        for batch_idx, (input, target) in enumerate(test_loader):\n            output = self.model(input.to(self.args.device))\n            loss = self.criterion(output, target.to(self.args.device)).sum()\n\n            predicted = torch.max(output.data, 1).indices.cpu()\n            correct += predicted.eq(target).sum().item()\n            total += target.size(0)\n\n            if batch_idx % self.args.print_freq == 0:\n                print('| Test Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tTest Loss: %.4f Test Acc: %.3f%%' % (\n                    epoch, self.epochs, batch_idx + 1, (round(len(self.dst_test) * self.args.selection_test_fraction) //\n                                                        self.args.selection_batch) + 1, loss.item(),\n                    100. * correct / total))\n\n        self.model.no_grad = False\n\n    def num_classes_mismatch(self):\n        pass\n\n    def before_train(self):\n        pass\n\n    def after_loss(self, outputs, loss, targets, batch_inds, epoch):\n        pass\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        pass\n\n    def finish_train(self):\n        pass\n\n    def before_epoch(self):\n        pass\n\n    def after_epoch(self):\n        pass\n\n    def before_run(self):\n        pass\n\n    def finish_run(self):\n        pass\n\n    def select(self, **kwargs):\n        selection_result = self.run()\n        return selection_result\n"
  },
  {
    "path": "deepcore/methods/forgetting.py",
    "content": "from .earlytrain import EarlyTrain\nimport torch, time\nfrom torch import nn\nimport numpy as np\n\n\n# Acknowledgement to\n# https://github.com/mtoneva/example_forgetting\n\nclass Forgetting(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, specific_model=None, balance=True,\n                 dst_test=None, **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs, specific_model=specific_model,\n                         dst_test=dst_test)\n\n        self.balance = balance\n\n    def get_hms(self, seconds):\n        # Format time for printing purposes\n\n        m, s = divmod(seconds, 60)\n        h, m = divmod(m, 60)\n\n        return h, m, s\n\n    def before_train(self):\n        self.train_loss = 0.\n        self.correct = 0.\n        self.total = 0.\n\n    def after_loss(self, outputs, loss, targets, batch_inds, epoch):\n        with torch.no_grad():\n            _, predicted = torch.max(outputs.data, 1)\n\n            cur_acc = (predicted == targets).clone().detach().requires_grad_(False).type(torch.float32)\n            self.forgetting_events[torch.tensor(batch_inds)[(self.last_acc[batch_inds]-cur_acc)>0.01]]+=1.\n            self.last_acc[batch_inds] = cur_acc\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        self.train_loss += loss.item()\n        self.total += targets.size(0)\n        _, predicted = torch.max(outputs.data, 1)\n        self.correct += predicted.eq(targets.data).cpu().sum()\n\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f Acc@1: %.3f%%' % (\n            epoch, self.epochs, batch_idx + 1, (self.n_train // batch_size) + 1, loss.item(),\n            100. * self.correct.item() / self.total))\n\n    def before_epoch(self):\n        self.start_time = time.time()\n\n    def after_epoch(self):\n        epoch_time = time.time() - self.start_time\n        self.elapsed_time += epoch_time\n        print('| Elapsed time : %d:%02d:%02d' % (self.get_hms(self.elapsed_time)))\n\n    def before_run(self):\n        self.elapsed_time = 0\n\n        self.forgetting_events = torch.zeros(self.n_train, requires_grad=False).to(self.args.device)\n        self.last_acc = torch.zeros(self.n_train, requires_grad=False).to(self.args.device)\n\n    def finish_run(self):\n        pass\n\n    def select(self, **kwargs):\n        self.run()\n\n        if not self.balance:\n            top_examples = self.train_indx[np.argsort(self.forgetting_events.cpu().numpy())][::-1][:self.coreset_size]\n        else:\n            top_examples = np.array([], dtype=np.int64)\n            for c in range(self.num_classes):\n                c_indx = self.train_indx[self.dst_train.targets == c]\n                budget = round(self.fraction * len(c_indx))\n                top_examples = np.append(top_examples,\n                                    c_indx[np.argsort(self.forgetting_events[c_indx].cpu().numpy())[::-1][:budget]])\n\n        return {\"indices\": top_examples, \"scores\": self.forgetting_events}\n"
  },
  {
    "path": "deepcore/methods/full.py",
    "content": "import numpy as np\nfrom .coresetmethod import CoresetMethod\n\n\nclass Full(CoresetMethod):\n    def __init__(self, dst_train, args, fraction, random_seed, **kwargs):\n        self.n_train = len(dst_train)\n\n    def select(self, **kwargs):\n        return {\"indices\": np.arange(self.n_train)}\n"
  },
  {
    "path": "deepcore/methods/glister.py",
    "content": "from .earlytrain import EarlyTrain\nfrom .methods_utils import submodular_optimizer\nimport torch\nimport numpy as np\nfrom ..nets.nets_utils import MyDataParallel\n\n\nclass Glister(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, specific_model=None,\n                 balance: bool = True, greedy=\"LazyGreedy\", eta=None, dst_val=None, **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs, specific_model, **kwargs)\n\n        self.balance = balance\n        self.eta = args.lr if eta is None else eta\n\n        self.dst_val = dst_train if dst_val is None else dst_val\n        self.n_val = len(self.dst_val)\n\n        if greedy not in submodular_optimizer.optimizer_choices:\n            raise ModuleNotFoundError(\"Greedy optimizer not found.\")\n        self._greedy = greedy\n\n    def calc_gradient(self, index=None, val=False, record_val_detail=False):\n        '''\n        Calculate gradients matrix on current network for training or validation dataset.\n        '''\n\n        self.model.eval()\n\n        if val:\n            batch_loader = torch.utils.data.DataLoader(\n                self.dst_val if index is None else torch.utils.data.Subset(self.dst_val, index),\n                batch_size=self.args.selection_batch, num_workers=self.args.workers)\n        else:\n            batch_loader = torch.utils.data.DataLoader(\n                self.dst_train if index is None else torch.utils.data.Subset(self.dst_train, index),\n                batch_size=self.args.selection_batch, num_workers=self.args.workers)\n\n        self.embedding_dim = self.model.get_last_layer().in_features\n        gradients = []\n        if val and record_val_detail:\n            self.init_out = []\n            self.init_emb = []\n            self.init_y = []\n\n        for i, (input, targets) in enumerate(batch_loader):\n            self.model_optimizer.zero_grad()\n            outputs = self.model(input.to(self.args.device))\n            loss = self.criterion(outputs.requires_grad_(True), targets.to(self.args.device)).sum()\n            batch_num = targets.shape[0]\n            with torch.no_grad():\n                bias_parameters_grads = torch.autograd.grad(loss, outputs)[0]\n                weight_parameters_grads = self.model.embedding_recorder.embedding.view(batch_num, 1,\n                                                self.embedding_dim).repeat(1, self.args.num_classes, 1) *\\\n                                                bias_parameters_grads.view(\n                                                batch_num, self.args.num_classes, 1).repeat(1, 1, self.embedding_dim)\n                gradients.append(torch.cat(\n                    [bias_parameters_grads, weight_parameters_grads.flatten(1)], dim=1).cpu())\n\n                if val and record_val_detail:\n                    self.init_out.append(outputs.cpu())\n                    self.init_emb.append(self.model.embedding_recorder.embedding.cpu())\n                    self.init_y.append(targets)\n\n        gradients = torch.cat(gradients, dim=0)\n        if val:\n            self.val_grads = torch.mean(gradients, dim=0)\n            if self.dst_val == self.dst_train:\n                # No validation set was provided while instantiating Glister, so self.dst_val == self.dst_train\n                self.train_grads = gradients\n        else:\n            self.train_grads = gradients\n        if val and record_val_detail:\n            with torch.no_grad():\n                self.init_out = torch.cat(self.init_out, dim=0)\n                self.init_emb = torch.cat(self.init_emb, dim=0)\n                self.init_y = torch.cat(self.init_y)\n\n        self.model.train()\n\n    def update_val_gradients(self, new_selection, selected_for_train):\n\n        sum_selected_train_gradients = torch.mean(self.train_grads[selected_for_train], dim=0)\n\n        new_outputs = self.init_out - self.eta * sum_selected_train_gradients[:self.args.num_classes].view(1,\n                      -1).repeat(self.init_out.shape[0], 1) - self.eta * torch.matmul(self.init_emb,\n                      sum_selected_train_gradients[self.args.num_classes:].view(self.args.num_classes, -1).T)\n\n        sample_num = new_outputs.shape[0]\n        gradients = torch.zeros([sample_num, self.args.num_classes * (self.embedding_dim + 1)], requires_grad=False)\n        i = 0\n        while i * self.args.selection_batch < sample_num:\n            batch_indx = np.arange(sample_num)[i * self.args.selection_batch:min((i + 1) * self.args.selection_batch,\n                                                                                 sample_num)]\n            new_out_puts_batch = new_outputs[batch_indx].clone().detach().requires_grad_(True)\n            loss = self.criterion(new_out_puts_batch, self.init_y[batch_indx])\n            batch_num = len(batch_indx)\n            bias_parameters_grads = torch.autograd.grad(loss.sum(), new_out_puts_batch, retain_graph=True)[0]\n\n            weight_parameters_grads = self.init_emb[batch_indx].view(batch_num, 1, self.embedding_dim).repeat(1,\n                                      self.args.num_classes, 1) * bias_parameters_grads.view(batch_num,\n                                      self.args.num_classes, 1).repeat(1, 1, self.embedding_dim)\n            gradients[batch_indx] = torch.cat([bias_parameters_grads, weight_parameters_grads.flatten(1)], dim=1).cpu()\n            i += 1\n\n        self.val_grads = torch.mean(gradients, dim=0)\n\n    def finish_run(self):\n        if isinstance(self.model, MyDataParallel):\n            self.model = self.model.module\n\n        self.model.embedding_recorder.record_embedding = True\n        self.model.no_grad = True\n\n        self.train_indx = np.arange(self.n_train)\n        self.val_indx = np.arange(self.n_val)\n        if self.balance:\n            selection_result = np.array([], dtype=np.int64)\n            #weights = np.array([], dtype=np.float32)\n            for c in range(self.num_classes):\n                c_indx = self.train_indx[self.dst_train.targets == c]\n                c_val_inx = self.val_indx[self.dst_val.targets == c]\n                self.calc_gradient(index=c_val_inx, val=True, record_val_detail=True)\n                if self.dst_val != self.dst_train:\n                    self.calc_gradient(index=c_indx)\n                submod_optimizer = submodular_optimizer.__dict__[self._greedy](args=self.args, index=c_indx,\n                                                            budget=round(self.fraction * len(c_indx)))\n                c_selection_result = submod_optimizer.select(gain_function=lambda idx_gain, selected,\n                                                             **kwargs: torch.matmul(self.train_grads[idx_gain],\n                                                             self.val_grads.view(-1, 1)).detach().cpu().numpy().\n                                                             flatten(), upadate_state=self.update_val_gradients)\n                selection_result = np.append(selection_result, c_selection_result)\n\n        else:\n            self.calc_gradient(val=True, record_val_detail=True)\n            if self.dst_val != self.dst_train:\n                self.calc_gradient()\n\n            submod_optimizer = submodular_optimizer.__dict__[self._greedy](args=self.args,\n                                  index=np.arange(self.n_train), budget=self.coreset_size)\n            selection_result = submod_optimizer.select(gain_function=lambda idx_gain, selected,\n                                                       **kwargs: torch.matmul(self.train_grads[idx_gain],\n                                                       self.val_grads.view(-1, 1)).detach().cpu().numpy().flatten(),\n                                                       upadate_state=self.update_val_gradients)\n\n        self.model.embedding_recorder.record_embedding = False\n        self.model.no_grad = False\n        return {\"indices\": selection_result}\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n                epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n"
  },
  {
    "path": "deepcore/methods/gradmatch.py",
    "content": "import torch\nimport numpy as np\nfrom scipy.linalg import lstsq\nfrom scipy.optimize import nnls\nfrom .earlytrain import EarlyTrain\nfrom ..nets.nets_utils import MyDataParallel\n\n\n# https://github.com/krishnatejakk/GradMatch\n\nclass GradMatch(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, specific_model=None,\n                 balance=True, dst_val=None, lam: float = 1., **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs, specific_model, **kwargs)\n        self.balance = balance\n        self.dst_val = dst_val\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n                epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n    def orthogonal_matching_pursuit(self, A, b, budget: int, lam: float = 1.):\n        '''approximately solves min_x |x|_0 s.t. Ax=b using Orthogonal Matching Pursuit\n        Acknowlegement to:\n        https://github.com/krishnatejakk/GradMatch/blob/main/GradMatch/selectionstrategies/helpers/omp_solvers.py\n        Args:\n          A: design matrix of size (d, n)\n          b: measurement vector of length d\n          budget: selection budget\n          lam: regularization coef. for the final output vector\n        Returns:\n           vector of length n\n        '''\n        with torch.no_grad():\n            d, n = A.shape\n            if budget <= 0:\n                budget = 0\n            elif budget > n:\n                budget = n\n\n            x = np.zeros(n, dtype=np.float32)\n            resid = b.clone()\n            indices = []\n            boolean_mask = torch.ones(n, dtype=bool, device=\"cuda\")\n            all_idx = torch.arange(n, device='cuda')\n\n            for i in range(budget):\n                if i % self.args.print_freq == 0:\n                    print(\"| Selecting [%3d/%3d]\" % (i + 1, budget))\n                projections = torch.matmul(A.T, resid)\n                index = torch.argmax(projections[boolean_mask])\n                index = all_idx[boolean_mask][index]\n\n                indices.append(index.item())\n                boolean_mask[index] = False\n\n                if indices.__len__() == 1:\n                    A_i = A[:, index]\n                    x_i = projections[index] / torch.dot(A_i, A_i).view(-1)\n                    A_i = A[:, index].view(1, -1)\n                else:\n                    A_i = torch.cat((A_i, A[:, index].view(1, -1)), dim=0)\n                    temp = torch.matmul(A_i, torch.transpose(A_i, 0, 1)) + lam * torch.eye(A_i.shape[0], device=\"cuda\")\n                    x_i, _ = torch.lstsq(torch.matmul(A_i, b).view(-1, 1), temp)\n                resid = b - torch.matmul(torch.transpose(A_i, 0, 1), x_i).view(-1)\n            if budget > 1:\n                x_i = nnls(temp.cpu().numpy(), torch.matmul(A_i, b).view(-1).cpu().numpy())[0]\n                x[indices] = x_i\n            elif budget == 1:\n                x[indices[0]] = 1.\n        return x\n\n    def orthogonal_matching_pursuit_np(self, A, b, budget: int, lam: float = 1.):\n        '''approximately solves min_x |x|_0 s.t. Ax=b using Orthogonal Matching Pursuit\n        Acknowlegement to:\n        https://github.com/krishnatejakk/GradMatch/blob/main/GradMatch/selectionstrategies/helpers/omp_solvers.py\n        Args:\n          A: design matrix of size (d, n)\n          b: measurement vector of length d\n          budget: selection budget\n          lam: regularization coef. for the final output vector\n        Returns:\n           vector of length n\n        '''\n        d, n = A.shape\n        if budget <= 0:\n            budget = 0\n        elif budget > n:\n            budget = n\n\n        x = np.zeros(n, dtype=np.float32)\n        resid = np.copy(b)\n        indices = []\n        boolean_mask = np.ones(n, dtype=bool)\n        all_idx = np.arange(n)\n\n        for i in range(budget):\n            if i % self.args.print_freq == 0:\n                print(\"| Selecting [%3d/%3d]\" % (i + 1, budget))\n            projections = A.T.dot(resid)\n            index = np.argmax(projections[boolean_mask])\n            index = all_idx[boolean_mask][index]\n\n            indices.append(index.item())\n            boolean_mask[index] = False\n\n            if indices.__len__() == 1:\n                A_i = A[:, index]\n                x_i = projections[index] / A_i.T.dot(A_i)\n            else:\n                A_i = np.vstack([A_i, A[:, index]])\n                x_i = lstsq(A_i.dot(A_i.T) + lam * np.identity(A_i.shape[0]), A_i.dot(b))[0]\n            resid = b - A_i.T.dot(x_i)\n        if budget > 1:\n            x_i = nnls(A_i.dot(A_i.T) + lam * np.identity(A_i.shape[0]), A_i.dot(b))[0]\n            x[indices] = x_i\n        elif budget == 1:\n            x[indices[0]] = 1.\n        return x\n\n    def calc_gradient(self, index=None, val=False):\n        self.model.eval()\n        if val:\n            batch_loader = torch.utils.data.DataLoader(\n                self.dst_val if index is None else torch.utils.data.Subset(self.dst_val, index),\n                batch_size=self.args.selection_batch, num_workers=self.args.workers)\n            sample_num = len(self.dst_val.targets) if index is None else len(index)\n        else:\n            batch_loader = torch.utils.data.DataLoader(\n                self.dst_train if index is None else torch.utils.data.Subset(self.dst_train, index),\n                batch_size=self.args.selection_batch, num_workers=self.args.workers)\n            sample_num = self.n_train if index is None else len(index)\n\n        self.embedding_dim = self.model.get_last_layer().in_features\n        gradients = torch.zeros([sample_num, self.args.num_classes * (self.embedding_dim + 1)],\n                                requires_grad=False, device=self.args.device)\n\n        for i, (input, targets) in enumerate(batch_loader):\n            self.model_optimizer.zero_grad()\n            outputs = self.model(input.to(self.args.device)).requires_grad_(True)\n            loss = self.criterion(outputs, targets.to(self.args.device)).sum()\n            batch_num = targets.shape[0]\n            with torch.no_grad():\n                bias_parameters_grads = torch.autograd.grad(loss, outputs, retain_graph=True)[0].cpu()\n                weight_parameters_grads = self.model.embedding_recorder.embedding.cpu().view(batch_num, 1,\n                                                    self.embedding_dim).repeat(1,self.args.num_classes,1) *\\\n                                                    bias_parameters_grads.view(batch_num, self.args.num_classes,\n                                                    1).repeat(1, 1, self.embedding_dim)\n                gradients[i * self.args.selection_batch:min((i + 1) * self.args.selection_batch, sample_num)] =\\\n                    torch.cat([bias_parameters_grads, weight_parameters_grads.flatten(1)], dim=1)\n\n        return gradients\n\n    def finish_run(self):\n        if isinstance(self.model, MyDataParallel):\n            self.model = self.model.module\n\n        self.model.no_grad = True\n        with self.model.embedding_recorder:\n            if self.dst_val is not None:\n                val_num = len(self.dst_val.targets)\n\n            if self.balance:\n                selection_result = np.array([], dtype=np.int64)\n                weights = np.array([], dtype=np.float32)\n                for c in range(self.args.num_classes):\n                    class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n                    cur_gradients = self.calc_gradient(class_index)\n                    if self.dst_val is not None:\n                        # Also calculate gradients of the validation set.\n                        val_class_index = np.arange(val_num)[self.dst_val.targets == c]\n                        cur_val_gradients = torch.mean(self.calc_gradient(val_class_index, val=True), dim=0)\n                    else:\n                        cur_val_gradients = torch.mean(cur_gradients, dim=0)\n                    if self.args.device == \"cpu\":\n                        # Compute OMP on numpy\n                        cur_weights = self.orthogonal_matching_pursuit_np(cur_gradients.numpy().T,\n                                                                          cur_val_gradients.numpy(),\n                                                                        budget=round(len(class_index) * self.fraction))\n                    else:\n                        cur_weights = self.orthogonal_matching_pursuit(cur_gradients.to(self.args.device).T,\n                                                                       cur_val_gradients.to(self.args.device),\n                                                                       budget=round(len(class_index) * self.fraction))\n                    selection_result = np.append(selection_result, class_index[np.nonzero(cur_weights)[0]])\n                    weights = np.append(weights, cur_weights[np.nonzero(cur_weights)[0]])\n            else:\n                cur_gradients = self.calc_gradient()\n                if self.dst_val is not None:\n                    # Also calculate gradients of the validation set.\n                    cur_val_gradients = torch.mean(self.calc_gradient(val=True), dim=0)\n                else:\n                    cur_val_gradients = torch.mean(cur_gradients, dim=0)\n                if self.args.device == \"cpu\":\n                    # Compute OMP on numpy\n                    cur_weights = self.orthogonal_matching_pursuit_np(cur_gradients.numpy().T,\n                                                                      cur_val_gradients.numpy(),\n                                                                      budget=self.coreset_size)\n                else:\n                    cur_weights = self.orthogonal_matching_pursuit(cur_gradients.T, cur_val_gradients,\n                                                                   budget=self.coreset_size)\n                selection_result = np.nonzero(cur_weights)[0]\n                weights = cur_weights[selection_result]\n        self.model.no_grad = False\n        return {\"indices\": selection_result, \"weights\": weights}\n\n    def select(self, **kwargs):\n        selection_result = self.run()\n        return selection_result\n\n"
  },
  {
    "path": "deepcore/methods/grand.py",
    "content": "from .earlytrain import EarlyTrain\nimport torch, time\nimport numpy as np\nfrom ..nets.nets_utils import MyDataParallel\n\n\nclass GraNd(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, repeat=10,\n                 specific_model=None, balance=False, **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs, specific_model)\n        self.epochs = epochs\n        self.n_train = len(dst_train)\n        self.coreset_size = round(self.n_train * fraction)\n        self.specific_model = specific_model\n        self.repeat = repeat\n\n        self.balance = balance\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n                epoch, self.epochs, batch_idx + 1, (self.n_train // batch_size) + 1, loss.item()))\n\n    def before_run(self):\n        if isinstance(self.model, MyDataParallel):\n            self.model = self.model.module\n\n    def finish_run(self):\n        self.model.embedding_recorder.record_embedding = True  # recording embedding vector\n\n        self.model.eval()\n\n        embedding_dim = self.model.get_last_layer().in_features\n        batch_loader = torch.utils.data.DataLoader(\n            self.dst_train, batch_size=self.args.selection_batch, num_workers=self.args.workers)\n        sample_num = self.n_train\n\n        for i, (input, targets) in enumerate(batch_loader):\n            self.model_optimizer.zero_grad()\n            outputs = self.model(input.to(self.args.device))\n            loss = self.criterion(outputs.requires_grad_(True),\n                                  targets.to(self.args.device)).sum()\n            batch_num = targets.shape[0]\n            with torch.no_grad():\n                bias_parameters_grads = torch.autograd.grad(loss, outputs)[0]\n                self.norm_matrix[i * self.args.selection_batch:min((i + 1) * self.args.selection_batch, sample_num),\n                self.cur_repeat] = torch.norm(torch.cat([bias_parameters_grads, (\n                        self.model.embedding_recorder.embedding.view(batch_num, 1, embedding_dim).repeat(1,\n                                             self.args.num_classes, 1) * bias_parameters_grads.view(\n                                             batch_num, self.args.num_classes, 1).repeat(1, 1, embedding_dim)).\n                                             view(batch_num, -1)], dim=1), dim=1, p=2)\n\n        self.model.train()\n\n        self.model.embedding_recorder.record_embedding = False\n\n    def select(self, **kwargs):\n        # Initialize a matrix to save norms of each sample on idependent runs\n        self.norm_matrix = torch.zeros([self.n_train, self.repeat], requires_grad=False).to(self.args.device)\n\n        for self.cur_repeat in range(self.repeat):\n            self.run()\n            self.random_seed = self.random_seed + 5\n\n        self.norm_mean = torch.mean(self.norm_matrix, dim=1).cpu().detach().numpy()\n        if not self.balance:\n            top_examples = self.train_indx[np.argsort(self.norm_mean)][::-1][:self.coreset_size]\n        else:\n            top_examples = np.array([], dtype=np.int64)\n            for c in range(self.num_classes):\n                c_indx = self.train_indx[self.dst_train.targets == c]\n                budget = round(self.fraction * len(c_indx))\n                top_examples = np.append(top_examples, c_indx[np.argsort(self.norm_mean[c_indx])[::-1][:budget]])\n\n        return {\"indices\": top_examples, \"scores\": self.norm_mean}\n"
  },
  {
    "path": "deepcore/methods/herding.py",
    "content": "from .earlytrain import EarlyTrain\nimport torch\nimport numpy as np\nfrom .methods_utils import euclidean_dist\nfrom ..nets.nets_utils import MyDataParallel\n\n\nclass Herding(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200,\n                 specific_model=\"ResNet18\", balance: bool = False, metric=\"euclidean\", **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs=epochs, specific_model=specific_model, **kwargs)\n\n        if metric == \"euclidean\":\n            self.metric = euclidean_dist\n        elif callable(metric):\n            self.metric = metric\n        else:\n            self.metric = euclidean_dist\n            self.run = lambda: self.finish_run()\n\n            def _construct_matrix(index=None):\n                data_loader = torch.utils.data.DataLoader(\n                    self.dst_train if index is None else torch.utils.data.Subset(self.dst_train, index),\n                    batch_size=self.n_train if index is None else len(index), num_workers=self.args.workers)\n                inputs, _ = next(iter(data_loader))\n                return inputs.flatten(1).requires_grad_(False).to(self.args.device)\n\n            self.construct_matrix = _construct_matrix\n\n        self.balance = balance\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n                epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n    def construct_matrix(self, index=None):\n        self.model.eval()\n        self.model.no_grad = True\n        with torch.no_grad():\n            with self.model.embedding_recorder:\n                sample_num = self.n_train if index is None else len(index)\n                matrix = torch.zeros([sample_num, self.emb_dim], requires_grad=False).to(self.args.device)\n\n                data_loader = torch.utils.data.DataLoader(self.dst_train if index is None else\n                                            torch.utils.data.Subset(self.dst_train, index),\n                                            batch_size=self.args.selection_batch,\n                                            num_workers=self.args.workers)\n\n                for i, (inputs, _) in enumerate(data_loader):\n                    self.model(inputs.to(self.args.device))\n                    matrix[i * self.args.selection_batch:min((i + 1) * self.args.selection_batch, sample_num)] = self.model.embedding_recorder.embedding\n\n        self.model.no_grad = False\n        return matrix\n\n    def before_run(self):\n        self.emb_dim = self.model.get_last_layer().in_features\n\n    def herding(self, matrix, budget: int, index=None):\n\n        sample_num = matrix.shape[0]\n\n        if budget < 0:\n            raise ValueError(\"Illegal budget size.\")\n        elif budget > sample_num:\n            budget = sample_num\n\n        indices = np.arange(sample_num)\n        with torch.no_grad():\n            mu = torch.mean(matrix, dim=0)\n            select_result = np.zeros(sample_num, dtype=bool)\n\n            for i in range(budget):\n                if i % self.args.print_freq == 0:\n                    print(\"| Selecting [%3d/%3d]\" % (i + 1, budget))\n                dist = self.metric(((i + 1) * mu - torch.sum(matrix[select_result], dim=0)).view(1, -1),\n                                   matrix[~select_result])\n                p = torch.argmax(dist).item()\n                p = indices[~select_result][p]\n                select_result[p] = True\n        if index is None:\n            index = indices\n        return index[select_result]\n\n    def finish_run(self):\n        if isinstance(self.model, MyDataParallel):\n            self.model = self.model.module\n\n        if self.balance:\n            selection_result = np.array([], dtype=np.int32)\n            for c in range(self.args.num_classes):\n                class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n\n                selection_result = np.append(selection_result, self.herding(self.construct_matrix(class_index),\n                        budget=round(self.fraction * len(class_index)), index=class_index))\n        else:\n            selection_result = self.herding(self.construct_matrix(), budget=self.coreset_size)\n        return {\"indices\": selection_result}\n\n    def select(self, **kwargs):\n        selection_result = self.run()\n        return selection_result\n\n"
  },
  {
    "path": "deepcore/methods/kcentergreedy.py",
    "content": "from .earlytrain import EarlyTrain\nimport torch\nimport numpy as np\nfrom .methods_utils import euclidean_dist\nfrom ..nets.nets_utils import MyDataParallel\n\n\ndef k_center_greedy(matrix, budget: int, metric, device, random_seed=None, index=None, already_selected=None,\n                    print_freq: int = 20):\n    if type(matrix) == torch.Tensor:\n        assert matrix.dim() == 2\n    elif type(matrix) == np.ndarray:\n        assert matrix.ndim == 2\n        matrix = torch.from_numpy(matrix).requires_grad_(False).to(device)\n\n    sample_num = matrix.shape[0]\n    assert sample_num >= 1\n\n    if budget < 0:\n        raise ValueError(\"Illegal budget size.\")\n    elif budget > sample_num:\n        budget = sample_num\n\n    if index is not None:\n        assert matrix.shape[0] == len(index)\n    else:\n        index = np.arange(sample_num)\n\n    assert callable(metric)\n\n    already_selected = np.array(already_selected)\n\n    with torch.no_grad():\n        np.random.seed(random_seed)\n        if already_selected.__len__() == 0:\n            select_result = np.zeros(sample_num, dtype=bool)\n            # Randomly select one initial point.\n            already_selected = [np.random.randint(0, sample_num)]\n            budget -= 1\n            select_result[already_selected] = True\n        else:\n            select_result = np.in1d(index, already_selected)\n\n        num_of_already_selected = np.sum(select_result)\n\n        # Initialize a (num_of_already_selected+budget-1)*sample_num matrix storing distances of pool points from\n        # each clustering center.\n        dis_matrix = -1 * torch.ones([num_of_already_selected + budget - 1, sample_num], requires_grad=False).to(device)\n\n        dis_matrix[:num_of_already_selected, ~select_result] = metric(matrix[select_result], matrix[~select_result])\n\n        mins = torch.min(dis_matrix[:num_of_already_selected, :], dim=0).values\n\n        for i in range(budget):\n            if i % print_freq == 0:\n                print(\"| Selecting [%3d/%3d]\" % (i + 1, budget))\n            p = torch.argmax(mins).item()\n            select_result[p] = True\n\n            if i == budget - 1:\n                break\n            mins[p] = -1\n            dis_matrix[num_of_already_selected + i, ~select_result] = metric(matrix[[p]], matrix[~select_result])\n            mins = torch.min(mins, dis_matrix[num_of_already_selected + i])\n    return index[select_result]\n\n\nclass kCenterGreedy(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=0,\n                 specific_model=\"ResNet18\", balance: bool = False, already_selected=[], metric=\"euclidean\",\n                 torchvision_pretrain: bool = True, **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs=epochs, specific_model=specific_model,\n                         torchvision_pretrain=torchvision_pretrain, **kwargs)\n\n        if already_selected.__len__() != 0:\n            if min(already_selected) < 0 or max(already_selected) >= self.n_train:\n                raise ValueError(\"List of already selected points out of the boundary.\")\n        self.already_selected = np.array(already_selected)\n\n        self.min_distances = None\n\n        if metric == \"euclidean\":\n            self.metric = euclidean_dist\n        elif callable(metric):\n            self.metric = metric\n        else:\n            self.metric = euclidean_dist\n            self.run = lambda : self.finish_run()\n            def _construct_matrix(index=None):\n                data_loader = torch.utils.data.DataLoader(\n                    self.dst_train if index is None else torch.utils.data.Subset(self.dst_train, index),\n                    batch_size=self.n_train if index is None else len(index),\n                    num_workers=self.args.workers)\n                inputs, _ = next(iter(data_loader))\n                return inputs.flatten(1).requires_grad_(False).to(self.args.device)\n            self.construct_matrix = _construct_matrix\n\n        self.balance = balance\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n            epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n    def old_construct_matrix(self, index=None):\n        self.model.eval()\n        self.model.no_grad = True\n        with torch.no_grad():\n            with self.model.embedding_recorder:\n                sample_num = self.n_train if index is None else len(index)\n                matrix = torch.zeros([sample_num, self.emb_dim], requires_grad=False).to(self.args.device)\n\n                data_loader = torch.utils.data.DataLoader(self.dst_train if index is None else\n                                        torch.utils.data.Subset(self.dst_train, index),\n                                                batch_size=self.args.selection_batch,\n                                                num_workers=self.args.workers)\n\n                for i, (inputs, _) in enumerate(data_loader):\n                    self.model(inputs.to(self.args.device))\n                    matrix[i * self.args.selection_batch:min((i + 1) * self.args.selection_batch,\n                                                             sample_num)] = self.model.embedding_recorder.embedding\n\n        self.model.no_grad = False\n        return matrix\n\n    def construct_matrix(self, index=None):\n        self.model.eval()\n        self.model.no_grad = True\n        with torch.no_grad():\n            with self.model.embedding_recorder:\n                sample_num = self.n_train if index is None else len(index)\n                matrix = []\n\n                data_loader = torch.utils.data.DataLoader(self.dst_train if index is None else\n                                    torch.utils.data.Subset(self.dst_train, index),\n                                    batch_size=self.args.selection_batch,\n                                    num_workers=self.args.workers)\n\n                for i, (inputs, _) in enumerate(data_loader):\n                    self.model(inputs.to(self.args.device))\n                    matrix.append(self.model.embedding_recorder.embedding)\n\n        self.model.no_grad = False\n        return torch.cat(matrix, dim=0)\n\n    def before_run(self):\n        self.emb_dim = self.model.get_last_layer().in_features\n\n    def finish_run(self):\n        if isinstance(self.model, MyDataParallel):\n            self.model = self.model.module\n\n    def select(self, **kwargs):\n        self.run()\n        if self.balance:\n            selection_result = np.array([], dtype=np.int32)\n            for c in range(self.args.num_classes):\n                class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n\n                selection_result = np.append(selection_result, k_center_greedy(self.construct_matrix(class_index),\n                                                                               budget=round(\n                                                                                   self.fraction * len(class_index)),\n                                                                               metric=self.metric,\n                                                                               device=self.args.device,\n                                                                               random_seed=self.random_seed,\n                                                                               index=class_index,\n                                                                               already_selected=self.already_selected[\n                                                                                   np.in1d(self.already_selected,\n                                                                                           class_index)],\n                                                                               print_freq=self.args.print_freq))\n        else:\n            matrix = self.construct_matrix()\n            del self.model_optimizer\n            del self.model\n            selection_result = k_center_greedy(matrix, budget=self.coreset_size,\n                                               metric=self.metric, device=self.args.device,\n                                               random_seed=self.random_seed,\n                                               already_selected=self.already_selected, print_freq=self.args.print_freq)\n        return {\"indices\": selection_result}\n"
  },
  {
    "path": "deepcore/methods/methods_utils/__init__.py",
    "content": "from .euclidean import *\nfrom .cossim import *\nfrom .submodular_function import *\nfrom .submodular_optimizer import *\n"
  },
  {
    "path": "deepcore/methods/methods_utils/cossim.py",
    "content": "import numpy as np\nimport torch\n\n\ndef cossim_np(v1, v2):\n    num = np.dot(v1, v2.T)\n    denom = np.linalg.norm(v1, axis=1).reshape(-1, 1) * np.linalg.norm(v2, axis=1)\n    res = num / denom\n    res[np.isneginf(res)] = 0.\n    return 0.5 + 0.5 * res\n\ndef cossim_pair_np(v1):\n    num = np.dot(v1, v1.T)\n    norm = np.linalg.norm(v1, axis=1)\n    denom = norm.reshape(-1, 1) * norm\n    res = num / denom\n    res[np.isneginf(res)] = 0.\n    return 0.5 + 0.5 * res\n\ndef cossim(v1, v2):\n    num = torch.matmul(v1, v2.T)\n    denom = torch.norm(v1, dim=1).view(-1, 1) * torch.norm(v2, dim=1)\n    res = num / denom\n    res[torch.isneginf(res)] = 0.\n    return 0.5 + 0.5 * res\n\ndef cossim_pair(v1):\n    num = torch.matmul(v1, v1.T)\n    norm = torch.norm(v1, dim=1)\n    denom = norm.view(-1, 1) * norm\n    res = num / denom\n    res[torch.isneginf(res)] = 0.\n    return 0.5 + 0.5 * res"
  },
  {
    "path": "deepcore/methods/methods_utils/euclidean.py",
    "content": "import torch\nimport numpy as np\n\n\ndef euclidean_dist(x, y):\n    m, n = x.size(0), y.size(0)\n    xx = torch.pow(x, 2).sum(1, keepdim=True).expand(m, n)\n    yy = torch.pow(y, 2).sum(1, keepdim=True).expand(n, m).t()\n    dist = xx + yy\n    dist.addmm_(1, -2, x, y.t())\n    dist = dist.clamp(min=1e-12).sqrt()\n    return dist\n\n\ndef euclidean_dist_pair(x):\n    m = x.size(0)\n    xx = torch.pow(x, 2).sum(1, keepdim=True).expand(m, m)\n    dist = xx + xx.t()\n    dist.addmm_(1, -2, x, x.t())\n    dist = dist.clamp(min=1e-12).sqrt()\n    return dist\n\ndef euclidean_dist_np(x, y):\n    (rowx, colx) = x.shape\n    (rowy, coly) = y.shape\n    xy = np.dot(x, y.T)\n    x2 = np.repeat(np.reshape(np.sum(np.multiply(x, x), axis=1), (rowx, 1)), repeats=rowy, axis=1)\n    y2 = np.repeat(np.reshape(np.sum(np.multiply(y, y), axis=1), (rowy, 1)), repeats=rowx, axis=1).T\n    return np.sqrt(np.clip(x2 + y2 - 2. * xy, 1e-12, None))\n\ndef euclidean_dist_pair_np(x):\n    (rowx, colx) = x.shape\n    xy = np.dot(x, x.T)\n    x2 = np.repeat(np.reshape(np.sum(np.multiply(x, x), axis=1), (rowx, 1)), repeats=rowx, axis=1)\n    return np.sqrt(np.clip(x2 + x2.T - 2. * xy, 1e-12, None))\n"
  },
  {
    "path": "deepcore/methods/methods_utils/submodular_function.py",
    "content": "import numpy as np\n\n\nclass SubmodularFunction(object):\n    def __init__(self, index, similarity_kernel=None, similarity_matrix=None, already_selected=[]):\n        self.index = index\n        self.n = len(index)\n\n        self.already_selected = already_selected\n\n        assert similarity_kernel is not None or similarity_matrix is not None\n\n        # For the sample similarity matrix, the method supports two input modes, one is to input a pairwise similarity\n        # matrix for the whole sample, and the other case allows the input of a similarity kernel to be used to\n        # calculate similarities incrementally at a later time if required.\n        if similarity_kernel is not None:\n            assert callable(similarity_kernel)\n            self.similarity_kernel = self._similarity_kernel(similarity_kernel)\n        else:\n            assert similarity_matrix.shape[0] == self.n and similarity_matrix.shape[1] == self.n\n            self.similarity_matrix = similarity_matrix\n            self.similarity_kernel = lambda a, b: self.similarity_matrix[np.ix_(a, b)]\n\n    def _similarity_kernel(self, similarity_kernel):\n        return similarity_kernel\n\n\nclass FacilityLocation(SubmodularFunction):\n    def __init__(self, **kwargs):\n        super().__init__(**kwargs)\n\n        if self.already_selected.__len__()==0:\n            self.cur_max = np.zeros(self.n, dtype=np.float32)\n        else:\n            self.cur_max = np.max(self.similarity_kernel(np.arange(self.n), self.already_selected), axis=1)\n\n        self.all_idx = np.ones(self.n, dtype=bool)\n\n    def _similarity_kernel(self, similarity_kernel):\n        # Initialize a matrix to store similarity values of sample points.\n        self.sim_matrix = np.zeros([self.n, self.n], dtype=np.float32)\n        self.if_columns_calculated = np.zeros(self.n, dtype=bool)\n\n        def _func(a, b):\n            if not np.all(self.if_columns_calculated[b]):\n                if b.dtype != bool:\n                    temp = ~self.all_idx\n                    temp[b] = True\n                    b = temp\n                not_calculated = b & ~self.if_columns_calculated\n                self.sim_matrix[:, not_calculated] = similarity_kernel(self.all_idx, not_calculated)\n                self.if_columns_calculated[not_calculated] = True\n            return self.sim_matrix[np.ix_(a, b)]\n        return _func\n\n    def calc_gain(self, idx_gain, selected, **kwargs):\n        gains = np.maximum(0., self.similarity_kernel(self.all_idx, idx_gain) - self.cur_max.reshape(-1, 1)).sum(axis=0)\n        return gains\n\n    def calc_gain_batch(self, idx_gain, selected, **kwargs):\n        batch_idx = ~self.all_idx\n        batch_idx[0:kwargs[\"batch\"]] = True\n        gains = np.maximum(0., self.similarity_kernel(batch_idx, idx_gain) - self.cur_max[batch_idx].reshape(-1, 1)).sum(axis=0)\n        for i in range(kwargs[\"batch\"], self.n, kwargs[\"batch\"]):\n            batch_idx = ~self.all_idx\n            batch_idx[i * kwargs[\"batch\"]:(i + 1) * kwargs[\"batch\"]] = True\n            gains += np.maximum(0., self.similarity_kernel(batch_idx, idx_gain) - self.cur_max[batch_idx].reshape(-1,1)).sum(axis=0)\n        return gains\n\n    def update_state(self, new_selection, total_selected, **kwargs):\n        self.cur_max = np.maximum(self.cur_max, np.max(self.similarity_kernel(self.all_idx, new_selection), axis=1))\n        #self.cur_max = np.max(np.append(self.cur_max.reshape(-1, 1), self.similarity_kernel(self.all_idx, new_selection), axis=1), axis=1)\n\n\nclass GraphCut(SubmodularFunction):\n    def __init__(self, lam: float = 1., **kwargs):\n        super().__init__(**kwargs)\n        self.lam = lam\n\n        if 'similarity_matrix' in kwargs:\n            self.sim_matrix_cols_sum = np.sum(self.similarity_matrix, axis=0)\n        self.all_idx = np.ones(self.n, dtype=bool)\n\n    def _similarity_kernel(self, similarity_kernel):\n        # Initialize a matrix to store similarity values of sample points.\n        self.sim_matrix = np.zeros([self.n, self.n], dtype=np.float32)\n        self.sim_matrix_cols_sum = np.zeros(self.n, dtype=np.float32)\n        self.if_columns_calculated = np.zeros(self.n, dtype=bool)\n\n        def _func(a, b):\n            if not np.all(self.if_columns_calculated[b]):\n                if b.dtype != bool:\n                    temp = ~self.all_idx\n                    temp[b] = True\n                    b = temp\n                not_calculated = b & ~self.if_columns_calculated\n                self.sim_matrix[:, not_calculated] = similarity_kernel(self.all_idx, not_calculated)\n                self.sim_matrix_cols_sum[not_calculated] = np.sum(self.sim_matrix[:, not_calculated], axis=0)\n                self.if_columns_calculated[not_calculated] = True\n            return self.sim_matrix[np.ix_(a, b)]\n        return _func\n\n    def calc_gain(self, idx_gain, selected, **kwargs):\n\n        gain = -2. * np.sum(self.similarity_kernel(selected, idx_gain), axis=0) + self.lam * self.sim_matrix_cols_sum[idx_gain]\n\n        return gain\n\n    def update_state(self, new_selection, total_selected, **kwargs):\n        pass\n\n\nclass LogDeterminant(SubmodularFunction):\n    def __init__(self, **kwargs):\n        super().__init__(**kwargs)\n\n        self.all_idx = np.ones(self.n, dtype=bool)\n\n    def _similarity_kernel(self, similarity_kernel):\n        # Initialize a matrix to store similarity values of sample points.\n        self.sim_matrix = np.zeros([self.n, self.n], dtype=np.float32)\n        self.if_columns_calculated = np.zeros(self.n, dtype=bool)\n\n        def _func(a, b):\n            if not np.all(self.if_columns_calculated[b]):\n                if b.dtype != bool:\n                    temp = ~self.all_idx\n                    temp[b] = True\n                    b = temp\n                not_calculated = b & ~self.if_columns_calculated\n                self.sim_matrix[:, not_calculated] = similarity_kernel(self.all_idx, not_calculated)\n                self.if_columns_calculated[not_calculated] = True\n            return self.sim_matrix[np.ix_(a, b)]\n        return _func\n\n    def calc_gain(self, idx_gain, selected, **kwargs):\n        # Gain for LogDeterminant can be written as $f(x | A ) = \\log\\det(S_{a} - S_{a,A}S_{A}^{-1}S_{x,A}^T)$.\n        sim_idx_gain = self.similarity_kernel(selected, idx_gain).T\n        sim_selected = self.similarity_kernel(selected, selected)\n        return (np.dot(sim_idx_gain, np.linalg.pinv(sim_selected)) * sim_idx_gain).sum(-1)\n\n    def update_state(self, new_selection, total_selected, **kwargs):\n        pass\n"
  },
  {
    "path": "deepcore/methods/methods_utils/submodular_optimizer.py",
    "content": "import numpy as np\n\n\noptimizer_choices = [\"NaiveGreedy\", \"LazyGreedy\", \"StochasticGreedy\", \"ApproximateLazyGreedy\"]\n\nclass optimizer(object):\n    def __init__(self, args, index, budget:int, already_selected=[]):\n        self.args = args\n        self.index = index\n\n        if budget <= 0 or budget > index.__len__():\n            raise ValueError(\"Illegal budget for optimizer.\")\n\n        self.n = len(index)\n        self.budget = budget\n        self.already_selected = already_selected\n\n\nclass NaiveGreedy(optimizer):\n    def __init__(self, args, index, budget:int, already_selected=[]):\n        super(NaiveGreedy, self).__init__(args, index, budget, already_selected)\n\n    def select(self, gain_function, update_state=None, **kwargs):\n        assert callable(gain_function)\n        if update_state is not None:\n            assert callable(update_state)\n        selected = np.zeros(self.n, dtype=bool)\n        selected[self.already_selected] = True\n\n        greedy_gain = np.zeros(len(self.index))\n        for i in range(sum(selected), self.budget):\n            if i % self.args.print_freq == 0:\n                print(\"| Selecting [%3d/%3d]\" % (i + 1, self.budget))\n            greedy_gain[~selected] = gain_function(~selected, selected, **kwargs)\n            current_selection = greedy_gain.argmax()\n            selected[current_selection] = True\n            greedy_gain[current_selection] = -np.inf\n            if update_state is not None:\n                update_state(np.array([current_selection]), selected, **kwargs)\n        return self.index[selected]\n\n\nclass LazyGreedy(optimizer):\n    def __init__(self, args, index, budget:int, already_selected=[]):\n        super(LazyGreedy, self).__init__(args, index, budget, already_selected)\n\n    def select(self, gain_function, update_state=None, **kwargs):\n        assert callable(gain_function)\n        if update_state is not None:\n            assert callable(update_state)\n        selected = np.zeros(self.n, dtype=bool)\n        selected[self.already_selected] = True\n\n        greedy_gain = np.zeros(len(self.index))\n        greedy_gain[~selected] = gain_function(~selected, selected, **kwargs)\n        greedy_gain[selected] = -np.inf\n\n        for i in range(sum(selected), self.budget):\n            if i % self.args.print_freq == 0:\n                print(\"| Selecting [%3d/%3d]\" % (i + 1, self.budget))\n            best_gain = -np.inf\n            last_max_element = -1\n            while True:\n                cur_max_element = greedy_gain.argmax()\n                if last_max_element == cur_max_element:\n                    # Select cur_max_element into the current subset\n                    selected[cur_max_element] = True\n                    greedy_gain[cur_max_element] = -np.inf\n\n                    if update_state is not None:\n                        update_state(np.array([cur_max_element]), selected, **kwargs)\n                    break\n                new_gain = gain_function(np.array([cur_max_element]), selected, **kwargs)[0]\n                greedy_gain[cur_max_element] = new_gain\n                if new_gain >= best_gain:\n                    best_gain = new_gain\n                    last_max_element = cur_max_element\n        return self.index[selected]\n\n\nclass StochasticGreedy(optimizer):\n    def __init__(self, args, index, budget:int, already_selected=[], epsilon: float=0.9):\n        super(StochasticGreedy, self).__init__(args, index, budget, already_selected)\n        self.epsilon = epsilon\n\n    def select(self, gain_function, update_state=None, **kwargs):\n        assert callable(gain_function)\n        if update_state is not None:\n            assert callable(update_state)\n        selected = np.zeros(self.n, dtype=bool)\n        selected[self.already_selected] = True\n\n        sample_size = max(round(-np.log(self.epsilon) * self.n / self.budget), 1)\n\n        greedy_gain = np.zeros(len(self.index))\n        all_idx = np.arange(self.n)\n        for i in range(sum(selected), self.budget):\n            if i % self.args.print_freq == 0:\n                print(\"| Selecting [%3d/%3d]\" % (i + 1, self.budget))\n\n            # Uniformly select a subset from unselected samples with size sample_size\n            subset = np.random.choice(all_idx[~selected], replace=False, size=min(sample_size, self.n - i))\n\n            if subset.__len__() == 0:\n                break\n\n            greedy_gain[subset] = gain_function(subset, selected, **kwargs)\n            current_selection = greedy_gain[subset].argmax()\n            selected[subset[current_selection]] = True\n            greedy_gain[subset[current_selection]] = -np.inf\n            if update_state is not None:\n                update_state(np.array([subset[current_selection]]), selected, **kwargs)\n        return self.index[selected]\n\n\nclass ApproximateLazyGreedy(optimizer):\n    def __init__(self, args, index, budget:int, already_selected=[], beta: float=0.9):\n        super(ApproximateLazyGreedy, self).__init__(args, index, budget, already_selected)\n        self.beta = beta\n\n    def select(self, gain_function, update_state=None, **kwargs):\n        assert callable(gain_function)\n        if update_state is not None:\n            assert callable(update_state)\n        selected = np.zeros(self.n, dtype=bool)\n        selected[self.already_selected] = True\n\n        greedy_gain = np.zeros(len(self.index))\n        greedy_gain[~selected] = gain_function(~selected, selected, **kwargs)\n        greedy_gain[selected] = -np.inf\n\n        for i in range(sum(selected), self.budget):\n            if i % self.args.print_freq == 0:\n                print(\"| Selecting [%3d/%3d]\" % (i + 1, self.budget))\n            while True:\n                cur_max_element = greedy_gain.argmax()\n                max_gain = greedy_gain[cur_max_element]\n\n                new_gain = gain_function(np.array([cur_max_element]), selected, **kwargs)[0]\n\n                if new_gain >= self.beta * max_gain:\n                    # Select cur_max_element into the current subset\n                    selected[cur_max_element] = True\n                    greedy_gain[cur_max_element] = -np.inf\n\n                    if update_state is not None:\n                        update_state(np.array([cur_max_element]), selected, **kwargs)\n                    break\n                else:\n                    greedy_gain[cur_max_element] = new_gain\n        return self.index[selected]\n\n\n\n\n"
  },
  {
    "path": "deepcore/methods/submodular.py",
    "content": "from .earlytrain import EarlyTrain\nimport numpy as np\nimport torch\nfrom .methods_utils import cossim_np, submodular_function, submodular_optimizer\nfrom ..nets.nets_utils import MyDataParallel\n\n\nclass Submodular(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, specific_model=None, balance=False,\n                 function=\"LogDeterminant\", greedy=\"ApproximateLazyGreedy\", metric=\"cossim\", **kwargs):\n        super(Submodular, self).__init__(dst_train, args, fraction, random_seed, epochs, specific_model, **kwargs)\n\n        if greedy not in submodular_optimizer.optimizer_choices:\n            raise ModuleNotFoundError(\"Greedy optimizer not found.\")\n        self._greedy = greedy\n        self._metric = metric\n        self._function = function\n\n        self.balance = balance\n\n    def before_train(self):\n        pass\n\n    def after_loss(self, outputs, loss, targets, batch_inds, epoch):\n        pass\n\n    def before_epoch(self):\n        pass\n\n    def after_epoch(self):\n        pass\n\n    def before_run(self):\n        pass\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n                epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n    def calc_gradient(self, index=None):\n        '''\n        Calculate gradients matrix on current network for specified training dataset.\n        '''\n        self.model.eval()\n\n        batch_loader = torch.utils.data.DataLoader(\n                self.dst_train if index is None else torch.utils.data.Subset(self.dst_train, index),\n                batch_size=self.args.selection_batch,\n                num_workers=self.args.workers)\n        sample_num = self.n_train if index is None else len(index)\n\n        self.embedding_dim = self.model.get_last_layer().in_features\n\n        # Initialize a matrix to save gradients.\n        # (on cpu)\n        gradients = []\n\n        for i, (input, targets) in enumerate(batch_loader):\n            self.model_optimizer.zero_grad()\n            outputs = self.model(input.to(self.args.device))\n            loss = self.criterion(outputs.requires_grad_(True),\n                                  targets.to(self.args.device)).sum()\n            batch_num = targets.shape[0]\n            with torch.no_grad():\n                bias_parameters_grads = torch.autograd.grad(loss, outputs)[0]\n                weight_parameters_grads = self.model.embedding_recorder.embedding.view(batch_num, 1,\n                                        self.embedding_dim).repeat(1, self.args.num_classes, 1) *\\\n                                        bias_parameters_grads.view(batch_num, self.args.num_classes,\n                                        1).repeat(1, 1, self.embedding_dim)\n                gradients.append(torch.cat([bias_parameters_grads, weight_parameters_grads.flatten(1)],\n                                            dim=1).cpu().numpy())\n\n        gradients = np.concatenate(gradients, axis=0)\n        return gradients\n\n    def finish_run(self):\n        if isinstance(self.model, MyDataParallel):\n            self.model = self.model.module\n\n        # Turn on the embedding recorder and the no_grad flag\n        with self.model.embedding_recorder:\n            self.model.no_grad = True\n            self.train_indx = np.arange(self.n_train)\n\n            if self.balance:\n                selection_result = np.array([], dtype=np.int64)\n                for c in range(self.num_classes):\n                    c_indx = self.train_indx[self.dst_train.targets == c]\n                    # Calculate gradients into a matrix\n                    gradients = self.calc_gradient(index=c_indx)\n                    # Instantiate a submodular function\n                    submod_function = submodular_function.__dict__[self._function](index=c_indx,\n                                        similarity_kernel=lambda a, b:cossim_np(gradients[a], gradients[b]))\n                    submod_optimizer = submodular_optimizer.__dict__[self._greedy](args=self.args,\n                                        index=c_indx, budget=round(self.fraction * len(c_indx)), already_selected=[])\n\n                    c_selection_result = submod_optimizer.select(gain_function=submod_function.calc_gain,\n                                                                 update_state=submod_function.update_state)\n                    selection_result = np.append(selection_result, c_selection_result)\n            else:\n                # Calculate gradients into a matrix\n                gradients = self.calc_gradient()\n                # Instantiate a submodular function\n                submod_function = submodular_function.__dict__[self._function](index=self.train_indx,\n                                            similarity_kernel=lambda a, b: cossim_np(gradients[a], gradients[b]))\n                submod_optimizer = submodular_optimizer.__dict__[self._greedy](args=self.args, index=self.train_indx,\n                                                                                  budget=self.coreset_size)\n                selection_result = submod_optimizer.select(gain_function=submod_function.calc_gain,\n                                                           update_state=submod_function.update_state)\n\n            self.model.no_grad = False\n        return {\"indices\": selection_result}\n\n    def select(self, **kwargs):\n        selection_result = self.run()\n        return selection_result\n\n\n"
  },
  {
    "path": "deepcore/methods/uncertainty.py",
    "content": "from .earlytrain import EarlyTrain\nimport torch\nimport numpy as np\n\n\nclass Uncertainty(EarlyTrain):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, epochs=200, selection_method=\"LeastConfidence\",\n                 specific_model=None, balance=False, **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed, epochs, specific_model, **kwargs)\n\n        selection_choices = [\"LeastConfidence\",\n                             \"Entropy\",\n                             \"Margin\"]\n        if selection_method not in selection_choices:\n            raise NotImplementedError(\"Selection algorithm unavailable.\")\n        self.selection_method = selection_method\n\n        self.epochs = epochs\n        self.balance = balance\n\n    def before_train(self):\n        pass\n\n    def after_loss(self, outputs, loss, targets, batch_inds, epoch):\n        pass\n\n    def before_epoch(self):\n        pass\n\n    def after_epoch(self):\n        pass\n\n    def before_run(self):\n        pass\n\n    def num_classes_mismatch(self):\n        raise ValueError(\"num_classes of pretrain dataset does not match that of the training dataset.\")\n\n    def while_update(self, outputs, loss, targets, epoch, batch_idx, batch_size):\n        if batch_idx % self.args.print_freq == 0:\n            print('| Epoch [%3d/%3d] Iter[%3d/%3d]\\t\\tLoss: %.4f' % (\n            epoch, self.epochs, batch_idx + 1, (self.n_pretrain_size // batch_size) + 1, loss.item()))\n\n    def finish_run(self):\n        if self.balance:\n            selection_result = np.array([], dtype=np.int64)\n            scores = []\n            for c in range(self.args.num_classes):\n                class_index = np.arange(self.n_train)[self.dst_train.targets == c]\n                scores.append(self.rank_uncertainty(class_index))\n                selection_result = np.append(selection_result, class_index[np.argsort(scores[-1])[\n                                                               :round(len(class_index) * self.fraction)]])\n        else:\n            scores = self.rank_uncertainty()\n            selection_result = np.argsort(scores)[::-1][:self.coreset_size]\n        return {\"indices\": selection_result, \"scores\": scores}\n\n    def rank_uncertainty(self, index=None):\n        self.model.eval()\n        with torch.no_grad():\n            train_loader = torch.utils.data.DataLoader(\n                self.dst_train if index is None else torch.utils.data.Subset(self.dst_train, index),\n                batch_size=self.args.selection_batch,\n                num_workers=self.args.workers)\n\n            scores = np.array([])\n            batch_num = len(train_loader)\n\n            for i, (input, _) in enumerate(train_loader):\n                if i % self.args.print_freq == 0:\n                    print(\"| Selecting for batch [%3d/%3d]\" % (i + 1, batch_num))\n                if self.selection_method == \"LeastConfidence\":\n                    scores = np.append(scores, self.model(input.to(self.args.device)).max(axis=1).values.cpu().numpy())\n                elif self.selection_method == \"Entropy\":\n                    preds = torch.nn.functional.softmax(self.model(input.to(self.args.device)), dim=1).cpu().numpy()\n                    scores = np.append(scores, (np.log(preds + 1e-6) * preds).sum(axis=1))\n                elif self.selection_method == 'Margin':\n                    preds = torch.nn.functional.softmax(self.model(input.to(self.args.device)), dim=1)\n                    preds_argmax = torch.argmax(preds, dim=1)\n                    max_preds = preds[torch.ones(preds.shape[0], dtype=bool), preds_argmax].clone()\n                    preds[torch.ones(preds.shape[0], dtype=bool), preds_argmax] = -1.0\n                    preds_sub_argmax = torch.argmax(preds, dim=1)\n                    scores = np.append(scores, (max_preds - preds[\n                        torch.ones(preds.shape[0], dtype=bool), preds_sub_argmax]).cpu().numpy())\n        return scores\n\n    def select(self, **kwargs):\n        selection_result = self.run()\n        return selection_result\n"
  },
  {
    "path": "deepcore/methods/uniform.py",
    "content": "import numpy as np\nfrom .coresetmethod import CoresetMethod\n\n\nclass Uniform(CoresetMethod):\n    def __init__(self, dst_train, args, fraction=0.5, random_seed=None, balance=False, replace=False, **kwargs):\n        super().__init__(dst_train, args, fraction, random_seed)\n        self.balance = balance\n        self.replace = replace\n        self.n_train = len(dst_train)\n\n    def select_balance(self):\n        \"\"\"The same sampling proportions were used in each class separately.\"\"\"\n        np.random.seed(self.random_seed)\n        self.index = np.array([], dtype=np.int64)\n        all_index = np.arange(self.n_train)\n        for c in range(self.num_classes):\n            c_index = (self.dst_train.targets == c)\n            self.index = np.append(self.index,\n                                   np.random.choice(all_index[c_index], round(self.fraction * c_index.sum().item()),\n                                                    replace=self.replace))\n        return self.index\n\n    def select_no_balance(self):\n        np.random.seed(self.random_seed)\n        self.index = np.random.choice(np.arange(self.n_train), round(self.n_train * self.fraction),\n                                      replace=self.replace)\n\n        return  self.index\n\n    def select(self, **kwargs):\n        return {\"indices\": self.select_balance() if self.balance else self.select_no_balance()}\n"
  },
  {
    "path": "deepcore/nets/__init__.py",
    "content": "from .alexnet import *\nfrom .inceptionv3 import *\nfrom .lenet import *\nfrom .mlp import *\nfrom .mobilenetv3 import *\nfrom .resnet import *\nfrom .vgg import *\nfrom .wideresnet import *\n"
  },
  {
    "path": "deepcore/nets/alexnet.py",
    "content": "import torch.nn as nn\nfrom torch import set_grad_enabled\nfrom torchvision import models\nimport torch\nfrom .nets_utils import EmbeddingRecorder\n\n\n# Acknowledgement to\n# https://github.com/kuangliu/pytorch-cifar,\n# https://github.com/BIGBALLON/CIFAR-ZOO,\n\nclass AlexNet_32x32(nn.Module):\n    def __init__(self, channel, num_classes, record_embedding=False, no_grad=False):\n        super().__init__()\n        self.features = nn.Sequential(\n            nn.Conv2d(channel, 128, kernel_size=5, stride=1, padding=4 if channel == 1 else 2),\n            nn.ReLU(inplace=True),\n            nn.MaxPool2d(kernel_size=2, stride=2),\n            nn.Conv2d(128, 192, kernel_size=5, padding=2),\n            nn.ReLU(inplace=True),\n            nn.MaxPool2d(kernel_size=2, stride=2),\n            nn.Conv2d(192, 256, kernel_size=3, padding=1),\n            nn.ReLU(inplace=True),\n            nn.Conv2d(256, 192, kernel_size=3, padding=1),\n            nn.ReLU(inplace=True),\n            nn.Conv2d(192, 192, kernel_size=3, padding=1),\n            nn.ReLU(inplace=True),\n            nn.MaxPool2d(kernel_size=2, stride=2),\n        )\n        self.fc = nn.Linear(192 * 4 * 4, num_classes)\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc\n\n    def forward(self, x):\n        with set_grad_enabled(not self.no_grad):\n            x = self.features(x)\n            x = x.view(x.size(0), -1)\n            x = self.embedding_recorder(x)\n            x = self.fc(x)\n        return x\n\n\nclass AlexNet_224x224(models.AlexNet):\n    def __init__(self, channel: int, num_classes: int, record_embedding: bool = False,\n                 no_grad: bool = False, **kwargs):\n        super().__init__(num_classes, **kwargs)\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        if channel != 3:\n            self.features[0] = nn.Conv2d(channel, 64, kernel_size=11, stride=4, padding=2)\n        self.fc = self.classifier[-1]\n        self.classifier[-1] = self.embedding_recorder\n        self.classifier.add_module(\"fc\", self.fc)\n\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc\n\n    def forward(self, x: torch.Tensor) -> torch.Tensor:\n        with set_grad_enabled(not self.no_grad):\n            x = self.features(x)\n            x = self.avgpool(x)\n            x = torch.flatten(x, 1)\n            x = self.classifier(x)\n        return x\n\n\ndef AlexNet(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n            pretrained: bool = False):\n    if pretrained:\n        if im_size[0] != 224 or im_size[1] != 224:\n            raise NotImplementedError(\"torchvison pretrained models only accept inputs with size of 224*224\")\n        net = AlexNet_224x224(channel=3, num_classes=1000, record_embedding=record_embedding, no_grad=no_grad)\n\n        from torch.hub import load_state_dict_from_url\n        state_dict = load_state_dict_from_url('https://download.pytorch.org/models/alexnet-owt-7be5be79.pth'\n                                              , progress=True)\n        net.load_state_dict(state_dict)\n\n        if channel != 3:\n            net.features[0] = nn.Conv2d(channel, 64, kernel_size=11, stride=4, padding=2)\n        if num_classes != 1000:\n            net.fc = nn.Linear(4096, num_classes)\n            net.classifier[-1] = net.fc\n\n    elif im_size[0] == 224 and im_size[1] == 224:\n        net = AlexNet_224x224(channel=channel, num_classes=num_classes, record_embedding=record_embedding,\n                              no_grad=no_grad)\n\n    elif (channel == 1 and im_size[0] == 28 and im_size[1] == 28) or (\n            channel == 3 and im_size[0] == 32 and im_size[1] == 32):\n        net = AlexNet_32x32(channel=channel, num_classes=num_classes, record_embedding=record_embedding,\n                            no_grad=no_grad)\n    else:\n        raise NotImplementedError(\"Network Architecture for current dataset has not been implemented.\")\n    return net\n"
  },
  {
    "path": "deepcore/nets/inceptionv3.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torchvision.models import inception\nfrom .nets_utils import EmbeddingRecorder\n\n\nclass BasicConv2d(nn.Module):\n\n    def __init__(self, input_channels, output_channels, **kwargs):\n        super().__init__()\n        self.conv = nn.Conv2d(input_channels, output_channels, bias=False, **kwargs)\n        self.bn = nn.BatchNorm2d(output_channels)\n        self.relu = nn.ReLU(inplace=True)\n\n    def forward(self, x):\n        x = self.conv(x)\n        x = self.bn(x)\n        x = self.relu(x)\n\n        return x\n\n\n# same naive inception module\nclass InceptionA(nn.Module):\n\n    def __init__(self, input_channels, pool_features):\n        super().__init__()\n        self.branch1x1 = BasicConv2d(input_channels, 64, kernel_size=1)\n\n        self.branch5x5 = nn.Sequential(\n            BasicConv2d(input_channels, 48, kernel_size=1),\n            BasicConv2d(48, 64, kernel_size=5, padding=2)\n        )\n\n        self.branch3x3 = nn.Sequential(\n            BasicConv2d(input_channels, 64, kernel_size=1),\n            BasicConv2d(64, 96, kernel_size=3, padding=1),\n            BasicConv2d(96, 96, kernel_size=3, padding=1)\n        )\n\n        self.branchpool = nn.Sequential(\n            nn.AvgPool2d(kernel_size=3, stride=1, padding=1),\n            BasicConv2d(input_channels, pool_features, kernel_size=3, padding=1)\n        )\n\n    def forward(self, x):\n        # x -> 1x1(same)\n        branch1x1 = self.branch1x1(x)\n\n        # x -> 1x1 -> 5x5(same)\n        branch5x5 = self.branch5x5(x)\n        # branch5x5 = self.branch5x5_2(branch5x5)\n\n        # x -> 1x1 -> 3x3 -> 3x3(same)\n        branch3x3 = self.branch3x3(x)\n\n        # x -> pool -> 1x1(same)\n        branchpool = self.branchpool(x)\n\n        outputs = [branch1x1, branch5x5, branch3x3, branchpool]\n\n        return torch.cat(outputs, 1)\n\n\n# downsample\n# Factorization into smaller convolutions\nclass InceptionB(nn.Module):\n\n    def __init__(self, input_channels):\n        super().__init__()\n\n        self.branch3x3 = BasicConv2d(input_channels, 384, kernel_size=3, stride=2)\n\n        self.branch3x3stack = nn.Sequential(\n            BasicConv2d(input_channels, 64, kernel_size=1),\n            BasicConv2d(64, 96, kernel_size=3, padding=1),\n            BasicConv2d(96, 96, kernel_size=3, stride=2)\n        )\n\n        self.branchpool = nn.MaxPool2d(kernel_size=3, stride=2)\n\n    def forward(self, x):\n        # x - > 3x3(downsample)\n        branch3x3 = self.branch3x3(x)\n\n        # x -> 3x3 -> 3x3(downsample)\n        branch3x3stack = self.branch3x3stack(x)\n\n        # x -> avgpool(downsample)\n        branchpool = self.branchpool(x)\n\n        # \"\"\"We can use two parallel stride 2 blocks: P and C. P is a pooling\n        # layer (either average or maximum pooling) the activation, both of\n        # them are stride 2 the filter banks of which are concatenated as in\n        # figure 10.\"\"\"\n        outputs = [branch3x3, branch3x3stack, branchpool]\n\n        return torch.cat(outputs, 1)\n\n\n# Factorizing Convolutions with Large Filter Size\nclass InceptionC(nn.Module):\n    def __init__(self, input_channels, channels_7x7):\n        super().__init__()\n        self.branch1x1 = BasicConv2d(input_channels, 192, kernel_size=1)\n\n        c7 = channels_7x7\n\n        # In theory, we could go even further and argue that one can replace any n × n\n        # convolution by a 1 × n convolution followed by a n × 1 convolution and the\n        # computational cost saving increases dramatically as n grows (see figure 6).\n        self.branch7x7 = nn.Sequential(\n            BasicConv2d(input_channels, c7, kernel_size=1),\n            BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0)),\n            BasicConv2d(c7, 192, kernel_size=(1, 7), padding=(0, 3))\n        )\n\n        self.branch7x7stack = nn.Sequential(\n            BasicConv2d(input_channels, c7, kernel_size=1),\n            BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0)),\n            BasicConv2d(c7, c7, kernel_size=(1, 7), padding=(0, 3)),\n            BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0)),\n            BasicConv2d(c7, 192, kernel_size=(1, 7), padding=(0, 3))\n        )\n\n        self.branch_pool = nn.Sequential(\n            nn.AvgPool2d(kernel_size=3, stride=1, padding=1),\n            BasicConv2d(input_channels, 192, kernel_size=1),\n        )\n\n    def forward(self, x):\n        # x -> 1x1(same)\n        branch1x1 = self.branch1x1(x)\n\n        # x -> 1layer 1*7 and 7*1 (same)\n        branch7x7 = self.branch7x7(x)\n\n        # x-> 2layer 1*7 and 7*1(same)\n        branch7x7stack = self.branch7x7stack(x)\n\n        # x-> avgpool (same)\n        branchpool = self.branch_pool(x)\n\n        outputs = [branch1x1, branch7x7, branch7x7stack, branchpool]\n\n        return torch.cat(outputs, 1)\n\n\nclass InceptionD(nn.Module):\n\n    def __init__(self, input_channels):\n        super().__init__()\n\n        self.branch3x3 = nn.Sequential(\n            BasicConv2d(input_channels, 192, kernel_size=1),\n            BasicConv2d(192, 320, kernel_size=3, stride=2)\n        )\n\n        self.branch7x7 = nn.Sequential(\n            BasicConv2d(input_channels, 192, kernel_size=1),\n            BasicConv2d(192, 192, kernel_size=(1, 7), padding=(0, 3)),\n            BasicConv2d(192, 192, kernel_size=(7, 1), padding=(3, 0)),\n            BasicConv2d(192, 192, kernel_size=3, stride=2)\n        )\n\n        self.branchpool = nn.AvgPool2d(kernel_size=3, stride=2)\n\n    def forward(self, x):\n        # x -> 1x1 -> 3x3(downsample)\n        branch3x3 = self.branch3x3(x)\n\n        # x -> 1x1 -> 1x7 -> 7x1 -> 3x3 (downsample)\n        branch7x7 = self.branch7x7(x)\n\n        # x -> avgpool (downsample)\n        branchpool = self.branchpool(x)\n\n        outputs = [branch3x3, branch7x7, branchpool]\n\n        return torch.cat(outputs, 1)\n\n\n# same\nclass InceptionE(nn.Module):\n    def __init__(self, input_channels):\n        super().__init__()\n        self.branch1x1 = BasicConv2d(input_channels, 320, kernel_size=1)\n\n        self.branch3x3_1 = BasicConv2d(input_channels, 384, kernel_size=1)\n        self.branch3x3_2a = BasicConv2d(384, 384, kernel_size=(1, 3), padding=(0, 1))\n        self.branch3x3_2b = BasicConv2d(384, 384, kernel_size=(3, 1), padding=(1, 0))\n\n        self.branch3x3stack_1 = BasicConv2d(input_channels, 448, kernel_size=1)\n        self.branch3x3stack_2 = BasicConv2d(448, 384, kernel_size=3, padding=1)\n        self.branch3x3stack_3a = BasicConv2d(384, 384, kernel_size=(1, 3), padding=(0, 1))\n        self.branch3x3stack_3b = BasicConv2d(384, 384, kernel_size=(3, 1), padding=(1, 0))\n\n        self.branch_pool = nn.Sequential(\n            nn.AvgPool2d(kernel_size=3, stride=1, padding=1),\n            BasicConv2d(input_channels, 192, kernel_size=1)\n        )\n\n    def forward(self, x):\n        # x -> 1x1 (same)\n        branch1x1 = self.branch1x1(x)\n\n        # x -> 1x1 -> 3x1\n        # x -> 1x1 -> 1x3\n        # concatenate(3x1, 1x3)\n        # \"\"\"7. Inception modules with expanded the filter bank outputs.\n        # This architecture is used on the coarsest (8 × 8) grids to promote\n        # high dimensional representations, as suggested by principle\n        # 2 of Section 2.\"\"\"\n        branch3x3 = self.branch3x3_1(x)\n        branch3x3 = [\n            self.branch3x3_2a(branch3x3),\n            self.branch3x3_2b(branch3x3)\n        ]\n        branch3x3 = torch.cat(branch3x3, 1)\n\n        # x -> 1x1 -> 3x3 -> 1x3\n        # x -> 1x1 -> 3x3 -> 3x1\n        # concatenate(1x3, 3x1)\n        branch3x3stack = self.branch3x3stack_1(x)\n        branch3x3stack = self.branch3x3stack_2(branch3x3stack)\n        branch3x3stack = [\n            self.branch3x3stack_3a(branch3x3stack),\n            self.branch3x3stack_3b(branch3x3stack)\n        ]\n        branch3x3stack = torch.cat(branch3x3stack, 1)\n\n        branchpool = self.branch_pool(x)\n\n        outputs = [branch1x1, branch3x3, branch3x3stack, branchpool]\n\n        return torch.cat(outputs, 1)\n\n\nclass InceptionV3_32x32(nn.Module):\n\n    def __init__(self, channel, num_classes, record_embedding=False, no_grad=False):\n        super().__init__()\n        self.Conv2d_1a_3x3 = BasicConv2d(channel, 32, kernel_size=3, padding=3 if channel == 1 else 1)\n        self.Conv2d_2a_3x3 = BasicConv2d(32, 32, kernel_size=3, padding=1)\n        self.Conv2d_2b_3x3 = BasicConv2d(32, 64, kernel_size=3, padding=1)\n        self.Conv2d_3b_1x1 = BasicConv2d(64, 80, kernel_size=1)\n        self.Conv2d_4a_3x3 = BasicConv2d(80, 192, kernel_size=3)\n\n        # naive inception module\n        self.Mixed_5b = InceptionA(192, pool_features=32)\n        self.Mixed_5c = InceptionA(256, pool_features=64)\n        self.Mixed_5d = InceptionA(288, pool_features=64)\n\n        # downsample\n        self.Mixed_6a = InceptionB(288)\n\n        self.Mixed_6b = InceptionC(768, channels_7x7=128)\n        self.Mixed_6c = InceptionC(768, channels_7x7=160)\n        self.Mixed_6d = InceptionC(768, channels_7x7=160)\n        self.Mixed_6e = InceptionC(768, channels_7x7=192)\n\n        # downsample\n        self.Mixed_7a = InceptionD(768)\n\n        self.Mixed_7b = InceptionE(1280)\n        self.Mixed_7c = InceptionE(2048)\n\n        # 6*6 feature size\n        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))\n        self.dropout = nn.Dropout2d()\n        self.linear = nn.Linear(2048, num_classes)\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.linear\n\n    def forward(self, x):\n        with torch.set_grad_enabled(not self.no_grad):\n            # 32 -> 30\n            x = self.Conv2d_1a_3x3(x)\n            x = self.Conv2d_2a_3x3(x)\n            x = self.Conv2d_2b_3x3(x)\n            x = self.Conv2d_3b_1x1(x)\n            x = self.Conv2d_4a_3x3(x)\n\n            # 30 -> 30\n            x = self.Mixed_5b(x)\n            x = self.Mixed_5c(x)\n            x = self.Mixed_5d(x)\n\n            # 30 -> 14\n            # Efficient Grid Size Reduction to avoid representation\n            # bottleneck\n            x = self.Mixed_6a(x)\n\n            # 14 -> 14\n            # \"\"\"In practice, we have found that employing this factorization does not\n            # work well on early layers, but it gives very good results on medium\n            # grid-sizes (On m × m feature maps, where m ranges between 12 and 20).\n            # On that level, very good results can be achieved by using 1 × 7 convolutions\n            # followed by 7 × 1 convolutions.\"\"\"\n            x = self.Mixed_6b(x)\n            x = self.Mixed_6c(x)\n            x = self.Mixed_6d(x)\n            x = self.Mixed_6e(x)\n\n            # 14 -> 6\n            # Efficient Grid Size Reduction\n            x = self.Mixed_7a(x)\n\n            # 6 -> 6\n            # We are using this solution only on the coarsest grid,\n            # since that is the place where producing high dimensional\n            # sparse representation is the most critical as the ratio of\n            # local processing (by 1 × 1 convolutions) is increased compared\n            # to the spatial aggregation.\"\"\"\n            x = self.Mixed_7b(x)\n            x = self.Mixed_7c(x)\n\n            # 6 -> 1\n            x = self.avgpool(x)\n            x = self.dropout(x)\n            x = x.view(x.size(0), -1)\n            x = self.embedding_recorder(x)\n            x = self.linear(x)\n        return x\n\n\nclass InceptionV3_224x224(inception.Inception3):\n    def __init__(self, channel: int, num_classes: int, record_embedding: bool = False,\n                 no_grad: bool = False, **kwargs):\n        super().__init__(num_classes=num_classes, **kwargs)\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        if channel != 3:\n            self.Conv2d_1a_3x3 = inception.conv_block(channel, 32, kernel_size=3, stride=2)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc\n\n    def _forward(self, x):\n        with torch.set_grad_enabled(not self.no_grad):\n            # N x 3 x 299 x 299\n            x = self.Conv2d_1a_3x3(x)\n            # N x 32 x 149 x 149\n            x = self.Conv2d_2a_3x3(x)\n            # N x 32 x 147 x 147\n            x = self.Conv2d_2b_3x3(x)\n            # N x 64 x 147 x 147\n            x = self.maxpool1(x)\n            # N x 64 x 73 x 73\n            x = self.Conv2d_3b_1x1(x)\n            # N x 80 x 73 x 73\n            x = self.Conv2d_4a_3x3(x)\n            # N x 192 x 71 x 71\n            x = self.maxpool2(x)\n            # N x 192 x 35 x 35\n            x = self.Mixed_5b(x)\n            # N x 256 x 35 x 35\n            x = self.Mixed_5c(x)\n            # N x 288 x 35 x 35\n            x = self.Mixed_5d(x)\n            # N x 288 x 35 x 35\n            x = self.Mixed_6a(x)\n            # N x 768 x 17 x 17\n            x = self.Mixed_6b(x)\n            # N x 768 x 17 x 17\n            x = self.Mixed_6c(x)\n            # N x 768 x 17 x 17\n            x = self.Mixed_6d(x)\n            # N x 768 x 17 x 17\n            x = self.Mixed_6e(x)\n            # N x 768 x 17 x 17\n            aux = None\n            if self.AuxLogits is not None:\n                if self.training:\n                    aux = self.AuxLogits(x)\n            # N x 768 x 17 x 17\n            x = self.Mixed_7a(x)\n            # N x 1280 x 8 x 8\n            x = self.Mixed_7b(x)\n            # N x 2048 x 8 x 8\n            x = self.Mixed_7c(x)\n            # N x 2048 x 8 x 8\n            # Adaptive average pooling\n            x = self.avgpool(x)\n            # N x 2048 x 1 x 1\n            x = self.dropout(x)\n            # N x 2048 x 1 x 1\n            x = torch.flatten(x, 1)\n            # N x 2048\n            x = self.embedding_recorder(x)\n            x = self.fc(x)\n            # N x 1000 (num_classes)\n            return x, aux\n\n\ndef InceptionV3(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n                pretrained: bool = False):\n    if pretrained:\n        if im_size[0] != 224 or im_size[1] != 224:\n            raise NotImplementedError(\"torchvison pretrained models only accept inputs with size of 224*224\")\n        net = InceptionV3_224x224(channel=3, num_classes=1000, record_embedding=record_embedding, no_grad=no_grad)\n\n        from torch.hub import load_state_dict_from_url\n        state_dict = load_state_dict_from_url(inception.model_urls[\"inception_v3_google\"], progress=True)\n        net.load_state_dict(state_dict)\n\n        if channel != 3:\n            net.Conv2d_1a_3x3 = inception.conv_block(channel, 32, kernel_size=3, stride=2)\n        if num_classes != 1000:\n            net.fc = nn.Linear(net.fc.in_features, num_classes)\n\n    elif im_size[0] == 224 and im_size[1] == 224:\n        net = InceptionV3_224x224(channel=channel, num_classes=num_classes, record_embedding=record_embedding,\n                                  no_grad=no_grad)\n    elif (channel == 1 and im_size[0] == 28 and im_size[1] == 28) or (\n            channel == 3 and im_size[0] == 32 and im_size[1] == 32):\n        net = InceptionV3_32x32(channel=channel, num_classes=num_classes, record_embedding=record_embedding,\n                                no_grad=no_grad)\n    else:\n        raise NotImplementedError(\"Network Architecture for current dataset has not been implemented.\")\n\n    return net\n"
  },
  {
    "path": "deepcore/nets/lenet.py",
    "content": "import torch.nn as nn\nimport torch.nn.functional as F\nfrom torch import set_grad_enabled\nfrom .nets_utils import EmbeddingRecorder\n\n\n# Acknowledgement to\n# https://github.com/kuangliu/pytorch-cifar,\n# https://github.com/BIGBALLON/CIFAR-ZOO,\n\nclass LeNet(nn.Module):\n    def __init__(self, channel, num_classes, im_size, record_embedding: bool = False, no_grad: bool = False,\n                 pretrained: bool = False):\n        if pretrained:\n            raise NotImplementedError(\"torchvison pretrained models not available.\")\n        super(LeNet, self).__init__()\n        self.features = nn.Sequential(\n            nn.Conv2d(channel, 6, kernel_size=5, padding=2 if channel == 1 else 0),\n            nn.ReLU(inplace=True),\n            nn.MaxPool2d(kernel_size=2, stride=2),\n            nn.Conv2d(6, 16, kernel_size=5),\n            nn.ReLU(inplace=True),\n            nn.MaxPool2d(kernel_size=2, stride=2),\n        )\n        self.fc_1 = nn.Linear(16 * 53 * 53 if im_size[0] == im_size[1] == 224 else 16 * 5 * 5, 120)\n        self.fc_2 = nn.Linear(120, 84)\n        self.fc_3 = nn.Linear(84, num_classes)\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc_3\n\n    def forward(self, x):\n        with set_grad_enabled(not self.no_grad):\n            x = self.features(x)\n            x = x.view(x.size(0), -1)\n            x = F.relu(self.fc_1(x))\n            x = F.relu(self.fc_2(x))\n            x = self.embedding_recorder(x)\n            x = self.fc_3(x)\n        return x\n"
  },
  {
    "path": "deepcore/nets/mlp.py",
    "content": "import torch.nn as nn\nimport torch.nn.functional as F\nfrom torch import set_grad_enabled\nfrom .nets_utils import EmbeddingRecorder\n\n# Acknowledgement to\n# https://github.com/kuangliu/pytorch-cifar,\n# https://github.com/BIGBALLON/CIFAR-ZOO,\n\n\n''' MLP '''\n\n\nclass MLP(nn.Module):\n    def __init__(self, channel, num_classes, im_size, record_embedding: bool = False, no_grad: bool = False,\n                 pretrained: bool = False):\n        if pretrained:\n            raise NotImplementedError(\"torchvison pretrained models not available.\")\n        super(MLP, self).__init__()\n        self.fc_1 = nn.Linear(im_size[0] * im_size[1] * channel, 128)\n        self.fc_2 = nn.Linear(128, 128)\n        self.fc_3 = nn.Linear(128, num_classes)\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc_3\n\n    def forward(self, x):\n        with set_grad_enabled(not self.no_grad):\n            out = x.view(x.size(0), -1)\n            out = F.relu(self.fc_1(out))\n            out = F.relu(self.fc_2(out))\n            out = self.embedding_recorder(out)\n            out = self.fc_3(out)\n        return out\n"
  },
  {
    "path": "deepcore/nets/mobilenetv3.py",
    "content": "import torch.nn as nn\nfrom torch import set_grad_enabled, flatten, Tensor\nfrom torchvision.models import mobilenetv3\nfrom .nets_utils import EmbeddingRecorder\nimport math\n\n'''MobileNetV3 in PyTorch.\nPaper： \"Inverted Residuals and Linear Bottlenecks:Mobile Networks for Classification, Detection and Segmentation\" \n\nAcknowlegement to:\nhttps://github.com/d-li14/mobilenetv3.pytorch/blob/master/mobilenetv3.py\n'''\n\n\ndef _make_divisible(v, divisor, min_value=None):\n    \"\"\"\n    This function is taken from the original tf repo.\n    It ensures that all layers have a channel number that is divisible by 8\n    It can be seen here:\n    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py\n    \"\"\"\n    if min_value is None:\n        min_value = divisor\n    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)\n    # Make sure that round down does not go down by more than 10%.\n    if new_v < 0.9 * v:\n        new_v += divisor\n    return new_v\n\n\nclass h_sigmoid(nn.Module):\n    def __init__(self, inplace=True):\n        super(h_sigmoid, self).__init__()\n        self.relu = nn.ReLU6(inplace=inplace)\n\n    def forward(self, x):\n        return self.relu(x + 3) / 6\n\n\nclass h_swish(nn.Module):\n    def __init__(self, inplace=True):\n        super(h_swish, self).__init__()\n        self.sigmoid = h_sigmoid(inplace=inplace)\n\n    def forward(self, x):\n        return x * self.sigmoid(x)\n\n\nclass SELayer(nn.Module):\n    def __init__(self, channel, reduction=4):\n        super(SELayer, self).__init__()\n        self.avg_pool = nn.AdaptiveAvgPool2d(1)\n        self.fc = nn.Sequential(\n            nn.Linear(channel, _make_divisible(channel // reduction, 8)),\n            nn.ReLU(inplace=True),\n            nn.Linear(_make_divisible(channel // reduction, 8), channel),\n            h_sigmoid()\n        )\n\n    def forward(self, x):\n        b, c, _, _ = x.size()\n        y = self.avg_pool(x).view(b, c)\n        y = self.fc(y).view(b, c, 1, 1)\n        return x * y\n\n\ndef conv_3x3_bn(inp, oup, stride, padding=1):\n    return nn.Sequential(\n        nn.Conv2d(inp, oup, 3, stride, padding, bias=False),\n        nn.BatchNorm2d(oup),\n        h_swish()\n    )\n\n\ndef conv_1x1_bn(inp, oup):\n    return nn.Sequential(\n        nn.Conv2d(inp, oup, 1, 1, 0, bias=False),\n        nn.BatchNorm2d(oup),\n        h_swish()\n    )\n\n\nclass InvertedResidual(nn.Module):\n    def __init__(self, inp, hidden_dim, oup, kernel_size, stride, use_se, use_hs):\n        super(InvertedResidual, self).__init__()\n        assert stride in [1, 2]\n\n        self.identity = stride == 1 and inp == oup\n\n        if inp == hidden_dim:\n            self.conv = nn.Sequential(\n                # dw\n                nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,\n                          bias=False),\n                nn.BatchNorm2d(hidden_dim),\n                h_swish() if use_hs else nn.ReLU(inplace=True),\n                # Squeeze-and-Excite\n                SELayer(hidden_dim) if use_se else nn.Identity(),\n                # pw-linear\n                nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),\n                nn.BatchNorm2d(oup),\n            )\n        else:\n            self.conv = nn.Sequential(\n                # pw\n                nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),\n                nn.BatchNorm2d(hidden_dim),\n                h_swish() if use_hs else nn.ReLU(inplace=True),\n                # dw\n                nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,\n                          bias=False),\n                nn.BatchNorm2d(hidden_dim),\n                # Squeeze-and-Excite\n                SELayer(hidden_dim) if use_se else nn.Identity(),\n                h_swish() if use_hs else nn.ReLU(inplace=True),\n                # pw-linear\n                nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),\n                nn.BatchNorm2d(oup),\n            )\n\n    def forward(self, x):\n        if self.identity:\n            return x + self.conv(x)\n        else:\n            return self.conv(x)\n\n\nclass MobileNetV3_32x32(nn.Module):\n    def __init__(self, cfgs, mode, channel=3, num_classes=1000, record_embedding=False,\n                 no_grad=False, width_mult=1.):\n        super(MobileNetV3_32x32, self).__init__()\n        # setting of inverted residual blocks\n        self.cfgs = cfgs\n        assert mode in ['mobilenet_v3_large', 'mobilenet_v3_small']\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n        # building first layer\n        input_channel = _make_divisible(16 * width_mult, 8)\n        layers = [conv_3x3_bn(channel, input_channel, 2, padding=3 if channel == 1 else 1)]\n        # building inverted residual blocks\n        block = InvertedResidual\n        for k, t, c, use_se, use_hs, s in self.cfgs:\n            output_channel = _make_divisible(c * width_mult, 8)\n            exp_size = _make_divisible(input_channel * t, 8)\n            layers.append(block(input_channel, exp_size, output_channel, k, s, use_se, use_hs))\n            input_channel = output_channel\n        self.features = nn.Sequential(*layers)\n        # building last several layers\n        self.conv = conv_1x1_bn(input_channel, exp_size)\n        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))\n        output_channel = {'mobilenet_v3_large': 1280, 'mobilenet_v3_small': 1024}\n        output_channel = _make_divisible(output_channel[mode] * width_mult, 8) if width_mult > 1.0 else output_channel[\n            mode]\n        self.classifier = nn.Sequential(\n            nn.Linear(exp_size, output_channel),\n            h_swish(),\n            nn.Dropout(0.2),\n            self.embedding_recorder,\n            nn.Linear(output_channel, num_classes),\n        )\n\n        self._initialize_weights()\n\n    def forward(self, x):\n        with set_grad_enabled(not self.no_grad):\n            x = self.features(x)\n            x = self.conv(x)\n            x = self.avgpool(x)\n            x = x.view(x.size(0), -1)\n            x = self.classifier(x)\n            return x\n\n    def _initialize_weights(self):\n        for m in self.modules():\n            if isinstance(m, nn.Conv2d):\n                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels\n                m.weight.data.normal_(0, math.sqrt(2. / n))\n                if m.bias is not None:\n                    m.bias.data.zero_()\n            elif isinstance(m, nn.BatchNorm2d):\n                m.weight.data.fill_(1)\n                m.bias.data.zero_()\n            elif isinstance(m, nn.Linear):\n                m.weight.data.normal_(0, 0.01)\n                m.bias.data.zero_()\n\n    def get_last_layer(self):\n        return self.classifier[-1]\n\n\nclass MobileNetV3_224x224(mobilenetv3.MobileNetV3):\n    def __init__(self, inverted_residual_setting, last_channel,\n                 channel=3, num_classes=1000, record_embedding=False, no_grad=False, **kwargs):\n        super(MobileNetV3_224x224, self).__init__(inverted_residual_setting, last_channel,\n                                                  num_classes=num_classes, **kwargs)\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n\n        self.fc = self.classifier[-1]\n        self.classifier[-1] = self.embedding_recorder\n        self.classifier.add_module(\"fc\", self.fc)\n\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc\n\n    def _forward_impl(self, x: Tensor) -> Tensor:\n        with set_grad_enabled(not self.no_grad):\n            x = self.features(x)\n            x = self.avgpool(x)\n            x = flatten(x, 1)\n            x = self.classifier(x)\n            return x\n\n\ndef MobileNetV3(arch: str, channel: int, num_classes: int, im_size, record_embedding: bool = False,\n                no_grad: bool = False,\n                pretrained: bool = False, **kwargs):\n    arch = arch.lower()\n    if pretrained:\n        if channel != 3:\n            raise NotImplementedError(\"Network Architecture for current dataset has not been implemented.\")\n\n        inverted_residual_setting, last_channel = mobilenetv3._mobilenet_v3_conf(arch)\n        net = MobileNetV3_224x224(inverted_residual_setting=inverted_residual_setting, last_channel=last_channel,\n                                  channel=3, num_classes=1000, record_embedding=record_embedding, no_grad=no_grad,\n                                  **kwargs)\n\n        from torch.hub import load_state_dict_from_url\n        state_dict = load_state_dict_from_url(mobilenetv3.model_urls[arch], progress=True)\n        net.load_state_dict(state_dict)\n\n        if num_classes != 1000:\n            net.fc = nn.Linear(last_channel, num_classes)\n            net.classifier[-1] = net.fc\n\n    elif im_size[0] == 224 and im_size[1] == 224:\n        if channel != 3:\n            raise NotImplementedError(\"Network Architecture for current dataset has not been implemented.\")\n        inverted_residual_setting, last_channel = mobilenetv3._mobilenet_v3_conf(arch)\n        net = MobileNetV3_224x224(inverted_residual_setting=inverted_residual_setting, last_channel=last_channel,\n                                  channel=channel, num_classes=num_classes, record_embedding=record_embedding,\n                                  no_grad=no_grad, **kwargs)\n\n    elif (channel == 1 and im_size[0] == 28 and im_size[1] == 28) or (\n            channel == 3 and im_size[0] == 32 and im_size[1] == 32):\n        if arch == \"mobilenet_v3_large\":\n            cfgs = [\n                # k, t, c, SE, HS, s\n                [3, 1, 16, 0, 0, 1],\n                [3, 4, 24, 0, 0, 2],\n                [3, 3, 24, 0, 0, 1],\n                [5, 3, 40, 1, 0, 2],\n                [5, 3, 40, 1, 0, 1],\n                [5, 3, 40, 1, 0, 1],\n                [3, 6, 80, 0, 1, 2],\n                [3, 2.5, 80, 0, 1, 1],\n                [3, 2.3, 80, 0, 1, 1],\n                [3, 2.3, 80, 0, 1, 1],\n                [3, 6, 112, 1, 1, 1],\n                [3, 6, 112, 1, 1, 1],\n                [5, 6, 160, 1, 1, 2],\n                [5, 6, 160, 1, 1, 1],\n                [5, 6, 160, 1, 1, 1]\n            ]\n            net = MobileNetV3_32x32(cfgs, arch, channel=channel, num_classes=num_classes,\n                                    record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"mobilenet_v3_small\":\n            cfgs = [\n                # k, t, c, SE, HS, s\n                [3, 1, 16, 1, 0, 2],\n                [3, 4.5, 24, 0, 0, 2],\n                [3, 3.67, 24, 0, 0, 1],\n                [5, 4, 40, 1, 1, 2],\n                [5, 6, 40, 1, 1, 1],\n                [5, 6, 40, 1, 1, 1],\n                [5, 3, 48, 1, 1, 1],\n                [5, 3, 48, 1, 1, 1],\n                [5, 6, 96, 1, 1, 2],\n                [5, 6, 96, 1, 1, 1],\n                [5, 6, 96, 1, 1, 1],\n            ]\n            net = MobileNetV3_32x32(cfgs, arch, channel=channel, num_classes=num_classes,\n                                    record_embedding=record_embedding, no_grad=no_grad)\n        else:\n            raise ValueError(\"Model architecture not found.\")\n    else:\n        raise NotImplementedError(\"Network Architecture for current dataset has not been implemented.\")\n    return net\n\n\ndef MobileNetV3Large(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n                     pretrained: bool = False, **kwargs):\n    return MobileNetV3(\"mobilenet_v3_large\", channel, num_classes, im_size, record_embedding, no_grad,\n                       pretrained, **kwargs)\n\n\ndef MobileNetV3Small(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n                     pretrained: bool = False, **kwargs):\n    return MobileNetV3(\"mobilenet_v3_small\", channel, num_classes, im_size, record_embedding, no_grad,\n                       pretrained, **kwargs)\n"
  },
  {
    "path": "deepcore/nets/nets_utils/__init__.py",
    "content": "from .parallel import *\nfrom .recorder import *"
  },
  {
    "path": "deepcore/nets/nets_utils/parallel.py",
    "content": "from torch.nn import DataParallel\n\n\nclass MyDataParallel(DataParallel):\n    def __getattr__(self, name):\n        try:\n            return super().__getattr__(name)\n        except AttributeError:\n            return getattr(self.module, name)\n    def __setattr__(self, name, value):\n        try:\n            if name == \"no_grad\":\n                return setattr(self.module, name, value)\n            return super().__setattr__(name, value)\n        except AttributeError:\n            return setattr(self.module, name, value)\n"
  },
  {
    "path": "deepcore/nets/nets_utils/recorder.py",
    "content": "from torch import nn\n\n\nclass EmbeddingRecorder(nn.Module):\n    def __init__(self, record_embedding: bool = False):\n        super().__init__()\n        self.record_embedding = record_embedding\n\n    def forward(self, x):\n        if self.record_embedding:\n            self.embedding = x\n        return x\n\n    def __enter__(self):\n        self.record_embedding = True\n\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        self.record_embedding = False"
  },
  {
    "path": "deepcore/nets/resnet.py",
    "content": "import torch.nn as nn\nimport torch.nn.functional as F\nfrom torch import set_grad_enabled, flatten, Tensor\nfrom .nets_utils import EmbeddingRecorder\nfrom torchvision.models import resnet\n\n\n# Acknowledgement to\n# https://github.com/kuangliu/pytorch-cifar,\n# https://github.com/BIGBALLON/CIFAR-ZOO,\n\n\ndef conv3x3(in_planes, out_planes, stride=1):\n    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False)\n\n\nclass BasicBlock(nn.Module):\n    expansion = 1\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(BasicBlock, self).__init__()\n        self.conv1 = conv3x3(in_planes, planes, stride)\n        self.bn1 = nn.BatchNorm2d(planes)\n        self.conv2 = conv3x3(planes, planes)\n        self.bn2 = nn.BatchNorm2d(planes)\n\n        self.shortcut = nn.Sequential()\n        if stride != 1 or in_planes != self.expansion * planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False),\n                nn.BatchNorm2d(self.expansion * planes)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(self.conv1(x)))\n        out = self.bn2(self.conv2(out))\n        out += self.shortcut(x)\n        out = F.relu(out)\n        return out\n\n\nclass Bottleneck(nn.Module):\n    expansion = 4\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(Bottleneck, self).__init__()\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)\n        self.bn1 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv3 = nn.Conv2d(planes, self.expansion * planes, kernel_size=1, bias=False)\n        self.bn3 = nn.BatchNorm2d(self.expansion * planes)\n\n        self.shortcut = nn.Sequential()\n        if stride != 1 or in_planes != self.expansion * planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False),\n                nn.BatchNorm2d(self.expansion * planes)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(self.conv1(x)))\n        out = F.relu(self.bn2(self.conv2(out)))\n        out = self.bn3(self.conv3(out))\n        out += self.shortcut(x)\n        out = F.relu(out)\n        return out\n\n\nclass ResNet_32x32(nn.Module):\n    def __init__(self, block, num_blocks, channel=3, num_classes=10, record_embedding: bool = False,\n                 no_grad: bool = False):\n        super().__init__()\n        self.in_planes = 64\n\n        self.conv1 = conv3x3(channel, 64)\n        self.bn1 = nn.BatchNorm2d(64)\n        self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)\n        self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)\n        self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)\n        self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)\n        self.linear = nn.Linear(512 * block.expansion, num_classes)\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.linear\n\n    def _make_layer(self, block, planes, num_blocks, stride):\n        strides = [stride] + [1] * (num_blocks - 1)\n        layers = []\n        for stride in strides:\n            layers.append(block(self.in_planes, planes, stride))\n            self.in_planes = planes * block.expansion\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n        with set_grad_enabled(not self.no_grad):\n            out = F.relu(self.bn1(self.conv1(x)))\n            out = self.layer1(out)\n            out = self.layer2(out)\n            out = self.layer3(out)\n            out = self.layer4(out)\n            out = F.avg_pool2d(out, 4)\n            out = out.view(out.size(0), -1)\n            out = self.embedding_recorder(out)\n            out = self.linear(out)\n        return out\n\n\nclass ResNet_224x224(resnet.ResNet):\n    def __init__(self, block, layers, channel: int, num_classes: int, record_embedding: bool = False,\n                 no_grad: bool = False, **kwargs):\n        super().__init__(block, layers, **kwargs)\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        if channel != 3:\n            self.conv1 = nn.Conv2d(channel, 64, kernel_size=7, stride=2, padding=3, bias=False)\n        if num_classes != 1000:\n            self.fc = nn.Linear(self.fc.in_features, num_classes)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc\n\n    def _forward_impl(self, x: Tensor) -> Tensor:\n        # See note [TorchScript super()]\n        with set_grad_enabled(not self.no_grad):\n            x = self.conv1(x)\n            x = self.bn1(x)\n            x = self.relu(x)\n            x = self.maxpool(x)\n\n            x = self.layer1(x)\n            x = self.layer2(x)\n            x = self.layer3(x)\n            x = self.layer4(x)\n\n            x = self.avgpool(x)\n            x = flatten(x, 1)\n            x = self.embedding_recorder(x)\n            x = self.fc(x)\n\n        return x\n\n\ndef ResNet(arch: str, channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n           pretrained: bool = False):\n    arch = arch.lower()\n    if pretrained:\n        if arch == \"resnet18\":\n            net = ResNet_224x224(resnet.BasicBlock, [2, 2, 2, 2], channel=3, num_classes=1000,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet34\":\n            net = ResNet_224x224(resnet.BasicBlock, [3, 4, 6, 3], channel=3, num_classes=1000,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet50\":\n            net = ResNet_224x224(resnet.Bottleneck, [3, 4, 6, 3], channel=3, num_classes=1000,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet101\":\n            net = ResNet_224x224(resnet.Bottleneck, [3, 4, 23, 3], channel=3, num_classes=1000,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet152\":\n            net = ResNet_224x224(resnet.Bottleneck, [3, 8, 36, 3], channel=3, num_classes=1000,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        else:\n            raise ValueError(\"Model architecture not found.\")\n        from torch.hub import load_state_dict_from_url\n        state_dict = load_state_dict_from_url(resnet.model_urls[arch], progress=True)\n        net.load_state_dict(state_dict)\n\n        if channel != 3:\n            net.conv1 = nn.Conv2d(channel, 64, kernel_size=7, stride=2, padding=3, bias=False)\n        if num_classes != 1000:\n            net.fc = nn.Linear(net.fc.in_features, num_classes)\n\n    elif im_size[0] == 224 and im_size[1] == 224:\n        if arch == \"resnet18\":\n            net = ResNet_224x224(resnet.BasicBlock, [2, 2, 2, 2], channel=channel, num_classes=num_classes,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet34\":\n            net = ResNet_224x224(resnet.BasicBlock, [3, 4, 6, 3], channel=channel, num_classes=num_classes,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet50\":\n            net = ResNet_224x224(resnet.Bottleneck, [3, 4, 6, 3], channel=channel, num_classes=num_classes,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet101\":\n            net = ResNet_224x224(resnet.Bottleneck, [3, 4, 23, 3], channel=channel, num_classes=num_classes,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet152\":\n            net = ResNet_224x224(resnet.Bottleneck, [3, 8, 36, 3], channel=channel, num_classes=num_classes,\n                                 record_embedding=record_embedding, no_grad=no_grad)\n        else:\n            raise ValueError(\"Model architecture not found.\")\n    elif (channel == 1 and im_size[0] == 28 and im_size[1] == 28) or (\n            channel == 3 and im_size[0] == 32 and im_size[1] == 32):\n        if arch == \"resnet18\":\n            net = ResNet_32x32(BasicBlock, [2, 2, 2, 2], channel=channel, num_classes=num_classes,\n                               record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet34\":\n            net = ResNet_32x32(BasicBlock, [3, 4, 6, 3], channel=channel, num_classes=num_classes,\n                               record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet50\":\n            net = ResNet_32x32(Bottleneck, [3, 4, 6, 3], channel=channel, num_classes=num_classes,\n                               record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet101\":\n            net = ResNet_32x32(Bottleneck, [3, 4, 23, 3], channel=channel, num_classes=num_classes,\n                               record_embedding=record_embedding, no_grad=no_grad)\n        elif arch == \"resnet152\":\n            net = ResNet_32x32(Bottleneck, [3, 8, 36, 3], channel=channel, num_classes=num_classes,\n                               record_embedding=record_embedding, no_grad=no_grad)\n        else:\n            raise ValueError(\"Model architecture not found.\")\n    else:\n        raise NotImplementedError(\"Network Architecture for current dataset has not been implemented.\")\n    return net\n\n\ndef ResNet18(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n             pretrained: bool = False):\n    return ResNet(\"resnet18\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef ResNet34(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n             pretrained: bool = False):\n    return ResNet(\"resnet34\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef ResNet50(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n             pretrained: bool = False):\n    return ResNet(\"resnet50\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef ResNet101(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n              pretrained: bool = False):\n    return ResNet(\"resnet101\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef ResNet152(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n              pretrained: bool = False):\n    return ResNet(\"resnet152\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n"
  },
  {
    "path": "deepcore/nets/vgg.py",
    "content": "import torch.nn as nn\nfrom torch import set_grad_enabled, flatten, Tensor\nfrom .nets_utils import EmbeddingRecorder\nfrom torchvision.models import vgg\n\n# Acknowledgement to\n# https://github.com/kuangliu/pytorch-cifar,\n# https://github.com/BIGBALLON/CIFAR-ZOO,\n\ncfg_vgg = {\n    'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],\n    'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],\n    'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],\n    'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],\n}\n\n\nclass VGG_32x32(nn.Module):\n    def __init__(self, vgg_name, channel, num_classes, record_embedding=False, no_grad=False):\n        super(VGG_32x32, self).__init__()\n        self.channel = channel\n        self.features = self._make_layers(cfg_vgg[vgg_name])\n        self.classifier = nn.Linear(512 if vgg_name != 'VGGS' else 128, num_classes)\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n    def forward(self, x):\n        with set_grad_enabled(not self.no_grad):\n            x = self.features(x)\n            x = x.view(x.size(0), -1)\n            x = self.embedding_recorder(x)\n            x = self.classifier(x)\n        return x\n\n    def get_last_layer(self):\n        return self.classifier\n\n    def _make_layers(self, cfg):\n        layers = []\n        in_channels = self.channel\n        for ic, x in enumerate(cfg):\n            if x == 'M':\n                layers += [nn.MaxPool2d(kernel_size=2, stride=2)]\n            else:\n                layers += [nn.Conv2d(in_channels, x, kernel_size=3, padding=3 if self.channel == 1 and ic == 0 else 1),\n                           nn.BatchNorm2d(x),\n                           nn.ReLU(inplace=True)]\n                in_channels = x\n        layers += [nn.AvgPool2d(kernel_size=1, stride=1)]\n        return nn.Sequential(*layers)\n\n\nclass VGG_224x224(vgg.VGG):\n    def __init__(self, features: nn.Module, channel: int, num_classes: int, record_embedding: bool = False,\n                 no_grad: bool = False, **kwargs):\n        super(VGG_224x224, self).__init__(features, num_classes, **kwargs)\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        if channel != 3:\n            self.features[0] = nn.Conv2d(channel, 64, kernel_size=3, padding=1)\n        self.fc = self.classifier[-1]\n        self.classifier[-1] = self.embedding_recorder\n        self.classifier.add_module(\"fc\", self.fc)\n\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc\n\n    def forward(self, x: Tensor) -> Tensor:\n        with set_grad_enabled(not self.no_grad):\n            x = self.features(x)\n            x = self.avgpool(x)\n            x = flatten(x, 1)\n            x = self.classifier(x)\n            return x\n\n\ndef VGG(arch: str, channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n        pretrained: bool = False):\n    arch = arch.lower()\n    if pretrained:\n        if im_size[0] != 224 or im_size[1] != 224:\n            raise NotImplementedError(\"torchvison pretrained models only accept inputs with size of 224*224\")\n        net = VGG_224x224(features=vgg.make_layers(cfg_vgg[arch], True), channel=3, num_classes=1000,\n                          record_embedding=record_embedding, no_grad=no_grad)\n\n        from torch.hub import load_state_dict_from_url\n        state_dict = load_state_dict_from_url(vgg.model_urls[arch], progress=True)\n        net.load_state_dict(state_dict)\n\n        if channel != 3:\n            net.features[0] = nn.Conv2d(channel, 64, kernel_size=3, padding=1)\n\n        if num_classes != 1000:\n            net.fc = nn.Linear(4096, num_classes)\n            net.classifier[-1] = net.fc\n\n    elif im_size[0] == 224 and im_size[1] == 224:\n        net = VGG_224x224(features=vgg.make_layers(cfg_vgg[arch], True), channel=channel, num_classes=num_classes,\n                          record_embedding=record_embedding, no_grad=no_grad)\n\n    elif (channel == 1 and im_size[0] == 28 and im_size[1] == 28) or (\n            channel == 3 and im_size[0] == 32 and im_size[1] == 32):\n        net = VGG_32x32(arch, channel, num_classes=num_classes, record_embedding=record_embedding, no_grad=no_grad)\n    else:\n        raise NotImplementedError(\"Network Architecture for current dataset has not been implemented.\")\n    return net\n\n\ndef VGG11(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n          pretrained: bool = False):\n    return VGG(\"vgg11\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef VGG13(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n          pretrained: bool = False):\n    return VGG('vgg13', channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef VGG16(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n          pretrained: bool = False):\n    return VGG('vgg16', channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef VGG19(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n          pretrained: bool = False):\n    return VGG('vgg19', channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n"
  },
  {
    "path": "deepcore/nets/wideresnet.py",
    "content": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom .nets_utils import EmbeddingRecorder\nfrom torchvision.models import resnet\nfrom .resnet import ResNet_224x224\n\n\n# Acknowledgement to\n# https://github.com/xternalz/WideResNet-pytorch\n\nclass BasicBlock(nn.Module):\n    def __init__(self, in_planes, out_planes, stride, dropRate=0.0):\n        super(BasicBlock, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.relu1 = nn.ReLU(inplace=True)\n        self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,\n                               padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(out_planes)\n        self.relu2 = nn.ReLU(inplace=True)\n        self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1,\n                               padding=1, bias=False)\n        self.droprate = dropRate\n        self.equalInOut = (in_planes == out_planes)\n        self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,\n                                                                padding=0, bias=False) or None\n\n    def forward(self, x):\n        if not self.equalInOut:\n            x = self.relu1(self.bn1(x))\n        else:\n            out = self.relu1(self.bn1(x))\n        out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x)))\n        if self.droprate > 0:\n            out = F.dropout(out, p=self.droprate, training=self.training)\n        out = self.conv2(out)\n        return torch.add(x if self.equalInOut else self.convShortcut(x), out)\n\n\nclass NetworkBlock(nn.Module):\n    def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0):\n        super(NetworkBlock, self).__init__()\n        self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate)\n\n    def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate):\n        layers = []\n        for i in range(int(nb_layers)):\n            layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate))\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n        return self.layer(x)\n\n\nclass WideResNet_32x32(nn.Module):\n    def __init__(self, depth, num_classes, channel=3, widen_factor=1, drop_rate=0.0, record_embedding=False,\n                 no_grad=False):\n        super(WideResNet_32x32, self).__init__()\n        nChannels = [16, 16 * widen_factor, 32 * widen_factor, 64 * widen_factor]\n        assert ((depth - 4) % 6 == 0)\n        n = (depth - 4) / 6\n        block = BasicBlock\n        # 1st conv before any network block\n        self.conv1 = nn.Conv2d(channel, nChannels[0], kernel_size=3, stride=1,\n                               padding=3 if channel == 1 else 1, bias=False)\n        # 1st block\n        self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, drop_rate)\n        # 2nd block\n        self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, drop_rate)\n        # 3rd block\n        self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, drop_rate)\n        # global average pooling and classifier\n        self.bn1 = nn.BatchNorm2d(nChannels[3])\n        self.relu = nn.ReLU(inplace=True)\n        self.fc = nn.Linear(nChannels[3], num_classes)\n        self.nChannels = nChannels[3]\n\n        for m in self.modules():\n            if isinstance(m, nn.Conv2d):\n                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')\n            elif isinstance(m, nn.BatchNorm2d):\n                m.weight.data.fill_(1)\n                m.bias.data.zero_()\n            elif isinstance(m, nn.Linear):\n                m.bias.data.zero_()\n\n        self.embedding_recorder = EmbeddingRecorder(record_embedding)\n        self.no_grad = no_grad\n\n    def get_last_layer(self):\n        return self.fc\n\n    def forward(self, x):\n        with torch.set_grad_enabled(not self.no_grad):\n            out = self.conv1(x)\n            out = self.block1(out)\n            out = self.block2(out)\n            out = self.block3(out)\n            out = self.relu(self.bn1(out))\n            out = F.avg_pool2d(out, 8)\n            out = out.view(-1, self.nChannels)\n            out = self.embedding_recorder(out)\n        return self.fc(out)\n\n\ndef WideResNet(arch: str, channel: int, num_classes: int, im_size, record_embedding: bool = False,\n               no_grad: bool = False, pretrained: bool = False):\n    arch = arch.lower()\n    if pretrained:\n        if im_size[0] != 224 or im_size[1] != 224:\n            raise NotImplementedError(\"torchvison pretrained models only accept inputs with size of 224*224\")\n        if arch == \"wrn502\":\n            arch = \"wide_resnet50_2\"\n            net = ResNet_224x224(resnet.Bottleneck, [3, 4, 6, 3], channel=3, num_classes=1000,\n                                 record_embedding=record_embedding, no_grad=no_grad, width_per_group=64 * 2)\n        elif arch == \"wrn1012\":\n            arch = \"wide_resnet101_2\"\n            net = ResNet_224x224(resnet.Bottleneck, [3, 4, 23, 3], channel=3, num_classes=1000,\n                                 record_embedding=record_embedding, no_grad=no_grad, width_per_group=64 * 2)\n        else:\n            raise ValueError(\"Model architecture not found.\")\n        from torch.hub import load_state_dict_from_url\n        state_dict = load_state_dict_from_url(resnet.model_urls[arch], progress=True)\n        net.load_state_dict(state_dict)\n\n        if channel != 3:\n            net.conv1 = nn.Conv2d(channel, 64, kernel_size=7, stride=2, padding=3, bias=False)\n        if num_classes != 1000:\n            net.fc = nn.Linear(net.fc.in_features, num_classes)\n\n    elif im_size[0] == 224 and im_size[1] == 224:\n        # Use torchvision models without pretrained parameters\n        if arch == \"wrn502\":\n            arch = \"wide_resnet50_2\"\n            net = ResNet_224x224(resnet.Bottleneck, [3, 4, 6, 3], channel=channel, num_classes=num_classes,\n                                 record_embedding=record_embedding, no_grad=no_grad, width_per_group=64 * 2)\n        elif arch == \"wrn1012\":\n            arch = \"wide_resnet101_2\"\n            net = ResNet_224x224(resnet.Bottleneck, [3, 4, 23, 3], channel=channel, num_classes=num_classes,\n                                 record_embedding=record_embedding, no_grad=no_grad, width_per_group=64 * 2)\n        else:\n            raise ValueError(\"Model architecture not found.\")\n\n    elif (channel == 1 and im_size[0] == 28 and im_size[1] == 28) or (\n            channel == 3 and im_size[0] == 32 and im_size[1] == 32):\n        if arch == \"wrn168\":\n            net = WideResNet_32x32(16, num_classes, channel, 8)\n        elif arch == \"wrn2810\":\n            net = WideResNet_32x32(28, num_classes, channel, 10)\n        elif arch == \"wrn282\":\n            net = WideResNet_32x32(28, num_classes, channel, 2)\n        else:\n            raise ValueError(\"Model architecture not found.\")\n    else:\n        raise NotImplementedError(\"Network Architecture for current dataset has not been implemented.\")\n    return net\n\n\ndef WRN168(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n           pretrained: bool = False):\n    return WideResNet(\"wrn168\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef WRN2810(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n            pretrained: bool = False):\n    return WideResNet(\"wrn2810\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef WRN282(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n           pretrained: bool = False):\n    return WideResNet('wrn282', channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef WRN502(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n           pretrained: bool = False):\n    return WideResNet(\"wrn502\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n\n\ndef WRN1012(channel: int, num_classes: int, im_size, record_embedding: bool = False, no_grad: bool = False,\n            pretrained: bool = False):\n    return WideResNet(\"wrn1012\", channel, num_classes, im_size, record_embedding, no_grad, pretrained)\n"
  },
  {
    "path": "main.py",
    "content": "import os\nimport torch.nn as nn\nimport argparse\nimport deepcore.nets as nets\nimport deepcore.datasets as datasets\nimport deepcore.methods as methods\nfrom torchvision import transforms\nfrom utils import *\nfrom datetime import datetime\nfrom time import sleep\n\n\ndef main():\n    parser = argparse.ArgumentParser(description='Parameter Processing')\n\n    # Basic arguments\n    parser.add_argument('--dataset', type=str, default='CIFAR10', help='dataset')\n    parser.add_argument('--model', type=str, default='ResNet18', help='model')\n    parser.add_argument('--selection', type=str, default=\"uniform\", help=\"selection method\")\n    parser.add_argument('--num_exp', type=int, default=5, help='the number of experiments')\n    parser.add_argument('--num_eval', type=int, default=10, help='the number of evaluating randomly initialized models')\n    parser.add_argument('--epochs', default=200, type=int, help='number of total epochs to run')\n    parser.add_argument('--data_path', type=str, default='data', help='dataset path')\n    parser.add_argument('--gpu', default=None, nargs=\"+\", type=int, help='GPU id to use')\n    parser.add_argument('--print_freq', '-p', default=20, type=int, help='print frequency (default: 20)')\n    parser.add_argument('--fraction', default=0.1, type=float, help='fraction of data to be selected (default: 0.1)')\n    parser.add_argument('--seed', default=int(time.time() * 1000) % 100000, type=int, help=\"random seed\")\n    parser.add_argument('-j', '--workers', default=4, type=int, metavar='N',\n                        help='number of data loading workers (default: 4)')\n    parser.add_argument(\"--cross\", type=str, nargs=\"+\", default=None, help=\"models for cross-architecture experiments\")\n\n    # Optimizer and scheduler\n    parser.add_argument('--optimizer', default=\"SGD\", help='optimizer to use, e.g. SGD, Adam')\n    parser.add_argument('--lr', type=float, default=0.1, help='learning rate for updating network parameters')\n    parser.add_argument('--min_lr', type=float, default=1e-4, help='minimum learning rate')\n    parser.add_argument('--momentum', default=0.9, type=float, metavar='M',\n                        help='momentum (default: 0.9)')\n    parser.add_argument('-wd', '--weight_decay', default=5e-4, type=float,\n                        metavar='W', help='weight decay (default: 5e-4)',\n                        dest='weight_decay')\n    parser.add_argument(\"--nesterov\", default=True, type=str_to_bool, help=\"if set nesterov\")\n    parser.add_argument(\"--scheduler\", default=\"CosineAnnealingLR\", type=str, help=\n    \"Learning rate scheduler\")\n    parser.add_argument(\"--gamma\", type=float, default=.5, help=\"Gamma value for StepLR\")\n    parser.add_argument(\"--step_size\", type=float, default=50, help=\"Step size for StepLR\")\n\n    # Training\n    parser.add_argument('--batch', '--batch-size', \"-b\", default=256, type=int, metavar='N',\n                        help='mini-batch size (default: 256)')\n    parser.add_argument(\"--train_batch\", \"-tb\", default=None, type=int,\n                     help=\"batch size for training, if not specified, it will equal to batch size in argument --batch\")\n    parser.add_argument(\"--selection_batch\", \"-sb\", default=None, type=int,\n                     help=\"batch size for selection, if not specified, it will equal to batch size in argument --batch\")\n\n    # Testing\n    parser.add_argument(\"--test_interval\", '-ti', default=1, type=int, help=\n    \"the number of training epochs to be preformed between two test epochs; a value of 0 means no test will be run (default: 1)\")\n    parser.add_argument(\"--test_fraction\", '-tf', type=float, default=1.,\n                        help=\"proportion of test dataset used for evaluating the model (default: 1.)\")\n\n    # Selecting\n    parser.add_argument(\"--selection_epochs\", \"-se\", default=40, type=int,\n                        help=\"number of epochs whiling performing selection on full dataset\")\n    parser.add_argument('--selection_momentum', '-sm', default=0.9, type=float, metavar='M',\n                        help='momentum whiling performing selection (default: 0.9)')\n    parser.add_argument('--selection_weight_decay', '-swd', default=5e-4, type=float,\n                        metavar='W', help='weight decay whiling performing selection (default: 5e-4)',\n                        dest='selection_weight_decay')\n    parser.add_argument('--selection_optimizer', \"-so\", default=\"SGD\",\n                        help='optimizer to use whiling performing selection, e.g. SGD, Adam')\n    parser.add_argument(\"--selection_nesterov\", \"-sn\", default=True, type=str_to_bool,\n                        help=\"if set nesterov whiling performing selection\")\n    parser.add_argument('--selection_lr', '-slr', type=float, default=0.1, help='learning rate for selection')\n    parser.add_argument(\"--selection_test_interval\", '-sti', default=1, type=int, help=\n    \"the number of training epochs to be preformed between two test epochs during selection (default: 1)\")\n    parser.add_argument(\"--selection_test_fraction\", '-stf', type=float, default=1.,\n             help=\"proportion of test dataset used for evaluating the model while preforming selection (default: 1.)\")\n    parser.add_argument('--balance', default=True, type=str_to_bool,\n                        help=\"whether balance selection is performed per class\")\n\n    # Algorithm\n    parser.add_argument('--submodular', default=\"GraphCut\", help=\"specifiy submodular function to use\")\n    parser.add_argument('--submodular_greedy', default=\"LazyGreedy\", help=\"specifiy greedy algorithm for submodular optimization\")\n    parser.add_argument('--uncertainty', default=\"Entropy\", help=\"specifiy uncertanty score to use\")\n\n    # Checkpoint and resumption\n    parser.add_argument('--save_path', \"-sp\", type=str, default='', help='path to save results (default: do not save)')\n    parser.add_argument('--resume', '-r', type=str, default='', help=\"path to latest checkpoint (default: do not load)\")\n\n    args = parser.parse_args()\n    args.device = 'cuda' if torch.cuda.is_available() else 'cpu'\n\n    if args.train_batch is None:\n        args.train_batch = args.batch\n    if args.selection_batch is None:\n        args.selection_batch = args.batch\n    if args.save_path != \"\" and not os.path.exists(args.save_path):\n        os.mkdir(args.save_path)\n    if not os.path.exists(args.data_path):\n        os.mkdir(args.data_path)\n\n    if args.resume != \"\":\n        # Load checkpoint\n        try:\n            print(\"=> Loading checkpoint '{}'\".format(args.resume))\n            checkpoint = torch.load(args.resume, map_location=args.device)\n            assert {\"exp\", \"epoch\", \"state_dict\", \"opt_dict\", \"best_acc1\", \"rec\", \"subset\", \"sel_args\"} <= set(\n                checkpoint.keys())\n            assert 'indices' in checkpoint[\"subset\"].keys()\n            start_exp = checkpoint['exp']\n            start_epoch = checkpoint[\"epoch\"]\n        except AssertionError:\n            try:\n                assert {\"exp\", \"subset\", \"sel_args\"} <= set(checkpoint.keys())\n                assert 'indices' in checkpoint[\"subset\"].keys()\n                print(\"=> The checkpoint only contains the subset, training will start from the begining\")\n                start_exp = checkpoint['exp']\n                start_epoch = 0\n            except AssertionError:\n                print(\"=> Failed to load the checkpoint, an empty one will be created\")\n                checkpoint = {}\n                start_exp = 0\n                start_epoch = 0\n    else:\n        checkpoint = {}\n        start_exp = 0\n        start_epoch = 0\n\n    for exp in range(start_exp, args.num_exp):\n        if args.save_path != \"\":\n            checkpoint_name = \"{dst}_{net}_{mtd}_exp{exp}_epoch{epc}_{dat}_{fr}_\".format(dst=args.dataset,\n                                                                                         net=args.model,\n                                                                                         mtd=args.selection,\n                                                                                         dat=datetime.now(),\n                                                                                         exp=start_exp,\n                                                                                         epc=args.epochs,\n                                                                                         fr=args.fraction)\n\n        print('\\n================== Exp %d ==================\\n' % exp)\n        print(\"dataset: \", args.dataset, \", model: \", args.model, \", selection: \", args.selection, \", num_ex: \",\n              args.num_exp, \", epochs: \", args.epochs, \", fraction: \", args.fraction, \", seed: \", args.seed,\n              \", lr: \", args.lr, \", save_path: \", args.save_path, \", resume: \", args.resume, \", device: \", args.device,\n              \", checkpoint_name: \" + checkpoint_name if args.save_path != \"\" else \"\", \"\\n\", sep=\"\")\n\n        channel, im_size, num_classes, class_names, mean, std, dst_train, dst_test = datasets.__dict__[args.dataset] \\\n            (args.data_path)\n        args.channel, args.im_size, args.num_classes, args.class_names = channel, im_size, num_classes, class_names\n\n        torch.random.manual_seed(args.seed)\n\n        if \"subset\" in checkpoint.keys():\n            subset = checkpoint['subset']\n            selection_args = checkpoint[\"sel_args\"]\n        else:\n            selection_args = dict(epochs=args.selection_epochs,\n                                  selection_method=args.uncertainty,\n                                  balance=args.balance,\n                                  greedy=args.submodular_greedy,\n                                  function=args.submodular\n                                  )\n            method = methods.__dict__[args.selection](dst_train, args, args.fraction, args.seed, **selection_args)\n            subset = method.select()\n        print(len(subset[\"indices\"]))\n\n        # Augmentation\n        if args.dataset == \"CIFAR10\" or args.dataset == \"CIFAR100\":\n            dst_train.transform = transforms.Compose(\n                [transforms.RandomCrop(args.im_size, padding=4, padding_mode=\"reflect\"),\n                 transforms.RandomHorizontalFlip(), dst_train.transform])\n        elif args.dataset == \"ImageNet\":\n            dst_train.transform = transforms.Compose([\n                transforms.RandomResizedCrop(224),\n                transforms.RandomHorizontalFlip(),\n                transforms.ToTensor(),\n                transforms.Normalize(mean, std)\n            ])\n\n        # Handle weighted subset\n        if_weighted = \"weights\" in subset.keys()\n        if if_weighted:\n            dst_subset = WeightedSubset(dst_train, subset[\"indices\"], subset[\"weights\"])\n        else:\n            dst_subset = torch.utils.data.Subset(dst_train, subset[\"indices\"])\n\n        # BackgroundGenerator for ImageNet to speed up dataloaders\n        if args.dataset == \"ImageNet\":\n            train_loader = DataLoaderX(dst_subset, batch_size=args.train_batch, shuffle=True,\n                                       num_workers=args.workers, pin_memory=True)\n            test_loader = DataLoaderX(dst_test, batch_size=args.train_batch, shuffle=False,\n                                      num_workers=args.workers, pin_memory=True)\n        else:\n            train_loader = torch.utils.data.DataLoader(dst_subset, batch_size=args.train_batch, shuffle=True,\n                                                       num_workers=args.workers, pin_memory=True)\n            test_loader = torch.utils.data.DataLoader(dst_test, batch_size=args.train_batch, shuffle=False,\n                                                      num_workers=args.workers, pin_memory=True)\n\n        # Listing cross-architecture experiment settings if specified.\n        models = [args.model]\n        if isinstance(args.cross, list):\n            for model in args.cross:\n                if model != args.model:\n                    models.append(model)\n\n        for model in models:\n            if len(models) > 1:\n                print(\"| Training on model %s\" % model)\n\n            network = nets.__dict__[model](channel, num_classes, im_size).to(args.device)\n\n            if args.device == \"cpu\":\n                print(\"Using CPU.\")\n            elif args.gpu is not None:\n                torch.cuda.set_device(args.gpu[0])\n                network = nets.nets_utils.MyDataParallel(network, device_ids=args.gpu)\n            elif torch.cuda.device_count() > 1:\n                network = nets.nets_utils.MyDataParallel(network).cuda()\n\n            if \"state_dict\" in checkpoint.keys():\n                # Loading model state_dict\n                network.load_state_dict(checkpoint[\"state_dict\"])\n\n            criterion = nn.CrossEntropyLoss(reduction='none').to(args.device)\n\n            # Optimizer\n            if args.optimizer == \"SGD\":\n                optimizer = torch.optim.SGD(network.parameters(), args.lr, momentum=args.momentum,\n                                            weight_decay=args.weight_decay, nesterov=args.nesterov)\n            elif args.optimizer == \"Adam\":\n                optimizer = torch.optim.Adam(network.parameters(), args.lr, weight_decay=args.weight_decay)\n            else:\n                optimizer = torch.optim.__dict__[args.optimizer](network.parameters(), args.lr, momentum=args.momentum,\n                                                                 weight_decay=args.weight_decay, nesterov=args.nesterov)\n\n            # LR scheduler\n            if args.scheduler == \"CosineAnnealingLR\":\n                scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, len(train_loader) * args.epochs,\n                                                                       eta_min=args.min_lr)\n            elif args.scheduler == \"StepLR\":\n                scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=len(train_loader) * args.step_size,\n                                                            gamma=args.gamma)\n            else:\n                scheduler = torch.optim.lr_scheduler.__dict__[args.scheduler](optimizer)\n            scheduler.last_epoch = (start_epoch - 1) * len(train_loader)\n\n            if \"opt_dict\" in checkpoint.keys():\n                optimizer.load_state_dict(checkpoint[\"opt_dict\"])\n\n            # Log recorder\n            if \"rec\" in checkpoint.keys():\n                rec = checkpoint[\"rec\"]\n            else:\n                rec = init_recorder()\n\n            best_prec1 = checkpoint[\"best_acc1\"] if \"best_acc1\" in checkpoint.keys() else 0.0\n\n            # Save the checkpont with only the susbet.\n            if args.save_path != \"\" and args.resume == \"\":\n                save_checkpoint({\"exp\": exp,\n                                 \"subset\": subset,\n                                 \"sel_args\": selection_args},\n                                os.path.join(args.save_path, checkpoint_name + (\"\" if model == args.model else model\n                                             + \"_\") + \"unknown.ckpt\"), 0, 0.)\n\n            for epoch in range(start_epoch, args.epochs):\n                # train for one epoch\n                train(train_loader, network, criterion, optimizer, scheduler, epoch, args, rec, if_weighted=if_weighted)\n\n                # evaluate on validation set\n                if args.test_interval > 0 and (epoch + 1) % args.test_interval == 0:\n                    prec1 = test(test_loader, network, criterion, epoch, args, rec)\n\n                    # remember best prec@1 and save checkpoint\n                    is_best = prec1 > best_prec1\n\n                    if is_best:\n                        best_prec1 = prec1\n                        if args.save_path != \"\":\n                            rec = record_ckpt(rec, epoch)\n                            save_checkpoint({\"exp\": exp,\n                                             \"epoch\": epoch + 1,\n                                             \"state_dict\": network.state_dict(),\n                                             \"opt_dict\": optimizer.state_dict(),\n                                             \"best_acc1\": best_prec1,\n                                             \"rec\": rec,\n                                             \"subset\": subset,\n                                             \"sel_args\": selection_args},\n                                            os.path.join(args.save_path, checkpoint_name + (\n                                                \"\" if model == args.model else model + \"_\") + \"unknown.ckpt\"),\n                                            epoch=epoch, prec=best_prec1)\n\n            # Prepare for the next checkpoint\n            if args.save_path != \"\":\n                try:\n                    os.rename(\n                        os.path.join(args.save_path, checkpoint_name + (\"\" if model == args.model else model + \"_\") +\n                                     \"unknown.ckpt\"), os.path.join(args.save_path, checkpoint_name +\n                                     (\"\" if model == args.model else model + \"_\") + \"%f.ckpt\" % best_prec1))\n                except:\n                    save_checkpoint({\"exp\": exp,\n                                     \"epoch\": args.epochs,\n                                     \"state_dict\": network.state_dict(),\n                                     \"opt_dict\": optimizer.state_dict(),\n                                     \"best_acc1\": best_prec1,\n                                     \"rec\": rec,\n                                     \"subset\": subset,\n                                     \"sel_args\": selection_args},\n                                    os.path.join(args.save_path, checkpoint_name +\n                                                 (\"\" if model == args.model else model + \"_\") + \"%f.ckpt\" % best_prec1),\n                                    epoch=args.epochs - 1,\n                                    prec=best_prec1)\n\n            print('| Best accuracy: ', best_prec1, \", on model \" + model if len(models) > 1 else \"\", end=\"\\n\\n\")\n            start_epoch = 0\n            checkpoint = {}\n            sleep(2)\n\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "requirements.txt",
    "content": "numpy==1.22\nprefetch_generator==1.0.1\nrequests==2.25.1\nscipy==1.5.3\ntorch==1.10.1\ntorchvision==0.11.2\n"
  },
  {
    "path": "utils.py",
    "content": "import time, torch\nfrom argparse import ArgumentTypeError\nfrom prefetch_generator import BackgroundGenerator\n\n\nclass WeightedSubset(torch.utils.data.Subset):\n    def __init__(self, dataset, indices, weights) -> None:\n        self.dataset = dataset\n        assert len(indices) == len(weights)\n        self.indices = indices\n        self.weights = weights\n\n    def __getitem__(self, idx):\n        if isinstance(idx, list):\n            return self.dataset[[self.indices[i] for i in idx]], self.weights[[i for i in idx]]\n        return self.dataset[self.indices[idx]], self.weights[idx]\n\n\ndef train(train_loader, network, criterion, optimizer, scheduler, epoch, args, rec, if_weighted: bool = False):\n    \"\"\"Train for one epoch on the training set\"\"\"\n    batch_time = AverageMeter('Time', ':6.3f')\n    losses = AverageMeter('Loss', ':.4e')\n    top1 = AverageMeter('Acc@1', ':6.2f')\n\n    # switch to train mode\n    network.train()\n\n    end = time.time()\n    for i, contents in enumerate(train_loader):\n        optimizer.zero_grad()\n        if if_weighted:\n            target = contents[0][1].to(args.device)\n            input = contents[0][0].to(args.device)\n\n            # Compute output\n            output = network(input)\n            weights = contents[1].to(args.device).requires_grad_(False)\n            loss = torch.sum(criterion(output, target) * weights) / torch.sum(weights)\n        else:\n            target = contents[1].to(args.device)\n            input = contents[0].to(args.device)\n\n            # Compute output\n            output = network(input)\n            loss = criterion(output, target).mean()\n\n        # Measure accuracy and record loss\n        prec1 = accuracy(output.data, target, topk=(1,))[0]\n        losses.update(loss.data.item(), input.size(0))\n        top1.update(prec1.item(), input.size(0))\n\n        # Compute gradient and do SGD step\n        loss.backward()\n        optimizer.step()\n        scheduler.step()\n\n        # Measure elapsed time\n        batch_time.update(time.time() - end)\n        end = time.time()\n\n        if i % args.print_freq == 0:\n            print('Epoch: [{0}][{1}/{2}]\\t'\n                  'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\\t'\n                  'Loss {loss.val:.4f} ({loss.avg:.4f})\\t'\n                  'Prec@1 {top1.val:.3f} ({top1.avg:.3f})'.format(\n                epoch, i, len(train_loader), batch_time=batch_time,\n                loss=losses, top1=top1))\n\n    record_train_stats(rec, epoch, losses.avg, top1.avg, optimizer.state_dict()['param_groups'][0]['lr'])\n\n\ndef test(test_loader, network, criterion, epoch, args, rec):\n    batch_time = AverageMeter('Time', ':6.3f')\n    losses = AverageMeter('Loss', ':.4e')\n    top1 = AverageMeter('Acc@1', ':6.2f')\n\n    # Switch to evaluate mode\n    network.eval()\n    network.no_grad = True\n\n    end = time.time()\n    for i, (input, target) in enumerate(test_loader):\n        target = target.to(args.device)\n        input = input.to(args.device)\n\n        # Compute output\n        with torch.no_grad():\n            output = network(input)\n\n            loss = criterion(output, target).mean()\n\n        # Measure accuracy and record loss\n        prec1 = accuracy(output.data, target, topk=(1,))[0]\n        losses.update(loss.data.item(), input.size(0))\n        top1.update(prec1.item(), input.size(0))\n\n        # Measure elapsed time\n        batch_time.update(time.time() - end)\n        end = time.time()\n\n        if i % args.print_freq == 0:\n            print('Test: [{0}/{1}]\\t'\n                  'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\\t'\n                  'Loss {loss.val:.4f} ({loss.avg:.4f})\\t'\n                  'Prec@1 {top1.val:.3f} ({top1.avg:.3f})'.format(\n                i, len(test_loader), batch_time=batch_time, loss=losses,\n                top1=top1))\n\n    print(' * Prec@1 {top1.avg:.3f}'.format(top1=top1))\n\n    network.no_grad = False\n\n    record_test_stats(rec, epoch, losses.avg, top1.avg)\n    return top1.avg\n\n\nclass AverageMeter(object):\n    \"\"\"Computes and stores the average and current value\"\"\"\n\n    def __init__(self, name, fmt=':f'):\n        self.name = name\n        self.fmt = fmt\n        self.reset()\n\n    def reset(self):\n        self.val = 0\n        self.avg = 0\n        self.sum = 0\n        self.count = 0\n\n    def update(self, val, n=1):\n        self.val = val\n        self.sum += val * n\n        self.count += n\n        self.avg = self.sum / self.count\n\n    def __str__(self):\n        fmtstr = '{name} {val' + self.fmt + '} ({avg' + self.fmt + '})'\n        return fmtstr.format(**self.__dict__)\n\n\ndef accuracy(output, target, topk=(1,)):\n    \"\"\"Computes the accuracy over the k top predictions for the specified values of k\"\"\"\n    with torch.no_grad():\n        maxk = max(topk)\n        batch_size = target.size(0)\n\n        _, pred = output.topk(maxk, 1, True, True)\n        pred = pred.t()\n        correct = pred.eq(target.view(1, -1).expand_as(pred))\n\n        res = []\n        for k in topk:\n            correct_k = correct[:k].reshape(-1).float().sum(0, keepdim=True)\n            res.append(correct_k.mul_(100.0 / batch_size))\n        return res\n\n\ndef str_to_bool(v):\n    # Handle boolean type in arguments.\n    if isinstance(v, bool):\n        return v\n    if v.lower() in ('yes', 'true', 't', 'y', '1'):\n        return True\n    elif v.lower() in ('no', 'false', 'f', 'n', '0'):\n        return False\n    else:\n        raise ArgumentTypeError('Boolean value expected.')\n\n\ndef save_checkpoint(state, path, epoch, prec):\n    print(\"=> Saving checkpoint for epoch %d, with Prec@1 %f.\" % (epoch, prec))\n    torch.save(state, path)\n\n\ndef init_recorder():\n    from types import SimpleNamespace\n    rec = SimpleNamespace()\n    rec.train_step = []\n    rec.train_loss = []\n    rec.train_acc = []\n    rec.lr = []\n    rec.test_step = []\n    rec.test_loss = []\n    rec.test_acc = []\n    rec.ckpts = []\n    return rec\n\n\ndef record_train_stats(rec, step, loss, acc, lr):\n    rec.train_step.append(step)\n    rec.train_loss.append(loss)\n    rec.train_acc.append(acc)\n    rec.lr.append(lr)\n    return rec\n\n\ndef record_test_stats(rec, step, loss, acc):\n    rec.test_step.append(step)\n    rec.test_loss.append(loss)\n    rec.test_acc.append(acc)\n    return rec\n\n\ndef record_ckpt(rec, step):\n    rec.ckpts.append(step)\n    return rec\n\n\nclass DataLoaderX(torch.utils.data.DataLoader):\n    def __iter__(self):\n        return BackgroundGenerator(super().__iter__())\n"
  }
]