[
  {
    "path": ".gitignore",
    "content": "*.checkpoint*\n**data/\n*__pycache__*\n*mig*\n**ttt*/\n**/*mig*/\n**log/**\n**/log/**\n*events*\n*.txt\n*.idea/\n"
  },
  {
    "path": "README.md",
    "content": "# YOPO (You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle)\nCode for our  [paper](https://arxiv.org/abs/1905.00877): \"You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle\" by [Dinghuai Zhang](https://zdhnarsil.github.io), [Tianyuan Zhang](http://tianyuanzhang.com), [Yiping Lu](https://web.stanford.edu/~yplu/), [Zhanxing Zhu](https://sites.google.com/view/zhanxingzhu/), [Bin Dong](http://bicmr.pku.edu.cn/~dongbin/).\n\nOur paper has been accepted by **NeurIPS2019**.\n\n![The Pipeline of YOPO](/images/pipeline.jpg)\n\n\n## Prerequisites\n* Pytorch==1.0.1, torchvision\n* Python 3.5\n* tensorboardX\n* easydict\n* tqdm\n\n## Intall\n```bash\ngit clone https://github.com/a1600012888/YOPO-You-Only-Propagate-Once.git\ncd YOPO-You-Only-Propagate-Once\npip3 install -r requirements.txt --user\n```\n\n## How to run our code\n\n### Natural training and PGD training \n* normal training: `experiments/CIFAR10/wide34.natural`\n* PGD adversarial training: `experiments/CIFAR10/wide34.pgd10`\nrun `python train.py -d <whcih_gpu>`\n\nYou can change all the hyper-parameters in `config.py`. And change network in `network.py`\nActually code in above mentioned director is very **flexible** and can be easiliy modified. It can be used as a **template**. \n\n### YOPO training\nGo to directory `experiments/CIFAR10/wide34.yopo-5-3`\nrun `python train.py -d <whcih_gpu>`\n\nYou can change all the hyper-parameters in `config.py`. And change network in `network.py`\nRuning this code for the first time will dowload the dataset in `./experiments/CIFAR10/data/`, you can modify the path in `dataset.py`\n\n<!--\n## Experiment results\n\n<center class=\"half\">\n    <img src=\"https://s2.ax1x.com/2019/05/16/EbamrT.jpg\" width=\"300\"/><img src=\"https://s2.ax1x.com/2019/05/16/EbatsK.jpg\" width=\"300\"/>\n</center>\n-->\n\n## Miscellaneous\nA C++ implementation by [Nitin Shyamkumar](https://scholar.google.com/citations?user=lF0ZyBQAAAAJ&hl=en) is provided [here](https://github.com/nitinshyamk/yopo-inference)! Thank you Nitin for your work!\n\nThe mainbody of `experiments/CIFAR10-TRADES/baseline.res-pre18.TRADES.10step` is written according to \n[TRADES official repo](https://github.com/yaodongyu/TRADES)\n\nA tensorflow implementation provided by [Runtian Zhai](http://www.runtianz.cn/) is provided\n [here](https://colab.research.google.com/drive/1hglbkT4Tzf8BOkvX185jFmAND9M67zoZ#scrollTo=OMyffsWl1b4y).\nThe implemetation of the [\"For Free\"](https://arxiv.org/abs/1904.12843) paper is also included. It turns out that our \nYOPO is faster than \"For Free\" (detailed results will come soon). \nThanks for Runtian's help!\n\n\n## Cite\n```\n@article{zhang2019you,\n  title={You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle},\n  author={Zhang, Dinghuai and Zhang, Tianyuan and Lu, Yiping and Zhu, Zhanxing and Dong, Bin},\n  journal={arXiv preprint arXiv:1905.00877},\n  year={2019}\n}\n```\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.pgd10/config.py",
    "content": "from easydict import EasyDict\nimport sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 105\n    val_interval = 5\n\n    create_optimizer = SGDOptimizerMaker(lr =5e-2, momentum = 0.9, weight_decay = 5e-4)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [75, 90, 100], gamma = 0.1)\n\n    create_loss_function = torch.nn.CrossEntropyLoss\n\n    create_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 10, norm = np.inf,\n                              mean = torch.tensor(np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std = torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 20, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\n# About data\n# C.inp_chn = 1\n# C.num_class = 10\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.pgd10/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.RandomCrop(32, padding=4),\n     transforms.RandomHorizontalFlip(),\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n\n    trainset = torchvision.datasets.CIFAR10(root=root, train=True, download=True, transform=transform_train)\n    #trainset = torchvision.datasets.MNIST(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n    testset = torchvision.datasets.CIFAR10(root=root, train=False, download=True, transform=transform_test)\n    #testset = torchvision.datasets.MNIST(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.pgd10/eval.py",
    "content": "from config import config\nfrom dataset import create_test_dataset\nfrom network import create_network\n\nfrom training.train import eval_one_epoch\nfrom utils.misc import load_checkpoint\n\nimport argparse\nimport torch\nimport numpy as np\nimport os\n\nparser = argparse.ArgumentParser()\nparser.add_argument('--resume', '--resume', default='log/models/last.checkpoint',\n                    type=str, metavar='PATH',\n                    help='path to latest checkpoint (default:log/last.checkpoint)')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nargs = parser.parse_args()\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\n\nds_val = create_test_dataset(512)\n\nAttackMethod = config.create_evaluation_attack_method(DEVICE)\n\nif os.path.isfile(args.resume):\n    load_checkpoint(args.resume, net)\n\n\nprint('Evaluating')\nclean_acc, adv_acc = eval_one_epoch(net, ds_val, DEVICE, AttackMethod)\nprint('clean acc -- {}     adv acc -- {}'.format(clean_acc, adv_acc))\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.pgd10/network.py",
    "content": "'''Pre-activation ResNet in PyTorch.\nReference:\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Identity Mappings in Deep Residual Networks. arXiv:1603.05027\n'''\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\n\nclass PreActBlock(nn.Module):\n    '''Pre-activation version of the BasicBlock.'''\n    expansion = 1\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBlock, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion * planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out += shortcut\n        return out\n\n\nclass PreActResNet(nn.Module):\n\n    def __init__(self, block, num_blocks, num_classes=10):\n        super(PreActResNet, self).__init__()\n        self.in_planes = 64\n\n        self.other_layers = nn.ModuleList()\n\n        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)\n\n        self.layer_one = self.conv1\n\n\n        self.other_layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)\n        self.other_layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)\n        self.other_layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)\n        self.other_layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)\n\n        self.linear = GlobalpoolFC(512 * block.expansion, num_classes)\n        self.other_layers.append(self.linear)\n\n    def _make_layer(self, block, planes, num_blocks, stride):\n        strides = [stride] + [1] * (num_blocks - 1)\n        layers = []\n        for stride in strides:\n            layers.append(block(self.in_planes, planes, stride))\n            self.other_layers.append(layers[-1])\n\n            self.in_planes = planes * block.expansion\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n\n        x = self.layer_one(x)\n        self.layer_one_out = x\n        self.layer_one_out.requires_grad_()\n        self.layer_one_out.retain_grad()\n        x = self.layer_one_out\n\n        for layer in self.other_layers:\n            x = layer(x)\n\n\n        '''\n        out = self.conv1(x)\n        out = self.layer1(out)\n        out = self.layer2(out)\n        out = self.layer3(out)\n        out = self.layer4(out)\n        out = F.avg_pool2d(out, 4)\n        out = out.view(out.size(0), -1)\n        out = self.linear(out)\n        return out\n        '''\n        return x\n\nclass GlobalpoolFC(nn.Module):\n\n    def __init__(self, num_in, num_class):\n        super(GlobalpoolFC, self).__init__()\n        self.pool = nn.AdaptiveAvgPool2d(output_size=1)\n        self.fc = nn.Linear(num_in, num_class)\n\n    def forward(self, x):\n        y = self.pool(x)\n        y = y.reshape(y.shape[0], -1)\n        y = self.fc(y)\n        return y\n\n\ndef PreActResNet18():\n    return PreActResNet(PreActBlock, [2, 2, 2, 2])\n\n\ndef PreActResNet34():\n    return PreActResNet(PreActBlock, [3, 4, 6, 3])\n\n\nclass PreActBottleneck(nn.Module):\n    '''Pre-activation version of the original Bottleneck module.'''\n    expansion = 4\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBottleneck, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn3 = nn.BatchNorm2d(planes)\n        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion*planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out = self.conv3(F.relu(self.bn3(out)))\n        out += shortcut\n        return out\n\ndef create_network():\n    return PreActResNet18()\n\n\ndef test():\n    net = PreActResNet18()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.pgd10/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import train_one_epoch, eval_one_epoch\n\nimport torch\nimport json\nimport time\nimport numpy as np\nfrom tensorboardX import SummaryWriter\nimport argparse\n\nimport os\nfrom collections import OrderedDict\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\ncriterion = config.create_loss_function().to(DEVICE)\n\noptimizer = config.create_optimizer(net.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\nTrainAttack = config.create_attack_method(DEVICE)\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\n\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    train_one_epoch(net, ds_train, optimizer, criterion, DEVICE,\n                    descrip_str, TrainAttack, adv_coef = args.adv_coef)\n    if config.val_interval > 0 and now_epoch % config.val_interval == 0:\n        eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n\n    lr_scheduler.step()\n\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.yopo-5-3/config.py",
    "content": "from easydict import EasyDict\nimport sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\nfrom loss import CrossEntropyWithWeightPenlty\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 36\n    val_interval = 2\n    weight_decay = 5e-4\n\n    inner_iters = 3\n    K = 5\n    sigma = 2 / 255.0\n    eps = 8 / 255.0\n\n    create_optimizer = SGDOptimizerMaker(lr =1e-1 * 2 / K, momentum = 0.9, weight_decay = 5e-4)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [30, 34, 36], gamma = 0.1)\n\n    create_loss_function = torch.nn.CrossEntropyLoss\n\n    #create_attack_method = \\\n    #    IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 10, norm = np.inf,\n    #                          mean = torch.tensor(np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n    #                          std = torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n    create_attack_method = None\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 20, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\n# About data\n# C.inp_chn = 1\n# C.num_class = 10\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.yopo-5-3/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.RandomCrop(32, padding=4),\n     transforms.RandomHorizontalFlip(),\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n\n    trainset = torchvision.datasets.CIFAR10(root=root, train=True, download=True, transform=transform_train)\n    #trainset = torchvision.datasets.MNIST(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n    testset = torchvision.datasets.CIFAR10(root=root, train=False, download=True, transform=transform_test)\n    #testset = torchvision.datasets.MNIST(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.yopo-5-3/eval.py",
    "content": "from config import config\nfrom dataset import create_test_dataset\nfrom network import create_network\n\nfrom training.train import eval_one_epoch\nfrom utils.misc import load_checkpoint\n\nimport argparse\nimport torch\nimport numpy as np\nimport os\n\nparser = argparse.ArgumentParser()\nparser.add_argument('--resume', '--resume', default='log/models/last.checkpoint',\n                    type=str, metavar='PATH',\n                    help='path to latest checkpoint (default:log/last.checkpoint)')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nargs = parser.parse_args()\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\n\nds_val = create_test_dataset(512)\n\nAttackMethod = config.create_evaluation_attack_method(DEVICE)\n\nif os.path.isfile(args.resume):\n    load_checkpoint(args.resume, net)\n\n\nprint('Evaluating')\nclean_acc, adv_acc = eval_one_epoch(net, ds_val, DEVICE, AttackMethod)\nprint('clean acc -- {}     adv acc -- {}'.format(clean_acc, adv_acc))\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.yopo-5-3/loss.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.nn.modules.loss import _Loss\nimport torch.nn.functional as F\n\n\nclass Hamiltonian(_Loss):\n\n    def __init__(self, layer, reg_cof = 1e-4):\n        super(Hamiltonian, self).__init__()\n        self.layer = layer\n        self.reg_cof = 0\n\n\n    def forward(self, x, p):\n\n        y = self.layer(x)\n        #l2 = cal_l2_norm(self.layer)\n\n        #print(y.shape, p.shape)\n        H = torch.sum(y * p)\n\n        #H = H - self.reg_cof * l2\n        return H\n\n\n\nclass CrossEntropyWithWeightPenlty(_Loss):\n    def __init__(self, module, DEVICE, reg_cof = 1e-4):\n        super(CrossEntropyWithWeightPenlty, self).__init__()\n\n        self.reg_cof = reg_cof\n        self.criterion = nn.CrossEntropyLoss().to(DEVICE)\n        self.module = module\n        #print(modules, 'dwadaQ!')\n\n    def __call__(self, pred, label):\n        cross_loss = self.criterion(pred, label)\n        weight_loss = 0\n        #for module in self.module:\n        #    print(module)\n        #    weight_loss = weight_loss + cal_l2_norm(module)\n\n        weight_loss = cal_l2_norm(self.module)\n\n        loss = cross_loss + self.reg_cof * weight_loss\n        return loss\n\ndef cal_l2_norm(layer: torch.nn.Module):\n loss = 0.\n for name, param in layer.named_parameters():\n     if name == 'weight':\n         loss = loss + 0.5 * torch.norm(param,) ** 2\n\n return loss\n\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.yopo-5-3/network.py",
    "content": "'''Pre-activation ResNet in PyTorch.\nReference:\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Identity Mappings in Deep Residual Networks. arXiv:1603.05027\n'''\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\n\nclass PreActBlock(nn.Module):\n    '''Pre-activation version of the BasicBlock.'''\n    expansion = 1\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBlock, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion * planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out += shortcut\n        return out\n\n\nclass PreActResNet(nn.Module):\n\n    def __init__(self, block, num_blocks, num_classes=10):\n        super(PreActResNet, self).__init__()\n        self.in_planes = 64\n\n        self.other_layers = nn.ModuleList()\n\n        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)\n\n        self.layer_one = self.conv1\n\n\n        self.other_layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)\n        self.other_layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)\n        self.other_layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)\n        self.other_layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)\n\n        self.linear = GlobalpoolFC(512 * block.expansion, num_classes)\n        self.other_layers.append(self.linear)\n\n    def _make_layer(self, block, planes, num_blocks, stride):\n        strides = [stride] + [1] * (num_blocks - 1)\n        layers = []\n        for stride in strides:\n            layers.append(block(self.in_planes, planes, stride))\n            self.other_layers.append(layers[-1])\n\n            self.in_planes = planes * block.expansion\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n\n        x = self.layer_one(x)\n        self.layer_one_out = x\n        self.layer_one_out.requires_grad_()\n        self.layer_one_out.retain_grad()\n        x = self.layer_one_out\n\n        for layer in self.other_layers:\n            x = layer(x)\n\n        return x\n\n\nclass GlobalpoolFC(nn.Module):\n\n    def __init__(self, num_in, num_class):\n        super(GlobalpoolFC, self).__init__()\n        self.pool = nn.AdaptiveAvgPool2d(output_size=1)\n        self.fc = nn.Linear(num_in, num_class)\n\n    def forward(self, x):\n        y = self.pool(x)\n        y = y.reshape(y.shape[0], -1)\n        y = self.fc(y)\n        return y\n\n\ndef PreActResNet18():\n    return PreActResNet(PreActBlock, [2, 2, 2, 2])\n\n\ndef PreActResNet34():\n    return PreActResNet(PreActBlock, [3, 4, 6, 3])\n\n\nclass PreActBottleneck(nn.Module):\n    '''Pre-activation version of the original Bottleneck module.'''\n    expansion = 4\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBottleneck, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn3 = nn.BatchNorm2d(planes)\n        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion*planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out = self.conv3(F.relu(self.bn3(out)))\n        out += shortcut\n        return out\n\ndef create_network():\n    return PreActResNet18()\n\n\ndef test():\n    net = PreActResNet18()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.yopo-5-3/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import eval_one_epoch\nfrom loss import  Hamiltonian, CrossEntropyWithWeightPenlty\nfrom training_function import train_one_epoch, FastGradientLayerOneTrainer\n\nimport torch\nimport json\nimport numpy as np\nfrom tensorboardX import SummaryWriter\nimport argparse\n\nimport torch.nn as nn\nimport torch.optim as optim\nimport os\nfrom collections import OrderedDict\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nwriter = SummaryWriter(log_dir=config.log_dir)\n\nnet = create_network()\nnet.to(DEVICE)\ncriterion = config.create_loss_function().to(DEVICE)\n#criterion = CrossEntropyWithWeightPenlty(net.other_layers, DEVICE, config.weight_decay)#.to(DEVICE)\n#ce_criterion = nn.CrossEntropyLoss().to(DEVICE)\noptimizer = config.create_optimizer(net.other_layers.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\n\n## Make Layer One trainner  This part of code should be writen in config.py\n\nHamiltonian_func = Hamiltonian(net.layer_one, config.weight_decay)\nlayer_one_optimizer = optim.SGD(net.layer_one.parameters(), lr = lr_scheduler.get_lr()[0], momentum=0.9, weight_decay=5e-4)\nlyaer_one_optimizer_lr_scheduler = optim.lr_scheduler.MultiStepLR(layer_one_optimizer,\n                                                                  milestones = [30, 34, 36], gamma = 0.1)\nLayerOneTrainer = FastGradientLayerOneTrainer(Hamiltonian_func, layer_one_optimizer,\n                                              config.inner_iters, config.sigma, config.eps)\n\n\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\n#TrainAttack = config.create_attack_method(DEVICE)\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\n\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    acc, yofoacc = train_one_epoch(net, ds_train, optimizer, criterion, LayerOneTrainer, config.K,\n                    DEVICE, descrip_str)\n    tb_train_dic = {'Acc':acc, 'YofoAcc':yofoacc}\n    print(tb_train_dic)\n    writer.add_scalars('Train', tb_train_dic, now_epoch)\n    if config.val_interval > 0 and now_epoch % config.val_interval == 0:\n        acc, advacc = eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n        tb_val_dic = {'Acc': acc, 'AdvAcc': advacc}\n        writer.add_scalars('Val', tb_val_dic, now_epoch)\n\n    lr_scheduler.step()\n    lyaer_one_optimizer_lr_scheduler.step()\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "experiments/CIFAR10/pre-res18.yopo-5-3/training_function.py",
    "content": "import torch\nimport torch.nn as nn\nfrom config import config\n\nfrom loss import Hamiltonian, cal_l2_norm\n\nfrom utils.misc import torch_accuracy, AvgMeter\nfrom collections import OrderedDict\nimport torch\nfrom tqdm import tqdm\n\n\nclass FastGradientLayerOneTrainer(object):\n\n    def __init__(self, Hamiltonian_func, param_optimizer,\n                    inner_steps=2, sigma = 0.008, eps = 0.03):\n        self.inner_steps = inner_steps\n        self.sigma = sigma\n        self.eps = eps\n        self.Hamiltonian_func = Hamiltonian_func\n        self.param_optimizer = param_optimizer\n\n    def step(self, inp, p, eta):\n        '''\n        Perform Iterative Sign Gradient on eta\n\n        ret: inp + eta\n        '''\n\n        p = p.detach()\n\n        for i in range(self.inner_steps):\n            tmp_inp = inp + eta\n            tmp_inp = torch.clamp(tmp_inp, 0, 1)\n            H = self.Hamiltonian_func(tmp_inp, p)\n\n            eta_grad_sign = torch.autograd.grad(H, eta, only_inputs=True, retain_graph=False)[0].sign()\n\n            eta = eta - eta_grad_sign * self.sigma\n\n            eta = torch.clamp(eta, -1.0 * self.eps, self.eps)\n            eta = torch.clamp(inp + eta, 0.0, 1.0) - inp\n            eta = eta.detach()\n            eta.requires_grad_()\n            eta.retain_grad()\n\n        #self.param_optimizer.zero_grad()\n\n        yofo_inp = eta + inp\n        yofo_inp = torch.clamp(yofo_inp, 0, 1)\n\n        loss = -1.0 * self.Hamiltonian_func(yofo_inp, p)\n\n        loss.backward()\n        #self.param_optimizer.step()\n        #self.param_optimizer.zero_grad()\n\n        return yofo_inp, eta\n\n\n\n\ndef train_one_epoch(net, batch_generator, optimizer,\n                    criterion, LayerOneTrainner, K,\n                    DEVICE=torch.device('cuda:0'),descrip_str='Training'):\n    '''\n\n    :param attack_freq:  Frequencies of training with adversarial examples. -1 indicates natural training\n    :param AttackMethod: the attack method, None represents natural training\n    :return:  None    #(clean_acc, adv_acc)\n    '''\n    net.train()\n    pbar = tqdm(batch_generator)\n    yofoacc = -1\n    cleanacc = -1\n    cleanloss = -1\n    pbar.set_description(descrip_str)\n    for i, (data, label) in enumerate(pbar):\n        data = data.to(DEVICE)\n        label = label.to(DEVICE)\n\n        eta = torch.FloatTensor(*data.shape).uniform_(-config.eps, config.eps)\n        eta = eta.to(label.device)\n        eta.requires_grad_()\n\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n\n        for j in range(K):\n            #optimizer.zero_grad()\n\n            pbar_dic = OrderedDict()\n            TotalLoss = 0\n\n            pred = net(data + eta.detach())\n\n            loss = criterion(pred, label)\n            TotalLoss = TotalLoss + loss\n#             wgrad = net.conv1.weight.grad\n            #bgrad = net.conv1.bias.grad\n            TotalLoss.backward()\n#             net.conv1.weight.grad = wgrad\n            #net.conv1.bias.grad = bgrad\n            #param = next(net.parameters())\n            #grad_mean = torch.mean(param.grad)\n\n            #optimizer.step()\n            #optimizer.zero_grad()\n\n            p = -1.0 * net.layer_one_out.grad\n            yofo_inp, eta = LayerOneTrainner.step(data, p, eta)\n\n            with torch.no_grad():\n                if j == 0:\n                    acc = torch_accuracy(pred, label, (1,))\n                    cleanacc = acc[0].item()\n                    cleanloss = loss.item()\n\n                if j == K - 1:\n                    yofo_pred = net(yofo_inp)\n                    yofoacc = torch_accuracy(yofo_pred, label, (1,))[0].item()\n            #pbar_dic['grad'] = '{}'.format(grad_mean)\n\n        optimizer.step()\n        LayerOneTrainner.param_optimizer.step()\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n        pbar_dic['Acc'] = '{:.2f}'.format(cleanacc)\n        pbar_dic['loss'] = '{:.2f}'.format(cleanloss)\n        pbar_dic['YofoAcc'] = '{:.2f}'.format(yofoacc)\n        pbar.set_postfix(pbar_dic)\n\n    return cleanacc, yofoacc\n\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.natural/config.py",
    "content": "from easydict import EasyDict\nimport sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 105\n    val_interval = 10\n\n    create_optimizer = SGDOptimizerMaker(lr =1e-1, momentum = 0.9, weight_decay = 2e-4)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [75, 90, 100], gamma = 0.1)\n\n    create_loss_function = torch.nn.CrossEntropyLoss\n\n    #create_attack_method = \\\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 20, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\n# About data\n# C.inp_chn = 1\n# C.num_class = 10\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.natural/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.RandomCrop(32, padding=4),\n     transforms.RandomHorizontalFlip(),\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n\n    trainset = torchvision.datasets.CIFAR10(root=root, train=True, download=True, transform=transform_train)\n    #trainset = torchvision.datasets.MNIST(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n    testset = torchvision.datasets.CIFAR10(root=root, train=False, download=True, transform=transform_test)\n    #testset = torchvision.datasets.MNIST(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.natural/eval.py",
    "content": "from config import config\nfrom dataset import create_test_dataset\nfrom network import create_network\n\nfrom training.train import eval_one_epoch\nfrom utils.misc import load_checkpoint\n\nimport argparse\nimport torch\nimport numpy as np\nimport os\n\nparser = argparse.ArgumentParser()\nparser.add_argument('--resume', '--resume', default='log/models/last.checkpoint',\n                    type=str, metavar='PATH',\n                    help='path to latest checkpoint (default:log/last.checkpoint)')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nargs = parser.parse_args()\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\n\nds_val = create_test_dataset(512)\n\nAttackMethod = config.create_evaluation_attack_method(DEVICE)\n\nif os.path.isfile(args.resume):\n    load_checkpoint(args.resume, net)\n\n\nprint('Evaluating')\nclean_acc, adv_acc = eval_one_epoch(net, ds_val, DEVICE, AttackMethod)\nprint('clean acc -- {}     adv acc -- {}'.format(clean_acc, adv_acc))\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.natural/network.py",
    "content": "import config\nfrom base_model.wide_resnet import WideResNet\n\ndef create_network():\n    return WideResNet(34)\n\n\ndef test():\n    net = create_network()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.natural/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import train_one_epoch, eval_one_epoch\n\nimport torch\nimport json\nimport time\nimport numpy as np\nfrom tensorboardX import SummaryWriter\nimport argparse\n\nimport os\nfrom collections import OrderedDict\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\ncriterion = config.create_loss_function().to(DEVICE)\n\noptimizer = config.create_optimizer(net.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\nTrainAttack = config.create_attack_method(DEVICE)\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\n\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    train_one_epoch(net, ds_train, optimizer, criterion, DEVICE,\n                    descrip_str, TrainAttack, adv_coef = args.adv_coef)\n    if config.val_interval > 0 and now_epoch % config.val_interval == 0:\n        eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n\n    lr_scheduler.step()\n\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.pgd10/config.py",
    "content": "from easydict import EasyDict\nimport sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 105\n    val_interval = 10\n\n    create_optimizer = SGDOptimizerMaker(lr =1e-1, momentum = 0.9, weight_decay = 2e-4)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [75, 90, 100], gamma = 0.1)\n\n    create_loss_function = torch.nn.CrossEntropyLoss\n\n    create_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 10, norm = np.inf,\n                              mean = torch.tensor(np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std = torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 20, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\n# About data\n# C.inp_chn = 1\n# C.num_class = 10\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.pgd10/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.RandomCrop(32, padding=4),\n     transforms.RandomHorizontalFlip(),\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n\n    trainset = torchvision.datasets.CIFAR10(root=root, train=True, download=True, transform=transform_train)\n    #trainset = torchvision.datasets.MNIST(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n    testset = torchvision.datasets.CIFAR10(root=root, train=False, download=True, transform=transform_test)\n    #testset = torchvision.datasets.MNIST(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.pgd10/eval.py",
    "content": "from config import config\nfrom dataset import create_test_dataset\nfrom network import create_network\n\nfrom training.train import eval_one_epoch\nfrom utils.misc import load_checkpoint\n\nimport argparse\nimport torch\nimport numpy as np\nimport os\n\nparser = argparse.ArgumentParser()\nparser.add_argument('--resume', '--resume', default='log/models/last.checkpoint',\n                    type=str, metavar='PATH',\n                    help='path to latest checkpoint (default:log/last.checkpoint)')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nargs = parser.parse_args()\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\n\nds_val = create_test_dataset(512)\n\nAttackMethod = config.create_evaluation_attack_method(DEVICE)\n\nif os.path.isfile(args.resume):\n    load_checkpoint(args.resume, net)\n\n\nprint('Evaluating')\nclean_acc, adv_acc = eval_one_epoch(net, ds_val, DEVICE, AttackMethod)\nprint('clean acc -- {}     adv acc -- {}'.format(clean_acc, adv_acc))\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.pgd10/network.py",
    "content": "import config\nfrom base_model.wide_resnet import WideResNet\n\ndef create_network():\n    return WideResNet(34)\n\n\ndef test():\n    net = create_network()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.pgd10/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import train_one_epoch, eval_one_epoch\n\nimport torch\nimport json\nimport time\nimport numpy as np\nfrom tensorboardX import SummaryWriter\nimport argparse\n\nimport os\nfrom collections import OrderedDict\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\ncriterion = config.create_loss_function().to(DEVICE)\n\noptimizer = config.create_optimizer(net.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\nTrainAttack = config.create_attack_method(DEVICE)\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\n\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    train_one_epoch(net, ds_train, optimizer, criterion, DEVICE,\n                    descrip_str, TrainAttack, adv_coef = args.adv_coef)\n    if config.val_interval > 0 and now_epoch % config.val_interval == 0:\n        eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n\n    lr_scheduler.step()\n\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.yopo-5-3/config.py",
    "content": "from easydict import EasyDict\nimport sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\nfrom loss import CrossEntropyWithWeightPenlty\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 36\n    val_interval = 2\n    weight_decay = 5e-4\n\n    inner_iters = 3\n    K = 5\n    sigma = 2 / 255.0\n    eps = 8 / 255.0\n\n    create_optimizer = SGDOptimizerMaker(lr =1e-1 * 4 / K, momentum = 0.9, weight_decay = 5e-4)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [30, 34, 36], gamma = 0.1)\n\n    create_loss_function = torch.nn.CrossEntropyLoss\n\n    #create_attack_method = \\\n    #    IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 10, norm = np.inf,\n    #                          mean = torch.tensor(np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n    #                          std = torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n    create_attack_method = None\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 20, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\n# About data\n# C.inp_chn = 1\n# C.num_class = 10\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.yopo-5-3/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.RandomCrop(32, padding=4),\n     transforms.RandomHorizontalFlip(),\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n\n    trainset = torchvision.datasets.CIFAR10(root=root, train=True, download=True, transform=transform_train)\n    #trainset = torchvision.datasets.MNIST(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n     #transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n    ])\n    testset = torchvision.datasets.CIFAR10(root=root, train=False, download=True, transform=transform_test)\n    #testset = torchvision.datasets.MNIST(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.yopo-5-3/eval.py",
    "content": "from config import config\nfrom dataset import create_test_dataset\nfrom network import create_network\n\nfrom training.train import eval_one_epoch\nfrom utils.misc import load_checkpoint\n\nimport argparse\nimport torch\nimport numpy as np\nimport os\n\nparser = argparse.ArgumentParser()\nparser.add_argument('--resume', '--resume', default='log/models/last.checkpoint',\n                    type=str, metavar='PATH',\n                    help='path to latest checkpoint (default:log/last.checkpoint)')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nargs = parser.parse_args()\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\n\nds_val = create_test_dataset(512)\n\nAttackMethod = config.create_evaluation_attack_method(DEVICE)\n\nif os.path.isfile(args.resume):\n    load_checkpoint(args.resume, net)\n\n\nprint('Evaluating')\nclean_acc, adv_acc = eval_one_epoch(net, ds_val, DEVICE, AttackMethod)\nprint('clean acc -- {}     adv acc -- {}'.format(clean_acc, adv_acc))\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.yopo-5-3/loss.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.nn.modules.loss import _Loss\nimport torch.nn.functional as F\n\n\nclass Hamiltonian(_Loss):\n\n    def __init__(self, layer, reg_cof = 1e-4):\n        super(Hamiltonian, self).__init__()\n        self.layer = layer\n        self.reg_cof = 0\n\n\n    def forward(self, x, p):\n\n        y = self.layer(x)\n        #l2 = cal_l2_norm(self.layer)\n\n        #print(y.shape, p.shape)\n        H = torch.sum(y * p)\n\n        #H = H - self.reg_cof * l2\n        return H\n\n\n\nclass CrossEntropyWithWeightPenlty(_Loss):\n    def __init__(self, module, DEVICE, reg_cof = 1e-4):\n        super(CrossEntropyWithWeightPenlty, self).__init__()\n\n        self.reg_cof = reg_cof\n        self.criterion = nn.CrossEntropyLoss().to(DEVICE)\n        self.module = module\n        #print(modules, 'dwadaQ!')\n\n    def __call__(self, pred, label):\n        cross_loss = self.criterion(pred, label)\n        weight_loss = 0\n        #for module in self.module:\n        #    print(module)\n        #    weight_loss = weight_loss + cal_l2_norm(module)\n\n        weight_loss = cal_l2_norm(self.module)\n\n        loss = cross_loss + self.reg_cof * weight_loss\n        return loss\n\ndef cal_l2_norm(layer: torch.nn.Module):\n loss = 0.\n for name, param in layer.named_parameters():\n     if name == 'weight':\n         loss = loss + 0.5 * torch.norm(param,) ** 2\n\n return loss\n\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.yopo-5-3/network.py",
    "content": "import config\nfrom base_model.wide_resnet import WideResNet\n\ndef create_network():\n    return WideResNet(34)\n\n\ndef test():\n    net = create_network()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.yopo-5-3/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import eval_one_epoch\nfrom loss import  Hamiltonian, CrossEntropyWithWeightPenlty\nfrom training_function import train_one_epoch, FastGradientLayerOneTrainer\n\nimport torch\nimport json\nimport numpy as np\nfrom tensorboardX import SummaryWriter\nimport argparse\n\nimport torch.nn as nn\nimport torch.optim as optim\nimport os\nfrom collections import OrderedDict\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nwriter = SummaryWriter(log_dir=config.log_dir)\n\nnet = create_network()\nnet.to(DEVICE)\ncriterion = config.create_loss_function().to(DEVICE)\n#criterion = CrossEntropyWithWeightPenlty(net.other_layers, DEVICE, config.weight_decay)#.to(DEVICE)\n#ce_criterion = nn.CrossEntropyLoss().to(DEVICE)\noptimizer = config.create_optimizer(net.other_layers.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\n\n## Make Layer One trainner  This part of code should be writen in config.py\n\nHamiltonian_func = Hamiltonian(net.layer_one, config.weight_decay)\nlayer_one_optimizer = optim.SGD(net.layer_one.parameters(), lr = lr_scheduler.get_lr()[0], momentum=0.9, weight_decay=5e-4)\nlyaer_one_optimizer_lr_scheduler = optim.lr_scheduler.MultiStepLR(layer_one_optimizer,\n                                                                  milestones = [30, 34, 36], gamma = 0.1)\nLayerOneTrainer = FastGradientLayerOneTrainer(Hamiltonian_func, layer_one_optimizer,\n                                              config.inner_iters, config.sigma, config.eps)\n\n\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\n#TrainAttack = config.create_attack_method(DEVICE)\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\n\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    acc, yofoacc = train_one_epoch(net, ds_train, optimizer, criterion, LayerOneTrainer, config.K,\n                    DEVICE, descrip_str)\n    tb_train_dic = {'Acc':acc, 'YofoAcc':yofoacc}\n    print(tb_train_dic)\n    writer.add_scalars('Train', tb_train_dic, now_epoch)\n    if config.val_interval > 0 and now_epoch % config.val_interval == 0:\n        acc, advacc = eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n        tb_val_dic = {'Acc': acc, 'AdvAcc': advacc}\n        writer.add_scalars('Val', tb_val_dic, now_epoch)\n\n    lr_scheduler.step()\n    lyaer_one_optimizer_lr_scheduler.step()\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "experiments/CIFAR10/wide34.yopo-5-3/training_function.py",
    "content": "import torch\nimport torch.nn as nn\nfrom config import config\n\nfrom loss import Hamiltonian, cal_l2_norm\n\nfrom utils.misc import torch_accuracy, AvgMeter\nfrom collections import OrderedDict\nimport torch\nfrom tqdm import tqdm\n\n\nclass FastGradientLayerOneTrainer(object):\n\n    def __init__(self, Hamiltonian_func, param_optimizer,\n                    inner_steps=2, sigma = 0.008, eps = 0.03):\n        self.inner_steps = inner_steps\n        self.sigma = sigma\n        self.eps = eps\n        self.Hamiltonian_func = Hamiltonian_func\n        self.param_optimizer = param_optimizer\n\n    def step(self, inp, p, eta):\n        '''\n        Perform Iterative Sign Gradient on eta\n\n        ret: inp + eta\n        '''\n\n        p = p.detach()\n\n        for i in range(self.inner_steps):\n            tmp_inp = inp + eta\n            tmp_inp = torch.clamp(tmp_inp, 0, 1)\n            H = self.Hamiltonian_func(tmp_inp, p)\n\n            eta_grad_sign = torch.autograd.grad(H, eta, only_inputs=True, retain_graph=False)[0].sign()\n\n            eta = eta - eta_grad_sign * self.sigma\n\n            eta = torch.clamp(eta, -1.0 * self.eps, self.eps)\n            eta = torch.clamp(inp + eta, 0.0, 1.0) - inp\n            eta = eta.detach()\n            eta.requires_grad_()\n            eta.retain_grad()\n\n        #self.param_optimizer.zero_grad()\n\n        yofo_inp = eta + inp\n        yofo_inp = torch.clamp(yofo_inp, 0, 1)\n\n        loss = -1.0 * self.Hamiltonian_func(yofo_inp, p)\n\n        loss.backward()\n        #self.param_optimizer.step()\n        #self.param_optimizer.zero_grad()\n\n        return yofo_inp, eta\n\n\n\n\ndef train_one_epoch(net, batch_generator, optimizer,\n                    criterion, LayerOneTrainner, K,\n                    DEVICE=torch.device('cuda:0'),descrip_str='Training'):\n    '''\n\n    :param attack_freq:  Frequencies of training with adversarial examples. -1 indicates natural training\n    :param AttackMethod: the attack method, None represents natural training\n    :return:  None    #(clean_acc, adv_acc)\n    '''\n    net.train()\n    pbar = tqdm(batch_generator)\n    yofoacc = -1\n    cleanacc = -1\n    cleanloss = -1\n    pbar.set_description(descrip_str)\n    for i, (data, label) in enumerate(pbar):\n        data = data.to(DEVICE)\n        label = label.to(DEVICE)\n\n        eta = torch.FloatTensor(*data.shape).uniform_(-config.eps, config.eps)\n        eta = eta.to(label.device)\n        eta.requires_grad_()\n\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n\n        for j in range(K):\n            #optimizer.zero_grad()\n\n            pbar_dic = OrderedDict()\n            TotalLoss = 0\n\n            pred = net(data + eta.detach())\n\n            loss = criterion(pred, label)\n            TotalLoss = TotalLoss + loss\n            wgrad = net.conv1.weight.grad\n            #bgrad = net.conv1.bias.grad\n            TotalLoss.backward()\n            net.conv1.weight.grad = wgrad\n            #net.conv1.bias.grad = bgrad\n            #param = next(net.parameters())\n            #grad_mean = torch.mean(param.grad)\n\n            #optimizer.step()\n            #optimizer.zero_grad()\n\n            p = -1.0 * net.layer_one_out.grad\n            yofo_inp, eta = LayerOneTrainner.step(data, p, eta)\n\n            with torch.no_grad():\n                if j == 0:\n                    acc = torch_accuracy(pred, label, (1,))\n                    cleanacc = acc[0].item()\n                    cleanloss = loss.item()\n\n                if j == K - 1:\n                    yofo_pred = net(yofo_inp)\n                    yofoacc = torch_accuracy(yofo_pred, label, (1,))[0].item()\n            #pbar_dic['grad'] = '{}'.format(grad_mean)\n\n        optimizer.step()\n        LayerOneTrainner.param_optimizer.step()\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n        pbar_dic['Acc'] = '{:.2f}'.format(cleanacc)\n        pbar_dic['loss'] = '{:.2f}'.format(cleanloss)\n        pbar_dic['YofoAcc'] = '{:.2f}'.format(yofoacc)\n        pbar.set_postfix(pbar_dic)\n\n    return cleanacc, yofoacc\n\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/baseline.res-pre18.TRADES.10step/config.py",
    "content": "import sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\n\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    create_optimizer = None #SGDOptimizerMaker(lr =1e-1 * 5 / K, momentum = 0.9, weight_decay = 5e-4)\n    create_lr_scheduler = None #PieceWiseConstantLrSchedulerMaker(milestones = [35, 40, 45], gamma = 0.1)\n    #\n    create_loss_function = None #torch.nn.CrossEntropyLoss\n    create_attack_method = None\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 20, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\nparser = argparse.ArgumentParser(description='PyTorch CIFAR TRADES Adversarial Training')\nparser.add_argument('--batch-size', type=int, default=200, metavar='N',\n                    help='input batch size for training (default: 128)')\nparser.add_argument('--test-batch-size', type=int, default=256, metavar='N',\n                    help='input batch size for testing (default: 128)')\nparser.add_argument('--epochs', type=int, default=100, metavar='N',\n                    help='number of epochs to train')\nparser.add_argument('--weight-decay', '--wd', default=2e-4,\n                    type=float, metavar='W')\nparser.add_argument('--lr', type=float, default=0.1, metavar='LR',\n                    help='learning rate')\nparser.add_argument('--momentum', type=float, default=0.9, metavar='M',\n                    help='SGD momentum')\nparser.add_argument('--no-cuda', action='store_true', default=False,\n                    help='disables CUDA training')\nparser.add_argument('--epsilon', default=0.031,\n                    help='perturbation')\nparser.add_argument('--num-steps', default=10,\n                    help='perturb number of steps')\nparser.add_argument('--step-size', default=0.007,\n                    help='perturb step size')\nparser.add_argument('--beta', default=1.0,\n                    help='regularization, i.e., 1/lambda in TRADES')\nparser.add_argument('--seed', type=int, default=1, metavar='S',\n                    help='random seed (default: 1)')\nparser.add_argument('--log-interval', type=int, default=100, metavar='N',\n                    help='how many batches to wait before logging training status')\nparser.add_argument('--model-dir', default='./model-cifar-wideResNet',\n                    help='directory of model for saving checkpoint')\nparser.add_argument('--save-freq', '-s', default=5, type=int, metavar='N',\n                    help='save frequency')\nparser.add_argument('-d', default=0, type=int, help='which gpu to use')\n\nargs = parser.parse_args()\n\nif __name__ == '__main__':\n    pass"
  },
  {
    "path": "experiments/CIFAR10-TRADES/baseline.res-pre18.TRADES.10step/network.py",
    "content": "import config\nfrom base_model.wide_resnet import WideResNet\nfrom base_model.preact_resnet import PreActResNet18\n\ndef create_network():\n    # return WideResNet(34)\n    return PreActResNet18()\n\n\ndef test():\n    net = create_network()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/baseline.res-pre18.TRADES.10step/trades.py",
    "content": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom torch.autograd import Variable\n\nfrom config import config\nfrom utils.misc import torch_accuracy, AvgMeter\n\ndef squared_l2_norm(x):\n    flattened = x.view(x.shape[0], -1)\n    return (flattened ** 2).sum(1)\n\n\ndef l2_norm(x):\n    return squared_l2_norm(x).sqrt()\n\n\ndef trades_loss(model,\n                x_natural,\n                y,\n                optimizer,\n                device,\n                step_size=0.003,\n                epsilon=0.031,\n                perturb_steps=10,\n                beta=1.0,\n                distance='l_inf'):\n    # define KL-loss\n    criterion_kl = nn.KLDivLoss(size_average=False)\n    model.eval()\n    batch_size = len(x_natural)\n    # generate adversarial example\n    x_adv = x_natural.detach() + 0.001 * torch.randn(x_natural.shape).cuda().detach().to(device)\n    if distance == 'l_inf':\n        # logits_natural = model(x_natural).detach()\n\n        for _ in range(perturb_steps):\n            x_adv.requires_grad_()\n            with torch.enable_grad():\n                loss_kl = criterion_kl(F.log_softmax(model(x_adv), dim=1),\n                                       F.softmax(model(x_natural), dim=1))\n                # loss_kl = criterion_kl(F.log_softmax(model(x_adv), dim=1),\n                #                        F.softmax(logits_natural, dim=1))\n\n            grad = torch.autograd.grad(loss_kl, [x_adv])[0]\n            x_adv = x_adv.detach() + step_size * torch.sign(grad.detach())\n            x_adv = torch.min(torch.max(x_adv, x_natural - epsilon), x_natural + epsilon)\n            x_adv = torch.clamp(x_adv, 0.0, 1.0)\n\n    elif distance == 'l_2':\n        for _ in range(perturb_steps):\n            x_adv.requires_grad_()\n            with torch.enable_grad():\n                loss_kl = criterion_kl(F.log_softmax(model(x_adv), dim=1),\n                                       F.softmax(model(x_natural), dim=1))\n            grad = torch.autograd.grad(loss_kl, [x_adv])[0]\n            for idx_batch in range(batch_size):\n                grad_idx = grad[idx_batch]\n                grad_idx_norm = l2_norm(grad_idx)\n                grad_idx /= (grad_idx_norm + 1e-8)\n                x_adv[idx_batch] = x_adv[idx_batch].detach() + step_size * grad_idx\n                eta_x_adv = x_adv[idx_batch] - x_natural[idx_batch]\n                norm_eta = l2_norm(eta_x_adv)\n                if norm_eta > epsilon:\n                    eta_x_adv = eta_x_adv * epsilon / l2_norm(eta_x_adv)\n                x_adv[idx_batch] = x_natural[idx_batch] + eta_x_adv\n            x_adv = torch.clamp(x_adv, 0.0, 1.0)\n    else:\n        x_adv = torch.clamp(x_adv, 0.0, 1.0)\n\n    model.train()\n\n    x_adv = Variable(torch.clamp(x_adv, 0.0, 1.0), requires_grad=False)\n    # zero gradient\n    optimizer.zero_grad()\n    # calculate robust loss\n    logits = model(x_natural)\n    adv_logits = model(x_adv)\n    loss_natural = F.cross_entropy(logits, y)\n    loss_robust = (1.0 / batch_size) * criterion_kl(F.log_softmax(adv_logits, dim=1),\n                                                    F.softmax(logits, dim=1))\n    loss = loss_natural + beta * loss_robust\n\n    cleanacc = torch_accuracy(logits, y, (1,))[0].item()\n    tradesacc = torch_accuracy(adv_logits, y, (1,))[0].item()\n    return loss, loss_natural.item(), loss_robust.item(), cleanacc, tradesacc\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/baseline.res-pre18.TRADES.10step/train_trades_cifar10.py",
    "content": "from __future__ import print_function\nimport os\nfrom tqdm import tqdm\nfrom collections import OrderedDict\nfrom time import time\nimport json\n\nimport argparse\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torchvision\nimport torch.optim as optim\nfrom torchvision import datasets, transforms\n\nfrom config import config, args\nfrom network import create_network\nfrom trades import trades_loss\n\nfrom training.train import eval_one_epoch\nfrom utils.misc import torch_accuracy, AvgMeter\n\n# settings\nmodel_dir = args.model_dir\nif not os.path.exists(model_dir):\n    os.makedirs(model_dir)\nuse_cuda = not args.no_cuda and torch.cuda.is_available()\ntorch.manual_seed(args.seed)\ndevice = torch.device('cuda:{}'.format(args.d) if use_cuda else \"cpu\")\nkwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}\n\n# setup data loader\ntransform_train = transforms.Compose([\n    transforms.RandomCrop(32, padding=4),\n    transforms.RandomHorizontalFlip(),\n    transforms.ToTensor(),\n])\ntransform_test = transforms.Compose([\n    transforms.ToTensor(),\n])\ntrainset = torchvision.datasets.CIFAR10(root='../data', train=True, download=True, transform=transform_train)\ntrain_loader = torch.utils.data.DataLoader(trainset, batch_size=args.batch_size, shuffle=True, **kwargs)\ntestset = torchvision.datasets.CIFAR10(root='../data', train=False, download=True, transform=transform_test)\ntest_loader = torch.utils.data.DataLoader(testset, batch_size=args.test_batch_size, shuffle=False, **kwargs)\n\n\ndef train(args, model, device, train_loader, optimizer, epoch, descrip_str='Training'):\n    model.train()\n    pbar = tqdm(train_loader)\n    pbar.set_description(descrip_str)\n\n    CleanAccMeter = AvgMeter()\n    TradesAccMeter = AvgMeter()\n    for batch_idx, (data, target) in enumerate(pbar):\n        data, target = data.to(device), target.to(device)\n\n        optimizer.zero_grad()\n\n        # calculate robust loss\n        loss, cleanloss, klloss, cleanacc, tradesacc = trades_loss(model=model,\n                                                                   x_natural=data,\n                                                                   y=target,\n                                                                   optimizer=optimizer,\n                                                                   device=device,\n                                                                   step_size=args.step_size,\n                                                                   epsilon=args.epsilon,\n                                                                   perturb_steps=args.num_steps,\n                                                                   beta=args.beta,)\n        loss.backward()\n        optimizer.step()\n\n        CleanAccMeter.update(cleanacc)\n        TradesAccMeter.update(tradesacc)\n\n        pbar_dic = OrderedDict()\n        pbar_dic['cleanloss'] = '{:.3f}'.format(cleanloss)\n        pbar_dic['klloss'] = '{:.3f}'.format(klloss)\n        pbar_dic['CleanAcc'] = '{:.2f}'.format(CleanAccMeter.mean)\n        pbar_dic['TradesAcc'] = '{:.2f}'.format(TradesAccMeter.mean)\n        pbar.set_postfix(pbar_dic)\n\n\ndef adjust_learning_rate(optimizer, epoch):\n    \"\"\"decrease the learning rate\"\"\"\n    lr = args.lr\n    if epoch >= 75:\n        lr = args.lr * 0.1\n    elif epoch >= 90:\n        lr = args.lr * 0.01\n    elif epoch >= 100:\n        lr = args.lr * 0.001\n    for param_group in optimizer.param_groups:\n        param_group['lr'] = lr\n\n\ndef main():\n    model = create_network().to(device)\n\n    optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum, weight_decay=args.weight_decay)\n\n    EvalAttack = config.create_evaluation_attack_method(device)\n\n    now_train_time = 0\n    for epoch in range(1, args.epochs + 1):\n        # adjust learning rate for SGD\n        adjust_learning_rate(optimizer, epoch)\n\n        s_time = time()\n        descrip_str = 'Training epoch: {}/{}'.format(epoch, args.epochs)\n        # adversarial training\n        train(args, model, device, train_loader, optimizer, epoch, descrip_str)\n        now_train_time += time() - s_time\n\n        acc, advacc = eval_one_epoch(model, test_loader, device, EvalAttack)\n\n        # save checkpoint\n        if epoch % args.save_freq == 0:\n            torch.save(model.state_dict(),\n                       os.path.join(config.model_dir, 'model-wideres-epoch{}.pt'.format(epoch)))\n\n\nif __name__ == '__main__':\n    main()"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-2-5/config.py",
    "content": "from easydict import EasyDict\nimport sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 105\n    val_interval = 10\n    weight_decay = 5e-4\n\n    inner_iters = 5\n    K = 2\n    sigma = 0.007\n    eps = 0.031\n\n    create_optimizer = SGDOptimizerMaker(lr=2e-1, momentum = 0.9, weight_decay = weight_decay)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [70, 90, 100], gamma = 0.1)\n\n    create_loss_function = torch.nn.CrossEntropyLoss\n\n    create_attack_method = None\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 20, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-2-5/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.RandomCrop(32, padding=4),\n     transforms.RandomHorizontalFlip(),\n     transforms.ToTensor(),\n    ])\n\n    trainset = torchvision.datasets.CIFAR10(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=0)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n    ])\n    testset = torchvision.datasets.CIFAR10(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=0)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-2-5/loss.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.nn.modules.loss import _Loss\nimport torch.nn.functional as F\n\n\nclass Hamiltonian(_Loss):\n\n    def __init__(self, layer, reg_cof = 1e-4):\n        super(Hamiltonian, self).__init__()\n        self.layer = layer\n        self.reg_cof = 0\n\n    def forward(self, x, p):\n\n        y = self.layer(x)\n        H = torch.sum(y * p)\n        return H\n\n\nclass CrossEntropyWithWeightPenlty(_Loss):\n    def __init__(self, module, DEVICE, reg_cof = 1e-4):\n        super(CrossEntropyWithWeightPenlty, self).__init__()\n\n        self.reg_cof = reg_cof\n        self.criterion = nn.CrossEntropyLoss().to(DEVICE)\n        self.module = module\n\n    def __call__(self, pred, label):\n        cross_loss = self.criterion(pred, label)\n        weight_loss = cal_l2_norm(self.module)\n\n        loss = cross_loss + self.reg_cof * weight_loss\n        return loss\n\n\ndef cal_l2_norm(layer: torch.nn.Module):\n loss = 0.\n for name, param in layer.named_parameters():\n     if name == 'weight':\n         loss = loss + 0.5 * torch.norm(param,) ** 2\n\n return loss\n\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-2-5/network.py",
    "content": "'''Pre-activation ResNet in PyTorch.\nReference:\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Identity Mappings in Deep Residual Networks. arXiv:1603.05027\n'''\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\n\nclass PreActBlock(nn.Module):\n    '''Pre-activation version of the BasicBlock.'''\n    expansion = 1\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBlock, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion * planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out += shortcut\n        return out\n\n\nclass PreActResNet(nn.Module):\n\n    def __init__(self, block, num_blocks, num_classes=10):\n        super(PreActResNet, self).__init__()\n        self.in_planes = 64\n\n        self.other_layers = nn.ModuleList()\n\n        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)\n\n        self.layer_one = self.conv1\n\n\n        self.other_layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)\n        self.other_layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)\n        self.other_layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)\n        self.other_layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)\n\n        self.linear = GlobalpoolFC(512 * block.expansion, num_classes)\n        self.other_layers.append(self.linear)\n\n    def _make_layer(self, block, planes, num_blocks, stride):\n        strides = [stride] + [1] * (num_blocks - 1)\n        layers = []\n        for stride in strides:\n            layers.append(block(self.in_planes, planes, stride))\n            self.other_layers.append(layers[-1])\n\n            self.in_planes = planes * block.expansion\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n\n        x = self.layer_one(x)\n        self.layer_one_out = x\n        self.layer_one_out.requires_grad_()\n        self.layer_one_out.retain_grad()\n        x = self.layer_one_out\n\n        for layer in self.other_layers:\n            x = layer(x)\n\n        return x\n\n\nclass GlobalpoolFC(nn.Module):\n\n    def __init__(self, num_in, num_class):\n        super(GlobalpoolFC, self).__init__()\n        self.pool = nn.AdaptiveAvgPool2d(output_size=1)\n        self.fc = nn.Linear(num_in, num_class)\n\n    def forward(self, x):\n        y = self.pool(x)\n        y = y.reshape(y.shape[0], -1)\n        y = self.fc(y)\n        return y\n\n\ndef PreActResNet18():\n    return PreActResNet(PreActBlock, [2, 2, 2, 2])\n\n\ndef PreActResNet34():\n    return PreActResNet(PreActBlock, [3, 4, 6, 3])\n\n\nclass PreActBottleneck(nn.Module):\n    '''Pre-activation version of the original Bottleneck module.'''\n    expansion = 4\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBottleneck, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn3 = nn.BatchNorm2d(planes)\n        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion*planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out = self.conv3(F.relu(self.bn3(out)))\n        out += shortcut\n        return out\n\ndef create_network():\n    return PreActResNet18()\n\n\ndef test():\n    net = PreActResNet18()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-2-5/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import eval_one_epoch\nfrom loss import  Hamiltonian, CrossEntropyWithWeightPenlty\nfrom training_function import train_one_epoch, FastGradientLayerOneTrainer\n\nimport torch\nimport torch.optim as optim\nimport os\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\n\nnet = create_network()\nnet.to(DEVICE)\n\ncriterion = config.create_loss_function().to(DEVICE)\noptimizer = config.create_optimizer(net.other_layers.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\n\nHamiltonian_func = Hamiltonian(net.layer_one, config.weight_decay)\nlayer_one_optimizer = optim.SGD(net.layer_one.parameters(), lr = lr_scheduler.get_lr()[0], momentum=0.9, weight_decay=5e-4)\nlyaer_one_optimizer_lr_scheduler = optim.lr_scheduler.MultiStepLR(layer_one_optimizer,\n                                                                  milestones = [70, 90, 100], gamma = 0.1)\nLayerOneTrainer = FastGradientLayerOneTrainer(Hamiltonian_func, layer_one_optimizer,\n                                              config.inner_iters, config.sigma, config.eps)\n\n\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    acc, yofoacc = train_one_epoch(net, ds_train, optimizer, criterion, LayerOneTrainer, config.K,\n                    DEVICE, descrip_str)\n    acc, advacc = eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n\n    lr_scheduler.step()\n    lyaer_one_optimizer_lr_scheduler.step()\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-2-5/training_function.py",
    "content": "import torch\nimport torch.nn as nn\nfrom config import config\n\nfrom loss import Hamiltonian, cal_l2_norm\nimport torch.nn.functional as F\n\nfrom utils.misc import torch_accuracy, AvgMeter\nfrom collections import OrderedDict\nimport torch\nfrom tqdm import tqdm\n\n\nclass FastGradientLayerOneTrainer(object):\n\n    def __init__(self, Hamiltonian_func, param_optimizer,\n                    inner_steps=2, sigma = 0.008, eps = 0.03):\n        self.inner_steps = inner_steps\n        self.sigma = sigma\n        self.eps = eps\n        self.Hamiltonian_func = Hamiltonian_func\n        self.param_optimizer = param_optimizer\n\n    def step(self, inp, p, eta):\n        '''\n        Perform Iterative Sign Gradient on eta\n        ret: inp + eta\n        '''\n\n        p = p.detach()\n\n        for i in range(self.inner_steps):\n            tmp_inp = inp + eta\n            tmp_inp = torch.clamp(tmp_inp, 0, 1)\n            H = self.Hamiltonian_func(tmp_inp, p)\n\n            eta_grad = torch.autograd.grad(H, eta, only_inputs=True, retain_graph=False)[0]\n            eta_grad_sign = eta_grad.sign()\n            eta = eta - eta_grad_sign * self.sigma\n\n            eta = torch.clamp(eta, -1.0 * self.eps, self.eps)\n            eta = torch.clamp(inp + eta, 0.0, 1.0) - inp\n            eta = eta.detach()\n            eta.requires_grad_()\n            eta.retain_grad()\n\n\n        yofo_inp = eta + inp\n        yofo_inp = torch.clamp(yofo_inp, 0, 1)\n        loss = -1.0 * (self.Hamiltonian_func(yofo_inp, p) -\n                       config.weight_decay * cal_l2_norm(self.Hamiltonian_func.layer))\n\n        loss.backward()\n\n\n        return yofo_inp, eta\n\n\ndef train_one_epoch(net, batch_generator, optimizer,\n                    criterion, LayerOneTrainner, K,\n                    DEVICE=torch.device('cuda:0'),descrip_str='Training'):\n\n    net.train()\n    pbar = tqdm(batch_generator)\n    yofoacc = -1\n    pbar.set_description(descrip_str)\n\n    trades_criterion = torch.nn.KLDivLoss(size_average=False) #.to(DEVICE)\n\n    for i, (data, label) in enumerate(pbar):\n        data = data.to(DEVICE)\n        label = label.to(DEVICE)\n\n        net.eval()\n        eta = 0.001 * torch.randn(data.shape).cuda().detach().to(DEVICE)\n\n        eta.requires_grad_()\n\n\n        raw_soft_label = F.softmax(net(data), dim=1).detach()\n        for j in range(K):\n            pred = net(data + eta.detach())\n\n            with torch.enable_grad():\n                loss = trades_criterion(F.log_softmax(pred, dim = 1), raw_soft_label)#raw_soft_label.detach())\n\n            p = -1.0 * torch.autograd.grad(loss, [net.layer_one_out, ])[0]\n\n            yofo_inp, eta = LayerOneTrainner.step(data, p, eta)\n\n            with torch.no_grad():\n\n                if j == K - 1:\n                    yofo_pred = net(yofo_inp)\n                    yofo_loss = criterion(yofo_pred, label)\n                    yofoacc = torch_accuracy(yofo_pred, label, (1,))[0].item()\n\n\n        net.train()\n\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n\n        raw_pred = net(data)\n        acc = torch_accuracy(raw_pred, label, (1,))\n        clean_acc = acc[0].item()\n        clean_loss = criterion(raw_pred, label)\n\n\n        adv_pred = net(torch.clamp(data + eta.detach(), 0.0, 1.0))\n        kl_loss = trades_criterion(F.log_softmax(adv_pred, dim=1),\n                                    F.softmax(raw_pred, dim=1)) / data.shape[0]\n\n        loss = clean_loss + kl_loss\n        loss.backward()\n\n        optimizer.step()\n        LayerOneTrainner.param_optimizer.step()\n\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n\n        pbar_dic = OrderedDict()\n        pbar_dic['Acc'] = '{:.2f}'.format(clean_acc)\n        pbar_dic['cleanloss'] = '{:.3f}'.format(clean_loss.item())\n        pbar_dic['klloss'] = '{:.3f}'.format(kl_loss.item())\n        pbar_dic['YofoAcc'] = '{:.2f}'.format(yofoacc)\n        pbar_dic['Yofoloss'] = '{:.3f}'.format(yofo_loss.item())\n        pbar.set_postfix(pbar_dic)\n\n    return clean_acc, yofoacc"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-3-4/config.py",
    "content": "from easydict import EasyDict\nimport sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\nfrom loss import CrossEntropyWithWeightPenlty\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 105\n    val_interval = 10\n    weight_decay = 5e-4\n\n    inner_iters = 4\n    K = 3\n    sigma = 0.007\n    eps = 0.031\n\n    create_optimizer = SGDOptimizerMaker(lr =2e-1, momentum = 0.9, weight_decay = weight_decay)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [70, 90, 100], gamma = 0.1)\n\n    create_loss_function = torch.nn.CrossEntropyLoss\n    create_attack_method = None\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 8/255.0, sigma = 2/255.0, nb_iters = 20, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-3-4/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.RandomCrop(32, padding=4),\n     transforms.RandomHorizontalFlip(),\n     transforms.ToTensor(),\n    ])\n\n    trainset = torchvision.datasets.CIFAR10(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n    ])\n    testset = torchvision.datasets.CIFAR10(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=2)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-3-4/loss.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.nn.modules.loss import _Loss\nimport torch.nn.functional as F\n\n\nclass Hamiltonian(_Loss):\n\n    def __init__(self, layer, reg_cof = 1e-4):\n        super(Hamiltonian, self).__init__()\n        self.layer = layer\n        self.reg_cof = 0\n\n\n    def forward(self, x, p):\n\n        y = self.layer(x)\n        H = torch.sum(y * p)\n        return H\n\n\n\nclass CrossEntropyWithWeightPenlty(_Loss):\n    def __init__(self, module, DEVICE, reg_cof = 1e-4):\n        super(CrossEntropyWithWeightPenlty, self).__init__()\n\n        self.reg_cof = reg_cof\n        self.criterion = nn.CrossEntropyLoss().to(DEVICE)\n        self.module = module\n\n    def __call__(self, pred, label):\n        cross_loss = self.criterion(pred, label)\n\n\n        weight_loss = cal_l2_norm(self.module)\n\n        loss = cross_loss + self.reg_cof * weight_loss\n        return loss\n\ndef cal_l2_norm(layer: torch.nn.Module):\n loss = 0.\n for name, param in layer.named_parameters():\n     if name == 'weight':\n         loss = loss + 0.5 * torch.norm(param,) ** 2\n\n return loss\n\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-3-4/network.py",
    "content": "'''Pre-activation ResNet in PyTorch.\nReference:\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Identity Mappings in Deep Residual Networks. arXiv:1603.05027\n'''\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\n\nclass PreActBlock(nn.Module):\n    '''Pre-activation version of the BasicBlock.'''\n    expansion = 1\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBlock, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion * planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out += shortcut\n        return out\n\n\nclass PreActResNet(nn.Module):\n\n    def __init__(self, block, num_blocks, num_classes=10):\n        super(PreActResNet, self).__init__()\n        self.in_planes = 64\n\n        self.other_layers = nn.ModuleList()\n\n        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)\n\n        self.layer_one = self.conv1\n\n\n        self.other_layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)\n        self.other_layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)\n        self.other_layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)\n        self.other_layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)\n\n        self.linear = GlobalpoolFC(512 * block.expansion, num_classes)\n        self.other_layers.append(self.linear)\n\n    def _make_layer(self, block, planes, num_blocks, stride):\n        strides = [stride] + [1] * (num_blocks - 1)\n        layers = []\n        for stride in strides:\n            layers.append(block(self.in_planes, planes, stride))\n            self.other_layers.append(layers[-1])\n\n            self.in_planes = planes * block.expansion\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n\n        x = self.layer_one(x)\n        self.layer_one_out = x\n        self.layer_one_out.requires_grad_()\n        self.layer_one_out.retain_grad()\n        x = self.layer_one_out\n\n        for layer in self.other_layers:\n            x = layer(x)\n\n        return x\n\n\nclass GlobalpoolFC(nn.Module):\n\n    def __init__(self, num_in, num_class):\n        super(GlobalpoolFC, self).__init__()\n        self.pool = nn.AdaptiveAvgPool2d(output_size=1)\n        self.fc = nn.Linear(num_in, num_class)\n\n    def forward(self, x):\n        y = self.pool(x)\n        y = y.reshape(y.shape[0], -1)\n        y = self.fc(y)\n        return y\n\n\ndef PreActResNet18():\n    return PreActResNet(PreActBlock, [2, 2, 2, 2])\n\n\ndef PreActResNet34():\n    return PreActResNet(PreActBlock, [3, 4, 6, 3])\n\n\nclass PreActBottleneck(nn.Module):\n    '''Pre-activation version of the original Bottleneck module.'''\n    expansion = 4\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBottleneck, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn3 = nn.BatchNorm2d(planes)\n        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion*planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out = self.conv3(F.relu(self.bn3(out)))\n        out += shortcut\n        return out\n\ndef create_network():\n    return PreActResNet18()\n\n\ndef test():\n    net = PreActResNet18()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-3-4/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import eval_one_epoch\nfrom loss import  Hamiltonian, CrossEntropyWithWeightPenlty\nfrom training_function import train_one_epoch, FastGradientLayerOneTrainer\n\nimport torch\nimport torch.optim as optim\nimport os\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\n\nnet = create_network()\nnet.to(DEVICE)\n\ncriterion = config.create_loss_function().to(DEVICE)\noptimizer = config.create_optimizer(net.other_layers.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\nHamiltonian_func = Hamiltonian(net.layer_one, config.weight_decay)\nlayer_one_optimizer = optim.SGD(net.layer_one.parameters(), lr = lr_scheduler.get_lr()[0], momentum=0.9, weight_decay=5e-4)\nlyaer_one_optimizer_lr_scheduler = optim.lr_scheduler.MultiStepLR(layer_one_optimizer,\n                                                                  milestones = [70, 90, 100], gamma = 0.1)\nLayerOneTrainer = FastGradientLayerOneTrainer(Hamiltonian_func, layer_one_optimizer,\n                                              config.inner_iters, config.sigma, config.eps)\n\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\n\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    acc, yofoacc = train_one_epoch(net, ds_train, optimizer, criterion, LayerOneTrainer, config.K,\n                    DEVICE, descrip_str)\n    acc, advacc = eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n\n    lr_scheduler.step()\n    lyaer_one_optimizer_lr_scheduler.step()\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "experiments/CIFAR10-TRADES/pre-res18.TRADES-YOPO-3-4/training_function.py",
    "content": "import torch\nimport torch.nn as nn\nfrom config import config\n\nfrom loss import Hamiltonian, cal_l2_norm\nimport torch.nn.functional as F\n\nfrom utils.misc import torch_accuracy, AvgMeter\nfrom collections import OrderedDict\nimport torch\nfrom tqdm import tqdm\n\n\nclass FastGradientLayerOneTrainer(object):\n\n    def __init__(self, Hamiltonian_func, param_optimizer,\n                    inner_steps=2, sigma = 0.008, eps = 0.03):\n        self.inner_steps = inner_steps\n        self.sigma = sigma\n        self.eps = eps\n        self.Hamiltonian_func = Hamiltonian_func\n        self.param_optimizer = param_optimizer\n\n    def step(self, inp, p, eta):\n\n        p = p.detach()\n\n        for i in range(self.inner_steps):\n            tmp_inp = inp + eta\n            tmp_inp = torch.clamp(tmp_inp, 0, 1)\n            H = self.Hamiltonian_func(tmp_inp, p)\n\n            eta_grad = torch.autograd.grad(H, eta, only_inputs=True, retain_graph=False)[0]\n            eta_grad_sign = eta_grad.sign()\n            eta = eta - eta_grad_sign * self.sigma\n\n            eta = torch.clamp(eta, -1.0 * self.eps, self.eps)\n            eta = torch.clamp(inp + eta, 0.0, 1.0) - inp\n            eta = eta.detach()\n            eta.requires_grad_()\n            eta.retain_grad()\n\n        yofo_inp = eta + inp\n        yofo_inp = torch.clamp(yofo_inp, 0, 1)\n\n        loss = -1.0 * (self.Hamiltonian_func(yofo_inp, p) -\n                       config.weight_decay * cal_l2_norm(self.Hamiltonian_func.layer))\n\n        loss.backward()\n\n        return yofo_inp, eta\n\n\ndef train_one_epoch(net, batch_generator, optimizer,\n                    criterion, LayerOneTrainner, K,\n                    DEVICE=torch.device('cuda:0'),descrip_str='Training'):\n\n    net.train()\n    pbar = tqdm(batch_generator)\n    yofoacc = -1\n    pbar.set_description(descrip_str)\n\n    trades_criterion = torch.nn.KLDivLoss(size_average=False) #.to(DEVICE)\n\n    for i, (data, label) in enumerate(pbar):\n        data = data.to(DEVICE)\n        label = label.to(DEVICE)\n\n        net.eval()\n        eta = 0.001 * torch.randn(data.shape).cuda().detach().to(DEVICE)\n        eta.requires_grad_()\n\n        raw_soft_label = F.softmax(net(data), dim=1).detach()\n        for j in range(K):\n            pred = net(data + eta.detach())\n            with torch.enable_grad():\n                loss = trades_criterion(F.log_softmax(pred, dim = 1), raw_soft_label)#raw_soft_label.detach())\n\n            p = -1.0 * torch.autograd.grad(loss, [net.layer_one_out, ])[0]\n\n            yofo_inp, eta = LayerOneTrainner.step(data, p, eta)\n\n            with torch.no_grad():\n\n                if j == K - 1:\n                    yofo_pred = net(yofo_inp)\n                    yofo_loss = criterion(yofo_pred, label)\n                    yofoacc = torch_accuracy(yofo_pred, label, (1,))[0].item()\n\n        net.train()\n\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n\n        raw_pred = net(data)\n        acc = torch_accuracy(raw_pred, label, (1,))\n        clean_acc = acc[0].item()\n        clean_loss = criterion(raw_pred, label)\n\n        adv_pred = net(torch.clamp(data + eta.detach(), 0.0, 1.0))\n        kl_loss = trades_criterion(F.log_softmax(adv_pred, dim=1),\n                                    F.softmax(raw_pred, dim=1)) / data.shape[0]\n\n        loss = clean_loss + kl_loss\n        loss.backward()\n\n        optimizer.step()\n        LayerOneTrainner.param_optimizer.step()\n\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n\n        pbar_dic = OrderedDict()\n        pbar_dic['Acc'] = '{:.2f}'.format(clean_acc)\n        pbar_dic['cleanloss'] = '{:.3f}'.format(clean_loss.item())\n        pbar_dic['klloss'] = '{:.3f}'.format(kl_loss.item())\n        pbar_dic['YofoAcc'] = '{:.2f}'.format(yofoacc)\n        pbar_dic['Yofoloss'] = '{:.3f}'.format(yofo_loss.item())\n        pbar.set_postfix(pbar_dic)\n\n    return clean_acc, yofoacc"
  },
  {
    "path": "experiments/MNIST/YOPO-5-10/config.py",
    "content": "import sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\n\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 40\n    val_interval = 1\n    weight_decay = 5e-4\n\n    inner_iters = 10\n    K = 5\n    sigma = 0.01\n    eps = 0.3\n\n    create_optimizer = SGDOptimizerMaker(lr =1e-2 / K, momentum = 0.9, weight_decay = weight_decay)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [30, 35, 39], gamma = 0.1)\n\n    create_loss_function = None\n\n    create_attack_method = None\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 0.3, sigma = 0.01, nb_iters = 40, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "experiments/MNIST/YOPO-5-10/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.ToTensor(),\n    ])\n    trainset = torchvision.datasets.MNIST(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n    ])\n    testset = torchvision.datasets.MNIST(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=2)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/MNIST/YOPO-5-10/eval.py",
    "content": "from config import config\nfrom dataset import create_test_dataset\nfrom network import create_network\n\nfrom training.train import eval_one_epoch\nfrom utils.misc import load_checkpoint\n\nimport argparse\nimport torch\nimport numpy as np\nimport os\n\nparser = argparse.ArgumentParser()\nparser.add_argument('--resume', '--resume', default='log/models/last.checkpoint',\n                    type=str, metavar='PATH',\n                    help='path to latest checkpoint (default:log/last.checkpoint)')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nargs = parser.parse_args()\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\n\nds_val = create_test_dataset(512)\n\nAttackMethod = config.create_evaluation_attack_method(DEVICE)\n\nif os.path.isfile(args.resume):\n    load_checkpoint(args.resume, net)\n\n\nprint('Evaluating')\nclean_acc, adv_acc = eval_one_epoch(net, ds_val, DEVICE, AttackMethod)\nprint('clean acc -- {}     adv acc -- {}'.format(clean_acc, adv_acc))\n"
  },
  {
    "path": "experiments/MNIST/YOPO-5-10/loss.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.nn.modules.loss import _Loss\nimport torch.nn.functional as F\n\n\nclass Hamiltonian(_Loss):\n\n    def __init__(self, layer, reg_cof = 1e-4):\n        super(Hamiltonian, self).__init__()\n        self.layer = layer\n        self.reg_cof = 0\n\n\n    def forward(self, x, p):\n\n        y = self.layer(x)\n        H = torch.sum(y * p)\n        return H\n\n\n\nclass CrossEntropyWithWeightPenlty(_Loss):\n    def __init__(self, module, DEVICE, reg_cof = 1e-4):\n        super(CrossEntropyWithWeightPenlty, self).__init__()\n\n        self.reg_cof = reg_cof\n        self.criterion = nn.CrossEntropyLoss().to(DEVICE)\n        self.module = module\n\n    def __call__(self, pred, label):\n        cross_loss = self.criterion(pred, label)\n        weight_loss = cal_l2_norm(self.module)\n\n        loss = cross_loss + self.reg_cof * weight_loss\n        return loss\n\ndef cal_l2_norm(layer: torch.nn.Module):\n loss = 0.\n for name, param in layer.named_parameters():\n     if name == 'weight':\n         loss = loss + 0.5 * torch.norm(param,) ** 2\n\n return loss\n\n"
  },
  {
    "path": "experiments/MNIST/YOPO-5-10/network.py",
    "content": "import config\nfrom collections import OrderedDict\nimport torch.nn as nn\n\n\nclass SmallCNN(nn.Module):\n    def __init__(self, drop=0.5):\n        super(SmallCNN, self).__init__()\n\n        self.num_channels = 1\n        self.num_labels = 10\n\n        activ = nn.ReLU(True)\n        self.conv1 = nn.Conv2d(self.num_channels, 32, 3)\n        self.layer_one = nn.Sequential(OrderedDict([\n            ('conv1', self.conv1),\n            ('relu1', activ),]))\n\n\n        self.feature_extractor = nn.Sequential(OrderedDict([\n            ('conv2', nn.Conv2d(32, 32, 3)),\n            ('relu2', activ),\n            ('maxpool1', nn.MaxPool2d(2, 2)),\n            ('conv3', nn.Conv2d(32, 64, 3)),\n            ('relu3', activ),\n            ('conv4', nn.Conv2d(64, 64, 3)),\n            ('relu4', activ),\n            ('maxpool2', nn.MaxPool2d(2, 2)),\n        ]))\n\n        self.classifier = nn.Sequential(OrderedDict([\n            ('fc1', nn.Linear(64 * 4 * 4, 200)),\n            ('relu1', activ),\n            ('drop', nn.Dropout(drop)),\n            ('fc2', nn.Linear(200, 200)),\n            ('relu2', activ),\n            ('fc3', nn.Linear(200, self.num_labels)),\n        ]))\n        self.other_layers = nn.ModuleList()\n        self.other_layers.append(self.feature_extractor)\n        self.other_layers.append(self.classifier)\n\n        for m in self.modules():\n            if isinstance(m, (nn.Conv2d)):\n                nn.init.kaiming_normal_(m.weight)\n                if m.bias is not None:\n                    nn.init.constant_(m.bias, 0)\n            elif isinstance(m, nn.BatchNorm2d):\n                nn.init.constant_(m.weight, 1)\n                nn.init.constant_(m.bias, 0)\n        nn.init.constant_(self.classifier.fc3.weight, 0)\n        nn.init.constant_(self.classifier.fc3.bias, 0)\n\n    def forward(self, input):\n        y = self.layer_one(input)\n        self.layer_one_out = y\n        self.layer_one_out.requires_grad_()\n        self.layer_one_out.retain_grad()\n        features = self.feature_extractor(y)\n        logits = self.classifier(features.view(-1, 64 * 4 * 4))\n        return logits\n\ndef create_network():\n    return SmallCNN()\n\n\ndef test():\n    net = create_network()\n    y = net((torch.randn(1, 1, 28, 28)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/MNIST/YOPO-5-10/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import eval_one_epoch\nfrom loss import  Hamiltonian, CrossEntropyWithWeightPenlty\nfrom training_function import train_one_epoch, FastGradientLayerOneTrainer\n\nimport torch\nimport json\nimport numpy as np\n# from tensorboardX import SummaryWriter\n\n\nimport torch.nn as nn\nimport torch.optim as optim\nimport os\nimport time\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\n# writer = SummaryWriter(log_dir=config.log_dir)\n\nnet = create_network()\nnet.to(DEVICE)\ncriterion = CrossEntropyWithWeightPenlty(net.other_layers, DEVICE, config.weight_decay)#.to(DEVICE)\noptimizer = config.create_optimizer(net.other_layers.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\n\n\nHamiltonian_func = Hamiltonian(net.layer_one, config.weight_decay)\nlayer_one_optimizer = optim.SGD(net.layer_one.parameters(), lr = lr_scheduler.get_lr()[0], momentum=0.9, weight_decay=5e-4)\nlyaer_one_optimizer_lr_scheduler = optim.lr_scheduler.MultiStepLR(layer_one_optimizer,\n                                                                  milestones = [15, 19], gamma = 0.1)\nLayerOneTrainer = FastGradientLayerOneTrainer(Hamiltonian_func, layer_one_optimizer,\n                                              config.inner_iters, config.sigma, config.eps)\n\n\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\n\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nnow_train_time = 0\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    s_time = time.time()\n    acc, yofoacc = train_one_epoch(net, ds_train, optimizer, criterion, LayerOneTrainer, config.K,\n                    DEVICE, descrip_str)\n    now_train_time = now_train_time + time.time() - s_time\n    tb_train_dic = {'Acc':acc, 'YofoAcc':yofoacc}\n    print(tb_train_dic)\n\n    # writer.add_scalars('Train', tb_train_dic, now_epoch)\n    if config.val_interval > 0 and now_epoch % config.val_interval == 0:\n        acc, advacc = eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n        tb_val_dic = {'Acc': acc, 'AdvAcc': advacc}\n        # writer.add_scalars('Val', tb_val_dic, now_epoch)\n        tb_val_dic['time'] = now_train_time\n        log_str = json.dumps(tb_val_dic)\n        with open('time.log', 'a') as f:\n            f.write(log_str+ '\\n')\n\n\n    lr_scheduler.step()\n    lyaer_one_optimizer_lr_scheduler.step()\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "experiments/MNIST/YOPO-5-10/training_function.py",
    "content": "import torch\nimport torch.nn as nn\nfrom config import config\n\nfrom loss import Hamiltonian, cal_l2_norm\n\nfrom utils.misc import torch_accuracy, AvgMeter\nfrom collections import OrderedDict\nimport torch\nfrom tqdm import tqdm\n\n\nclass FastGradientLayerOneTrainer(object):\n\n    def __init__(self, Hamiltonian_func, param_optimizer,\n                    inner_steps=2, sigma = 0.008, eps = 0.03):\n        self.inner_steps = inner_steps\n        self.sigma = sigma\n        self.eps = eps\n        self.Hamiltonian_func = Hamiltonian_func\n        self.param_optimizer = param_optimizer\n\n    def step(self, inp, p, eta):\n        '''\n        Perform Iterative Sign Gradient on eta\n\n        ret: inp + eta\n        '''\n\n        p = p.detach()\n\n        for i in range(self.inner_steps):\n            tmp_inp = inp + eta\n            tmp_inp = torch.clamp(tmp_inp, 0, 1)\n            H = self.Hamiltonian_func(tmp_inp, p)\n\n            eta_grad_sign = torch.autograd.grad(H, eta, only_inputs=True, retain_graph=False)[0].sign()\n\n            eta = eta - eta_grad_sign * self.sigma\n\n            eta = torch.clamp(eta, -1.0 * self.eps, self.eps)\n            eta = torch.clamp(inp + eta, 0.0, 1.0) - inp\n            eta = eta.detach()\n            eta.requires_grad_()\n            eta.retain_grad()\n\n        #self.param_optimizer.zero_grad()\n\n        yofo_inp = eta + inp\n        yofo_inp = torch.clamp(yofo_inp, 0, 1)\n\n        loss = -1.0 * self.Hamiltonian_func(yofo_inp, p)\n\n        loss.backward()\n        #self.param_optimizer.step()\n        #self.param_optimizer.zero_grad()\n\n        return yofo_inp, eta\n\n\n\n\ndef train_one_epoch(net, batch_generator, optimizer,\n                    criterion, LayerOneTrainner, K,\n                    DEVICE=torch.device('cuda:0'),descrip_str='Training'):\n    '''\n\n    :param attack_freq:  Frequencies of training with adversarial examples. -1 indicates natural training\n    :param AttackMethod: the attack method, None represents natural training\n    :return:  None    #(clean_acc, adv_acc)\n    '''\n    net.train()\n    pbar = tqdm(batch_generator)\n    yofoacc = -1\n    cleanacc = -1\n    cleanloss = -1\n    pbar.set_description(descrip_str)\n    for i, (data, label) in enumerate(pbar):\n        data = data.to(DEVICE)\n        label = label.to(DEVICE)\n\n        eta = torch.FloatTensor(*data.shape).uniform_(-config.eps, config.eps)\n        eta = eta.to(label.device)\n        eta.requires_grad_()\n\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n\n        for j in range(K):\n            pbar_dic = OrderedDict()\n            TotalLoss = 0\n\n            pred = net(data + eta.detach())\n\n            loss = criterion(pred, label)\n            TotalLoss = TotalLoss + loss\n            wgrad = net.conv1.weight.grad\n            TotalLoss.backward()\n            net.conv1.weight.grad = wgrad\n\n\n            p = -1.0 * net.layer_one_out.grad\n            yofo_inp, eta = LayerOneTrainner.step(data, p, eta)\n\n            with torch.no_grad():\n                if j == 0:\n                    acc = torch_accuracy(pred, label, (1,))\n                    cleanacc = acc[0].item()\n                    cleanloss = loss.item()\n\n                if j == K - 1:\n                    yofo_pred = net(yofo_inp)\n                    yofoacc = torch_accuracy(yofo_pred, label, (1,))[0].item()\n\n        optimizer.step()\n        LayerOneTrainner.param_optimizer.step()\n        optimizer.zero_grad()\n        LayerOneTrainner.param_optimizer.zero_grad()\n        pbar_dic['Acc'] = '{:.2f}'.format(cleanacc)\n        pbar_dic['loss'] = '{:.2f}'.format(cleanloss)\n        pbar_dic['YofoAcc'] = '{:.2f}'.format(yofoacc)\n        pbar.set_postfix(pbar_dic)\n\n    return cleanacc, yofoacc\n\n"
  },
  {
    "path": "experiments/MNIST/pgd40/config.py",
    "content": "import sys\nimport os\nimport argparse\nimport numpy as np\nimport torch\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n\nabs_current_path = os.path.realpath('./')\nroot_path = os.path.join('/', *abs_current_path.split(os.path.sep)[:-3])\nlib_dir = os.path.join(root_path, 'lib')\nadd_path(lib_dir)\n\nfrom training.config import TrainingConfigBase, SGDOptimizerMaker, \\\n    PieceWiseConstantLrSchedulerMaker, IPGDAttackMethodMaker\n\nclass TrainingConfing(TrainingConfigBase):\n\n    lib_dir = lib_dir\n\n    num_epochs = 56\n    val_interval = 1\n\n    create_optimizer = SGDOptimizerMaker(lr =1e-1, momentum = 0.9, weight_decay = 5e-4)\n    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [50, 55], gamma = 0.1)\n\n    create_loss_function = torch.nn.CrossEntropyLoss\n\n    create_attack_method = \\\n        IPGDAttackMethodMaker(eps = 0.3, sigma = 0.01, nb_iters = 40, norm = np.inf,\n                              mean = torch.tensor(np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std = torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n    create_evaluation_attack_method = \\\n        IPGDAttackMethodMaker(eps = 0.3, sigma = 0.01, nb_iters = 40, norm = np.inf,\n                              mean=torch.tensor(\n                                  np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                              std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))\n\n\nconfig = TrainingConfing()\n\nparser = argparse.ArgumentParser()\n\nparser.add_argument('--resume', default=None, type=str, metavar='PATH',\n                 help='path to latest checkpoint (default: none)')\nparser.add_argument('-b', '--batch_size', default=256, type=int,\n                 metavar='N', help='mini-batch size')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nparser.add_argument('-adv_coef', default=1.0, type = float,\n                    help = 'Specify the weight for adversarial loss')\nparser.add_argument('--auto-continue', default=False, action = 'store_true',\n                    help = 'Continue from the latest checkpoint')\nargs = parser.parse_args()\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "experiments/MNIST/pgd40/dataset.py",
    "content": "import torch\nimport torchvision\nimport torchvision.transforms as transforms\nimport numpy as np\ndef create_train_dataset(batch_size = 128, root = '../data'):\n\n    transform_train = transforms.Compose([\n     transforms.ToTensor(),\n    ])\n    trainset = torchvision.datasets.MNIST(root=root, train=True, download=True, transform=transform_train)\n    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)\n\n    return trainloader\ndef create_test_dataset(batch_size = 128, root = '../data'):\n    transform_test = transforms.Compose([\n     transforms.ToTensor(),\n    ])\n    testset = torchvision.datasets.MNIST(root=root, train=False, download=True, transform=transform_test)\n    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=2)\n    return testloader\n\n\nif __name__ == '__main__':\n    print(create_train_dataset())\n    print(create_test_dataset())\n\n"
  },
  {
    "path": "experiments/MNIST/pgd40/eval.py",
    "content": "from config import config\nfrom dataset import create_test_dataset\nfrom network import create_network\n\nfrom training.train import eval_one_epoch\nfrom utils.misc import load_checkpoint\n\nimport argparse\nimport torch\nimport numpy as np\nimport os\n\nparser = argparse.ArgumentParser()\nparser.add_argument('--resume', '--resume', default='log/models/last.checkpoint',\n                    type=str, metavar='PATH',\n                    help='path to latest checkpoint (default:log/last.checkpoint)')\nparser.add_argument('-d', type=int, default=0, help='Which gpu to use')\nargs = parser.parse_args()\n\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\n\nds_val = create_test_dataset(512)\n\nAttackMethod = config.create_evaluation_attack_method(DEVICE)\n\nif os.path.isfile(args.resume):\n    load_checkpoint(args.resume, net)\n\n\nprint('Evaluating')\nclean_acc, adv_acc = eval_one_epoch(net, ds_val, DEVICE, AttackMethod)\nprint('clean acc -- {}     adv acc -- {}'.format(clean_acc, adv_acc))\n"
  },
  {
    "path": "experiments/MNIST/pgd40/network.py",
    "content": "import config\nfrom base_model.small_cnn import SmallCNN\n\n\ndef create_network():\n    return SmallCNN()\n\n\ndef test():\n    net = create_network()\n    y = net((torch.randn(1, 1, 28, 28)))\n    print(y.size())\n"
  },
  {
    "path": "experiments/MNIST/pgd40/train.py",
    "content": "from config import config, args\nfrom dataset import create_train_dataset, create_test_dataset\nfrom network import create_network\n\nfrom utils.misc import save_args, save_checkpoint, load_checkpoint\nfrom training.train import train_one_epoch, eval_one_epoch\n\nimport torch\n\nimport os\n\nDEVICE = torch.device('cuda:{}'.format(args.d))\ntorch.backends.cudnn.benchmark = True\n\nnet = create_network()\nnet.to(DEVICE)\ncriterion = config.create_loss_function().to(DEVICE)\n\noptimizer = config.create_optimizer(net.parameters())\nlr_scheduler = config.create_lr_scheduler(optimizer)\n\nds_train = create_train_dataset(args.batch_size)\nds_val = create_test_dataset(args.batch_size)\n\nTrainAttack = config.create_attack_method(DEVICE)\nEvalAttack = config.create_evaluation_attack_method(DEVICE)\n\nnow_epoch = 0\n\nif args.auto_continue:\n    args.resume = os.path.join(config.model_dir, 'last.checkpoint')\nif args.resume is not None and os.path.isfile(args.resume):\n    now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)\n\nwhile True:\n    if now_epoch > config.num_epochs:\n        break\n    now_epoch = now_epoch + 1\n\n    descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, config.num_epochs,\n                                                                       lr_scheduler.get_lr()[0])\n    train_one_epoch(net, ds_train, optimizer, criterion, DEVICE,\n                    descrip_str, TrainAttack, adv_coef = args.adv_coef)\n    if config.val_interval > 0 and now_epoch % config.val_interval == 0:\n        eval_one_epoch(net, ds_val, DEVICE, EvalAttack)\n\n    lr_scheduler.step()\n\n    save_checkpoint(now_epoch, net, optimizer, lr_scheduler,\n                    file_name = os.path.join(config.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))\n"
  },
  {
    "path": "lib/__init__.py",
    "content": ""
  },
  {
    "path": "lib/attack/__init__.py",
    "content": "from .attack_base import clip_eta\n"
  },
  {
    "path": "lib/attack/attack_base.py",
    "content": "import torch\nimport numpy as np\nfrom abc import ABCMeta, abstractmethod, abstractproperty\n\nclass AttackBase(metaclass=ABCMeta):\n    @abstractmethod\n    def attack(self, net, inp, label, target = None):\n        '''\n\n        :param inp: batched images\n        :param target: specify the indexes of target class, None represents untargeted attack\n        :return: batched adversaril images\n        '''\n        pass\n\n    @abstractmethod\n    def to(self, device):\n        pass\n\n\n\ndef clip_eta(eta, norm, eps, DEVICE = torch.device('cuda:0')):\n    '''\n    helper functions to project eta into epsilon norm ball\n    :param eta: Perturbation tensor (should be of size(N, C, H, W))\n    :param norm: which norm. should be in [1, 2, np.inf]\n    :param eps: epsilon, bound of the perturbation\n    :return: Projected perturbation\n    '''\n\n    assert norm in [1, 2, np.inf], \"norm should be in [1, 2, np.inf]\"\n\n    with torch.no_grad():\n        avoid_zero_div = torch.tensor(1e-12).to(DEVICE)\n        eps = torch.tensor(eps).to(DEVICE)\n        one = torch.tensor(1.0).to(DEVICE)\n\n        if norm == np.inf:\n            eta = torch.clamp(eta, -eps, eps)\n        else:\n            normalize = torch.norm(eta.reshape(eta.size(0), -1), p = norm, dim = -1, keepdim = False)\n            normalize = torch.max(normalize, avoid_zero_div)\n\n            normalize.unsqueeze_(dim = -1)\n            normalize.unsqueeze_(dim=-1)\n            normalize.unsqueeze_(dim=-1)\n\n            factor = torch.min(one, eps / normalize)\n            eta = eta * factor\n    return eta\n\ndef test_clip():\n\n    a = torch.rand((10, 3, 28, 28)).cuda()\n\n    epss = [0.1, 0.5, 1]\n\n    norms = [1, 2, np.inf]\n    for e, n in zip(epss, norms):\n        print(e, n)\n        c = clip_eta(a, n, e, True)\n\n        print(c)\n\nif __name__ == '__main__':\n    test_clip()\n"
  },
  {
    "path": "lib/attack/pgd.py",
    "content": "'''\nReference:\n[1] Towards Deep Learning Models Resistant to Adversarial Attacks\nAleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu\narXiv:1706.06083v3\n'''\nimport torch\nimport numpy as np\nimport os\nimport sys\nfather_dir = os.path.join('/', *os.path.realpath(__file__).split(os.path.sep)[:-2])\nif not father_dir in sys.path:\n    sys.path.append(father_dir)\nfrom attack.attack_base import AttackBase, clip_eta\n\nclass IPGD(AttackBase):\n    # ImageNet pre-trained mean and std\n    # _mean = torch.tensor(np.array([0.485, 0.456, 0.406]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis])\n    # _std = torch.tensor(np.array([0.229, 0.224, 0.225]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis])\n\n    # _mean = torch.tensor(np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis])\n    # _std = torch.tensor(np.array([1.0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis])\n    def __init__(self, eps = 6 / 255.0, sigma = 3 / 255.0, nb_iter = 20,\n                 norm = np.inf, DEVICE = torch.device('cpu'),\n                 mean = torch.tensor(np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),\n                 std = torch.tensor(np.array([1.0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]), random_start = True):\n        '''\n        :param eps: maximum distortion of adversarial examples\n        :param sigma: single step size\n        :param nb_iter: number of attack iterations\n        :param norm: which norm to bound the perturbations\n        '''\n        self.eps = eps\n        self.sigma = sigma\n        self.nb_iter = nb_iter\n        self.norm = norm\n        self.criterion = torch.nn.CrossEntropyLoss().to(DEVICE)\n        self.DEVICE = DEVICE\n        self._mean = mean.to(DEVICE)\n        self._std = std.to(DEVICE)\n        self.random_start = random_start\n\n    def single_attack(self, net, inp, label, eta, target = None):\n        '''\n        Given the original image and the perturbation computed so far, computes\n        a new perturbation.\n        :param net:\n        :param inp: original image\n        :param label:\n        :param eta: perturbation computed so far\n        :return: a new perturbation\n        '''\n\n        adv_inp = inp + eta\n\n        #net.zero_grad()\n\n        pred = net(adv_inp)\n        if target is not None:\n            targets = torch.sum(pred[:, target])\n            grad_sign = torch.autograd.grad(targets, adv_in, only_inputs=True, retain_graph = False)[0].sign()\n\n        else:\n            loss = self.criterion(pred, label)\n            grad_sign = torch.autograd.grad(loss, adv_inp,\n                                            only_inputs=True, retain_graph = False)[0].sign()\n\n        adv_inp = adv_inp + grad_sign * (self.sigma / self._std)\n        tmp_adv_inp = adv_inp * self._std +  self._mean\n\n        tmp_inp = inp * self._std + self._mean\n        tmp_adv_inp = torch.clamp(tmp_adv_inp, 0, 1) ## clip into 0-1\n        #tmp_adv_inp = (tmp_adv_inp - self._mean) / self._std\n        tmp_eta = tmp_adv_inp - tmp_inp\n        tmp_eta = clip_eta(tmp_eta, norm=self.norm, eps=self.eps, DEVICE=self.DEVICE)\n\n        eta = tmp_eta/ self._std\n\n        return eta\n\n    def attack(self, net, inp, label, target = None):\n\n        if self.random_start:\n            eta = torch.FloatTensor(*inp.shape).uniform_(-self.eps, self.eps)\n        else:\n            eta = torch.zeros_like(inp)\n        eta = eta.to(self.DEVICE)\n        eta = (eta - self._mean) / self._std\n        net.eval()\n\n        inp.requires_grad = True\n        eta.requires_grad = True\n        for i in range(self.nb_iter):\n            eta = self.single_attack(net, inp, label, eta, target)\n            #print(i)\n\n        #print(eta.max())\n        adv_inp = inp + eta\n        tmp_adv_inp = adv_inp * self._std +  self._mean\n        tmp_adv_inp = torch.clamp(tmp_adv_inp, 0, 1)\n        adv_inp = (tmp_adv_inp - self._mean) / self._std\n\n        return adv_inp\n\n    def to(self, device):\n        self.DEVICE = device\n        self._mean = self._mean.to(device)\n        self._std = self._std.to(device)\n        self.criterion = self.criterion.to(device)\n\ndef test_IPGD():\n    pass\nif __name__ == '__main__':\n    test_IPGD()\n"
  },
  {
    "path": "lib/base_model/__init__.py",
    "content": ""
  },
  {
    "path": "lib/base_model/cifar_resnet18.py",
    "content": "'''\nResNet in PyTorch.absFor Pre-activation ResNet, see 'preact_resnet.py'.\nReference:\n    [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n        Deep Residual Learning for Image Recognition. arXiv:1512.03385\n\nNote: cifar_resnet18 constructs the same model with that from\nhttps://github.com/kuangliu/pytorch-cifar/blob/master/models/resnet.py\n'''\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass BasicBlock(nn.Module):\n    expansion = 1\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(BasicBlock, self).__init__()\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn1 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n\n        self.shortcut = nn.Sequential()\n        if stride != 1 or in_planes != self.expansion*planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False),\n                nn.BatchNorm2d(self.expansion*planes)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(self.conv1(x)))\n        out = self.bn2(self.conv2(out))\n        out += self.shortcut(x)\n        out = F.relu(out)\n        return out\n\n\nclass Bottleneck(nn.Module):\n    expansion = 4\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(Bottleneck, self).__init__()\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)\n        self.bn1 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)\n        self.bn3 = nn.BatchNorm2d(self.expansion*planes)\n\n        self.shortcut = nn.Sequential()\n        if stride != 1 or in_planes != self.expansion*planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False),\n                nn.BatchNorm2d(self.expansion*planes)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(self.conv1(x)))\n        out = F.relu(self.bn2(self.conv2(out)))\n        out = self.bn3(self.conv3(out))\n        out += self.shortcut(x)\n        out = F.relu(out)\n        return out\n\n\nclass ResNet(nn.Module):\n    def __init__(self, block, num_blocks, num_classes=10):\n        super(ResNet, self).__init__()\n        self.in_planes = 64\n\n        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)\n        self.bn1 = nn.BatchNorm2d(64)\n        self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)\n        self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)\n        self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)\n        self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)\n        self.linear = nn.Linear(512*block.expansion, num_classes)\n\n    def _make_layer(self, block, planes, num_blocks, stride):\n        strides = [stride] + [1]*(num_blocks-1)\n        layers = []\n        for stride in strides:\n            layers.append(block(self.in_planes, planes, stride))\n            self.in_planes = planes * block.expansion\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n        out = F.relu(self.bn1(self.conv1(x)))\n        out = self.layer1(out)\n        out = self.layer2(out)\n        out = self.layer3(out)\n        out = self.layer4(out)\n        out = F.avg_pool2d(out, 4)\n        out = out.view(out.size(0), -1)\n        out = self.linear(out)\n        return out\n\n\ndef cifar_resnet18(*args, **kargs):\n    return ResNet(BasicBlock, [2,2,2,2])\n\ndef ResNet34():\n    return ResNet(BasicBlock, [3,4,6,3])\n\ndef ResNet50():\n    return ResNet(Bottleneck, [3,4,6,3])\n\ndef ResNet101():\n    return ResNet(Bottleneck, [3,4,23,3])\n\ndef ResNet152():\n    return ResNet(Bottleneck, [3,8,36,3])\n\n\ndef test():\n    net = ResNet18()\n    y = net(torch.randn(1,3,32,32))\n    print(y.size())\n\nif __name__ == '__main__':\n\n    test()\n"
  },
  {
    "path": "lib/base_model/network.py",
    "content": "'''Pre-activation ResNet in PyTorch.\nReference:\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Identity Mappings in Deep Residual Networks. arXiv:1603.05027\n'''\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\n\nclass PreActBlock(nn.Module):\n    '''Pre-activation version of the BasicBlock.'''\n    expansion = 1\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBlock, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion * planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out += shortcut\n        return out\n\n\nclass PreActResNet(nn.Module):\n\n    def __init__(self, block, num_blocks, num_classes=10):\n        super(PreActResNet, self).__init__()\n        self.in_planes = 64\n\n        self.other_layers = nn.ModuleList()\n\n        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)\n\n        self.layer_one = self.conv1\n\n\n        self.other_layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)\n        self.other_layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)\n        self.other_layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)\n        self.other_layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)\n\n        self.linear = GlobalpoolFC(512 * block.expansion, num_classes)\n        self.other_layers.append(self.linear)\n\n    def _make_layer(self, block, planes, num_blocks, stride):\n        strides = [stride] + [1] * (num_blocks - 1)\n        layers = []\n        for stride in strides:\n            layers.append(block(self.in_planes, planes, stride))\n            self.other_layers.append(layers[-1])\n\n            self.in_planes = planes * block.expansion\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n\n        x = self.layer_one(x)\n        self.layer_one_out = x\n        self.layer_one_out.requires_grad_()\n        self.layer_one_out.retain_grad()\n        x = self.layer_one_out\n\n        for layer in self.other_layers:\n            x = layer(x)\n\n\n        '''\n        out = self.conv1(x)\n        out = self.layer1(out)\n        out = self.layer2(out)\n        out = self.layer3(out)\n        out = self.layer4(out)\n        out = F.avg_pool2d(out, 4)\n        out = out.view(out.size(0), -1)\n        out = self.linear(out)\n        return out\n        '''\n        return x\n\nclass GlobalpoolFC(nn.Module):\n\n    def __init__(self, num_in, num_class):\n        super(GlobalpoolFC, self).__init__()\n        self.pool = nn.AdaptiveAvgPool2d(output_size=1)\n        self.fc = nn.Linear(num_in, num_class)\n\n    def forward(self, x):\n        y = self.pool(x)\n        y = y.reshape(y.shape[0], -1)\n        y = self.fc(y)\n        return y\n\n\ndef PreActResNet18():\n    return PreActResNet(PreActBlock, [2, 2, 2, 2])\n\n\ndef PreActResNet34():\n    return PreActResNet(PreActBlock, [3, 4, 6, 3])\n\n\nclass PreActBottleneck(nn.Module):\n    '''Pre-activation version of the original Bottleneck module.'''\n    expansion = 4\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBottleneck, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn3 = nn.BatchNorm2d(planes)\n        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion*planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out = self.conv3(F.relu(self.bn3(out)))\n        out += shortcut\n        return out\n\ndef create_network():\n    return PreActResNet18()\n\n\ndef test():\n    net = PreActResNet18()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "lib/base_model/preact_resnet.py",
    "content": "'''Pre-activation ResNet in PyTorch.\nReference:\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Identity Mappings in Deep Residual Networks. arXiv:1603.05027\n'''\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass PreActBlock(nn.Module):\n    '''Pre-activation version of the BasicBlock.'''\n    expansion = 1\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBlock, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion * planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out += shortcut\n        return out\n\n\nclass PreActResNet(nn.Module):\n\n    def __init__(self, block, num_blocks, num_classes=10):\n        super(PreActResNet, self).__init__()\n        self.in_planes = 64\n\n        self.layers = nn.ModuleList()\n\n        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)\n\n        self.layers.append(self.conv1)\n        self.is_trainable = [True]\n\n        self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)\n        self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)\n        self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)\n        self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)\n\n        self.linear = GlobalpoolFC(512 * block.expansion, num_classes)\n        self.layers.append(self.linear)\n        self.is_trainable.append(True)\n\n    def _make_layer(self, block, planes, num_blocks, stride):\n        strides = [stride] + [1] * (num_blocks - 1)\n        layers = []\n        for stride in strides:\n            layers.append(block(self.in_planes, planes, stride))\n            self.layers.append(layers[-1])\n            self.is_trainable.append(True)\n            self.in_planes = planes * block.expansion\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n        self.inputs = []\n        self.inputs.append(x)\n\n        for layer in self.layers:\n            x = layer(x)\n            self.inputs.append(x)\n\n        '''\n        out = self.conv1(x)\n        out = self.layer1(out)\n        out = self.layer2(out)\n        out = self.layer3(out)\n        out = self.layer4(out)\n        out = F.avg_pool2d(out, 4)\n        out = out.view(out.size(0), -1)\n        out = self.linear(out)\n        return out\n        '''\n        return x\n\n\nclass GlobalpoolFC(nn.Module):\n\n    def __init__(self, num_in, num_class):\n        super(GlobalpoolFC, self).__init__()\n        self.pool = nn.AdaptiveAvgPool2d(output_size=1)\n        self.fc = nn.Linear(num_in, num_class)\n\n    def forward(self, x):\n        y = self.pool(x)\n        y = y.reshape(y.shape[0], -1)\n        y = self.fc(y)\n        return y\n\n\ndef PreActResNet18():\n    return PreActResNet(PreActBlock, [2, 2, 2, 2])\n\n\ndef PreActResNet34():\n    return PreActResNet(PreActBlock, [3, 4, 6, 3])\n\n\nclass PreActBottleneck(nn.Module):\n    '''Pre-activation version of the original Bottleneck module.'''\n    expansion = 4\n\n    def __init__(self, in_planes, planes, stride=1):\n        super(PreActBottleneck, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn3 = nn.BatchNorm2d(planes)\n        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)\n\n        if stride != 1 or in_planes != self.expansion*planes:\n            self.shortcut = nn.Sequential(\n                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False)\n            )\n\n    def forward(self, x):\n        out = F.relu(self.bn1(x))\n        shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x\n        out = self.conv1(out)\n        out = self.conv2(F.relu(self.bn2(out)))\n        out = self.conv3(F.relu(self.bn3(out)))\n        out += shortcut\n        return out\n\n\ndef test():\n    net = PreActResNet18()\n    y = net((torch.randn(1, 3, 32, 32)))\n    print(y.size())\n"
  },
  {
    "path": "lib/base_model/small_cnn.py",
    "content": "'''\nthis code is from https://github.com/yaodongyu/TRADES/blob/master/models/small_cnn.py\n@article{Zhang2019theoretically,\n\tauthor = {Hongyang Zhang and Yaodong Yu and Jiantao Jiao and Eric P. Xing and Laurent El Ghaoui and Michael I. Jordan},\n\ttitle = {Theoretically Principled Trade-off between Robustness and Accuracy},\n\tjournal = {arXiv preprint arXiv:1901.08573},\n\tyear = {2019}\n}\n'''\nfrom collections import OrderedDict\nimport torch.nn as nn\n\n\nclass SmallCNN(nn.Module):\n    def __init__(self, drop=0.5):\n        super(SmallCNN, self).__init__()\n\n        self.num_channels = 1\n        self.num_labels = 10\n\n        activ = nn.ReLU(True)\n        self.conv1 = nn.Conv2d(self.num_channels, 32, 3)\n        self.layer_one = nn.Sequential(OrderedDict([\n            ('conv1', self.conv1),\n            ('relu1', activ),]))\n\n\n        self.feature_extractor = nn.Sequential(OrderedDict([\n            ('conv2', nn.Conv2d(32, 32, 3)),\n            ('relu2', activ),\n            ('maxpool1', nn.MaxPool2d(2, 2)),\n            ('conv3', nn.Conv2d(32, 64, 3)),\n            ('relu3', activ),\n            ('conv4', nn.Conv2d(64, 64, 3)),\n            ('relu4', activ),\n            ('maxpool2', nn.MaxPool2d(2, 2)),\n        ]))\n\n        self.classifier = nn.Sequential(OrderedDict([\n            ('fc1', nn.Linear(64 * 4 * 4, 200)),\n            ('relu1', activ),\n            ('drop', nn.Dropout(drop)),\n            ('fc2', nn.Linear(200, 200)),\n            ('relu2', activ),\n            ('fc3', nn.Linear(200, self.num_labels)),\n        ]))\n        self.other_layers = nn.ModuleList()\n        self.other_layers.append(self.feature_extractor)\n        self.other_layers.append(self.classifier)\n\n        for m in self.modules():\n            if isinstance(m, (nn.Conv2d)):\n                nn.init.kaiming_normal_(m.weight)\n                if m.bias is not None:\n                    nn.init.constant_(m.bias, 0)\n            elif isinstance(m, nn.BatchNorm2d):\n                nn.init.constant_(m.weight, 1)\n                nn.init.constant_(m.bias, 0)\n        nn.init.constant_(self.classifier.fc3.weight, 0)\n        nn.init.constant_(self.classifier.fc3.bias, 0)\n\n    def forward(self, input):\n        y = self.layer_one(input)\n        self.layer_one_out = y\n        self.layer_one_out.requires_grad_()\n        self.layer_one_out.retain_grad()\n        features = self.feature_extractor(y)\n        logits = self.classifier(features.view(-1, 64 * 4 * 4))\n        return logits\n\ndef create_network():\n    return SmallCNN()\n\n\ndef test():\n    net = create_network()\n    y = net((torch.randn(1, 1, 28, 28)))\n    print(y.size())\n"
  },
  {
    "path": "lib/base_model/wide_resnet.py",
    "content": "'''\nThis code is from https://github.com/yaodongyu/TRADES/blob/master/models/wideresnet.py/\n@article{Zhang2019theoretically,\n\tauthor = {Hongyang Zhang and Yaodong Yu and Jiantao Jiao and Eric P. Xing and Laurent El Ghaoui and Michael I. Jordan},\n\ttitle = {Theoretically Principled Trade-off between Robustness and Accuracy},\n\tjournal = {arXiv preprint arXiv:1901.08573},\n\tyear = {2019}\n}\n'''\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport math\n\n\nclass BasicBlock(nn.Module):\n\n    def __init__(self, in_planes, out_planes, stride, dropRate=0.0):\n        super(BasicBlock, self).__init__()\n        self.bn1 = nn.BatchNorm2d(in_planes)\n        self.relu1 = nn.ReLU(inplace=True)\n        self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,\n                               padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(out_planes)\n        self.relu2 = nn.ReLU(inplace=True)\n        self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1,\n                               padding=1, bias=False)\n        self.droprate = dropRate\n        self.equalInOut = (in_planes == out_planes)\n        self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,\n                                                                padding=0, bias=False) or None\n\n    def forward(self, x):\n        if not self.equalInOut:\n            x = self.relu1(self.bn1(x))\n        else:\n            out = self.relu1(self.bn1(x))\n        out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x)))\n        if self.droprate > 0:\n            out = F.dropout(out, p=self.droprate, training=self.training)\n        out = self.conv2(out)\n        return torch.add(x if self.equalInOut else self.convShortcut(x), out)\n\n\nclass NetworkBlock(nn.Module):\n    def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0):\n        super(NetworkBlock, self).__init__()\n        self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate)\n\n    def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate):\n        layers = []\n        for i in range(int(nb_layers)):\n            layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate))\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n        return self.layer(x)\n\n\nclass WideResNet(nn.Module):\n    def __init__(self, depth=28, num_classes=10, widen_factor=10, dropRate=0.0):\n        super(WideResNet, self).__init__()\n        nChannels = [16, 16 * widen_factor, 32 * widen_factor, 64 * widen_factor]\n        assert ((depth - 4) % 6 == 0)\n        n = (depth - 4) / 6\n        block = BasicBlock\n        # 1st conv before any network block\n        self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1,\n                               padding=1, bias=False)\n        self.layer_one = self.conv1\n\n        self.other_layers = nn.ModuleList()\n        # 1st block\n        self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate)\n        self.other_layers.append(self.block1)\n        # 1st sub-block\n        self.sub_block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate)\n        self.other_layers.append(self.sub_block1)\n        # 2nd block\n        self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate)\n        self.other_layers.append(self.block2)\n        # 3rd block\n        self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate)\n        self.other_layers.append(self.block3)\n        # global average pooling and classifier\n        self.bn1 = nn.BatchNorm2d(nChannels[3])\n        self.other_layers.append(self.bn1)\n        self.relu = nn.ReLU(inplace=True)\n        self.fc = nn.Linear(nChannels[3], num_classes)\n        self.other_layers.append(self.fc)\n        self.nChannels = nChannels[3]\n\n        '''\n        for m in self.modules():\n            if isinstance(m, nn.Conv2d):\n                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels\n                m.weight.data.normal_(0, math.sqrt(2. / n))\n            elif isinstance(m, nn.BatchNorm2d):\n                m.weight.data.fill_(1)\n                m.bias.data.zero_()\n            elif isinstance(m, nn.Linear):\n                m.bias.data.zero_()\n        '''\n\n    def forward(self, x, ret_cls1=True):\n        out = self.conv1(x)\n        self.layer_one_out = out\n        self.layer_one_out.requires_grad_()\n        self.layer_one_out.retain_grad()\n        out = self.block1(out)\n        out = self.block2(out)\n        out = self.block3(out)\n        out = self.relu(self.bn1(out))\n        out = F.avg_pool2d(out, 8)\n        out = out.view(-1, self.nChannels)\n\n        y = self.fc(out)\n        return y\n\n\ndef create_network():\n    net = WideResNet()\n\n    return net\n\n\nif __name__ == '__main__':\n    net = create_network()\n    print(net)\n"
  },
  {
    "path": "lib/training/__init__.py",
    "content": ""
  },
  {
    "path": "lib/training/config.py",
    "content": "from abc import ABCMeta, abstractproperty, abstractmethod\nfrom typing import Tuple, List, Dict\nimport os\nimport sys\nimport torch\n\n\nclass TrainingConfigBase(metaclass=ABCMeta):\n    '''\n    Base class for training\n    '''\n\n    # directory handling\n    @property\n    def abs_current_dir(self):\n        return os.path.realpath('./')\n\n    @property\n    def log_dir(self):\n        if not os.path.exists('./log'):\n            os.mkdir('./log')\n        return os.path.join(self.abs_current_dir, 'log')\n\n    @property\n    def model_dir(self):\n        log_dir = self.log_dir\n        model_dir = os.path.join(log_dir, 'models')\n        #print(model_dir)\n        if not os.path.exists(model_dir):\n            os.mkdir(model_dir)\n        return model_dir\n\n    @abstractproperty\n    def lib_dir(self):\n        pass\n\n    # training setting\n    @abstractproperty\n    def num_epochs(self):\n        pass\n\n    @property\n    def val_interval(self):\n        '''\n        Specify how many epochs between two validation steps\n        Return <= 0 means no validation phase\n        '''\n        return 0\n\n    @abstractmethod\n    def create_optimizer(self, params) -> torch.optim.Optimizer:\n        '''\n        params (iterable): iterable of parameters to optimize or dicts defining\n                           parameter groups\n        '''\n        pass\n\n    @abstractmethod\n    def create_lr_scheduler(self, optimizer:torch.optim.Optimizer) -> torch.optim.lr_scheduler._LRScheduler:\n        pass\n\n    @abstractmethod\n    def create_loss_function(self) -> torch.nn.modules.loss._Loss:\n        pass\n\n\n    def create_attack_method(self, *inputs):\n        '''\n        Perform adversarial training against xxx adversary\n        Return None means natural training\n        '''\n        return None\n\n    # Evaluation Setting\n\n    def create_evaluation_attack_method(self, *inputs):\n        '''\n        evaluating the robustness of model against xxx adversary\n        Return None means only measuring clean accuracy\n        '''\n        return None\n\n\n\n\nclass SGDOptimizerMaker(object):\n\n    def __init__(self, lr = 0.1, momentum = 0.9, weight_decay = 1e-4):\n        self.lr = lr\n        self.momentum = momentum\n        self.weight_decay = weight_decay\n\n    def __call__(self, params):\n        return torch.optim.SGD(params, lr=self.lr, momentum=self.momentum, weight_decay=self.weight_decay)\n\n\nclass PieceWiseConstantLrSchedulerMaker(object):\n\n    def __init__(self, milestones:List[int], gamma:float = 0.1):\n        self.milestones = milestones\n        self.gamma = gamma\n\n    def __call__(self, optimizer):\n        return torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=self.milestones, gamma=self.gamma)\n\nclass IPGDAttackMethodMaker(object):\n\n    def __init__(self, eps, sigma, nb_iters, norm, mean, std):\n        self.eps = eps\n        self.sigma = sigma\n        self.nb_iters = nb_iters\n        self.norm = norm\n        self.mean = mean\n        self.std = std\n\n    def __call__(self, DEVICE):\n        father_dir = os.path.join('/', *os.path.realpath(__file__).split(os.path.sep)[:-2])\n        # print(father_dir)\n        if not father_dir in sys.path:\n            sys.path.append(father_dir)\n        from attack.pgd import IPGD\n        return IPGD(self.eps, self.sigma, self.nb_iters, self.norm, DEVICE, self.mean, self.std)\n\nclass LambdaLrSchedulerMaker(object):\n\n\n    def __init__(self, func, last_epoch = -1):\n        assert callable(func)\n\n        self.func = func\n        self.last_epoch = last_epoch\n\n    def __call__(self, parameters):\n        from torch.optim.lr_scheduler import LambdaLR\n        lr_schduler = LambdaLR(parameters, self.func, self.last_epoch)\n        return lr_schduler\n"
  },
  {
    "path": "lib/training/train.py",
    "content": "import os\nimport sys\nfather_dir = os.path.join('/',  *os.path.realpath(__file__).split(os.path.sep)[:-2])\n#print(father_dir)\nif not father_dir in sys.path:\n    sys.path.append(father_dir)\nfrom utils.misc import torch_accuracy, AvgMeter\nfrom collections import OrderedDict\nimport torch\nfrom tqdm import tqdm\n\ndef train_one_epoch(net, batch_generator, optimizer,\n                    criterion, DEVICE=torch.device('cuda:0'),\n                    descrip_str='Training', AttackMethod = None, adv_coef = 1.0):\n    '''\n\n    :param attack_freq:  Frequencies of training with adversarial examples. -1 indicates natural training\n    :param AttackMethod: the attack method, None represents natural training\n    :return:  None    #(clean_acc, adv_acc)\n    '''\n    net.train()\n    pbar = tqdm(batch_generator)\n    advacc = -1\n    advloss = -1\n    cleanacc = -1\n    cleanloss = -1\n    pbar.set_description(descrip_str)\n    for i, (data, label) in enumerate(pbar):\n        data = data.to(DEVICE)\n        label = label.to(DEVICE)\n\n        optimizer.zero_grad()\n\n        pbar_dic = OrderedDict()\n        TotalLoss = 0\n\n        if AttackMethod is not None:\n            adv_inp = AttackMethod.attack(net, data, label)\n            optimizer.zero_grad()\n            net.train()\n            pred = net(adv_inp)\n            loss = criterion(pred, label)\n\n            acc = torch_accuracy(pred, label, (1,))\n            advacc = acc[0].item()\n            advloss = loss.item()\n            #TotalLoss = TotalLoss + loss * adv_coef\n            (loss * adv_coef).backward()\n\n\n        pred = net(data)\n\n        loss = criterion(pred, label)\n        #TotalLoss = TotalLoss + loss\n        loss.backward()\n        #TotalLoss.backward()\n        #param = next(net.parameters())\n        #grad_mean = torch.mean(param.grad)\n\n        optimizer.step()\n        acc = torch_accuracy(pred, label, (1,))\n        cleanacc = acc[0].item()\n        cleanloss = loss.item()\n        #pbar_dic['grad'] = '{}'.format(grad_mean)\n        pbar_dic['Acc'] = '{:.2f}'.format(cleanacc)\n        pbar_dic['loss'] = '{:.2f}'.format(cleanloss)\n        pbar_dic['AdvAcc'] = '{:.2f}'.format(advacc)\n        pbar_dic['Advloss'] = '{:.2f}'.format(advloss)\n        pbar.set_postfix(pbar_dic)\n\n\ndef eval_one_epoch(net, batch_generator,  DEVICE=torch.device('cuda:0'), AttackMethod = None):\n    net.eval()\n    pbar = tqdm(batch_generator)\n    clean_accuracy = AvgMeter()\n    adv_accuracy = AvgMeter()\n\n    pbar.set_description('Evaluating')\n    for (data, label) in pbar:\n        data = data.to(DEVICE)\n        label = label.to(DEVICE)\n\n        with torch.no_grad():\n            pred = net(data)\n            acc = torch_accuracy(pred, label, (1,))\n            clean_accuracy.update(acc[0].item())\n\n        if AttackMethod is not None:\n            adv_inp = AttackMethod.attack(net, data, label)\n\n            with torch.no_grad():\n                pred = net(adv_inp)\n                acc = torch_accuracy(pred, label, (1,))\n                adv_accuracy.update(acc[0].item())\n\n        pbar_dic = OrderedDict()\n        pbar_dic['CleanAcc'] = '{:.2f}'.format(clean_accuracy.mean)\n        pbar_dic['AdvAcc'] = '{:.2f}'.format(adv_accuracy.mean)\n\n        pbar.set_postfix(pbar_dic)\n\n        adv_acc = adv_accuracy.mean if AttackMethod is not None else 0\n    return clean_accuracy.mean, adv_acc\n"
  },
  {
    "path": "lib/utils/__init__.py",
    "content": ""
  },
  {
    "path": "lib/utils/misc.py",
    "content": "import math\nimport os\nfrom typing import Tuple, List, Dict\nimport torch\n\ndef torch_accuracy(output, target, topk=(1,)) -> List[torch.Tensor]:\n    '''\n    param output, target: should be torch Variable\n    '''\n    # assert isinstance(output, torch.cuda.Tensor), 'expecting Torch Tensor'\n    # assert isinstance(target, torch.Tensor), 'expecting Torch Tensor'\n    # print(type(output))\n\n    topn = max(topk)\n    batch_size = output.size(0)\n\n    _, pred = output.topk(topn, 1, True, True)\n    pred = pred.t()\n\n    is_correct = pred.eq(target.view(1, -1).expand_as(pred))\n\n    ans = []\n    for i in topk:\n        is_correct_i = is_correct[:i].view(-1).float().sum(0, keepdim=True)\n        ans.append(is_correct_i.mul_(100.0 / batch_size))\n\n    return ans\n\nclass AvgMeter(object):\n    '''\n    Computing mean\n    '''\n    name = 'No name'\n\n    def __init__(self, name='No name'):\n        self.name = name\n        self.reset()\n\n    def reset(self):\n        self.sum = 0\n        self.mean = 0\n        self.num = 0\n        self.now = 0\n\n    def update(self, mean_var, count=1):\n        if math.isnan(mean_var):\n            mean_var = 1e6\n            print('Avgmeter getting Nan!')\n        self.now = mean_var\n        self.num += count\n\n        self.sum += mean_var * count\n        self.mean = float(self.sum) / self.num\n\ndef save_args(args, save_dir = None):\n    if save_dir == None:\n        param_path = os.path.join(args.resume, \"params.json\")\n    else:\n        param_path = os.path.join(save_dir, 'params.json')\n\n    #logger.info(\"[*] MODEL dir: %s\" % args.resume)\n    #logger.info(\"[*] PARAM path: %s\" % param_path)\n\n    with open(param_path, 'w') as fp:\n        json.dump(args.__dict__, fp, indent=4, sort_keys=True)\n\n\ndef mkdir(path):\n    if not os.path.exists(path):\n        print('creating dir {}'.format(path))\n        os.mkdir(path)\n\ndef save_checkpoint(now_epoch, net, optimizer, lr_scheduler, file_name):\n    checkpoint = {'epoch': now_epoch,\n                  'state_dict': net.state_dict(),\n                  'optimizer_state_dict': optimizer.state_dict(),\n                  'lr_scheduler_state_dict':lr_scheduler.state_dict()}\n    if os.path.exists(file_name):\n        print('Overwriting {}'.format(file_name))\n    torch.save(checkpoint, file_name)\n    link_name = os.path.join('/', *file_name.split(os.path.sep)[:-1], 'last.checkpoint')\n    #print(link_name)\n    make_symlink(source = file_name, link_name=link_name)\n\ndef load_checkpoint(file_name, net = None, optimizer = None, lr_scheduler = None):\n    if os.path.isfile(file_name):\n        print(\"=> loading checkpoint '{}'\".format(file_name))\n        check_point = torch.load(file_name)\n        if net is not None:\n            print('Loading network state dict')\n            net.load_state_dict(check_point['state_dict'])\n        if optimizer is not None:\n            print('Loading optimizer state dict')\n            optimizer.load_state_dict(check_point['optimizer_state_dict'])\n        if lr_scheduler is not None:\n            print('Loading lr_scheduler state dict')\n            lr_scheduler.load_state_dict(check_point['lr_scheduler_state_dict'])\n\n        return check_point['epoch']\n    else:\n        print(\"=> no checkpoint found at '{}'\".format(file_name))\n\n\ndef make_symlink(source, link_name):\n    '''\n    Note: overwriting enabled!\n    '''\n    if os.path.exists(link_name):\n        #print(\"Link name already exist! Removing '{}' and overwriting\".format(link_name))\n        os.remove(link_name)\n    if os.path.exists(source):\n        os.symlink(source, link_name)\n        return\n    else:\n        print('Source path not exists')\n    #print('SymLink Wrong!')\n\ndef add_path(path):\n    if path not in sys.path:\n        print('Adding {}'.format(path))\n        sys.path.append(path)\n"
  }
]