[
  {
    "path": ".gitignore",
    "content": "__pycache__/\n.vscode/\nids/\nmodel/"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2017 habrman\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# FaceRecognition\nWebcam face recognition using tensorflow and opencv.\nThe application tries to find faces in the webcam image and match them against images in an id folder using deep neural networks.\n\n## Dependencies\n*   OpenCv\n*   Tensorflow\n*   Scikit-learn\n*   easygui\n\n## Inspiration\nModels, training code and inspriation can be found in the [facenet](https://github.com/davidsandberg/facenet) repository.\n[Multi-task Cascaded Convolutional Networks](https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html) are used for facial and landmark detection while an [Inception Resnet](https://arxiv.org/abs/1602.07261) is used for ID classification.\nA direct link to the pretrained Inception Resnet model can be found [here](https://drive.google.com/file/d/0B5MzpY9kBtDVZ2RpVDYwWmxoSUk).\n\n## How to\nGet the [model from facenet](https://drive.google.com/file/d/0B5MzpY9kBtDVZ2RpVDYwWmxoSUk) and setup your id folder.\nThe id folder should contain subfolders, each containing at least one image of one person. The subfolders should be named after the person in the folder since this name is used as output when a match is found.\n\nE.g. id folder named `ids` containing subfolders `Adam` and `Eve`, each containing images of the respective person.\n\n```bash\n├── ids\n│   ├── Adam\n│   │   ├── Adam0.png\n│   │   ├── Adam1.png\n│   ├── Eve\n│   │   ├── Eve0.png\n```\nDownload and unpack the [model](https://drive.google.com/file/d/0B5MzpY9kBtDVZ2RpVDYwWmxoSUk) to a folder and run `python3 main.py ./folder/model.pb ./ids/` to start the program. Make sure to replace `./folder/model.pb` with the path to the downloaded model.\n\nVisualization hotkeys:\n*   l - toggle facial landmarks\n*   b - toggle bounding box\n*   i - toggle id\n*   f - toggle frames per second\n*   s - save image face detections to id folder\n\n![alt text](https://github.com/habrman/FaceRecognition/blob/master/example.png)"
  },
  {
    "path": "detect_and_align.py",
    "content": "from six import string_types, iteritems\r\nimport tensorflow as tf\r\nimport numpy as np\r\nimport cv2\r\nimport os\r\n\r\n\r\ndef detect_faces(img, mtcnn):\r\n    margin = 44\r\n    image_size = 160\r\n\r\n    img_size = np.asarray(img.shape)[0:2]\r\n    bounding_boxes, landmarks = detect_face(img, mtcnn[\"pnet\"], mtcnn[\"rnet\"], mtcnn[\"onet\"])\r\n    nrof_bb = bounding_boxes.shape[0]\r\n    padded_bounding_boxes = []\r\n    face_patches = []\r\n\r\n    if nrof_bb > 0:\r\n        landmarks = np.stack(landmarks)\r\n        landmarks = np.transpose(landmarks, (1, 0))\r\n        for i in range(nrof_bb):\r\n            det = np.squeeze(bounding_boxes[i, 0:4])\r\n            bb = np.zeros(4, dtype=np.int32)\r\n            bb[0] = np.maximum(det[0] - margin / 2, 0)\r\n            bb[1] = np.maximum(det[1] - margin / 2, 0)\r\n            bb[2] = np.minimum(det[2] + margin / 2, img_size[1])\r\n            bb[3] = np.minimum(det[3] + margin / 2, img_size[0])\r\n            cropped = img[bb[1] : bb[3], bb[0] : bb[2], :]\r\n            aligned = cv2.resize(cropped, (image_size, image_size))\r\n            prewhitened = prewhiten(aligned)\r\n            padded_bounding_boxes.append(bb)\r\n            face_patches.append(prewhitened)\r\n\r\n    return face_patches, padded_bounding_boxes, landmarks\r\n\r\n\r\ndef prewhiten(x):\r\n    mean = np.mean(x)\r\n    std = np.std(x)\r\n    std_adj = np.maximum(std, 1.0 / np.sqrt(x.size))\r\n    y = np.multiply(np.subtract(x, mean), 1 / std_adj)\r\n    return y\r\n\r\n\r\ndef imresample(img, sz):\r\n    im_data = cv2.resize(img, (sz[1], sz[0]), interpolation=cv2.INTER_AREA)\r\n    return im_data\r\n\r\n\r\ndef generateBoundingBox(imap, reg, scale, t):\r\n    # use heatmap to generate bounding boxes\r\n    stride = 2\r\n    cellsize = 12\r\n\r\n    imap = np.transpose(imap)\r\n    dx1 = np.transpose(reg[:, :, 0])\r\n    dy1 = np.transpose(reg[:, :, 1])\r\n    dx2 = np.transpose(reg[:, :, 2])\r\n    dy2 = np.transpose(reg[:, :, 3])\r\n    y, x = np.where(imap >= t)\r\n    if y.shape[0] == 1:\r\n        dx1 = np.flipud(dx1)\r\n        dy1 = np.flipud(dy1)\r\n        dx2 = np.flipud(dx2)\r\n        dy2 = np.flipud(dy2)\r\n    score = imap[(y, x)]\r\n    reg = np.transpose(np.vstack([dx1[(y, x)], dy1[(y, x)], dx2[(y, x)], dy2[(y, x)]]))\r\n    if reg.size == 0:\r\n        reg = np.empty((0, 3))\r\n    bb = np.transpose(np.vstack([y, x]))\r\n    q1 = np.fix((stride * bb + 1) / scale)\r\n    q2 = np.fix((stride * bb + cellsize - 1 + 1) / scale)\r\n    boundingbox = np.hstack([q1, q2, np.expand_dims(score, 1), reg])\r\n    return boundingbox, reg\r\n\r\n\r\ndef nms(boxes, threshold, method):\r\n    if boxes.size == 0:\r\n        return np.empty((0, 3))\r\n    x1 = boxes[:, 0]\r\n    y1 = boxes[:, 1]\r\n    x2 = boxes[:, 2]\r\n    y2 = boxes[:, 3]\r\n    s = boxes[:, 4]\r\n    area = (x2 - x1 + 1) * (y2 - y1 + 1)\r\n    I = np.argsort(s)\r\n    pick = np.zeros_like(s, dtype=np.int16)\r\n    counter = 0\r\n    while I.size > 0:\r\n        i = I[-1]\r\n        pick[counter] = i\r\n        counter += 1\r\n        idx = I[0:-1]\r\n        xx1 = np.maximum(x1[i], x1[idx])\r\n        yy1 = np.maximum(y1[i], y1[idx])\r\n        xx2 = np.minimum(x2[i], x2[idx])\r\n        yy2 = np.minimum(y2[i], y2[idx])\r\n        w = np.maximum(0.0, xx2 - xx1 + 1)\r\n        h = np.maximum(0.0, yy2 - yy1 + 1)\r\n        inter = w * h\r\n        if method is \"Min\":\r\n            o = inter / np.minimum(area[i], area[idx])\r\n        else:\r\n            o = inter / (area[i] + area[idx] - inter)\r\n        I = I[np.where(o <= threshold)]\r\n    pick = pick[0:counter]\r\n    return pick\r\n\r\n\r\ndef rerec(bboxA):\r\n    # convert bboxA to square\r\n    h = bboxA[:, 3] - bboxA[:, 1]\r\n    w = bboxA[:, 2] - bboxA[:, 0]\r\n    l = np.maximum(w, h)\r\n    bboxA[:, 0] = bboxA[:, 0] + w * 0.5 - l * 0.5\r\n    bboxA[:, 1] = bboxA[:, 1] + h * 0.5 - l * 0.5\r\n    bboxA[:, 2:4] = bboxA[:, 0:2] + np.transpose(np.tile(l, (2, 1)))\r\n    return bboxA\r\n\r\n\r\ndef pad(total_boxes, w, h):\r\n    # compute the padding coordinates (pad the bounding boxes to square)\r\n    tmpw = (total_boxes[:, 2] - total_boxes[:, 0] + 1).astype(np.int32)\r\n    tmph = (total_boxes[:, 3] - total_boxes[:, 1] + 1).astype(np.int32)\r\n    numbox = total_boxes.shape[0]\r\n\r\n    dx = np.ones((numbox), dtype=np.int32)\r\n    dy = np.ones((numbox), dtype=np.int32)\r\n    edx = tmpw.copy().astype(np.int32)\r\n    edy = tmph.copy().astype(np.int32)\r\n\r\n    x = total_boxes[:, 0].copy().astype(np.int32)\r\n    y = total_boxes[:, 1].copy().astype(np.int32)\r\n    ex = total_boxes[:, 2].copy().astype(np.int32)\r\n    ey = total_boxes[:, 3].copy().astype(np.int32)\r\n\r\n    tmp = np.where(ex > w)\r\n    edx.flat[tmp] = np.expand_dims(-ex[tmp] + w + tmpw[tmp], 1)\r\n    ex[tmp] = w\r\n\r\n    tmp = np.where(ey > h)\r\n    edy.flat[tmp] = np.expand_dims(-ey[tmp] + h + tmph[tmp], 1)\r\n    ey[tmp] = h\r\n\r\n    tmp = np.where(x < 1)\r\n    dx.flat[tmp] = np.expand_dims(2 - x[tmp], 1)\r\n    x[tmp] = 1\r\n\r\n    tmp = np.where(y < 1)\r\n    dy.flat[tmp] = np.expand_dims(2 - y[tmp], 1)\r\n    y[tmp] = 1\r\n\r\n    return dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph\r\n\r\n\r\ndef bbreg(boundingbox, reg):\r\n    # calibrate bounding boxes\r\n    if reg.shape[1] == 1:\r\n        reg = np.reshape(reg, (reg.shape[2], reg.shape[3]))\r\n\r\n    w = boundingbox[:, 2] - boundingbox[:, 0] + 1\r\n    h = boundingbox[:, 3] - boundingbox[:, 1] + 1\r\n    b1 = boundingbox[:, 0] + reg[:, 0] * w\r\n    b2 = boundingbox[:, 1] + reg[:, 1] * h\r\n    b3 = boundingbox[:, 2] + reg[:, 2] * w\r\n    b4 = boundingbox[:, 3] + reg[:, 3] * h\r\n    boundingbox[:, 0:4] = np.transpose(np.vstack([b1, b2, b3, b4]))\r\n    return boundingbox\r\n\r\n\r\ndef layer(op):\r\n    def layer_decorated(self, *args, **kwargs):\r\n        # Automatically set a name if not provided.\r\n        name = kwargs.setdefault(\"name\", self.get_unique_name(op.__name__))\r\n        # Figure out the layer inputs.\r\n        if len(self.terminals) == 0:\r\n            raise RuntimeError(\"No input variables found for layer %s.\" % name)\r\n        elif len(self.terminals) == 1:\r\n            layer_input = self.terminals[0]\r\n        else:\r\n            layer_input = list(self.terminals)\r\n        # Perform the operation and get the output.\r\n        layer_output = op(self, layer_input, *args, **kwargs)\r\n        # Add to layer LUT.\r\n        self.layers[name] = layer_output\r\n        # This output is now the input for the next layer.\r\n        self.feed(layer_output)\r\n        # Return self for chained calls.\r\n        return self\r\n\r\n    return layer_decorated\r\n\r\n\r\nclass Network(object):\r\n    def __init__(self, inputs, trainable=True):\r\n        # The input nodes for this network\r\n        self.inputs = inputs\r\n        # The current list of terminal nodes\r\n        self.terminals = []\r\n        # Mapping from layer names to layers\r\n        self.layers = dict(inputs)\r\n        # If true, the resulting variables are set as trainable\r\n        self.trainable = trainable\r\n\r\n        self.setup()\r\n\r\n    def setup(self):\r\n        \"\"\"Construct the network. \"\"\"\r\n        raise NotImplementedError(\"Must be implemented by the subclass.\")\r\n\r\n    def load(self, data_path, session, ignore_missing=False):\r\n        \"\"\"Load network weights.\r\n        data_path: The path to the numpy-serialized network weights\r\n        session: The current TensorFlow session\r\n        ignore_missing: If true, serialized weights for missing layers are ignored.\r\n        \"\"\"\r\n        data_dict = np.load(data_path, encoding=\"latin1\", allow_pickle=True).item()\r\n\r\n        for op_name in data_dict:\r\n            with tf.variable_scope(op_name, reuse=True):\r\n                for param_name, data in iteritems(data_dict[op_name]):\r\n                    try:\r\n                        var = tf.get_variable(param_name)\r\n                        session.run(var.assign(data))\r\n                    except ValueError:\r\n                        if not ignore_missing:\r\n                            raise\r\n\r\n    def feed(self, *args):\r\n        \"\"\"Set the input(s) for the next operation by replacing the terminal nodes.\r\n        The arguments can be either layer names or the actual layers.\r\n        \"\"\"\r\n        assert len(args) != 0\r\n        self.terminals = []\r\n        for fed_layer in args:\r\n            if isinstance(fed_layer, string_types):\r\n                try:\r\n                    fed_layer = self.layers[fed_layer]\r\n                except KeyError:\r\n                    raise KeyError(\"Unknown layer name fed: %s\" % fed_layer)\r\n            self.terminals.append(fed_layer)\r\n        return self\r\n\r\n    def get_output(self):\r\n        \"\"\"Returns the current network output.\"\"\"\r\n        return self.terminals[-1]\r\n\r\n    def get_unique_name(self, prefix):\r\n        \"\"\"Returns an index-suffixed unique name for the given prefix.\r\n        This is used for auto-generating layer names based on the type-prefix.\r\n        \"\"\"\r\n        ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1\r\n        return \"%s_%d\" % (prefix, ident)\r\n\r\n    def make_var(self, name, shape):\r\n        \"\"\"Creates a new TensorFlow variable.\"\"\"\r\n        return tf.get_variable(name, shape, trainable=self.trainable)\r\n\r\n    def validate_padding(self, padding):\r\n        \"\"\"Verifies that the padding is one of the supported ones.\"\"\"\r\n        assert padding in (\"SAME\", \"VALID\")\r\n\r\n    @layer\r\n    def conv(self, inp, k_h, k_w, c_o, s_h, s_w, name, relu=True, padding=\"SAME\", group=1, biased=True):\r\n        # Verify that the padding is acceptable\r\n        self.validate_padding(padding)\r\n        # Get the number of channels in the input\r\n        c_i = int(inp.get_shape()[-1])\r\n        # Verify that the grouping parameter is valid\r\n        assert c_i % group == 0\r\n        assert c_o % group == 0\r\n\r\n        # Convolution for a given input and kernel\r\n        def convolve(i, k):\r\n            return tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)\r\n\r\n        with tf.variable_scope(name) as scope:\r\n            kernel = self.make_var(\"weights\", shape=[k_h, k_w, c_i // group, c_o])\r\n            # This is the common-case. Convolve the input without any further complications.\r\n            output = convolve(inp, kernel)\r\n            # Add the biases\r\n            if biased:\r\n                biases = self.make_var(\"biases\", [c_o])\r\n                output = tf.nn.bias_add(output, biases)\r\n            if relu:\r\n                # ReLU non-linearity\r\n                output = tf.nn.relu(output, name=scope.name)\r\n            return output\r\n\r\n    @layer\r\n    def prelu(self, inp, name):\r\n        with tf.variable_scope(name):\r\n            i = int(inp.get_shape()[-1])\r\n            alpha = self.make_var(\"alpha\", shape=(i,))\r\n            output = tf.nn.relu(inp) + tf.multiply(alpha, -tf.nn.relu(-inp))\r\n        return output\r\n\r\n    @layer\r\n    def max_pool(self, inp, k_h, k_w, s_h, s_w, name, padding=\"SAME\"):\r\n        self.validate_padding(padding)\r\n        return tf.nn.max_pool(inp, ksize=[1, k_h, k_w, 1], strides=[1, s_h, s_w, 1], padding=padding, name=name)\r\n\r\n    @layer\r\n    def fc(self, inp, num_out, name, relu=True):\r\n        with tf.variable_scope(name):\r\n            input_shape = inp.get_shape()\r\n            if input_shape.ndims == 4:\r\n                # The input is spatial. Vectorize it first.\r\n                dim = 1\r\n                for d in input_shape[1:].as_list():\r\n                    dim *= int(d)\r\n                feed_in = tf.reshape(inp, [-1, dim])\r\n            else:\r\n                feed_in, dim = (inp, input_shape[-1].value)\r\n            weights = self.make_var(\"weights\", shape=[dim, num_out])\r\n            biases = self.make_var(\"biases\", [num_out])\r\n            op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b\r\n            fc = op(feed_in, weights, biases, name=name)\r\n            return fc\r\n\r\n    @layer\r\n    def softmax(self, target, axis, name=None):\r\n        max_axis = tf.reduce_max(target, axis, keep_dims=True)\r\n        target_exp = tf.exp(target - max_axis)\r\n        normalize = tf.reduce_sum(target_exp, axis, keep_dims=True)\r\n        softmax = tf.div(target_exp, normalize, name)\r\n        return softmax\r\n\r\n\r\nclass PNet(Network):\r\n    def setup(self):\r\n        (\r\n            self.feed(\"data\")\r\n            .conv(3, 3, 10, 1, 1, padding=\"VALID\", relu=False, name=\"conv1\")\r\n            .prelu(name=\"PReLU1\")\r\n            .max_pool(2, 2, 2, 2, name=\"pool1\")\r\n            .conv(3, 3, 16, 1, 1, padding=\"VALID\", relu=False, name=\"conv2\")\r\n            .prelu(name=\"PReLU2\")\r\n            .conv(3, 3, 32, 1, 1, padding=\"VALID\", relu=False, name=\"conv3\")\r\n            .prelu(name=\"PReLU3\")\r\n            .conv(1, 1, 2, 1, 1, relu=False, name=\"conv4-1\")\r\n            .softmax(3, name=\"prob1\")\r\n        )\r\n\r\n        (self.feed(\"PReLU3\").conv(1, 1, 4, 1, 1, relu=False, name=\"conv4-2\"))\r\n\r\n\r\nclass RNet(Network):\r\n    def setup(self):\r\n        (\r\n            self.feed(\"data\")\r\n            .conv(3, 3, 28, 1, 1, padding=\"VALID\", relu=False, name=\"conv1\")\r\n            .prelu(name=\"prelu1\")\r\n            .max_pool(3, 3, 2, 2, name=\"pool1\")\r\n            .conv(3, 3, 48, 1, 1, padding=\"VALID\", relu=False, name=\"conv2\")\r\n            .prelu(name=\"prelu2\")\r\n            .max_pool(3, 3, 2, 2, padding=\"VALID\", name=\"pool2\")\r\n            .conv(2, 2, 64, 1, 1, padding=\"VALID\", relu=False, name=\"conv3\")\r\n            .prelu(name=\"prelu3\")\r\n            .fc(128, relu=False, name=\"conv4\")\r\n            .prelu(name=\"prelu4\")\r\n            .fc(2, relu=False, name=\"conv5-1\")\r\n            .softmax(1, name=\"prob1\")\r\n        )\r\n\r\n        (self.feed(\"prelu4\").fc(4, relu=False, name=\"conv5-2\"))\r\n\r\n\r\nclass ONet(Network):\r\n    def setup(self):\r\n        (\r\n            self.feed(\"data\")\r\n            .conv(3, 3, 32, 1, 1, padding=\"VALID\", relu=False, name=\"conv1\")\r\n            .prelu(name=\"prelu1\")\r\n            .max_pool(3, 3, 2, 2, name=\"pool1\")\r\n            .conv(3, 3, 64, 1, 1, padding=\"VALID\", relu=False, name=\"conv2\")\r\n            .prelu(name=\"prelu2\")\r\n            .max_pool(3, 3, 2, 2, padding=\"VALID\", name=\"pool2\")\r\n            .conv(3, 3, 64, 1, 1, padding=\"VALID\", relu=False, name=\"conv3\")\r\n            .prelu(name=\"prelu3\")\r\n            .max_pool(2, 2, 2, 2, name=\"pool3\")\r\n            .conv(2, 2, 128, 1, 1, padding=\"VALID\", relu=False, name=\"conv4\")\r\n            .prelu(name=\"prelu4\")\r\n            .fc(256, relu=False, name=\"conv5\")\r\n            .prelu(name=\"prelu5\")\r\n            .fc(2, relu=False, name=\"conv6-1\")\r\n            .softmax(1, name=\"prob1\")\r\n        )\r\n\r\n        (self.feed(\"prelu5\").fc(4, relu=False, name=\"conv6-2\"))\r\n\r\n        (self.feed(\"prelu5\").fc(10, relu=False, name=\"conv6-3\"))\r\n\r\n\r\ndef create_mtcnn(sess, model_path):\r\n    if not model_path:\r\n        model_path, _ = os.path.split(os.path.realpath(__file__))\r\n\r\n    with tf.variable_scope(\"pnet\"):\r\n        data = tf.placeholder(tf.float32, (None, None, None, 3), \"input\")\r\n        pnet = PNet({\"data\": data})\r\n        pnet.load(os.path.join(model_path, \"det1.npy\"), sess)\r\n    with tf.variable_scope(\"rnet\"):\r\n        data = tf.placeholder(tf.float32, (None, 24, 24, 3), \"input\")\r\n        rnet = RNet({\"data\": data})\r\n        rnet.load(os.path.join(model_path, \"det2.npy\"), sess)\r\n    with tf.variable_scope(\"onet\"):\r\n        data = tf.placeholder(tf.float32, (None, 48, 48, 3), \"input\")\r\n        onet = ONet({\"data\": data})\r\n        onet.load(os.path.join(model_path, \"det3.npy\"), sess)\r\n\r\n    def pnet_fun(img):\r\n        return sess.run((\"pnet/conv4-2/BiasAdd:0\", \"pnet/prob1:0\"), feed_dict={\"pnet/input:0\": img})\r\n\r\n    def rnet_fun(img):\r\n        return sess.run((\"rnet/conv5-2/conv5-2:0\", \"rnet/prob1:0\"), feed_dict={\"rnet/input:0\": img})\r\n\r\n    def onet_fun(img):\r\n        return sess.run(\r\n            (\"onet/conv6-2/conv6-2:0\", \"onet/conv6-3/conv6-3:0\", \"onet/prob1:0\"), feed_dict={\"onet/input:0\": img}\r\n        )\r\n\r\n    return {\"pnet\": pnet_fun, \"rnet\": rnet_fun, \"onet\": onet_fun}\r\n\r\n\r\ndef detect_face(img, pnet, rnet, onet):\r\n\r\n    minsize = 20  # minimum size of face\r\n    threshold = [0.6, 0.7, 0.7]  # three steps's threshold\r\n    factor = 0.709  # scale factor\r\n\r\n    factor_count = 0\r\n    total_boxes = np.empty((0, 9))\r\n    points = []\r\n    h = img.shape[0]\r\n    w = img.shape[1]\r\n    minl = np.amin([h, w])\r\n    m = 12.0 / minsize\r\n    minl = minl * m\r\n    # creat scale pyramid\r\n    scales = []\r\n    while minl >= 12:\r\n        scales += [m * np.power(factor, factor_count)]\r\n        minl = minl * factor\r\n        factor_count += 1\r\n\r\n    # first stage\r\n    for j in range(len(scales)):\r\n        scale = scales[j]\r\n        hs = int(np.ceil(h * scale))\r\n        ws = int(np.ceil(w * scale))\r\n        im_data = imresample(img, (hs, ws))\r\n        im_data = (im_data - 127.5) * 0.0078125\r\n        img_x = np.expand_dims(im_data, 0)\r\n        img_y = np.transpose(img_x, (0, 2, 1, 3))\r\n        out = pnet(img_y)\r\n        out0 = np.transpose(out[0], (0, 2, 1, 3))\r\n        out1 = np.transpose(out[1], (0, 2, 1, 3))\r\n\r\n        boxes, _ = generateBoundingBox(out1[0, :, :, 1].copy(), out0[0, :, :, :].copy(), scale, threshold[0])\r\n\r\n        # inter-scale nms\r\n        pick = nms(boxes.copy(), 0.5, \"Union\")\r\n        if boxes.size > 0 and pick.size > 0:\r\n            boxes = boxes[pick, :]\r\n            total_boxes = np.append(total_boxes, boxes, axis=0)\r\n\r\n    numbox = total_boxes.shape[0]\r\n    if numbox > 0:\r\n        pick = nms(total_boxes.copy(), 0.7, \"Union\")\r\n        total_boxes = total_boxes[pick, :]\r\n        regw = total_boxes[:, 2] - total_boxes[:, 0]\r\n        regh = total_boxes[:, 3] - total_boxes[:, 1]\r\n        qq1 = total_boxes[:, 0] + total_boxes[:, 5] * regw\r\n        qq2 = total_boxes[:, 1] + total_boxes[:, 6] * regh\r\n        qq3 = total_boxes[:, 2] + total_boxes[:, 7] * regw\r\n        qq4 = total_boxes[:, 3] + total_boxes[:, 8] * regh\r\n        total_boxes = np.transpose(np.vstack([qq1, qq2, qq3, qq4, total_boxes[:, 4]]))\r\n        total_boxes = rerec(total_boxes.copy())\r\n        total_boxes[:, 0:4] = np.fix(total_boxes[:, 0:4]).astype(np.int32)\r\n        dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h)\r\n\r\n    numbox = total_boxes.shape[0]\r\n    if numbox > 0:\r\n        # second stage\r\n        tempimg = np.zeros((24, 24, 3, numbox))\r\n        for k in range(0, numbox):\r\n            tmp = np.zeros((int(tmph[k]), int(tmpw[k]), 3))\r\n            tmp[dy[k] - 1 : edy[k], dx[k] - 1 : edx[k], :] = img[y[k] - 1 : ey[k], x[k] - 1 : ex[k], :]\r\n            if tmp.shape[0] > 0 and tmp.shape[1] > 0 or tmp.shape[0] == 0 and tmp.shape[1] == 0:\r\n                tempimg[:, :, :, k] = imresample(tmp, (24, 24))\r\n            else:\r\n                return np.empty()\r\n        tempimg = (tempimg - 127.5) * 0.0078125\r\n        tempimg1 = np.transpose(tempimg, (3, 1, 0, 2))\r\n        out = rnet(tempimg1)\r\n        out0 = np.transpose(out[0])\r\n        out1 = np.transpose(out[1])\r\n        score = out1[1, :]\r\n        ipass = np.where(score > threshold[1])\r\n        total_boxes = np.hstack([total_boxes[ipass[0], 0:4].copy(), np.expand_dims(score[ipass].copy(), 1)])\r\n        mv = out0[:, ipass[0]]\r\n        if total_boxes.shape[0] > 0:\r\n            pick = nms(total_boxes, 0.7, \"Union\")\r\n            total_boxes = total_boxes[pick, :]\r\n            total_boxes = bbreg(total_boxes.copy(), np.transpose(mv[:, pick]))\r\n            total_boxes = rerec(total_boxes.copy())\r\n\r\n    numbox = total_boxes.shape[0]\r\n    if numbox > 0:\r\n        # third stage\r\n        total_boxes = np.fix(total_boxes).astype(np.int32)\r\n        dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h)\r\n        tempimg = np.zeros((48, 48, 3, numbox))\r\n        for k in range(0, numbox):\r\n            tmp = np.zeros((int(tmph[k]), int(tmpw[k]), 3))\r\n            tmp[dy[k] - 1 : edy[k], dx[k] - 1 : edx[k], :] = img[y[k] - 1 : ey[k], x[k] - 1 : ex[k], :]\r\n            if tmp.shape[0] > 0 and tmp.shape[1] > 0 or tmp.shape[0] == 0 and tmp.shape[1] == 0:\r\n                tempimg[:, :, :, k] = imresample(tmp, (48, 48))\r\n            else:\r\n                return np.empty()\r\n        tempimg = (tempimg - 127.5) * 0.0078125\r\n        tempimg1 = np.transpose(tempimg, (3, 1, 0, 2))\r\n        out = onet(tempimg1)\r\n        out0 = np.transpose(out[0])\r\n        out1 = np.transpose(out[1])\r\n        out2 = np.transpose(out[2])\r\n        score = out2[1, :]\r\n        points = out1\r\n        ipass = np.where(score > threshold[2])\r\n        points = points[:, ipass[0]]\r\n        total_boxes = np.hstack([total_boxes[ipass[0], 0:4].copy(), np.expand_dims(score[ipass].copy(), 1)])\r\n        mv = out0[:, ipass[0]]\r\n\r\n        w = total_boxes[:, 2] - total_boxes[:, 0] + 1\r\n        h = total_boxes[:, 3] - total_boxes[:, 1] + 1\r\n        points[0:5, :] = np.tile(w, (5, 1)) * points[0:5, :] + np.tile(total_boxes[:, 0], (5, 1)) - 1\r\n        points[5:10, :] = np.tile(h, (5, 1)) * points[5:10, :] + np.tile(total_boxes[:, 1], (5, 1)) - 1\r\n        if total_boxes.shape[0] > 0:\r\n            total_boxes = bbreg(total_boxes.copy(), np.transpose(mv))\r\n            pick = nms(total_boxes.copy(), 0.7, \"Min\")\r\n            total_boxes = total_boxes[pick, :]\r\n            points = points[:, pick]\r\n\r\n    return total_boxes, points\r\n"
  },
  {
    "path": "main.py",
    "content": "from sklearn.metrics.pairwise import pairwise_distances\r\nfrom tensorflow.python.platform import gfile\r\nimport tensorflow as tf\r\nimport numpy as np\r\nimport detect_and_align\r\nimport argparse\r\nimport easygui\r\nimport time\r\nimport cv2\r\nimport os\r\n\r\n\r\nclass IdData:\r\n    \"\"\"Keeps track of known identities and calculates id matches\"\"\"\r\n\r\n    def __init__(\r\n        self, id_folder, mtcnn, sess, embeddings, images_placeholder, phase_train_placeholder, distance_treshold\r\n    ):\r\n        print(\"Loading known identities: \", end=\"\")\r\n        self.distance_treshold = distance_treshold\r\n        self.id_folder = id_folder\r\n        self.mtcnn = mtcnn\r\n        self.id_names = []\r\n        self.embeddings = None\r\n\r\n        image_paths = []\r\n        os.makedirs(id_folder, exist_ok=True)\r\n        ids = os.listdir(os.path.expanduser(id_folder))\r\n        if not ids:\r\n            return\r\n\r\n        for id_name in ids:\r\n            id_dir = os.path.join(id_folder, id_name)\r\n            image_paths = image_paths + [os.path.join(id_dir, img) for img in os.listdir(id_dir)]\r\n\r\n        print(\"Found %d images in id folder\" % len(image_paths))\r\n        aligned_images, id_image_paths = self.detect_id_faces(image_paths)\r\n        feed_dict = {images_placeholder: aligned_images, phase_train_placeholder: False}\r\n        self.embeddings = sess.run(embeddings, feed_dict=feed_dict)\r\n\r\n        if len(id_image_paths) < 5:\r\n            self.print_distance_table(id_image_paths)\r\n\r\n    def add_id(self, embedding, new_id, face_patch):\r\n        if self.embeddings is None:\r\n            self.embeddings = np.atleast_2d(embedding)\r\n        else:\r\n            self.embeddings = np.vstack([self.embeddings, embedding])\r\n        self.id_names.append(new_id)\r\n        id_folder = os.path.join(self.id_folder, new_id)\r\n        os.makedirs(id_folder, exist_ok=True)\r\n        filenames = [s.split(\".\")[0] for s in os.listdir(id_folder)]\r\n        numbered_filenames = [int(f) for f in filenames if f.isdigit()]\r\n        img_number = max(numbered_filenames) + 1 if numbered_filenames else 0\r\n        cv2.imwrite(os.path.join(id_folder, f\"{img_number}.jpg\"), face_patch)\r\n\r\n    def detect_id_faces(self, image_paths):\r\n        aligned_images = []\r\n        id_image_paths = []\r\n        for image_path in image_paths:\r\n            image = cv2.imread(os.path.expanduser(image_path), cv2.IMREAD_COLOR)\r\n            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\r\n            face_patches, _, _ = detect_and_align.detect_faces(image, self.mtcnn)\r\n            if len(face_patches) > 1:\r\n                print(\r\n                    \"Warning: Found multiple faces in id image: %s\" % image_path\r\n                    + \"\\nMake sure to only have one face in the id images. \"\r\n                    + \"If that's the case then it's a false positive detection and\"\r\n                    + \" you can solve it by increasing the thresolds of the cascade network\"\r\n                )\r\n            aligned_images = aligned_images + face_patches\r\n            id_image_paths += [image_path] * len(face_patches)\r\n            path = os.path.dirname(image_path)\r\n            self.id_names += [os.path.basename(path)] * len(face_patches)\r\n\r\n        return np.stack(aligned_images), id_image_paths\r\n\r\n    def print_distance_table(self, id_image_paths):\r\n        \"\"\"Prints distances between id embeddings\"\"\"\r\n        distance_matrix = pairwise_distances(self.embeddings, self.embeddings)\r\n        image_names = [path.split(\"/\")[-1] for path in id_image_paths]\r\n        print(\"Distance matrix:\\n{:20}\".format(\"\"), end=\"\")\r\n        [print(\"{:20}\".format(name), end=\"\") for name in image_names]\r\n        for path, distance_row in zip(image_names, distance_matrix):\r\n            print(\"\\n{:20}\".format(path), end=\"\")\r\n            for distance in distance_row:\r\n                print(\"{:20}\".format(\"%0.3f\" % distance), end=\"\")\r\n        print()\r\n\r\n    def find_matching_ids(self, embs):\r\n        if self.id_names:\r\n            matching_ids = []\r\n            matching_distances = []\r\n            distance_matrix = pairwise_distances(embs, self.embeddings)\r\n            for distance_row in distance_matrix:\r\n                min_index = np.argmin(distance_row)\r\n                if distance_row[min_index] < self.distance_treshold:\r\n                    matching_ids.append(self.id_names[min_index])\r\n                    matching_distances.append(distance_row[min_index])\r\n                else:\r\n                    matching_ids.append(None)\r\n                    matching_distances.append(None)\r\n        else:\r\n            matching_ids = [None] * len(embs)\r\n            matching_distances = [np.inf] * len(embs)\r\n        return matching_ids, matching_distances\r\n\r\n\r\ndef load_model(model):\r\n    model_exp = os.path.expanduser(model)\r\n    if os.path.isfile(model_exp):\r\n        print(\"Loading model filename: %s\" % model_exp)\r\n        with gfile.FastGFile(model_exp, \"rb\") as f:\r\n            graph_def = tf.GraphDef()\r\n            graph_def.ParseFromString(f.read())\r\n            tf.import_graph_def(graph_def, name=\"\")\r\n    else:\r\n        raise ValueError(\"Specify model file, not directory!\")\r\n\r\n\r\ndef main(args):\r\n    with tf.Graph().as_default():\r\n        with tf.Session() as sess:\r\n\r\n            # Setup models\r\n            mtcnn = detect_and_align.create_mtcnn(sess, None)\r\n\r\n            load_model(args.model)\r\n            images_placeholder = tf.get_default_graph().get_tensor_by_name(\"input:0\")\r\n            embeddings = tf.get_default_graph().get_tensor_by_name(\"embeddings:0\")\r\n            phase_train_placeholder = tf.get_default_graph().get_tensor_by_name(\"phase_train:0\")\r\n\r\n            # Load anchor IDs\r\n            id_data = IdData(\r\n                args.id_folder[0], mtcnn, sess, embeddings, images_placeholder, phase_train_placeholder, args.threshold\r\n            )\r\n\r\n            cap = cv2.VideoCapture(0)\r\n            frame_height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)\r\n\r\n            show_landmarks = False\r\n            show_bb = False\r\n            show_id = True\r\n            show_fps = False\r\n            frame_detections = None\r\n            while True:\r\n                start = time.time()\r\n                _, frame = cap.read()\r\n\r\n                # Locate faces and landmarks in frame\r\n                face_patches, padded_bounding_boxes, landmarks = detect_and_align.detect_faces(frame, mtcnn)\r\n\r\n                if len(face_patches) > 0:\r\n                    face_patches = np.stack(face_patches)\r\n                    feed_dict = {images_placeholder: face_patches, phase_train_placeholder: False}\r\n                    embs = sess.run(embeddings, feed_dict=feed_dict)\r\n\r\n                    matching_ids, matching_distances = id_data.find_matching_ids(embs)\r\n                    frame_detections = {\"embs\": embs, \"bbs\": padded_bounding_boxes, \"frame\": frame.copy()}\r\n\r\n                    print(\"Matches in frame:\")\r\n                    for bb, landmark, matching_id, dist in zip(\r\n                        padded_bounding_boxes, landmarks, matching_ids, matching_distances\r\n                    ):\r\n                        if matching_id is None:\r\n                            matching_id = \"Unknown\"\r\n                            print(\"Unknown! Couldn't fint match.\")\r\n                        else:\r\n                            print(\"Hi %s! Distance: %1.4f\" % (matching_id, dist))\r\n\r\n                        if show_id:\r\n                            font = cv2.FONT_HERSHEY_SIMPLEX\r\n                            cv2.putText(frame, matching_id, (bb[0], bb[3]), font, 1, (255, 255, 255), 1, cv2.LINE_AA)\r\n                        if show_bb:\r\n                            cv2.rectangle(frame, (bb[0], bb[1]), (bb[2], bb[3]), (255, 0, 0), 2)\r\n                        if show_landmarks:\r\n                            for j in range(5):\r\n                                size = 1\r\n                                top_left = (int(landmark[j]) - size, int(landmark[j + 5]) - size)\r\n                                bottom_right = (int(landmark[j]) + size, int(landmark[j + 5]) + size)\r\n                                cv2.rectangle(frame, top_left, bottom_right, (255, 0, 255), 2)\r\n                else:\r\n                    print(\"Couldn't find a face\")\r\n\r\n                end = time.time()\r\n\r\n                seconds = end - start\r\n                fps = round(1 / seconds, 2)\r\n\r\n                if show_fps:\r\n                    font = cv2.FONT_HERSHEY_SIMPLEX\r\n                    cv2.putText(frame, str(fps), (0, int(frame_height) - 5), font, 1, (255, 255, 255), 1, cv2.LINE_AA)\r\n\r\n                cv2.imshow(\"frame\", frame)\r\n\r\n                key = cv2.waitKey(1)\r\n                if key == ord(\"q\"):\r\n                    break\r\n                elif key == ord(\"l\"):\r\n                    show_landmarks = not show_landmarks\r\n                elif key == ord(\"b\"):\r\n                    show_bb = not show_bb\r\n                elif key == ord(\"i\"):\r\n                    show_id = not show_id\r\n                elif key == ord(\"f\"):\r\n                    show_fps = not show_fps\r\n                elif key == ord(\"s\") and frame_detections is not None:\r\n                    for emb, bb in zip(frame_detections[\"embs\"], frame_detections[\"bbs\"]):\r\n                        patch = frame_detections[\"frame\"][bb[1] : bb[3], bb[0] : bb[2], :]\r\n                        cv2.imshow(\"frame\", patch)\r\n                        cv2.waitKey(1)\r\n                        new_id = easygui.enterbox(\"Who's in the image? Leave empty for non-valid\")\r\n                        if len(new_id) > 0:\r\n                            id_data.add_id(emb, new_id, patch)\r\n\r\n            cap.release()\r\n            cv2.destroyAllWindows()\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    parser = argparse.ArgumentParser()\r\n\r\n    parser.add_argument(\"model\", type=str, help=\"Path to model protobuf (.pb) file\")\r\n    parser.add_argument(\"id_folder\", type=str, nargs=\"+\", help=\"Folder containing ID folders\")\r\n    parser.add_argument(\"-t\", \"--threshold\", type=float, help=\"Distance threshold defining an id match\", default=1.0)\r\n    main(parser.parse_args())\r\n"
  }
]