[
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2018 Ganesh Iyer\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# CalibNet\n\n### [DEPRECATED] This repository is no longer actively supported. \n\nWhile the authors work on an update, please check out this unofficial implementation: [CalibNet_pytorch](https://github.com/gitouni/CalibNet_pytorch) :fire: :slightly_smiling_face:\n___\n\nCode for our paper:\n[CalibNet: Self-Supervised Extrinsic Calibration using 3D Spatial Transformer Networks](https://arxiv.org/pdf/1803.08181.pdf)\n\nCheck out our [project page](https://epiception.github.io/CalibNet/)!\n\n![CalibNet_gif1](https://media.giphy.com/media/1zjOgLf7j4lHmeMubG/giphy.gif)\n\n### Prerequisites\nCalibNet is trained on Tensorflow 1.3, CUDA 8.0, CUDNN 7.0.1\n\n\n##### Installation\n\nThe code for point cloud distance loss is modified from [PU-NET](https://github.com/yulequan/PU-Net), PointNet++, PointSetGeneration.\n\nThis repository, thus, is based on Tensorflow and the TF operators from PointNet++ and PU-NET.\n\nFor installing tensorflow, please follow the official instructions in here. The code is tested under TF1.3 and Python 2.7 on Ubuntu 16.04.\n\nFor compiling TF operators, please check tf_xxx_compile.sh under each op subfolder in code/tf_ops folder, and change the path correctly to ../path/to/tensorflow/include. Note that you need to update nvcc, python and tensoflow include library if necessary. You also need to remove -D_GLIBCXX_USE_CXX11_ABI=0 flag in g++ command in order to compile correctly if necessary.\n\nWe are working to update the code and installation steps for the latest tensorflow versions.\n\n### Dataset Preparation\n\nTo prepare the dataset, run /dataset_files/dataset_builder_parallel.sh in the directory where you wish to store. We will also create a parser `parsed_set.txt` for the dataset, that contains the file names for training.\n\n```\ngit clone https://github.com/epiception/CalibNet.git\nor\nsvn checkout https://github.com/CalibNet/trunk/code (for the code)\ncd ../path/to/dataset_directory\nbash ../path/to/code_folder/dataset_files/dataset_builder_parallel.sh\ncd ../path/to/CalibNet/code\npython dataset_files/parser.py ../dataset_directory/2011_09_26/\n```\n##### Resnet-18\nPretrained Resnet-18 parameters can be found [here](https://drive.google.com/open?id=1XGqdBH3A88m1LgUIe5tS7VjKjtQc1A6V).\n\n\n### Training\n\nBefore training, be sure to make requisite changes to the paths and training parameters in the config file `config_res.py`.\nWe trained using 2 GPUs. The base code is written to support ops for the same device configuration. \n\nTo begin training:\n```\nCUDA_VISIBLE_DEVICES=<device_id1>,<device_id2> python -B train_model_combined.py\n```\n\n##### Trained Weights\nTrained weights for the base variant (non-iterative) model is available [here](https://drive.google.com/drive/folders/138hq7OgTEBmG-wK52h7gchg5ob1WqARn?usp=sharing). This model was trained for 44 epochs. As mentioned in the paper, the iterative realignment model for better translation outputs will uploaded soon.\n\n### Evaluation/Test\nCode for Direct Evaluation/Testing pipeline for point cloud calibration will be uploaded soon.\n"
  },
  {
    "path": "code/common/Lie_functions.py",
    "content": "import numpy as np\nimport tensorflow as tf\n\ndef for_translation(Old_transform,t):\n\n    R = Old_transform[:3,:3]\n    t = tf.expand_dims(t,1)\n    trans_T = tf.concat([R,t], 1)\n\n    return trans_T\n\ndef for_rotation(Old_transform):\n\n    R = Old_transform[:3,:3]\n    t = tf.expand_dims(tf.constant(np.array([0.0, 0.0, 0.0], dtype = np.float32)), 1)\n    rots_T = tf.concat([R,t], 1)\n\n    return rots_T\n\n\ndef exponential_map_single(vec):\n\n    \"Exponential Map Operation. Decoupled for SO(3) and translation t\"\n\n    with tf.name_scope(\"Exponential_map\"):\n\n        u = vec[:3]\n        omega = vec[3:]\n\n        theta = tf.sqrt(omega[0]*omega[0] + omega[1]*omega[1] + omega[2]*omega[2])\n\n        omega_cross = tf.stack([0.0, -omega[2], omega[1], omega[2], 0.0, -omega[0], -omega[1], omega[0], 0.0])\n        omega_cross = tf.reshape(omega_cross, [3,3])\n\n        #Taylor's approximation for A,B and C not being used currently, approximations preferable for low values of theta\n\n        # A = 1.0 - (tf.pow(theta,2)/factorial(3.0)) + (tf.pow(theta, 4)/factorial(5.0))\n        # B = 1.0/factorial(2.0) - (tf.pow(theta,2)/factorial(4.0)) + (tf.pow(theta, 4)/factorial(6.0))\n        # C = 1.0/factorial(3.0) - (tf.pow(theta,2)/factorial(5.0)) + (tf.pow(theta, 4)/factorial(7.0))\n\n        A = tf.sin(theta)/theta\n\n        B = (1.0 - tf.cos(theta))/(tf.pow(theta,2))\n\n        C = (1.0 - A)/(tf.pow(theta,2))\n\n        omega_cross_square = tf.matmul(omega_cross, omega_cross)\n\n        R = tf.eye(3,3) + A*omega_cross + B*omega_cross_square\n\n        V = tf.eye(3,3) + B*omega_cross + C*omega_cross_square\n        Vu = tf.matmul(V,tf.expand_dims(u,1))\n\n        T = tf.concat([R, Vu], 1)\n\n        return T\n\n\ndef transforms_mul(T1, T2):\n\n\n    T1 = tf.concat ([T1, tf.constant(np.array([[0.0, 0.0, 0.0, 1.0]]), dtype = tf.float32)], 0)\n    T2 = tf.concat ([T2, tf.constant(np.array([[0.0, 0.0, 0.0, 1.0]]), dtype = tf.float32)], 0)\n\n    product = tf.matmul(T1, T2)\n\n    return product\n"
  },
  {
    "path": "code/common/__init__.py",
    "content": ""
  },
  {
    "path": "code/common/all_transformer.py",
    "content": "import tensorflow as tf\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport scipy.misc as smc\n\nimport config_res as config\n\nIMG_HT = config.depth_img_params['IMG_HT']\nIMG_WDT = config.depth_img_params['IMG_WDT']\nbatch_size = 1\n\nshape = (IMG_HT, IMG_WDT)\n\ndef _simple_transformer(depth_map, t_mat, k_mat, small_transform):\n\n    batch_grids, transformed_depth_map, sparse_cloud  = _3D_meshgrid_batchwise_diff(IMG_HT, IMG_WDT, depth_map, batch_size, t_mat, k_mat, small_transform)\n\n    x_all = tf.reshape(batch_grids[:,0], (IMG_HT, IMG_WDT))\n    y_all = tf.reshape(batch_grids[:,1], (IMG_HT, IMG_WDT))\n\n    return _bilinear_sampling(transformed_depth_map, x_all, y_all), sparse_cloud\n\n\ndef sparsify_cloud(S):\n\n    \"\"\"\n    Cluster centers of point clouds used to sparsify cloud for Earth Mover's Distance. Using 4096 centroids\n    \"\"\"\n\n    with tf.device('/cpu:0'):\n\n        point_limit = 4096\n        no_points = tf.shape(S)[0]\n        no_partitions = no_points/tf.constant(point_limit, dtype=tf.int32)\n        saved_points = tf.gather_nd(S, [tf.expand_dims(tf.range(0, no_partitions*point_limit), 1)])\n        saved_points = tf.reshape(saved_points, [point_limit, no_partitions, 3])\n        saved_points_sparse = tf.reduce_mean(saved_points, 1)\n\n        return saved_points_sparse\n\ndef _3D_meshgrid_batchwise_diff(height, width, depth_img, num_batch, transformation_matrix, tf_K_mat, small_transform):\n\n    \"\"\"\n    Creates 3d sampling meshgrid\n    \"\"\"\n\n    x_index = tf.linspace(-1.0, 1.0, width)\n    y_index = tf.linspace(-1.0, 1.0, height)\n    z_index = tf.range(0, width*height)\n\n    x_t, y_t = tf.meshgrid(x_index, y_index)\n\n    # flatten\n    x_t_flat = tf.reshape(x_t, [1,-1])\n    y_t_flat = tf.reshape(y_t, [1,-1])\n    ZZ = tf.reshape(depth_img, [-1])\n\n    zeros_target = tf.zeros_like(ZZ)\n    mask = tf.not_equal(ZZ, zeros_target)\n    ones = tf.ones_like(x_t_flat)\n\n    sampling_grid_2d = tf.concat([x_t_flat, y_t_flat, ones], 0)\n    sampling_grid_2d_sparse = tf.transpose(tf.boolean_mask(tf.transpose(sampling_grid_2d), mask))\n    ZZ_saved = tf.boolean_mask(ZZ, mask)\n    ones_saved = tf.expand_dims(tf.ones_like(ZZ_saved), 0)\n\n    projection_grid_3d = tf.matmul(tf.matrix_inverse(tf_K_mat), sampling_grid_2d_sparse*ZZ_saved)\n\n    homog_points_3d = tf.concat([projection_grid_3d, ones_saved], 0)\n\n    final_transformation_matrix = tf.matmul(transformation_matrix, small_transform)[:3,:]\n    warped_sampling_grid = tf.matmul(final_transformation_matrix, homog_points_3d)\n\n    points_2d = tf.matmul(tf_K_mat, warped_sampling_grid[:3,:])\n\n    Z = points_2d[2,:]\n\n    x_dash_pred = points_2d[0,:]\n    y_dash_pred = points_2d[1,:]\n    point_cloud = tf.stack([x_dash_pred, y_dash_pred, Z], 1)\n\n    sparse_point_cloud = sparsify_cloud(point_cloud)\n\n    x = tf.transpose(points_2d[0,:]/Z)\n    y = tf.transpose(points_2d[1,:]/Z)\n\n    mask_int = tf.cast(mask, 'int32')\n\n    updated_indices = tf.expand_dims(tf.boolean_mask(mask_int*z_index, mask), 1)\n\n    updated_Z = tf.scatter_nd(updated_indices, Z, tf.constant([width*height]))\n    updated_x = tf.scatter_nd(updated_indices, x, tf.constant([width*height]))\n    neg_ones = tf.ones_like(updated_x)*-1.0\n    updated_x_fin = tf.where(tf.equal(updated_Z, zeros_target), neg_ones, updated_x)\n\n    updated_y = tf.scatter_nd(updated_indices, y, tf.constant([width*height]))\n    updated_y_fin = tf.where(tf.equal(updated_Z, zeros_target), neg_ones, updated_y)\n\n    reprojected_grid = tf.stack([updated_x_fin, updated_y_fin], 1)\n\n    transformed_depth = tf.reshape(updated_Z, (IMG_HT, IMG_WDT))\n\n    return reprojected_grid, transformed_depth, sparse_point_cloud\n\ndef reverse_all(z):\n\n    \"\"\"Reversing from cantor function indices to correct indices\"\"\"\n\n    z = tf.cast(z, 'float32')\n    w = tf.floor((tf.sqrt(8.*z + 1.) - 1.)/2.0)\n    t = (w**2 + w)/2.0\n    y = tf.clip_by_value(tf.expand_dims(z - t, 1), 0.0, IMG_HT - 1)\n    x = tf.clip_by_value(tf.expand_dims(w - y[:,0], 1), 0.0, IMG_WDT - 1)\n\n    return tf.concat([y,x], 1)\n\ndef get_pixel_value(img, x, y):\n\n    \"\"\"Cantor pairing for removing non-unique updates and indices. At the time of implementation, unfixed issue with scatter_nd causes problems with int32 update values. Till resolution, implemented on cpu \"\"\"\n\n    with tf.device('/cpu:0'):\n        indices = tf.stack([y, x], 2)\n        indices = tf.reshape(indices, (375*1242, 2))\n        values = tf.reshape(img, [-1])\n\n        Y = indices[:,0]\n        X = indices[:,1]\n        Z = (X + Y)*(X + Y + 1)/2 + Y\n\n        filtered, idx = tf.unique(tf.squeeze(Z))\n        updated_values  = tf.unsorted_segment_max(values, idx, tf.shape(filtered)[0])\n\n        # updated_indices = tf.map_fn(fn=lambda i: reverse(i), elems=filtered, dtype=tf.float32)\n        updated_indices = reverse_all(filtered)\n        updated_indices = tf.cast(updated_indices, 'int32')\n        resolved_map = tf.scatter_nd(updated_indices, updated_values, img.shape)\n\n        return resolved_map\n\ndef _bilinear_sampling(img, x_func, y_func):\n\n    \"\"\"\n    Sampling from input image and performing bilinear interpolation\n    \"\"\"\n\n    max_y = tf.constant(IMG_HT - 1, dtype=tf.int32)\n    max_x = tf.constant(IMG_WDT - 1, dtype=tf.int32)\n\n    zero = tf.zeros([], dtype='int32')\n\n    # rescale x and y to [0, W/H/D]\n    x = 0.5 * ((x_func + 1.0) * tf.cast(IMG_WDT - 1, 'float32'))\n    y = 0.5 * ((y_func + 1.0) * tf.cast(IMG_HT - 1, 'float32'))\n\n    x = tf.clip_by_value(x, 0.0, tf.cast(max_x, 'float32'))\n    y = tf.clip_by_value(y, 0.0, tf.cast(max_y, 'float32'))\n\n    # grab 4 nearest corner points for each (x_i, y_i)\n    # i.e. we need a rectangle around the point of interest\n    x0 = tf.cast(tf.floor(x), 'int32')\n    x1 = x0 + 1\n    y0 = tf.cast(tf.floor(y), 'int32')\n    y1 = y0 + 1\n\n    # clip to range [0, H/W] to not violate img boundaries\n    x0 = tf.clip_by_value(x0, 0, max_x)\n    x1 = tf.clip_by_value(x1, 0, max_x)\n    y0 = tf.clip_by_value(y0, 0, max_y)\n    y1 = tf.clip_by_value(y1, 0, max_y)\n\n    # find Ia, Ib, Ic, Id\n\n    Ia = get_pixel_value(img, x0, y0)\n    Ib = get_pixel_value(img, x0, y1)\n    Ic = get_pixel_value(img, x1, y0)\n    Id = get_pixel_value(img, x1, y1)\n\n    x0 = tf.cast(x0, 'float32')\n    x1 = tf.cast(x1, 'float32')\n    y0 = tf.cast(y0, 'float32')\n    y1 = tf.cast(y1, 'float32')\n\n    # calculate deltas\n    wa = (x1-x) * (y1-y)\n    wb = (x1-x) * (y-y0)\n    wc = (x-x0) * (y1-y)\n    wd = (x-x0) * (y-y0)\n\n    loc = wa*Ia + wb*Ib + wc*Ic + wd*Id\n\n    return loc\n"
  },
  {
    "path": "code/common/cnn_utils_res.py",
    "content": "import numpy as np\nimport tensorflow as tf\n\ndef weight_variable(shape, str):\n    '''\n    Helper function to create a weight variable initialized with\n    a normal distribution (truncated to two standard devs)\n    Parameters\n    ----------\n    shape : list\n        Size of weight variable\n    '''\n\n    W = tf.get_variable(\"weight\" + str, shape=shape, initializer=tf.contrib.layers.xavier_initializer())\n    return W\n\ndef weight_variable_fc(shape, str):\n    '''\n    Helper function to create a weight variable initialized with\n    a normal distribution (truncated to two standard devs)\n    Parameters\n    ----------\n    shape : list\n        Size of weight variable\n    '''\n\n    W = 0.01*tf.get_variable(\"weight\" + str, shape=shape, initializer=tf.contrib.layers.xavier_initializer())\n    return W\n\ndef bias_variable(shape, str):\n\n    B = tf.Variable(tf.constant(0.0, shape= shape, dtype=tf.float32), name=\"bias\" + str)\n    return B\n\n\ndef init_weights(W, str, to_train):\n    init = tf.constant_initializer(W)\n    weight = tf.get_variable('weight'+str, shape=W.shape, dtype=tf.float32, initializer= init, trainable = to_train)\n    return weight\n\ndef init_bias(B, layerno, to_train):\n    init = tf.constant_initializer(B)\n    bias = tf.get_variable('bias_%d'%layerno, shape=B.shape, dtype=tf.float32, initializer= init, trainable = to_train)\n    return bias\n\ndef conv2d_batchnorm(x, W, name, phase, beta_r, gamma_r, mean_r, variance_r, stride = [1,1,1,1], relu = True):\n\n    beta = tf.constant_initializer(beta_r)\n    gamma = tf.constant_initializer(gamma_r)\n    moving_mean = tf.constant_initializer(mean_r)\n    moving_variance = tf.constant_initializer(variance_r)\n\n    with tf.name_scope(name):\n        mid1 =  tf.nn.conv2d(x, W, strides = stride, padding = \"SAME\")\n        with tf.name_scope('batch_norm'):\n            mid2 = tf.contrib.layers.batch_norm(mid1, param_initializers={'beta': beta, 'gamma': gamma, 'moving_mean': moving_mean,'moving_variance': moving_variance,}, is_training = phase, updates_collections = None, scale = True, decay = 0.9)\n            if(relu == True):\n                return tf.nn.relu(mid2)\n            else:\n                return mid2\n\ndef conv2d_batchnorm_init(x, W, name, phase, stride = [1,1,1,1], relu = True):\n\n    with tf.name_scope(name):\n        mid1 =  tf.nn.conv2d(x, W, strides = stride, padding =\"SAME\")\n        with tf.name_scope('batch_norm'):\n            mid2 = tf.contrib.layers.batch_norm(mid1, is_training = phase, updates_collections = None)\n            if(relu == True):\n                return tf.nn.relu(mid2)\n            else:\n                return mid2\n\ndef conv2d_init(x, W, name, phase, stride, padding):\n    with tf.name_scope(name):\n        mid1 =  tf.nn.conv2d(x, W, strides = stride, padding = padding)\n        mid2 = tf.nn.relu(mid1)\n\n        return mid2\n\n\ndef conv2d_bias_init(x, W, b, name):\n    with tf.name_scope(name):\n        mid1 =  tf.nn.conv2d(x, W, strides = [1,1,1,1], padding = \"SAME\") + b\n        mid2 = tf.nn.relu(mid1)\n\n        return mid2\n\ndef max_pool(x, name):\n    return tf.nn.max_pool(x, ksize = [1,2,2,1], strides = [1,2,2,1], padding = \"SAME\", name=name)\n\ndef variable_summaries(var):\n  \"\"\"Attach a lot of summaries to a Tensor (for TensorBoard visualization).\"\"\"\n\n  with tf.name_scope('summaries'):\n    mean = tf.reduce_mean(var)\n    sum_mean = tf.summary.scalar('mean', mean)\n    with tf.name_scope('stddev'):\n      stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))\n    sum_stddev = tf.summary.scalar('stddev', stddev)\n    #tf.summary.scalar('max', tf.reduce_max(var))\n    #tf.summary.scalar('min', tf.reduce_min(var))\n    sum_hist = tf.summary.histogram('histogram', var)\n    return [sum_mean, sum_hist, sum_stddev]\n"
  },
  {
    "path": "code/common/global_agg_net.py",
    "content": "import numpy as np\nimport tensorflow as tf\nimport scipy.misc as smc\nimport matplotlib.pyplot as plt\n\nimport config_res as config\nfrom cnn_utils_res import *\n\nimport resnet_rgb_model as model\nimport resnet_depth_model as model_depth\n\nbatch_size = config.net_params['batch_size']\ncurrent_epoch = config.net_params['load_epoch']\n\ndef End_Net_weights_init():\n\n    \"\"\"\n    Initialize Aggregation Network Weights and Summaries\n    \"\"\"\n\n    W_ext1 = weight_variable([3,3,768,384], \"_8\")\n    W_ext2 = weight_variable([3,3,384,384], \"_9\")\n    W_ext3 = weight_variable([1,2,384,384], \"_10\")\n\n    W_ext4_rot = weight_variable([1,1,384,384], \"_11\")\n    W_fc_rot = weight_variable_fc([3840,3], \"_12\")\n\n    W_ext4_tr = weight_variable([1,1,384,384], \"_13\")\n    W_fc_tr = weight_variable_fc([3840,3], \"_14\")\n\n    end_weights = [W_ext1, W_ext2, W_ext3, W_ext4_rot, W_fc_rot, W_ext4_tr, W_fc_tr]\n\n    weight_summaries = []\n\n    for weight_index in range(len(end_weights)):\n        with tf.name_scope('weight_%d'%weight_index):\n            weight_summaries += variable_summaries(end_weights[weight_index])\n\n    return end_weights, weight_summaries\n\ndef End_Net(input_x, phase_depth, keep_prob):\n\n    \"\"\"\n    Define Aggregation Network\n    \"\"\"\n\n    weights, summaries = End_Net_weights_init()\n\n    layer8 = conv2d_batchnorm_init(input_x, weights[0], name=\"conv_9\", phase= phase_depth, stride=[1,2,2,1])\n    layer9 = conv2d_batchnorm_init(layer8, weights[1], name=\"conv_10\", phase= phase_depth, stride=[1,2,2,1])\n    layer10 = conv2d_batchnorm_init(layer9, weights[2], name=\"conv_11\", phase= phase_depth, stride=[1,1,1,1])\n\n    layer11_rot = conv2d_batchnorm_init(layer10, weights[3], name=\"conv_12\", phase= phase_depth, stride=[1,1,1,1])\n    layer11_m_rot = tf.reshape(layer11_rot, [batch_size, 3840])\n    layer11_drop_rot = tf.nn.dropout(layer11_m_rot, keep_prob)\n    layer11_vec_rot = (tf.matmul(layer11_drop_rot, weights[4]))\n\n    layer11_tr = conv2d_batchnorm_init(layer10, weights[5], name=\"conv_13\", phase= phase_depth, stride=[1,1,1,1])\n    layer11_m_tr = tf.reshape(layer11_tr, [batch_size, 3840])\n    layer11_drop_tr = tf.nn.dropout(layer11_m_tr, keep_prob)\n    layer11_vec_tr = (tf.matmul(layer11_drop_tr, weights[6]))\n\n    output_vectors = tf.concat([layer11_vec_tr, layer11_vec_rot], 1)\n    return output_vectors, summaries\n\n\ndef End_Net_Out(X1, phase_rgb, pooled_input2, phase, keep_prob):\n\n    \"\"\"\n    Computation Graph\n    \"\"\"\n\n    RGB_Net_obj = model.Resnet(X1, phase_rgb)\n    Depth_Net_obj = model_depth.Depthnet(pooled_input2, phase)\n\n    with tf.variable_scope('ResNet'):\n        with tf.device('/device:GPU:0'):\n            output_rgb = RGB_Net_obj.Net()\n    with tf.variable_scope('DepthNet'):\n        with tf.device('/device:GPU:1'):\n            output_depth = Depth_Net_obj.Net()\n\n    layer_next = tf.concat([output_rgb, output_depth], 3)\n\n    end_net_op = End_Net(layer_next, phase, keep_prob)\n\n    return end_net_op\n"
  },
  {
    "path": "code/common/resnet_depth_model.py",
    "content": "import numpy as np\nimport tensorflow as tf\nimport json\nfrom cnn_utils_res import *\n\nimport config_res\n\nwith open(config_res.paths['resnet_params_path']) as f_in:\n    parameters = json.load(f_in)\n\nclass Depthnet:\n\n    def __init__(self, input_y, phase, parameters = parameters):\n\n        self.input_y = input_y\n        self.phase = phase\n        self.parameters = parameters\n        self.layer_zero\n        self.layer\n        self.Net\n\n    def Net(self):\n        layer_zero_out = self.layer_zero(self.input_y)\n\n        current_output = layer_zero_out\n\n        for layer_idx in range(1,5):\n            layer_out = self.layer(current_output, layer_idx)\n            current_output = layer_out\n\n        return current_output\n\n    def layer_zero(self, layer_input):\n\n        layer_dict =self.parameters['layer0']\n        bl_str = \"block_1\"\n\n        W = np.array(layer_dict[bl_str]['conv1']['weight'], dtype = np.float32)\n        bn_mov_mean = np.array(layer_dict[bl_str]['bn1']['running_mean'], dtype = np.float32)\n        bn_mov_var = np.array(layer_dict[bl_str]['bn1']['running_var'], dtype = np.float32)\n        bn_gamma = np.array(layer_dict[bl_str]['bn1']['weight'], dtype = np.float32)\n        bn_beta = np.array(layer_dict[bl_str]['bn1']['bias'], dtype = np.float32)\n\n        shapex = W.shape\n        W_conv = weight_variable([shapex[0], shapex[1], 1, shapex[3]/2], \"_depth_0\")\n        # out = conv2d_batchnorm(layer_input, W_conv, \"layer_depth_0\", self.phase, bn_beta, bn_gamma, bn_mov_mean, bn_mov_var, [1,2,2,1], True)\n        out = conv2d_batchnorm_init(layer_input, W_conv, \"layer_depth_0\", self.phase, [1,2,2,1], True)\n\n        out = tf.nn.max_pool(out, [1,3,3,1], strides=[1,2,2,1], padding=\"SAME\")\n\n        print('layer0', out.shape)\n        return out\n\n    def layer(self, layer_input, layer_no):\n        layer_dict = self.parameters['layer%d'%layer_no]\n\n        cur = layer_input\n        res = layer_input\n\n        for b_no in range(1,3):\n            bl_str = \"block_%d\"%b_no\n\n            stride = [0,0]\n            if(b_no == 1):\n                stride = [2,1]\n            else:\n                stride = [1,1]\n\n            # for in_bno in range(1,3):\n\n            W1 = np.array(layer_dict[bl_str]['conv1']['weight'], dtype = np.float32)\n            bn_mov_mean1 = np.array(layer_dict[bl_str]['bn1']['running_mean'], dtype = np.float32)\n            bn_mov_var1 = np.array(layer_dict[bl_str]['bn1']['running_var'], dtype = np.float32)\n            bn_gamma1 = np.array(layer_dict[bl_str]['bn1']['weight'], dtype = np.float32)\n            bn_beta1 = np.array(layer_dict[bl_str]['bn1']['bias'], dtype = np.float32)\n\n            W2 = np.array(layer_dict[bl_str]['conv2']['weight'], dtype = np.float32)\n            bn_mov_mean2 = np.array(layer_dict[bl_str]['bn2']['running_mean'], dtype = np.float32)\n            bn_mov_var2 = np.array(layer_dict[bl_str]['bn2']['running_var'], dtype = np.float32)\n            bn_gamma2 = np.array(layer_dict[bl_str]['bn2']['weight'], dtype = np.float32)\n            bn_beta2 = np.array(layer_dict[bl_str]['bn2']['bias'], dtype = np.float32)\n\n            # W_conv1 = init_weights(W1, \"_l_%d_bl_%d_no_%d\"%(layer_no,b_no, 1), False)\n            # W_conv2 = init_weights(W2, \"_l_%d_bl_%d_no_%d\"%(layer_no,b_no, 2), False)\n\n            shapex1 = W1.shape\n            shapex2 = W2.shape\n\n            W_conv1 = weight_variable([shapex1[0], shapex1[1], shapex1[2]/2, shapex1[3]/2], \"dep_l_%d_bl_%d_no_%d\"%(layer_no,b_no, 1))\n            W_conv2 = weight_variable([shapex2[0], shapex2[1], shapex2[2]/2, shapex2[3]/2], \"dep_l_%d_bl_%d_no_%d\"%(layer_no,b_no, 2))\n\n            # out1 = conv2d_batchnorm(cur, W_conv1, \"layer_%d_%d_1\"%(layer_no,b_no), self.phase, bn_beta1, bn_gamma1, bn_mov_mean1, bn_mov_var1, [1,stride[0],stride[0],1], False)\n\n            out1 = conv2d_batchnorm_init(cur, W_conv1, \"dep_layer_%d_%d_1\"%(layer_no,b_no), self.phase, [1,stride[0],stride[0],1], False)\n\n            print(\"layer_%d_%d_1\"%(layer_no,b_no), out1.shape)\n\n            \"\"\" if layer1 no downsample, so stride 2,1 then 1,1 \"\"\"\n            \"\"\"else stride 2,1 then downsample then 1,1 \"\"\"\n\n            if(layer_no > 1 and b_no == 1):\n                downsample_dict = self.parameters['layer%d_downsample'%layer_no]\n                W_dn = np.array(downsample_dict['block_1']['conv']['weight'], dtype = np.float32)\n                bn_mov_mean_dn = np.array(downsample_dict['block_1']['bn']['running_mean'], dtype = np.float32)\n                bn_mov_var_dn = np.array(downsample_dict['block_1']['bn']['running_var'], dtype = np.float32)\n                bn_gamma_dn = np.array(downsample_dict['block_1']['bn']['weight'], dtype = np.float32)\n                bn_beta_dn = np.array(downsample_dict['block_1']['bn']['bias'], dtype = np.float32)\n\n                # W_conv_dn = init_weights(W_dn, \"downsample_%d\"%(layer_no), False)\n                # res = conv2d_batchnorm(res, W_conv_dn, \"layer_dn_%d\"%(layer_no), self.phase, bn_beta_dn, bn_gamma_dn, bn_mov_mean_dn, bn_mov_var_dn, [1,2,2,1], False)\n\n                shapex = W_dn.shape\n\n                W_conv_dn = weight_variable([shapex[0], shapex[1], shapex[2]/2, shapex[3]/2], \"dep_downsample_%d\"%(layer_no))\n                res = conv2d_batchnorm_init(res, W_conv_dn, \"dep_layer_dn_%d\"%(layer_no), self.phase, [1,2,2,1], False)\n\n                print(\"downsample_layer_%d_%d_1\"%(layer_no,b_no), res.shape)\n\n                out1 = tf.nn.relu(out1 + res)\n\n            else:\n                out1 = tf.nn.relu(out1)\n\n            out2 = conv2d_batchnorm_init(out1, W_conv2, \"dep_layer_%d_%d_2\"%(layer_no,b_no), self.phase, [1,stride[1],stride[1],1], True)\n            print(\"layer_%d_%d_2\"%(layer_no,b_no), out2.shape)\n            cur = out2\n\n        return cur\n"
  },
  {
    "path": "code/common/resnet_rgb_model.py",
    "content": "import numpy as np\nimport tensorflow as tf\nimport json\nfrom cnn_utils_res import *\nimport config_res\n\nwith open(config_res.paths['resnet_params_path']) as f_in:\n    parameters = json.load(f_in)\n\n\nclass Resnet:\n\n    def __init__(self, input_x, phase, parameters = parameters):\n\n        self.input_x = input_x\n        self.phase = phase\n        self.parameters = parameters\n        self.layer_zero\n        self.layer\n        self.Net\n\n    def Net(self):\n        layer_zero_out = self.layer_zero(self.input_x)\n\n        current_output = layer_zero_out\n\n        for layer_idx in range(1,5):\n            layer_out = self.layer(current_output, layer_idx)\n            current_output = layer_out\n\n        return current_output\n\n    def layer_zero(self, layer_input):\n\n        layer_dict =self.parameters['layer0']\n        bl_str = \"block_1\"\n\n        W = np.array(layer_dict[bl_str]['conv1']['weight'], dtype = np.float32)\n        bn_mov_mean = np.array(layer_dict[bl_str]['bn1']['running_mean'], dtype = np.float32)\n        bn_mov_var = np.array(layer_dict[bl_str]['bn1']['running_var'], dtype = np.float32)\n        bn_gamma = np.array(layer_dict[bl_str]['bn1']['weight'], dtype = np.float32)\n        bn_beta = np.array(layer_dict[bl_str]['bn1']['bias'], dtype = np.float32)\n\n        W_conv = init_weights(W, \"_0\", False)\n\n        # with tf.name_scope('layer_0'):\n        out = conv2d_batchnorm(layer_input, W_conv, \"layer_0\", self.phase, bn_beta, bn_gamma, bn_mov_mean, bn_mov_var, [1,2,2,1], True)\n        out = tf.nn.max_pool(out, [1,3,3,1], strides=[1,2,2,1], padding=\"SAME\")\n\n        print('layer0', out.shape)\n        return out\n\n    def layer(self, layer_input, layer_no):\n        layer_dict = self.parameters['layer%d'%layer_no]\n\n        cur = layer_input\n        res = layer_input\n\n        for b_no in range(1,3):\n            bl_str = \"block_%d\"%b_no\n\n            stride = [0,0]\n            if(b_no == 1):\n                stride = [2,1]\n            else:\n                stride = [1,1]\n\n            # for in_bno in range(1,3):\n\n            W1 = np.array(layer_dict[bl_str]['conv1']['weight'], dtype = np.float32)\n            bn_mov_mean1 = np.array(layer_dict[bl_str]['bn1']['running_mean'], dtype = np.float32)\n            bn_mov_var1 = np.array(layer_dict[bl_str]['bn1']['running_var'], dtype = np.float32)\n            bn_gamma1 = np.array(layer_dict[bl_str]['bn1']['weight'], dtype = np.float32)\n            bn_beta1 = np.array(layer_dict[bl_str]['bn1']['bias'], dtype = np.float32)\n\n            W2 = np.array(layer_dict[bl_str]['conv2']['weight'], dtype = np.float32)\n            bn_mov_mean2 = np.array(layer_dict[bl_str]['bn2']['running_mean'], dtype = np.float32)\n            bn_mov_var2 = np.array(layer_dict[bl_str]['bn2']['running_var'], dtype = np.float32)\n            bn_gamma2 = np.array(layer_dict[bl_str]['bn2']['weight'], dtype = np.float32)\n            bn_beta2 = np.array(layer_dict[bl_str]['bn2']['bias'], dtype = np.float32)\n\n            W_conv1 = init_weights(W1, \"_l_%d_bl_%d_no_%d\"%(layer_no,b_no, 1), False)\n            W_conv2 = init_weights(W2, \"_l_%d_bl_%d_no_%d\"%(layer_no,b_no, 2), False)\n\n            # with tf.name_scope(\"layer_%d_%d_1\"%(layer_no,b_no)):\n            out1 = conv2d_batchnorm(cur, W_conv1, \"layer_%d_%d_1\"%(layer_no,b_no), self.phase, bn_beta1, bn_gamma1, bn_mov_mean1, bn_mov_var1, [1,stride[0],stride[0],1], False)\n\n            print(\"layer_%d_%d_1\"%(layer_no,b_no), out1.shape)\n\n            \"\"\" if layer1 no downsample, so stride 2,1 then 1,1 \"\"\"\n            \"\"\"else stride 2,1 then downsample then 1,1 \"\"\"\n\n            if(layer_no > 1 and b_no == 1):\n                downsample_dict = self.parameters['layer%d_downsample'%layer_no]\n                W_dn = np.array(downsample_dict['block_1']['conv']['weight'], dtype = np.float32)\n                bn_mov_mean_dn = np.array(downsample_dict['block_1']['bn']['running_mean'], dtype = np.float32)\n                bn_mov_var_dn = np.array(downsample_dict['block_1']['bn']['running_var'], dtype = np.float32)\n                bn_gamma_dn = np.array(downsample_dict['block_1']['bn']['weight'], dtype = np.float32)\n                bn_beta_dn = np.array(downsample_dict['block_1']['bn']['bias'], dtype = np.float32)\n\n                W_conv_dn = init_weights(W_dn, \"downsample_%d\"%(layer_no), False)\n                # with tf.name_scope(\"downsample_layer_%d_%d\"%(layer_no,b_no)):\n                res = conv2d_batchnorm(res, W_conv_dn, \"layer_dn_%d\"%(layer_no), self.phase, bn_beta_dn, bn_gamma_dn, bn_mov_mean_dn, bn_mov_var_dn, [1,2,2,1], False)\n\n                print(\"downsample_layer_%d_%d_1\"%(layer_no,b_no), res.shape)\n\n                out1 = tf.nn.relu(out1 + res)\n\n            else:\n                out1 = tf.nn.relu(out1)\n\n            # with tf.name_scope(\"layer_%d_%d_2\"%(layer_no,b_no)):\n            out2 = conv2d_batchnorm(out1, W_conv2, \"layer_%d_%d_2\"%(layer_no,b_no), self.phase, bn_beta2, bn_gamma2, bn_mov_mean2, bn_mov_var2, [1,stride[1],stride[1],1], True)\n            print(\"layer_%d_%d_2\"%(layer_no,b_no), out2.shape)\n            cur = out2\n\n        return cur\n"
  },
  {
    "path": "code/config_res.py",
    "content": "import numpy as np\n\n# Important paths\n\n# resnet_params_path: path containing .pkl file of RESNET18_BN weights\n# dataset_path_full: path to parser file\n# checkpoint_path: dir where checkpoints are saved and loaded from\n# training_imgs_path: dir to save training progress frames (spatial transformer outputs during training)\n# validation_imgs_path: dir to save validation progress frames (spatial transformer outputs during validation)\npaths = dict(\n\tresnet_params_path = \"../Extrinsic_Calibration_2/parameters.json\",\n\tdataset_path_full = \"/home/ganeshiyer/Extrinsic_Calibration_1.5/model_with_emd/rotation_results/parsed_set.txt\",\n\tcheckpoint_path = \"/tmp/ganesh_saved_Weights/Checkpoint_simple_transformer\",\n\ttraining_imgs_path = \"/tmp/ganesh_saved_Weights/training_imgs\",\n\tvalidation_imgs_path = \"/tmp/ganesh_saved_Weights/validation_imgs\"\n)\n\n# Depth Map parameters\n\n# IMG_HT: input image height\n# IMG_WDT: input image width\ndepth_img_params = dict(\n\tIMG_HT = 375,\n\tIMG_WDT = 1242\n)\n\n# Camera parameters\n# The current parameters are for the training set created for the all raw sequences: http://www.cvlibs.net/datasets/kitti/raw_data.php on  2011_09_26. Camera parameters change based on the date of sequence, hence used in config\n\n# fx: focal length x\n# fy: focal length y\n# cx: camera center cx\n# cy: camera center cy\n# cam_transform_02: Transform to color camera 02 (applied after transform)\n# cam_transform_02_inv: Inverse translation of above\ncamera_params = dict(\n\tfx = 7.215377e+02,\n    fy = 7.215377e+02,\n    cx = 6.095593e+02,\n    cy = 1.728540e+02,\n\n    cam_transform_02 =  np.array([1.0, 0.0, 0.0, (-4.485728e+01)/7.215377e+02,\n\t                              0.0, 1.0, 0.0, (-2.163791e-01)/7.215377e+02,\n\t                              0.0, 0.0, 1.0, -2.745884e-03,\n\t                              0.0, 0.0, 0.0, 1.0]).reshape(4,4),\n\n\tcam_transform_02_inv =  np.array([1.0, 0.0, 0.0, (4.485728e+01)/7.215377e+02,\n\t                                  0.0, 1.0, 0.0, (2.163791e-01)/7.215377e+02,\n\t                                  0.0, 0.0, 1.0, 2.745884e-03,\n\t                                  0.0, 0.0, 0.0, 1.0]).reshape(4,4)\n)\n\n\n# Network and Training Parameters\n\n# batch_size: batch_size taken during training\n# total_frames: total instances (check parsed_set.txt for total number of lines)\n# total_frames_train: total training instances\n# total_frames_validation: total validation instances\n# partition_limit: partition size of the total dataset loaded into memory during training\n# epochs: total number of epochs\n# learning_rate\n# beta1: momentum term for Adam Optimizer\n# load_epoch: Load checkpoint no. 0 at the start of training (can be changed to resume training)\nnet_params = dict(\n\tbatch_size = 20,\n\ttotal_frames = 30000,\n\ttotal_frames_train = 24000,\n\ttotal_frames_validation = 6000,\n\tpartition_limit = 1200,\n\tepochs = 40,\n\tlearning_rate = 5e-4,\n\tbeta1 = 0.9,\n\tload_epoch = 0\n\t)\n"
  },
  {
    "path": "code/dataset_files/dataset_build_color.py",
    "content": "\"\"\"\n\nFor sequence: 2011_09_26\n\n\"\"\"\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport os\nimport glob\nimport argparse\nfrom natsort import natsorted as ns\nfrom skimage import io\n\nimport scipy.misc as smc\nplt.ion()\n\n\nIMG_HT = 375\nIMG_WDT = 1242\n\nfx = 7.215377e+02\nfy = 7.215377e+02\ncx = 6.095593e+02\ncy = 1.728540e+02\n\nK = np.array([7.215377e+02, 0.000000e+00, 6.095593e+02, 0.000000e+00, 7.215377e+02, 1.728540e+02, 0.000000e+00, 0.000000e+00, 1.000000e+00]).reshape(3,3)\n\nvelo_to_cam_R = np.array([7.533745e-03, -9.999714e-01, -6.166020e-04, 1.480249e-02, 7.280733e-04, -9.998902e-01, 9.998621e-01, 7.523790e-03, 1.480755e-02]).reshape(3,3)\nvelo_to_cam_T = np.array([-4.069766e-03, -7.631618e-02, -2.717806e-01]).reshape(3,1)\n\nvelo_to_cam = np.vstack((np.hstack((velo_to_cam_R, velo_to_cam_T)), np.array([[0,0,0,1]])))\n\nR_rect_00 =  np.array([9.999239e-01, 9.837760e-03, -7.445048e-03, 0.0,\n                      -9.869795e-03, 9.999421e-01, -4.278459e-03, 0.0,\n                       7.402527e-03, 4.351614e-03, 9.999631e-01,  0.0,\n                       0.0,          0.0,          0.0,           1.0]).reshape(4,4)\n\ncam_02_transform = np.array([1.0, 0.0, 0.0, 4.485728e+01/fx,\n                             0.0, 1.0, 0.0, 2.163791e-01/fy,\n                             0.0, 0.0, 1.0, 2.745884e-03,\n                             0.0, 0.0, 0.0, 1.0]).reshape(4,4)\n\n\nparser = argparse.ArgumentParser(description=\"Create Lidar Dataset\")\nparser.add_argument(\"path\", help = \"path_to_folder, end with number\", type = str)\nargs = parser.parse_args()\n\n#\n# main_path = \"/tmp/lidar_calibration_dataset/2011_09_26/2011_09_26_drive_0104\"\n\nmain_path = args.path\n\ndef timestamp_sync(path):\n    txt1 = np.loadtxt(path + \"_extract/velodyne_points/timestamps.txt\", dtype = str)\n    txt2 = np.loadtxt(path + \"_sync/velodyne_points/timestamps.txt\", dtype = str)\n    file_list = ns(glob.glob(path + \"_extract/velodyne_points/data/*.txt\"))\n\n    times1 = txt1[:,1]\n    times2 = txt2[:,1]\n\n    for idx in range(times1.shape[0]):\n        times1[idx] = times1[idx].split(\":\")[2]\n\n    for idx in range(times2.shape[0]):\n        times2[idx] = times2[idx].split(\":\")[2]\n\n    start_pt = times2[0]\n    end_pt = times2[-1]\n\n    index_start = np.where(times1 == start_pt)[0][0]\n    index_end = np.where(times1 == end_pt)[0][0]\n\n    return file_list[index_start:index_end+1]\n\nif not os.path.exists(main_path + \"_sync/depth_maps\"):\n    os.makedirs(main_path + \"_sync/depth_maps\")\n\nif not os.path.exists(main_path + \"_sync/target_imgs\"):\n    os.makedirs(main_path + \"_sync/target_imgs\")\n\nif not os.path.exists(main_path + \"_sync/depth_maps_transformed\"):\n    os.makedirs(main_path + \"_sync/depth_maps_transformed\")\n\ndepth_maps_folder = main_path + \"_sync/depth_maps\"\ntarget_img_folder = main_path + \"_sync/target_imgs\"\ndepth_maps_transformed_folder = main_path + \"_sync/depth_maps_transformed\"\n\npoint_files = timestamp_sync(main_path)\nimgs_files = ns(glob.glob(main_path + \"_sync/image_02/data/*.png\"))\n\nangle_limit = 0.34722965035593395/1.25\ntr_limit = 0.34722965035593395/1.25\n\nangle_list = np.zeros((1,16), dtype = np.float32)\n\nfor img_name, cloud_name in zip(imgs_files, point_files):\n\n    print(img_name, cloud_name)\n\n    omega_x = angle_limit*np.random.random_sample() - (angle_limit/2.0)\n    omega_y = angle_limit*np.random.random_sample() - (angle_limit/2.0)\n    omega_z = angle_limit*np.random.random_sample() - (angle_limit/2.0)\n    tr_x = tr_limit*np.random.random_sample() - (tr_limit/2.0)\n    tr_y = tr_limit*np.random.random_sample() - (tr_limit/2.0)\n    tr_z = tr_limit*np.random.random_sample() - (tr_limit/2.0)\n\n    theta = np.sqrt(omega_x**2 + omega_y**2 + omega_z**2)\n    omega_cross = np.array([0.0, -omega_z, omega_y, omega_z, 0.0, -omega_x, -omega_y, omega_x, 0.0]).reshape(3,3)\n\n    A = np.sin(theta)/theta\n    B = (1.0 - np.cos(theta))/(theta**2)\n\n    R = np.eye(3,3) + A*omega_cross + B*np.matmul(omega_cross, omega_cross)\n\n    T = np.array([tr_x, tr_y, tr_z]).reshape(3,1)\n\n    random_transform = np.vstack((np.hstack((R, T)), np.array([[0.0, 0.0, 0.0, 1.0]])))\n\n    to_write_tr = np.expand_dims(np.ndarray.flatten(random_transform), 0)\n    angle_list = np.vstack((angle_list, to_write_tr))\n\n    points = np.loadtxt(cloud_name)\n    points = points[:,:3]\n    ones_col = np.ones(shape=(points.shape[0],1))\n    points = np.hstack((points,ones_col))\n    current_img = smc.imread(img_name)\n    # current_img_color = smc.imread(color_name)\n\n    img = smc.imread(img_name)\n    img_ht = img.shape[0]\n    img_wdt = img.shape[1]\n\n    points_in_cam_axis = np.matmul(R_rect_00, (np.matmul(velo_to_cam, points.T)))\n    transformed_points = np.matmul(random_transform, points_in_cam_axis)\n    # transformed_points = transformed_points[:-1,:]\n\n    points_2d = np.matmul(K, np.matmul(cam_02_transform, transformed_points)[:-1,:])\n\n    # points_2d = np.matmul(K, points_in_cam_axis[:-1,:])\n\n    Z = points_2d[2,:]\n    x = (points_2d[0,:]/Z).T\n    y = (points_2d[1,:]/Z).T\n\n    x = np.clip(x, 0.0, img_wdt - 1)\n    y = np.clip(y, 0.0, img_ht - 1)\n\n    reprojected_img = np.zeros_like(img)\n    for x_idx, y_idx,z_idx in zip(x,y,Z):\n        if(z_idx>0):\n            reprojected_img[int(y_idx), int(x_idx)] = z_idx\n\n    smc.imsave(depth_maps_transformed_folder + \"/\" + img_name[-14:], reprojected_img)\n\n    points_2d = np.matmul(K, np.matmul(cam_02_transform, points_in_cam_axis)[:-1,:])\n\n    Z = points_2d[2,:]\n    x = (points_2d[0,:]/Z).T\n    y = (points_2d[1,:]/Z).T\n\n    x = np.clip(x, 0.0, img_wdt - 1)\n    y = np.clip(y, 0.0, img_ht - 1)\n\n    reprojected_img = np.zeros_like(img)\n    for x_idx, y_idx,z_idx in zip(x,y,Z):\n        if(z_idx>0):\n            reprojected_img[int(y_idx), int(x_idx)] = z_idx\n    pooled_img = reprojected_img\n\n    print(img_name[-14:])\n\n    reconstructed_img = current_img*(pooled_img>0.)\n    smc.imsave(depth_maps_folder + \"/\" + img_name[-14:], pooled_img)\n    smc.imsave(target_img_folder + \"/\" + img_name[-14:], reconstructed_img)\n\nnp.savetxt(depth_maps_transformed_folder + \"/../angle_list.txt\", angle_list[1:], fmt = \"%.4f\")\n"
  },
  {
    "path": "code/dataset_files/dataset_build_color_2.py",
    "content": "\"\"\"\n\nFor sequence: 2011_09_26\n\n\"\"\"\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport os\nimport glob\nimport argparse\nfrom natsort import natsorted as ns\nfrom skimage import io\n\nimport scipy.misc as smc\nplt.ion()\n\n\nIMG_HT = 375\nIMG_WDT = 1242\n\nfx = 7.215377e+02\nfy = 7.215377e+02\ncx = 6.095593e+02\ncy = 1.728540e+02\n\nK = np.array([7.215377e+02, 0.000000e+00, 6.095593e+02, 0.000000e+00, 7.215377e+02, 1.728540e+02, 0.000000e+00, 0.000000e+00, 1.000000e+00]).reshape(3,3)\n\nvelo_to_cam_R = np.array([7.533745e-03, -9.999714e-01, -6.166020e-04, 1.480249e-02, 7.280733e-04, -9.998902e-01, 9.998621e-01, 7.523790e-03, 1.480755e-02]).reshape(3,3)\nvelo_to_cam_T = np.array([-4.069766e-03, -7.631618e-02, -2.717806e-01]).reshape(3,1)\n\nvelo_to_cam = np.vstack((np.hstack((velo_to_cam_R, velo_to_cam_T)), np.array([[0,0,0,1]])))\n\nR_rect_00 =  np.array([9.999239e-01, 9.837760e-03, -7.445048e-03, 0.0,\n                      -9.869795e-03, 9.999421e-01, -4.278459e-03, 0.0,\n                       7.402527e-03, 4.351614e-03, 9.999631e-01,  0.0,\n                       0.0,          0.0,          0.0,           1.0]).reshape(4,4)\n\ncam_02_transform = np.array([1.0, 0.0, 0.0, 4.485728e+01/fx,\n                             0.0, 1.0, 0.0, 2.163791e-01/fy,\n                             0.0, 0.0, 1.0, 2.745884e-03,\n                             0.0, 0.0, 0.0, 1.0]).reshape(4,4)\n\n\nparser = argparse.ArgumentParser(description=\"Create Lidar Dataset\")\nparser.add_argument(\"path\", help = \"path_to_folder, end with number\", type = str)\nargs = parser.parse_args()\n\n#\n# main_path = \"/tmp/lidar_calibration_dataset/2011_09_26/2011_09_26_drive_0104\"\n\nmain_path = args.path\n\ndef timestamp_sync(path):\n    txt1 = np.loadtxt(path + \"_extract/velodyne_points/timestamps.txt\", dtype = str)\n    txt2 = np.loadtxt(path + \"_sync/velodyne_points/timestamps.txt\", dtype = str)\n    file_list = ns(glob.glob(path + \"_extract/velodyne_points/data/*.txt\"))\n\n    times1 = txt1[:,1]\n    times2 = txt2[:,1]\n\n    for idx in range(times1.shape[0]):\n        times1[idx] = times1[idx].split(\":\")[2]\n\n    for idx in range(times2.shape[0]):\n        times2[idx] = times2[idx].split(\":\")[2]\n\n    start_pt = times2[0]\n    end_pt = times2[-1]\n\n    index_start = np.where(times1 == start_pt)[0][0]\n    index_end = np.where(times1 == end_pt)[0][0]\n\n    return file_list[index_start:index_end+1]\n\nif not os.path.exists(main_path + \"_sync/depth_maps_2\"):\n    os.makedirs(main_path + \"_sync/depth_maps_2\")\n\nif not os.path.exists(main_path + \"_sync/target_imgs_2\"):\n    os.makedirs(main_path + \"_sync/target_imgs_2\")\n\nif not os.path.exists(main_path + \"_sync/depth_maps_transformed_2\"):\n    os.makedirs(main_path + \"_sync/depth_maps_transformed_2\")\n\ndepth_maps_folder = main_path + \"_sync/depth_maps_2\"\ntarget_img_folder = main_path + \"_sync/target_imgs_2\"\ndepth_maps_transformed_folder = main_path + \"_sync/depth_maps_transformed_2\"\n\npoint_files = timestamp_sync(main_path)\nimgs_files = ns(glob.glob(main_path + \"_sync/image_02/data/*.png\"))\n\nangle_limit = 0.34722965035593395/2.50\ntr_limit = 0.34722965035593395*1.5\n\nangle_list = np.zeros((1,16), dtype = np.float32)\n\nfor img_name, cloud_name in zip(imgs_files, point_files):\n\n    print(img_name, cloud_name)\n\n    omega_x = angle_limit*np.random.random_sample() - (angle_limit/2.0)\n    omega_y = angle_limit*np.random.random_sample() - (angle_limit/2.0)\n    omega_z = angle_limit*np.random.random_sample() - (angle_limit/2.0)\n    tr_x = tr_limit*np.random.random_sample() - (tr_limit/2.0)\n    tr_y = tr_limit*np.random.random_sample() - (tr_limit/2.0)\n    tr_z = tr_limit*np.random.random_sample() - (tr_limit/2.0)\n\n    theta = np.sqrt(omega_x**2 + omega_y**2 + omega_z**2)\n    omega_cross = np.array([0.0, -omega_z, omega_y, omega_z, 0.0, -omega_x, -omega_y, omega_x, 0.0]).reshape(3,3)\n\n    A = np.sin(theta)/theta\n    B = (1.0 - np.cos(theta))/(theta**2)\n\n    R = np.eye(3,3) + A*omega_cross + B*np.matmul(omega_cross, omega_cross)\n\n    T = np.array([tr_x, tr_y, tr_z]).reshape(3,1)\n\n    random_transform = np.vstack((np.hstack((R, T)), np.array([[0.0, 0.0, 0.0, 1.0]])))\n\n    to_write_tr = np.expand_dims(np.ndarray.flatten(random_transform), 0)\n    angle_list = np.vstack((angle_list, to_write_tr))\n\n    points = np.loadtxt(cloud_name)\n    points = points[:,:3]\n    ones_col = np.ones(shape=(points.shape[0],1))\n    points = np.hstack((points,ones_col))\n    current_img = smc.imread(img_name)\n    # current_img_color = smc.imread(color_name)\n\n    img = smc.imread(img_name)\n    img_ht = img.shape[0]\n    img_wdt = img.shape[1]\n\n    points_in_cam_axis = np.matmul(R_rect_00, (np.matmul(velo_to_cam, points.T)))\n    transformed_points = np.matmul(random_transform, points_in_cam_axis)\n    # transformed_points = transformed_points[:-1,:]\n\n    points_2d = np.matmul(K, np.matmul(cam_02_transform, transformed_points)[:-1,:])\n\n    # points_2d = np.matmul(K, points_in_cam_axis[:-1,:])\n\n    Z = points_2d[2,:]\n    x = (points_2d[0,:]/Z).T\n    y = (points_2d[1,:]/Z).T\n\n    x = np.clip(x, 0.0, img_wdt - 1)\n    y = np.clip(y, 0.0, img_ht - 1)\n\n    reprojected_img = np.zeros_like(img)\n    for x_idx, y_idx,z_idx in zip(x,y,Z):\n        if(z_idx>0):\n            reprojected_img[int(y_idx), int(x_idx)] = z_idx\n\n    smc.imsave(depth_maps_transformed_folder + \"/\" + img_name[-14:], reprojected_img)\n\n    points_2d = np.matmul(K, np.matmul(cam_02_transform, points_in_cam_axis)[:-1,:])\n\n    Z = points_2d[2,:]\n    x = (points_2d[0,:]/Z).T\n    y = (points_2d[1,:]/Z).T\n\n    x = np.clip(x, 0.0, img_wdt - 1)\n    y = np.clip(y, 0.0, img_ht - 1)\n\n    reprojected_img = np.zeros_like(img)\n    for x_idx, y_idx,z_idx in zip(x,y,Z):\n        if(z_idx>0):\n            reprojected_img[int(y_idx), int(x_idx)] = z_idx\n    pooled_img = reprojected_img\n\n    print(img_name[-14:])\n    print(np.max(pooled_img), np.min(pooled_img))\n\n    reconstructed_img = current_img*(pooled_img>0.)\n    smc.imsave(depth_maps_folder + \"/\" + img_name[-14:], pooled_img)\n    smc.imsave(target_img_folder + \"/\" + img_name[-14:], reconstructed_img)\n\nnp.savetxt(depth_maps_transformed_folder + \"/../angle_list_2.txt\", angle_list[1:], fmt = \"%.4f\")\n"
  },
  {
    "path": "code/dataset_files/dataset_builder_parallel.sh",
    "content": "SCRIPTPATH=\"$( cd \"$(dirname \"$0\")\" ; pwd -P )\"\n\narray=( 0001 0002 0005 0009 0011 0013 0014 0015 0017 0018 0019 0020 0022 0023 0027 0028 0029 0032 0035 0036 0039 0046 0048 0051 0052 0056 0057 0059 0060 0061 0064 0070 0079 0084 0086 0087 0091 0093 0095 0096 0101 0104 0106 0113 0117 )\n\ntask()\n{\n    wget -k -c https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_$1/2011_09_26_drive_$1_extract.zip\n    wget -k -c https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_$1/2011_09_26_drive_$1_sync.zip\n    echo \"Downloading $1\"\n\n    unzip 2011_09_26_drive_$1_extract.zip\n    unzip 2011_09_26_drive_$1_sync.zip\n    echo \"Extracting $1\"\n\n    python -B $SCRIPTPATH/dataset_build_color.py $PWD/2011_09_26/2011_09_26_drive_$1\n    python -B $SCRIPTPATH/dataset_build_color_2.py $PWD/2011_09_26/2011_09_26_drive_$1\n    echo \"Perturbing Depth Maps for ${array[$id]}\"\n}\n\n\nN=4\n(\nfor id in {0..44}\ndo\n   ((i=i%N)); ((i++==0)) && wait\n   task ${array[$id]} &\ndone\n)\n\nwait\necho \"All processes done!\"\n"
  },
  {
    "path": "code/dataset_files/parser.py",
    "content": "import numpy as np\nimport scipy.misc as smc\nfrom natsort import natsorted as ns\nimport glob, os\nimport argparse\n\nparser = argparse.ArgumentParser(description=\"Create Lidar Dataset Parser file\")\nparser.add_argument(\"path\", help = \"path_to_folder\", type = str)\nargs = parser.parse_args()\n\ndataset_path = args.path\n\n#Picking up all sync folders\nfolder_names = ns(glob.glob(dataset_path +\"*_sync\" + os.path.sep))\n\ndataset_array = np.zeros(dtype = str, shape = (1,20))\ndataset_array_2 = np.zeros(dtype = str, shape = (1,20))\n\nfor fn in folder_names:\n    print fn\n    file_names_source = ns(glob.glob(fn + \"depth_maps_transformed/*.png\"))\n    file_names_target = ns(glob.glob(fn + \"depth_maps/*.png\"))\n    img_source = ns(glob.glob(fn + \"image_02/data/*.png\"))\n    img_target = ns(glob.glob(fn + \"image_03/data/*.png\"))\n    transforms_list = np.loadtxt(fn + \"angle_list.txt\", dtype = str)\n\n    file_names_source = np.array(file_names_source, dtype=str).reshape(-1,1)\n    file_names_target = np.array(file_names_target, dtype=str).reshape(-1,1)\n    img_source = np.array(img_source, dtype=str).reshape(-1,1)\n    img_target = np.array(img_target, dtype=str).reshape(-1,1)\n\n    dataset = np.hstack((file_names_source, file_names_target, img_source, img_target, transforms_list))\n    print(dataset.shape)\n\n    dataset_array = np.vstack((dataset_array, dataset))\n\n    #######################################################################################\n\n    file_names_source_2 = ns(glob.glob(fn + \"depth_maps_transformed_2/*.png\"))\n    file_names_target_2 = ns(glob.glob(fn + \"depth_maps_2/*.png\"))\n\n    transforms_list_2 = np.loadtxt(fn + \"angle_list_2.txt\", dtype = str)\n\n    file_names_source_2 = np.array(file_names_source_2, dtype=str).reshape(-1,1)\n    file_names_target_2 = np.array(file_names_target_2, dtype=str).reshape(-1,1)\n\n    dataset_2 = np.hstack((file_names_source_2, file_names_target_2, img_source, img_target, transforms_list_2))\n    print(dataset_2.shape)\n\n    dataset_array_2 = np.vstack((dataset_array_2, dataset_2))\n\n\n\n\ndataset_array = dataset_array[1:]\ndataset_array_2 = dataset_array_2[1:]\n\nfinal_array = np.vstack((dataset_array, dataset_array_2))\n\nnp.random.shuffle(final_array)\nnp.savetxt(\"parsed_set.txt\", final_array, fmt = \"%s\", delimiter=' ')\n"
  },
  {
    "path": "code/model_utils.py",
    "content": "import os\nimport tensorflow as tf\nfrom tf_ops.emd import tf_auctionmatch\nfrom tf_ops.CD import tf_nndistance\nfrom tf_ops.sampling import tf_sampling\nfrom tf_ops.grouping.tf_grouping import query_ball_point, group_point\n\ndef pre_load_checkpoint(checkpoint_dir):\n    ckpt = tf.train.get_checkpoint_state(checkpoint_dir)\n    if ckpt and ckpt.model_checkpoint_path:\n        # print(\" [*] Reading checkpoint from {}\".format(ckpt.model_checkpoint_path))\n        epoch_step = int(os.path.basename(ckpt.model_checkpoint_path).split('-')[1])\n        return epoch_step,ckpt.model_checkpoint_path\n    else:\n        return 0,None\n\n\ndef get_repulsion_loss4(pred, nsample=20, radius=0.07):\n    # pred: (batch_size, npoint,3)\n    idx, pts_cnt = query_ball_point(radius, nsample, pred, pred)\n    tf.summary.histogram('smooth/unque_index', pts_cnt)\n\n    grouped_pred = group_point(pred, idx)  # (batch_size, npoint, nsample, 3)\n    grouped_pred -= tf.expand_dims(pred, 2)\n\n    ##get the uniform loss\n    h = 0.03\n    dist_square = tf.reduce_sum(grouped_pred ** 2, axis=-1)\n    dist_square, idx = tf.nn.top_k(-dist_square, 5)\n    dist_square = -dist_square[:, :, 1:]  # remove the first one\n    dist_square = tf.maximum(1e-12,dist_square)\n    dist = tf.sqrt(dist_square)\n    weight = tf.exp(-dist_square/h**2)\n    uniform_loss = tf.reduce_mean(radius-dist*weight)\n    return uniform_loss\n\n\ndef get_emd_loss(pred, gt):\n    \"\"\" pred: BxNxC,\n        label: BxN, \"\"\"\n    batch_size = pred.get_shape()[0].value\n    matchl_out, matchr_out = tf_auctionmatch.auction_match(pred, gt)\n    matched_out = tf_sampling.gather_point(gt, matchl_out)\n    dist = tf.reshape((pred - matched_out) ** 2, shape=(batch_size, -1))\n    emd_loss = tf.reduce_sum(dist)\n    return emd_loss\n\ndef get_cd_loss(pred, gt):\n    \"\"\" pred: BxNxC,\n        label: BxN, \"\"\"\n    dists_forward, _, dists_backward, _ = tf_nndistance.nn_distance(gt, pred)\n\n    #dists_forward is for each element in gt, the cloest distance to this element\n    CD_dist = tf.reduce_sum(dists_forward) + tf.reduce_sum(dists_backward)\n    return CD_dist\n\n\nif __name__ == '__main__':\n    gt = tf.constant([[[1,0,0],[2,0,0],[3,0,0],[4,0,0]]],tf.float32)\n    pred = tf.constant([[[-10,0,0], [1,0, 0], [2,0, 0], [3,0,0]]],tf.float32)\n\n    dists_forward, idx1, dists_backward, idx2 = tf_nndistance.nn_distance(gt, pred)\n    with tf.Session() as sess:\n        print idx1.eval() # for each element in gt, the idx of pred\n        print idx2.eval() # for each element in pred,\n        print dists_forward.eval()\n        print dists_backward.eval()\n"
  },
  {
    "path": "code/nw_loader_color.py",
    "content": "import numpy as np\nimport glob, os, argparse\nimport scipy.misc as smc\nfrom tqdm import tqdm\nimport matplotlib.pyplot as plt\n\nimport config_res as config\n\ntotal = config.net_params['total_frames']\ntotal_train = config.net_params['total_frames_train']\ntotal_validation = config.net_params['total_frames_validation']\npartition_limit = config.net_params['partition_limit']\n\nIMG_HT = config.depth_img_params['IMG_HT']\nIMG_WDT = config.depth_img_params['IMG_WDT']\nbatch_size = config.net_params['batch_size']\n\ndataset = np.loadtxt(config.paths['dataset_path_full'], dtype = str)\n\ndataset_train = dataset[:total_train]\ndataset_validation = dataset[total_train:total]\n\ndef shuffle():\n    np.random.shuffle(dataset_train)\n    np.random.shuffle(dataset_validation)\n\ndef load(p_no, mode):\n\n    if(mode == \"train\"):\n        dataset_part = dataset_train[p_no*partition_limit:(p_no + 1)*partition_limit]\n    elif(mode == \"validation\"):\n        dataset_part = dataset_validation[p_no*partition_limit:(p_no + 1)*partition_limit]\n    source_file_names = dataset_part[:,0]\n    target_file_names = dataset_part[:,1]\n    source_image_names = dataset_part[:,2]\n    target_image_names = dataset_part[:,3]\n    transforms = np.float32(dataset_part[:,4:])\n\n    target_container = np.zeros((partition_limit, IMG_HT, IMG_WDT, 1), dtype = np.float32)\n    source_container = np.zeros((partition_limit, IMG_HT, IMG_WDT, 1), dtype = np.float32)\n    source_img_container = np.zeros((partition_limit, IMG_HT, IMG_WDT, 3), dtype = np.float32)\n    target_img_container = np.zeros((partition_limit, IMG_HT, IMG_WDT, 3), dtype = np.float32)\n    transforms_container = np.zeros((partition_limit, 4, 4), dtype = np.float32)\n\n    c_idx = 0\n    for s_name, t_name, img_source_name, img_target_name, transform in tqdm(zip(source_file_names, target_file_names, source_image_names, target_image_names, transforms)):\n\n        warped_ip = np.float32(smc.imread(s_name, True))\n        warped_ip[0:5,:] = 0.0 ; warped_ip[:,0:5] = 0.0 ; warped_ip[IMG_HT - 5:,:] = 0.0 ; warped_ip[:,IMG_WDT-5:] = 0.0 ;\n        warped_ip = (warped_ip - 40.0)/40.0\n        source_container[c_idx, :, :, 0] = warped_ip\n\n        target_ip = np.float32(smc.imread(t_name, True))\n        target_ip[0:5,:] = 0.0 ; target_ip[:,0:5] = 0.0 ; target_ip[IMG_HT - 5:,:] = 0.0 ; target_ip[:,IMG_WDT-5:] = 0.0 ;\n        target_ip = (target_ip - 40.0)/40.0\n        target_container[c_idx, :, :, 0] = target_ip\n\n        source_img = np.float32(smc.imread(img_source_name))\n        source_img[0:5,:,:] = 0.0 ; source_img[:,0:5,:] = 0.0 ; source_img[IMG_HT - 5:,:,:] = 0.0 ; source_img[:,IMG_WDT-5:,:] = 0.0 ;\n        source_img = (source_img - 127.5)/127.5\n        source_img_container[c_idx, :, :, :] = source_img\n\n        target_img = np.float32(smc.imread(img_target_name))\n        target_img[0:5,:,:] = 0.0 ; target_img[:,0:5,:] = 0.0 ; target_img[IMG_HT - 5:,:,:] = 0.0 ; target_img[:,IMG_WDT-5:,:] = 0.0 ;\n        target_img = (target_img - 127.5)/127.5\n        target_img_container[c_idx, :, :, :] = target_img\n\n        transforms_container[c_idx, :, :] = np.linalg.inv(transform.reshape(4,4))\n        c_idx+=1\n\n    source_container = source_container.reshape(partition_limit/batch_size, batch_size, IMG_HT, IMG_WDT , 1)\n    target_container = target_container.reshape(partition_limit/batch_size, batch_size, IMG_HT, IMG_WDT , 1)\n    source_img_container = source_img_container.reshape(partition_limit/batch_size, batch_size, IMG_HT, IMG_WDT, 3)\n    target_img_container = target_img_container.reshape(partition_limit/batch_size, batch_size, IMG_HT, IMG_WDT, 3)\n    transforms_container = transforms_container.reshape(partition_limit/batch_size, batch_size, 4, 4)\n\n    return source_container, target_container, source_img_container, target_img_container, transforms_container\n"
  },
  {
    "path": "code/tf_ops/CD/__init__.py",
    "content": ""
  },
  {
    "path": "code/tf_ops/CD/makefile",
    "content": "nvcc = /usr/local/cuda-8.0/bin/nvcc\ncudalib = /usr/local/cuda-8.0/lib64/\ntensorflow = /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include\n\nall: tf_nndistance_so.so render_balls_so.so\n.PHONY : all\n\ntf_nndistance_so.so: tf_nndistance_g.cu.o tf_nndistance.cpp\n\tg++ -std=c++11 tf_nndistance.cpp tf_nndistance_g.cu.o -o tf_nndistance_so.so -shared -fPIC -I $(tensorflow) -lcudart -L $(cudalib) -O2 -D_GLIBCXX_USE_CXX11_ABI=0\n\ntf_nndistance_g.cu.o: tf_nndistance_g.cu\n\t$(nvcc) -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 -c -o tf_nndistance_g.cu.o tf_nndistance_g.cu -I $(tensorflow) -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -O2\n\nrender_balls_so.so: render_balls_so.cpp\n\tg++ -std=c++11 render_balls_so.cpp -o render_balls_so.so -shared -fPIC -O2 -D_GLIBCXX_USE_CXX11_ABI=0\n\n\n"
  },
  {
    "path": "code/tf_ops/CD/render_balls_so.cpp",
    "content": "#include <cstdio>\n#include <vector>\n#include <algorithm>\n#include <math.h>\nusing namespace std;\n\nstruct PointInfo{\n\tint x,y,z;\n\tfloat r,g,b;\n};\n\nextern \"C\"{\n\nvoid render_ball(int h,int w,unsigned char * show,int n,int * xyzs,float * c0,float * c1,float * c2,int r){\n\tr=max(r,1);\n\tvector<int> depth(h*w,-2100000000);\n\tvector<PointInfo> pattern;\n\tfor (int dx=-r;dx<=r;dx++)\n\t\tfor (int dy=-r;dy<=r;dy++)\n\t\t\tif (dx*dx+dy*dy<r*r){\n\t\t\t\tdouble dz=sqrt(double(r*r-dx*dx-dy*dy));\n\t\t\t\tPointInfo pinfo;\n\t\t\t\tpinfo.x=dx;\n\t\t\t\tpinfo.y=dy;\n\t\t\t\tpinfo.z=dz;\n\t\t\t\tpinfo.r=dz/r;\n\t\t\t\tpinfo.g=dz/r;\n\t\t\t\tpinfo.b=dz/r;\n\t\t\t\tpattern.push_back(pinfo);\n\t\t\t}\n\tdouble zmin=0,zmax=0;\n\tfor (int i=0;i<n;i++){\n\t\tif (i==0){\n\t\t\tzmin=xyzs[i*3+2]-r;\n\t\t\tzmax=xyzs[i*3+2]+r;\n\t\t}else{\n\t\t\tzmin=min(zmin,double(xyzs[i*3+2]-r));\n\t\t\tzmax=max(zmax,double(xyzs[i*3+2]+r));\n\t\t}\n\t}\n\tfor (int i=0;i<n;i++){\n\t\tint x=xyzs[i*3+0],y=xyzs[i*3+1],z=xyzs[i*3+2];\n\t\tfor (int j=0;j<int(pattern.size());j++){\n\t\t\tint x2=x+pattern[j].x;\n\t\t\tint y2=y+pattern[j].y;\n\t\t\tint z2=z+pattern[j].z;\n\t\t\tif (!(x2<0 || x2>=h || y2<0 || y2>=w) && depth[x2*w+y2]<z2){\n\t\t\t\tdepth[x2*w+y2]=z2;\n\t\t\t\tdouble intensity=min(1.0,(z2-zmin)/(zmax-zmin)*0.7+0.3);\n\t\t\t\tshow[(x2*w+y2)*3+0]=pattern[j].b*c2[i]*intensity;\n\t\t\t\tshow[(x2*w+y2)*3+1]=pattern[j].g*c0[i]*intensity;\n\t\t\t\tshow[(x2*w+y2)*3+2]=pattern[j].r*c1[i]*intensity;\n\t\t\t}\n\t\t}\n\t}\n}\n\n}//extern \"C\"\n"
  },
  {
    "path": "code/tf_ops/CD/tf_nndistance.cpp",
    "content": "#include \"tensorflow/core/framework/op.h\"\n#include \"tensorflow/core/framework/op_kernel.h\"\nREGISTER_OP(\"NnDistance\")\n\t.Input(\"xyz1: float32\")\n\t.Input(\"xyz2: float32\")\n\t.Output(\"dist1: float32\")\n\t.Output(\"idx1: int32\")\n\t.Output(\"dist2: float32\")\n\t.Output(\"idx2: int32\");\nREGISTER_OP(\"NnDistanceGrad\")\n\t.Input(\"xyz1: float32\")\n\t.Input(\"xyz2: float32\")\n\t.Input(\"grad_dist1: float32\")\n\t.Input(\"idx1: int32\")\n\t.Input(\"grad_dist2: float32\")\n\t.Input(\"idx2: int32\")\n\t.Output(\"grad_xyz1: float32\")\n\t.Output(\"grad_xyz2: float32\");\nusing namespace tensorflow;\n\nstatic void nnsearch(int b,int n,int m,const float * xyz1,const float * xyz2,float * dist,int * idx){\n\tfor (int i=0;i<b;i++){\n\t\tfor (int j=0;j<n;j++){\n\t\t\tfloat x1=xyz1[(i*n+j)*3+0];\n\t\t\tfloat y1=xyz1[(i*n+j)*3+1];\n\t\t\tfloat z1=xyz1[(i*n+j)*3+2];\n\t\t\tdouble best=0;\n\t\t\tint besti=0;\n\t\t\tfor (int k=0;k<m;k++){\n\t\t\t\tfloat x2=xyz2[(i*m+k)*3+0]-x1;\n\t\t\t\tfloat y2=xyz2[(i*m+k)*3+1]-y1;\n\t\t\t\tfloat z2=xyz2[(i*m+k)*3+2]-z1;\n\t\t\t\tdouble d=x2*x2+y2*y2+z2*z2;\n\t\t\t\tif (k==0 || d<best){\n\t\t\t\t\tbest=d;\n\t\t\t\t\tbesti=k;\n\t\t\t\t}\n\t\t\t}\n\t\t\tdist[i*n+j]=best;\n\t\t\tidx[i*n+j]=besti;\n\t\t}\n\t}\n}\n\nclass NnDistanceOp : public OpKernel{\n\tpublic:\n\t\texplicit NnDistanceOp(OpKernelConstruction* context):OpKernel(context){}\n\t\tvoid Compute(OpKernelContext * context)override{\n\t\t\tconst Tensor& xyz1_tensor=context->input(0);\n\t\t\tconst Tensor& xyz2_tensor=context->input(1);\n\t\t\tOP_REQUIRES(context,xyz1_tensor.dims()==3,errors::InvalidArgument(\"NnDistance requires xyz1 be of shape (batch,#points,3)\"));\n\t\t\tOP_REQUIRES(context,xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"NnDistance only accepts 3d point set xyz1\"));\n\t\t\tint b=xyz1_tensor.shape().dim_size(0);\n\t\t\tint n=xyz1_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.dims()==3,errors::InvalidArgument(\"NnDistance requires xyz2 be of shape (batch,#points,3)\"));\n\t\t\tOP_REQUIRES(context,xyz2_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"NnDistance only accepts 3d point set xyz2\"));\n\t\t\tint m=xyz2_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.shape().dim_size(0)==b,errors::InvalidArgument(\"NnDistance expects xyz1 and xyz2 have same batch size\"));\n\t\t\tauto xyz1_flat=xyz1_tensor.flat<float>();\n\t\t\tconst float * xyz1=&xyz1_flat(0);\n\t\t\tauto xyz2_flat=xyz2_tensor.flat<float>();\n\t\t\tconst float * xyz2=&xyz2_flat(0);\n\t\t\tTensor * dist1_tensor=NULL;\n\t\t\tTensor * idx1_tensor=NULL;\n\t\t\tTensor * dist2_tensor=NULL;\n\t\t\tTensor * idx2_tensor=NULL;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n},&dist1_tensor));\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,n},&idx1_tensor));\n\t\t\tauto dist1_flat=dist1_tensor->flat<float>();\n\t\t\tauto idx1_flat=idx1_tensor->flat<int>();\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(2,TensorShape{b,m},&dist2_tensor));\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(3,TensorShape{b,m},&idx2_tensor));\n\t\t\tauto dist2_flat=dist2_tensor->flat<float>();\n\t\t\tauto idx2_flat=idx2_tensor->flat<int>();\n\t\t\tfloat * dist1=&(dist1_flat(0));\n\t\t\tint * idx1=&(idx1_flat(0));\n\t\t\tfloat * dist2=&(dist2_flat(0));\n\t\t\tint * idx2=&(idx2_flat(0));\n\t\t\tnnsearch(b,n,m,xyz1,xyz2,dist1,idx1);\n\t\t\tnnsearch(b,m,n,xyz2,xyz1,dist2,idx2);\n\t\t}\n};\nREGISTER_KERNEL_BUILDER(Name(\"NnDistance\").Device(DEVICE_CPU), NnDistanceOp);\nclass NnDistanceGradOp : public OpKernel{\n\tpublic:\n\t\texplicit NnDistanceGradOp(OpKernelConstruction* context):OpKernel(context){}\n\t\tvoid Compute(OpKernelContext * context)override{\n\t\t\tconst Tensor& xyz1_tensor=context->input(0);\n\t\t\tconst Tensor& xyz2_tensor=context->input(1);\n\t\t\tconst Tensor& grad_dist1_tensor=context->input(2);\n\t\t\tconst Tensor& idx1_tensor=context->input(3);\n\t\t\tconst Tensor& grad_dist2_tensor=context->input(4);\n\t\t\tconst Tensor& idx2_tensor=context->input(5);\n\t\t\tOP_REQUIRES(context,xyz1_tensor.dims()==3,errors::InvalidArgument(\"NnDistanceGrad requires xyz1 be of shape (batch,#points,3)\"));\n\t\t\tOP_REQUIRES(context,xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"NnDistanceGrad only accepts 3d point set xyz1\"));\n\t\t\tint b=xyz1_tensor.shape().dim_size(0);\n\t\t\tint n=xyz1_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.dims()==3,errors::InvalidArgument(\"NnDistanceGrad requires xyz2 be of shape (batch,#points,3)\"));\n\t\t\tOP_REQUIRES(context,xyz2_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"NnDistanceGrad only accepts 3d point set xyz2\"));\n\t\t\tint m=xyz2_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.shape().dim_size(0)==b,errors::InvalidArgument(\"NnDistanceGrad expects xyz1 and xyz2 have same batch size\"));\n\t\t\tOP_REQUIRES(context,grad_dist1_tensor.shape()==(TensorShape{b,n}),errors::InvalidArgument(\"NnDistanceGrad requires grad_dist1 be of shape(batch,#points)\"));\n\t\t\tOP_REQUIRES(context,idx1_tensor.shape()==(TensorShape{b,n}),errors::InvalidArgument(\"NnDistanceGrad requires idx1 be of shape(batch,#points)\"));\n\t\t\tOP_REQUIRES(context,grad_dist2_tensor.shape()==(TensorShape{b,m}),errors::InvalidArgument(\"NnDistanceGrad requires grad_dist2 be of shape(batch,#points)\"));\n\t\t\tOP_REQUIRES(context,idx2_tensor.shape()==(TensorShape{b,m}),errors::InvalidArgument(\"NnDistanceGrad requires idx2 be of shape(batch,#points)\"));\n\t\t\tauto xyz1_flat=xyz1_tensor.flat<float>();\n\t\t\tconst float * xyz1=&xyz1_flat(0);\n\t\t\tauto xyz2_flat=xyz2_tensor.flat<float>();\n\t\t\tconst float * xyz2=&xyz2_flat(0);\n\t\t\tauto idx1_flat=idx1_tensor.flat<int>();\n\t\t\tconst int * idx1=&idx1_flat(0);\n\t\t\tauto idx2_flat=idx2_tensor.flat<int>();\n\t\t\tconst int * idx2=&idx2_flat(0);\n\t\t\tauto grad_dist1_flat=grad_dist1_tensor.flat<float>();\n\t\t\tconst float * grad_dist1=&grad_dist1_flat(0);\n\t\t\tauto grad_dist2_flat=grad_dist2_tensor.flat<float>();\n\t\t\tconst float * grad_dist2=&grad_dist2_flat(0);\n\t\t\tTensor * grad_xyz1_tensor=NULL;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n,3},&grad_xyz1_tensor));\n\t\t\tTensor * grad_xyz2_tensor=NULL;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,m,3},&grad_xyz2_tensor));\n\t\t\tauto grad_xyz1_flat=grad_xyz1_tensor->flat<float>();\n\t\t\tfloat * grad_xyz1=&grad_xyz1_flat(0);\n\t\t\tauto grad_xyz2_flat=grad_xyz2_tensor->flat<float>();\n\t\t\tfloat * grad_xyz2=&grad_xyz2_flat(0);\n\t\t\tfor (int i=0;i<b*n*3;i++)\n\t\t\t\tgrad_xyz1[i]=0;\n\t\t\tfor (int i=0;i<b*m*3;i++)\n\t\t\t\tgrad_xyz2[i]=0;\n\t\t\tfor (int i=0;i<b;i++){\n\t\t\t\tfor (int j=0;j<n;j++){\n\t\t\t\t\tfloat x1=xyz1[(i*n+j)*3+0];\n\t\t\t\t\tfloat y1=xyz1[(i*n+j)*3+1];\n\t\t\t\t\tfloat z1=xyz1[(i*n+j)*3+2];\n\t\t\t\t\tint j2=idx1[i*n+j];\n\t\t\t\t\tfloat x2=xyz2[(i*m+j2)*3+0];\n\t\t\t\t\tfloat y2=xyz2[(i*m+j2)*3+1];\n\t\t\t\t\tfloat z2=xyz2[(i*m+j2)*3+2];\n\t\t\t\t\tfloat g=grad_dist1[i*n+j]*2;\n\t\t\t\t\tgrad_xyz1[(i*n+j)*3+0]+=g*(x1-x2);\n\t\t\t\t\tgrad_xyz1[(i*n+j)*3+1]+=g*(y1-y2);\n\t\t\t\t\tgrad_xyz1[(i*n+j)*3+2]+=g*(z1-z2);\n\t\t\t\t\tgrad_xyz2[(i*m+j2)*3+0]-=(g*(x1-x2));\n\t\t\t\t\tgrad_xyz2[(i*m+j2)*3+1]-=(g*(y1-y2));\n\t\t\t\t\tgrad_xyz2[(i*m+j2)*3+2]-=(g*(z1-z2));\n\t\t\t\t}\n\t\t\t\tfor (int j=0;j<m;j++){\n\t\t\t\t\tfloat x1=xyz2[(i*m+j)*3+0];\n\t\t\t\t\tfloat y1=xyz2[(i*m+j)*3+1];\n\t\t\t\t\tfloat z1=xyz2[(i*m+j)*3+2];\n\t\t\t\t\tint j2=idx2[i*m+j];\n\t\t\t\t\tfloat x2=xyz1[(i*n+j2)*3+0];\n\t\t\t\t\tfloat y2=xyz1[(i*n+j2)*3+1];\n\t\t\t\t\tfloat z2=xyz1[(i*n+j2)*3+2];\n\t\t\t\t\tfloat g=grad_dist2[i*m+j]*2;\n\t\t\t\t\tgrad_xyz2[(i*m+j)*3+0]+=g*(x1-x2);\n\t\t\t\t\tgrad_xyz2[(i*m+j)*3+1]+=g*(y1-y2);\n\t\t\t\t\tgrad_xyz2[(i*m+j)*3+2]+=g*(z1-z2);\n\t\t\t\t\tgrad_xyz1[(i*n+j2)*3+0]-=(g*(x1-x2));\n\t\t\t\t\tgrad_xyz1[(i*n+j2)*3+1]-=(g*(y1-y2));\n\t\t\t\t\tgrad_xyz1[(i*n+j2)*3+2]-=(g*(z1-z2));\n\t\t\t\t}\n\t\t\t}\n\t\t}\n};\nREGISTER_KERNEL_BUILDER(Name(\"NnDistanceGrad\").Device(DEVICE_CPU), NnDistanceGradOp);\n\nvoid NmDistanceKernelLauncher(int b,int n,const float * xyz,int m,const float * xyz2,float * result,int * result_i,float * result2,int * result2_i);\nclass NnDistanceGpuOp : public OpKernel{\n\tpublic:\n\t\texplicit NnDistanceGpuOp(OpKernelConstruction* context):OpKernel(context){}\n\t\tvoid Compute(OpKernelContext * context)override{\n\t\t\tconst Tensor& xyz1_tensor=context->input(0);\n\t\t\tconst Tensor& xyz2_tensor=context->input(1);\n\t\t\tOP_REQUIRES(context,xyz1_tensor.dims()==3,errors::InvalidArgument(\"NnDistance requires xyz1 be of shape (batch,#points,3)\"));\n\t\t\tOP_REQUIRES(context,xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"NnDistance only accepts 3d point set xyz1\"));\n\t\t\tint b=xyz1_tensor.shape().dim_size(0);\n\t\t\tint n=xyz1_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.dims()==3,errors::InvalidArgument(\"NnDistance requires xyz2 be of shape (batch,#points,3)\"));\n\t\t\tOP_REQUIRES(context,xyz2_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"NnDistance only accepts 3d point set xyz2\"));\n\t\t\tint m=xyz2_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.shape().dim_size(0)==b,errors::InvalidArgument(\"NnDistance expects xyz1 and xyz2 have same batch size\"));\n\t\t\tauto xyz1_flat=xyz1_tensor.flat<float>();\n\t\t\tconst float * xyz1=&xyz1_flat(0);\n\t\t\tauto xyz2_flat=xyz2_tensor.flat<float>();\n\t\t\tconst float * xyz2=&xyz2_flat(0);\n\t\t\tTensor * dist1_tensor=NULL;\n\t\t\tTensor * idx1_tensor=NULL;\n\t\t\tTensor * dist2_tensor=NULL;\n\t\t\tTensor * idx2_tensor=NULL;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n},&dist1_tensor));\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,n},&idx1_tensor));\n\t\t\tauto dist1_flat=dist1_tensor->flat<float>();\n\t\t\tauto idx1_flat=idx1_tensor->flat<int>();\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(2,TensorShape{b,m},&dist2_tensor));\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(3,TensorShape{b,m},&idx2_tensor));\n\t\t\tauto dist2_flat=dist2_tensor->flat<float>();\n\t\t\tauto idx2_flat=idx2_tensor->flat<int>();\n\t\t\tfloat * dist1=&(dist1_flat(0));\n\t\t\tint * idx1=&(idx1_flat(0));\n\t\t\tfloat * dist2=&(dist2_flat(0));\n\t\t\tint * idx2=&(idx2_flat(0));\n\t\t\tNmDistanceKernelLauncher(b,n,xyz1,m,xyz2,dist1,idx1,dist2,idx2);\n\t\t}\n};\nREGISTER_KERNEL_BUILDER(Name(\"NnDistance\").Device(DEVICE_GPU), NnDistanceGpuOp);\n\nvoid NmDistanceGradKernelLauncher(int b,int n,const float * xyz1,int m,const float * xyz2,const float * grad_dist1,const int * idx1,const float * grad_dist2,const int * idx2,float * grad_xyz1,float * grad_xyz2);\nclass NnDistanceGradGpuOp : public OpKernel{\n\tpublic:\n\t\texplicit NnDistanceGradGpuOp(OpKernelConstruction* context):OpKernel(context){}\n\t\tvoid Compute(OpKernelContext * context)override{\n\t\t\tconst Tensor& xyz1_tensor=context->input(0);\n\t\t\tconst Tensor& xyz2_tensor=context->input(1);\n\t\t\tconst Tensor& grad_dist1_tensor=context->input(2);\n\t\t\tconst Tensor& idx1_tensor=context->input(3);\n\t\t\tconst Tensor& grad_dist2_tensor=context->input(4);\n\t\t\tconst Tensor& idx2_tensor=context->input(5);\n\t\t\tOP_REQUIRES(context,xyz1_tensor.dims()==3,errors::InvalidArgument(\"NnDistanceGrad requires xyz1 be of shape (batch,#points,3)\"));\n\t\t\tOP_REQUIRES(context,xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"NnDistanceGrad only accepts 3d point set xyz1\"));\n\t\t\tint b=xyz1_tensor.shape().dim_size(0);\n\t\t\tint n=xyz1_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.dims()==3,errors::InvalidArgument(\"NnDistanceGrad requires xyz2 be of shape (batch,#points,3)\"));\n\t\t\tOP_REQUIRES(context,xyz2_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"NnDistanceGrad only accepts 3d point set xyz2\"));\n\t\t\tint m=xyz2_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.shape().dim_size(0)==b,errors::InvalidArgument(\"NnDistanceGrad expects xyz1 and xyz2 have same batch size\"));\n\t\t\tOP_REQUIRES(context,grad_dist1_tensor.shape()==(TensorShape{b,n}),errors::InvalidArgument(\"NnDistanceGrad requires grad_dist1 be of shape(batch,#points)\"));\n\t\t\tOP_REQUIRES(context,idx1_tensor.shape()==(TensorShape{b,n}),errors::InvalidArgument(\"NnDistanceGrad requires idx1 be of shape(batch,#points)\"));\n\t\t\tOP_REQUIRES(context,grad_dist2_tensor.shape()==(TensorShape{b,m}),errors::InvalidArgument(\"NnDistanceGrad requires grad_dist2 be of shape(batch,#points)\"));\n\t\t\tOP_REQUIRES(context,idx2_tensor.shape()==(TensorShape{b,m}),errors::InvalidArgument(\"NnDistanceGrad requires idx2 be of shape(batch,#points)\"));\n\t\t\tauto xyz1_flat=xyz1_tensor.flat<float>();\n\t\t\tconst float * xyz1=&xyz1_flat(0);\n\t\t\tauto xyz2_flat=xyz2_tensor.flat<float>();\n\t\t\tconst float * xyz2=&xyz2_flat(0);\n\t\t\tauto idx1_flat=idx1_tensor.flat<int>();\n\t\t\tconst int * idx1=&idx1_flat(0);\n\t\t\tauto idx2_flat=idx2_tensor.flat<int>();\n\t\t\tconst int * idx2=&idx2_flat(0);\n\t\t\tauto grad_dist1_flat=grad_dist1_tensor.flat<float>();\n\t\t\tconst float * grad_dist1=&grad_dist1_flat(0);\n\t\t\tauto grad_dist2_flat=grad_dist2_tensor.flat<float>();\n\t\t\tconst float * grad_dist2=&grad_dist2_flat(0);\n\t\t\tTensor * grad_xyz1_tensor=NULL;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n,3},&grad_xyz1_tensor));\n\t\t\tTensor * grad_xyz2_tensor=NULL;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,m,3},&grad_xyz2_tensor));\n\t\t\tauto grad_xyz1_flat=grad_xyz1_tensor->flat<float>();\n\t\t\tfloat * grad_xyz1=&grad_xyz1_flat(0);\n\t\t\tauto grad_xyz2_flat=grad_xyz2_tensor->flat<float>();\n\t\t\tfloat * grad_xyz2=&grad_xyz2_flat(0);\n\t\t\tNmDistanceGradKernelLauncher(b,n,xyz1,m,xyz2,grad_dist1,idx1,grad_dist2,idx2,grad_xyz1,grad_xyz2);\n\t\t}\n};\nREGISTER_KERNEL_BUILDER(Name(\"NnDistanceGrad\").Device(DEVICE_GPU), NnDistanceGradGpuOp);\n"
  },
  {
    "path": "code/tf_ops/CD/tf_nndistance.py",
    "content": "import os\nimport tensorflow as tf\nfrom tensorflow.python.framework import ops\n\nBASE_DIR = os.path.dirname(os.path.abspath(__file__))\n\nnn_distance_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_nndistance_so.so'))\n\ndef nn_distance(xyz1,xyz2):\n\t'''\nComputes the distance of nearest neighbors for a pair of point clouds\ninput: xyz1: (batch_size,#points_1,3)  the first point cloud\ninput: xyz2: (batch_size,#points_2,3)  the second point cloud\noutput: dist1: (batch_size,#point_1)   distance from first to second\noutput: idx1:  (batch_size,#point_1)   nearest neighbor from first to second\noutput: dist2: (batch_size,#point_2)   distance from second to first\noutput: idx2:  (batch_size,#point_2)   nearest neighbor from second to first\n\t'''\n\treturn nn_distance_module.nn_distance(xyz1,xyz2)\n#@tf.RegisterShape('NnDistance')\n#def _nn_distance_shape(op):\n\t#shape1=op.inputs[0].get_shape().with_rank(3)\n\t#shape2=op.inputs[1].get_shape().with_rank(3)\n\t#return [tf.TensorShape([shape1.dims[0],shape1.dims[1]]),tf.TensorShape([shape1.dims[0],shape1.dims[1]]),\n\t\t#tf.TensorShape([shape2.dims[0],shape2.dims[1]]),tf.TensorShape([shape2.dims[0],shape2.dims[1]])]\n@ops.RegisterGradient('NnDistance')\ndef _nn_distance_grad(op,grad_dist1,grad_idx1,grad_dist2,grad_idx2):\n\txyz1=op.inputs[0]\n\txyz2=op.inputs[1]\n\tidx1=op.outputs[1]\n\tidx2=op.outputs[3]\n\treturn nn_distance_module.nn_distance_grad(xyz1,xyz2,grad_dist1,idx1,grad_dist2,idx2)\n\n\nif __name__=='__main__':\n\timport numpy as np\n\timport random\n\timport time\n\tfrom tensorflow.python.ops.gradient_checker import compute_gradient\n\trandom.seed(100)\n\tnp.random.seed(100)\n\twith tf.Session('') as sess:\n\t\txyz1=np.random.randn(32,16384,3).astype('float32')\n\t\txyz2=np.random.randn(32,1024,3).astype('float32')\n\t\t#with tf.device('/gpu:0'):\n\t\tif True:\n\t\t\tinp1=tf.Variable(xyz1)\n\t\t\tinp2=tf.constant(xyz2)\n\t\t\treta,retb,retc,retd=nn_distance(inp1,inp2)\n\t\t\tloss=tf.reduce_sum(reta)+tf.reduce_sum(retc)\n\t\t\ttrain=tf.train.GradientDescentOptimizer(learning_rate=0.05).minimize(loss)\n\t\tsess.run(tf.initialize_all_variables())\n\t\tt0=time.time()\n\t\tt1=t0\n\t\tbest=1e100\n\t\tfor i in xrange(100):\n\t\t\ttrainloss,_=sess.run([loss,train])\n\t\t\tnewt=time.time()\n\t\t\tbest=min(best,newt-t1)\n\t\t\tprint i,trainloss,(newt-t0)/(i+1),best\n\t\t\tt1=newt\n\t\t#print sess.run([inp1,retb,inp2,retd])\n\t\t#grads=compute_gradient([inp1,inp2],[(16,32,3),(16,32,3)],loss,(1,),[xyz1,xyz2])\n\t\t#for i,j in grads:\n\t\t\t#print i.shape,j.shape,np.mean(np.abs(i-j)),np.mean(np.abs(i)),np.mean(np.abs(j))\n\t\t#for i in xrange(10):\n\t\t\t#t0=time.time()\n\t\t\t#a,b,c,d=sess.run([reta,retb,retc,retd],feed_dict={inp1:xyz1,inp2:xyz2})\n\t\t\t#print 'time',time.time()-t0\n\t\t#print a.shape,b.shape,c.shape,d.shape\n\t\t#print a.dtype,b.dtype,c.dtype,d.dtype\n\t\t#samples=np.array(random.sample(range(xyz2.shape[1]),100),dtype='int32')\n\t\t#dist1=((xyz1[:,samples,None,:]-xyz2[:,None,:,:])**2).sum(axis=-1).min(axis=-1)\n\t\t#idx1=((xyz1[:,samples,None,:]-xyz2[:,None,:,:])**2).sum(axis=-1).argmin(axis=-1)\n\t\t#print np.abs(dist1-a[:,samples]).max()\n\t\t#print np.abs(idx1-b[:,samples]).max()\n\t\t#dist2=((xyz2[:,samples,None,:]-xyz1[:,None,:,:])**2).sum(axis=-1).min(axis=-1)\n\t\t#idx2=((xyz2[:,samples,None,:]-xyz1[:,None,:,:])**2).sum(axis=-1).argmin(axis=-1)\n\t\t#print np.abs(dist2-c[:,samples]).max()\n\t\t#print np.abs(idx2-d[:,samples]).max()\n\n"
  },
  {
    "path": "code/tf_ops/CD/tf_nndistance_g.cu",
    "content": "#if GOOGLE_CUDA\n#define EIGEN_USE_GPU\n#include \"third_party/eigen3/unsupported/Eigen/CXX11/Tensor\"\n\n__global__ void NmDistanceKernel(int b,int n,const float * xyz,int m,const float * xyz2,float * result,int * result_i){\n\tconst int batch=512;\n\t__shared__ float buf[batch*3];\n\tfor (int i=blockIdx.x;i<b;i+=gridDim.x){\n\t\tfor (int k2=0;k2<m;k2+=batch){\n\t\t\tint end_k=min(m,k2+batch)-k2;\n\t\t\tfor (int j=threadIdx.x;j<end_k*3;j+=blockDim.x){\n\t\t\t\tbuf[j]=xyz2[(i*m+k2)*3+j];\n\t\t\t}\n\t\t\t__syncthreads();\n\t\t\tfor (int j=threadIdx.x+blockIdx.y*blockDim.x;j<n;j+=blockDim.x*gridDim.y){\n\t\t\t\tfloat x1=xyz[(i*n+j)*3+0];\n\t\t\t\tfloat y1=xyz[(i*n+j)*3+1];\n\t\t\t\tfloat z1=xyz[(i*n+j)*3+2];\n\t\t\t\tint best_i=0;\n\t\t\t\tfloat best=0;\n\t\t\t\tint end_ka=end_k-(end_k&3);\n\t\t\t\tif (end_ka==batch){\n\t\t\t\t\tfor (int k=0;k<batch;k+=4){\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfloat x2=buf[k*3+0]-x1;\n\t\t\t\t\t\t\tfloat y2=buf[k*3+1]-y1;\n\t\t\t\t\t\t\tfloat z2=buf[k*3+2]-z1;\n\t\t\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\t\t\tif (k==0 || d<best){\n\t\t\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\t\t\tbest_i=k+k2;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfloat x2=buf[k*3+3]-x1;\n\t\t\t\t\t\t\tfloat y2=buf[k*3+4]-y1;\n\t\t\t\t\t\t\tfloat z2=buf[k*3+5]-z1;\n\t\t\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\t\t\tif (d<best){\n\t\t\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\t\t\tbest_i=k+k2+1;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfloat x2=buf[k*3+6]-x1;\n\t\t\t\t\t\t\tfloat y2=buf[k*3+7]-y1;\n\t\t\t\t\t\t\tfloat z2=buf[k*3+8]-z1;\n\t\t\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\t\t\tif (d<best){\n\t\t\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\t\t\tbest_i=k+k2+2;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfloat x2=buf[k*3+9]-x1;\n\t\t\t\t\t\t\tfloat y2=buf[k*3+10]-y1;\n\t\t\t\t\t\t\tfloat z2=buf[k*3+11]-z1;\n\t\t\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\t\t\tif (d<best){\n\t\t\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\t\t\tbest_i=k+k2+3;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\t\t\t\t}else{\n\t\t\t\t\tfor (int k=0;k<end_ka;k+=4){\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfloat x2=buf[k*3+0]-x1;\n\t\t\t\t\t\t\tfloat y2=buf[k*3+1]-y1;\n\t\t\t\t\t\t\tfloat z2=buf[k*3+2]-z1;\n\t\t\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\t\t\tif (k==0 || d<best){\n\t\t\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\t\t\tbest_i=k+k2;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfloat x2=buf[k*3+3]-x1;\n\t\t\t\t\t\t\tfloat y2=buf[k*3+4]-y1;\n\t\t\t\t\t\t\tfloat z2=buf[k*3+5]-z1;\n\t\t\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\t\t\tif (d<best){\n\t\t\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\t\t\tbest_i=k+k2+1;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfloat x2=buf[k*3+6]-x1;\n\t\t\t\t\t\t\tfloat y2=buf[k*3+7]-y1;\n\t\t\t\t\t\t\tfloat z2=buf[k*3+8]-z1;\n\t\t\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\t\t\tif (d<best){\n\t\t\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\t\t\tbest_i=k+k2+2;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfloat x2=buf[k*3+9]-x1;\n\t\t\t\t\t\t\tfloat y2=buf[k*3+10]-y1;\n\t\t\t\t\t\t\tfloat z2=buf[k*3+11]-z1;\n\t\t\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\t\t\tif (d<best){\n\t\t\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\t\t\tbest_i=k+k2+3;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t\tfor (int k=end_ka;k<end_k;k++){\n\t\t\t\t\tfloat x2=buf[k*3+0]-x1;\n\t\t\t\t\tfloat y2=buf[k*3+1]-y1;\n\t\t\t\t\tfloat z2=buf[k*3+2]-z1;\n\t\t\t\t\tfloat d=x2*x2+y2*y2+z2*z2;\n\t\t\t\t\tif (k==0 || d<best){\n\t\t\t\t\t\tbest=d;\n\t\t\t\t\t\tbest_i=k+k2;\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t\tif (k2==0 || result[(i*n+j)]>best){\n\t\t\t\t\tresult[(i*n+j)]=best;\n\t\t\t\t\tresult_i[(i*n+j)]=best_i;\n\t\t\t\t}\n\t\t\t}\n\t\t\t__syncthreads();\n\t\t}\n\t}\n}\nvoid NmDistanceKernelLauncher(int b,int n,const float * xyz,int m,const float * xyz2,float * result,int * result_i,float * result2,int * result2_i){\n\tNmDistanceKernel<<<dim3(32,16,1),512>>>(b,n,xyz,m,xyz2,result,result_i);\n\tNmDistanceKernel<<<dim3(32,16,1),512>>>(b,m,xyz2,n,xyz,result2,result2_i);\n}\n__global__ void NmDistanceGradKernel(int b,int n,const float * xyz1,int m,const float * xyz2,const float * grad_dist1,const int * idx1,float * grad_xyz1,float * grad_xyz2){\n\tfor (int i=blockIdx.x;i<b;i+=gridDim.x){\n\t\tfor (int j=threadIdx.x+blockIdx.y*blockDim.x;j<n;j+=blockDim.x*gridDim.y){\n\t\t\tfloat x1=xyz1[(i*n+j)*3+0];\n\t\t\tfloat y1=xyz1[(i*n+j)*3+1];\n\t\t\tfloat z1=xyz1[(i*n+j)*3+2];\n\t\t\tint j2=idx1[i*n+j];\n\t\t\tfloat x2=xyz2[(i*m+j2)*3+0];\n\t\t\tfloat y2=xyz2[(i*m+j2)*3+1];\n\t\t\tfloat z2=xyz2[(i*m+j2)*3+2];\n\t\t\tfloat g=grad_dist1[i*n+j]*2;\n\t\t\tatomicAdd(&(grad_xyz1[(i*n+j)*3+0]),g*(x1-x2));\n\t\t\tatomicAdd(&(grad_xyz1[(i*n+j)*3+1]),g*(y1-y2));\n\t\t\tatomicAdd(&(grad_xyz1[(i*n+j)*3+2]),g*(z1-z2));\n\t\t\tatomicAdd(&(grad_xyz2[(i*m+j2)*3+0]),-(g*(x1-x2)));\n\t\t\tatomicAdd(&(grad_xyz2[(i*m+j2)*3+1]),-(g*(y1-y2)));\n\t\t\tatomicAdd(&(grad_xyz2[(i*m+j2)*3+2]),-(g*(z1-z2)));\n\t\t}\n\t}\n}\nvoid NmDistanceGradKernelLauncher(int b,int n,const float * xyz1,int m,const float * xyz2,const float * grad_dist1,const int * idx1,const float * grad_dist2,const int * idx2,float * grad_xyz1,float * grad_xyz2){\n\tcudaMemset(grad_xyz1,0,b*n*3*4);\n\tcudaMemset(grad_xyz2,0,b*m*3*4);\n\tNmDistanceGradKernel<<<dim3(1,16,1),256>>>(b,n,xyz1,m,xyz2,grad_dist1,idx1,grad_xyz1,grad_xyz2);\n\tNmDistanceGradKernel<<<dim3(1,16,1),256>>>(b,m,xyz2,n,xyz1,grad_dist2,idx2,grad_xyz2,grad_xyz1);\n}\n\n#endif\n"
  },
  {
    "path": "code/tf_ops/__init__.py",
    "content": ""
  },
  {
    "path": "code/tf_ops/emd/__init__.py",
    "content": ""
  },
  {
    "path": "code/tf_ops/emd/tf_auctionmatch.cpp",
    "content": "#include \"tensorflow/core/framework/op.h\"\n#include \"tensorflow/core/framework/op_kernel.h\"\n#include \"tensorflow/core/framework/shape_inference.h\"\n#include \"tensorflow/core/framework/common_shape_fns.h\"\n#include <algorithm>\n#include <vector>\n#include <math.h>\nusing namespace tensorflow;\nREGISTER_OP(\"AuctionMatch\")\n\t.Input(\"xyz1: float32\")\n\t.Input(\"xyz2: float32\")\n\t.Output(\"matchl: int32\")\n\t.Output(\"matchr: int32\")\n\t.SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c){\n        ::tensorflow::shape_inference::ShapeHandle dims1;\n        c->WithRank(c->input(0), 3, &dims1);\n        ::tensorflow::shape_inference::ShapeHandle dims2;\n        c->WithRank(c->input(1), 3, &dims2);\n        ::tensorflow::shape_inference::ShapeHandle output1 = c->MakeShape({c->Dim(dims1, 0), c->Dim(dims1, 1)});\n        c->set_output(0, output1);\n        ::tensorflow::shape_inference::ShapeHandle output2 = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1)});\n        c->set_output(1, output2);\n        return Status::OK();\n\t});\nvoid AuctionMatchLauncher(int b,int n,const float * xyz1,const float * xyz2,int * matchl,int * matchr,float * cost);\n\nclass AuctionMatchGpuOp: public OpKernel{\n\tpublic:\n\t\texplicit AuctionMatchGpuOp(OpKernelConstruction* context):OpKernel(context){}\n\t\tvoid Compute(OpKernelContext * context)override{\n\t\t\tconst Tensor& xyz1_tensor=context->input(0);\n\t\t\tOP_REQUIRES(context,xyz1_tensor.dims()==3 && xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"ApproxMatch expects (batch_size,num_points,3) xyz1 shape\"));\n\t\t\tauto xyz1_flat=xyz1_tensor.flat<float>();\n\t\t\tconst float * xyz1=&(xyz1_flat(0));\n\t\t\tint b=xyz1_tensor.shape().dim_size(0);\n\t\t\tint n=xyz1_tensor.shape().dim_size(1);\n\t\t\tOP_REQUIRES(context,n<=4096,errors::InvalidArgument(\"AuctionMatch handles at most 4096 dataset points\"));\n\n\t\t\tconst Tensor& xyz2_tensor=context->input(1);\n\t\t\tOP_REQUIRES(context,xyz2_tensor.dims()==3 && xyz2_tensor.shape().dim_size(2)==3 && xyz2_tensor.shape().dim_size(0)==b && xyz2_tensor.shape().dim_size(1)==n,errors::InvalidArgument(\"AuctionMatch expects (batch_size,num_points,3) xyz2 shape, and shape must match with xyz1\"));\n\t\t\tauto xyz2_flat=xyz2_tensor.flat<float>();\n\t\t\tconst float * xyz2=&(xyz2_flat(0));\n\n\t\t\tTensor * matchl_tensor=NULL;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n},&matchl_tensor));\n\t\t\tauto matchl_flat=matchl_tensor->flat<int>();\n\t\t\tint * matchl=&(matchl_flat(0));\n\t\t\tTensor * matchr_tensor=NULL;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,n},&matchr_tensor));\n\t\t\tauto matchr_flat=matchr_tensor->flat<int>();\n\t\t\tint * matchr=&(matchr_flat(0));\n\n\t\t\tTensor temp_tensor;\n\t\t\tOP_REQUIRES_OK(context,context->allocate_temp(DataTypeToEnum<float>::value,TensorShape{b,n,n},&temp_tensor));\n\t\t\tauto temp_flat=temp_tensor.flat<float>();\n\t\t\tfloat * temp=&(temp_flat(0));\n\n\t\t\tAuctionMatchLauncher(b,n,xyz1,xyz2,matchl,matchr,temp);\n\t\t}\n};\nREGISTER_KERNEL_BUILDER(Name(\"AuctionMatch\").Device(DEVICE_GPU), AuctionMatchGpuOp);\n"
  },
  {
    "path": "code/tf_ops/emd/tf_auctionmatch.py",
    "content": "import tensorflow as tf\nfrom tensorflow.python.framework import ops\nimport sys\nimport os\n\nBASE_DIR = os.path.dirname(os.path.abspath(__file__))\nsys.path.append(BASE_DIR)\nauctionmatch_module = tf.load_op_library(os.path.join(BASE_DIR, 'tf_auctionmatch_so.so'))\n\ndef auction_match(xyz1,xyz2):\n\t'''\ninput:\n\txyz1 : batch_size * #points * 3\n\txyz2 : batch_size * #points * 3\nreturns:\n\tmatchl : batch_size * #npoints\n\tmatchr : batch_size * #npoints\n\t'''\n\tprint (\"Here\")\n\treturn auctionmatch_module.auction_match(xyz1,xyz2)\nops.NoGradient('AuctionMatch')\n\n# TF1.0 API requires set shape in C++\n# @tf.RegisterShape('AuctionMatch')\n# def _auction_match_shape(op):\n# \tshape1=op.inputs[0].get_shape().with_rank(3)\n# \tshape2=op.inputs[1].get_shape().with_rank(3)\n# \treturn [\n# \t\ttf.TensorShape([shape1.dims[0],shape1.dims[1]]),\n# \t\ttf.TensorShape([shape2.dims[0],shape2.dims[1]])\n# \t]\n\nif __name__=='__main__':\n    from tf_ops.grouping import tf_grouping\n    from tf_ops.sampling import tf_sampling\n\n    npoint=4096\n    xyz1_in=tf.placeholder(tf.float32,shape=(32,npoint,3))\n    xyz2_in=tf.placeholder(tf.float32,shape=(32,npoint,3))\n    matchl_out,matchr_out=auction_match(xyz1_in,xyz2_in)\n    matched_out=tf_sampling.gather_point(xyz2_in,matchl_out)\n    import numpy as np\n    np.random.seed(100)\n    xyz1=np.random.randn(32,npoint,3).astype('float32')\n    xyz2=xyz1.copy()+np.random.randn(32,npoint,3)*0.01\n    for i in xrange(len(xyz2)):\n        xyz2[i]=np.roll(xyz2[i],i,axis=0)\n    with tf.Session('') as sess:\n        ret=sess.run(matched_out,feed_dict={xyz1_in:xyz1,xyz2_in:xyz2})\n    print ((xyz1-ret)**2).mean()\n"
  },
  {
    "path": "code/tf_ops/emd/tf_auctionmatch_compile.sh",
    "content": "echo 'nvcc'\n/usr/local/cuda-8.0/bin/nvcc tf_auctionmatch_g.cu -o tf_auctionmatch_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_30 \necho 'g++'\n\ng++ -std=c++11 tf_auctionmatch.cpp tf_auctionmatch_g.cu.o -o tf_auctionmatch_so.so -shared -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -I /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include  -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2\n"
  },
  {
    "path": "code/tf_ops/emd/tf_auctionmatch_g.cu",
    "content": "#include <cstdio>\n__global__ void AuctionMatchKernel(int b,int n,const float * __restrict__ xyz1,const float * __restrict__ xyz2,int * matchl,int * matchr,float * cost){\n\t//this kernel handles up to 4096 points\n\tconst int NMax=4096;\n\t__shared__ short Queue[NMax];\n\t__shared__ short matchrbuf[NMax];\n\t__shared__ float pricer[NMax];\n\t__shared__ float bests[32][3];\n\t__shared__ int qhead,qlen;\n\tconst int BufLen=2048;\n\t__shared__ float buf[BufLen];\n\tfor (int bno=blockIdx.x;bno<b;bno+=gridDim.x){\n\t\tint cnt=0;\n\t\tfloat tolerance=1e-4;\n\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x)\n\t\t\tmatchl[bno*n+j]=-1;\n\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x)\n\t\t\tmatchrbuf[j]=-1;\n\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x)\n\t\t\tQueue[j]=j;\n\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x)\n\t\t\tpricer[j]=0;\n\t\tconst int Block=512;\n\t\tfor (int k0=0;k0<n;k0+=Block){\n\t\t\tint k1=min(n,k0+Block);\n\t\t\tfor (int k=threadIdx.x;k<(k1-k0)*3;k+=blockDim.x)\n\t\t\t\tbuf[k]=xyz1[bno*n*3+k0*3+k];\n\t\t\t__syncthreads();\n\t\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x){\n\t\t\t\tfloat x2=xyz2[bno*n*3+j*3+0];\n\t\t\t\tfloat y2=xyz2[bno*n*3+j*3+1];\n\t\t\t\tfloat z2=xyz2[bno*n*3+j*3+2];\n\t\t\t\tfor (int k=k0;k<k1;k++){\n\t\t\t\t\tfloat x1=buf[(k-k0)*3+0];\n\t\t\t\t\tfloat y1=buf[(k-k0)*3+1];\n\t\t\t\t\tfloat z1=buf[(k-k0)*3+2];\n\t\t\t\t\tfloat d=sqrtf((x1-x2)*(x1-x2)+(y1-y2)*(y1-y2)+(z1-z2)*(z1-z2));\n\t\t\t\t\tcost[blockIdx.x*n*n+k*n+j]=d;\n\t\t\t\t}\n\t\t\t}\n\t\t\t__syncthreads();\n\t\t}\n\t\t//calculate the distacne\n\t\tif (threadIdx.x==0){\n\t\t\tqhead=0;\n\t\t\tqlen=n;\n\t\t}\n\t\t__syncthreads();\n\t\tint loaded=0;\n\t\tfloat value9,value10,value11,value12,value13,value14,value15,value16;\n\t\twhile (qlen){\n\t\t\tint i=Queue[qhead];\n\t\t\tint i2;\n\t\t\tif (qhead+1<n)\n\t\t\t\ti2=Queue[qhead+1];\n\t\t\telse\n\t\t\t\ti2=Queue[0];\n\t\t\tfloat best=1e38f,best2=1e38f;\n\t\t\tint bestj=0;\n\t\t\tif (n==blockDim.x*8){\n\t\t\t\tint j=threadIdx.x;\n\t\t\t\tfloat value1,value2,value3,value4,value5,value6,value7,value8;\n\t\t\t\tif (loaded){\n\t\t\t\t\tvalue1=value9+pricer[j];\n\t\t\t\t\tvalue2=value10+pricer[j+blockDim.x];\n\t\t\t\t\tvalue3=value11+pricer[j+blockDim.x*2];\n\t\t\t\t\tvalue4=value12+pricer[j+blockDim.x*3];\n\t\t\t\t\tvalue5=value13+pricer[j+blockDim.x*4];\n\t\t\t\t\tvalue6=value14+pricer[j+blockDim.x*5];\n\t\t\t\t\tvalue7=value15+pricer[j+blockDim.x*6];\n\t\t\t\t\tvalue8=value16+pricer[j+blockDim.x*7];\n\t\t\t\t\tloaded=0;\n\t\t\t\t}else{\n\t\t\t\t\tvalue1=cost[blockIdx.x*n*n+i*n+j]+pricer[j];\n\t\t\t\t\tvalue2=cost[blockIdx.x*n*n+i*n+j+blockDim.x]+pricer[j+blockDim.x];\n\t\t\t\t\tvalue3=cost[blockIdx.x*n*n+i*n+j+blockDim.x*2]+pricer[j+blockDim.x*2];\n\t\t\t\t\tvalue4=cost[blockIdx.x*n*n+i*n+j+blockDim.x*3]+pricer[j+blockDim.x*3];\n\t\t\t\t\tvalue5=cost[blockIdx.x*n*n+i*n+j+blockDim.x*4]+pricer[j+blockDim.x*4];\n\t\t\t\t\tvalue6=cost[blockIdx.x*n*n+i*n+j+blockDim.x*5]+pricer[j+blockDim.x*5];\n\t\t\t\t\tvalue7=cost[blockIdx.x*n*n+i*n+j+blockDim.x*6]+pricer[j+blockDim.x*6];\n\t\t\t\t\tvalue8=cost[blockIdx.x*n*n+i*n+j+blockDim.x*7]+pricer[j+blockDim.x*7];\n\t\t\t\t\tvalue9=cost[blockIdx.x*n*n+i2*n+j];\n\t\t\t\t\tvalue10=cost[blockIdx.x*n*n+i2*n+j+blockDim.x];\n\t\t\t\t\tvalue11=cost[blockIdx.x*n*n+i2*n+j+blockDim.x*2];\n\t\t\t\t\tvalue12=cost[blockIdx.x*n*n+i2*n+j+blockDim.x*3];\n\t\t\t\t\tvalue13=cost[blockIdx.x*n*n+i2*n+j+blockDim.x*4];\n\t\t\t\t\tvalue14=cost[blockIdx.x*n*n+i2*n+j+blockDim.x*5];\n\t\t\t\t\tvalue15=cost[blockIdx.x*n*n+i2*n+j+blockDim.x*6];\n\t\t\t\t\tvalue16=cost[blockIdx.x*n*n+i2*n+j+blockDim.x*7];\n\t\t\t\t\tloaded=qlen>1;\n\t\t\t\t}\n\t\t\t\tint vj,vj2,vj3,vj4;\n\t\t\t\tif (value1<value2){\n\t\t\t\t\tvj=j;\n\t\t\t\t}else{\n\t\t\t\t\tvj=j+blockDim.x;\n\t\t\t\t\tfloat t=value1;\n\t\t\t\t\tvalue1=value2;\n\t\t\t\t\tvalue2=t;\n\t\t\t\t}\n\t\t\t\tif (value3<value4){\n\t\t\t\t\tvj2=j+blockDim.x*2;\n\t\t\t\t}else{\n\t\t\t\t\tvj2=j+blockDim.x*3;\n\t\t\t\t\tfloat t=value3;\n\t\t\t\t\tvalue3=value4;\n\t\t\t\t\tvalue4=t;\n\t\t\t\t}\n\t\t\t\tif (value5<value6){\n\t\t\t\t\tvj3=j+blockDim.x*4;\n\t\t\t\t}else{\n\t\t\t\t\tvj3=j+blockDim.x*5;\n\t\t\t\t\tfloat t=value5;\n\t\t\t\t\tvalue5=value6;\n\t\t\t\t\tvalue6=t;\n\t\t\t\t}\n\t\t\t\tif (value7<value8){\n\t\t\t\t\tvj4=j+blockDim.x*6;\n\t\t\t\t}else{\n\t\t\t\t\tvj4=j+blockDim.x*7;\n\t\t\t\t\tfloat t=value7;\n\t\t\t\t\tvalue7=value8;\n\t\t\t\t\tvalue8=t;\n\t\t\t\t}\n\t\t\t\tif (value1<value3){\n\t\t\t\t\tvalue2=fminf(value2,value3);\n\t\t\t\t}else{\n\t\t\t\t\tvalue2=fminf(value1,value4);\n\t\t\t\t\tvalue1=value3;\n\t\t\t\t\tvj=vj2;\n\t\t\t\t}\n\t\t\t\tif (value5<value7){\n\t\t\t\t\tvalue6=fminf(value6,value7);\n\t\t\t\t}else{\n\t\t\t\t\tvalue6=fminf(value5,value8);\n\t\t\t\t\tvalue5=value7;\n\t\t\t\t\tvj3=vj4;\n\t\t\t\t}\n\t\t\t\tif (value1<value5){\n\t\t\t\t\tbest=value1;\n\t\t\t\t\tbestj=vj;\n\t\t\t\t\tbest2=fminf(value2,value5);\n\t\t\t\t}else{\n\t\t\t\t\tbest2=fminf(value1,value6);\n\t\t\t\t\tbest=value5;\n\t\t\t\t\tbestj=vj3;\n\t\t\t\t}\n\t\t\t}else if (n>=blockDim.x*4){\n\t\t\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x*4){\n\t\t\t\t\tfloat value1=cost[blockIdx.x*n*n+i*n+j]+pricer[j];\n\t\t\t\t\tfloat value2=cost[blockIdx.x*n*n+i*n+j+blockDim.x]+pricer[j+blockDim.x];\n\t\t\t\t\tfloat value3=cost[blockIdx.x*n*n+i*n+j+blockDim.x*2]+pricer[j+blockDim.x*2];\n\t\t\t\t\tfloat value4=cost[blockIdx.x*n*n+i*n+j+blockDim.x*3]+pricer[j+blockDim.x*3];\n\t\t\t\t\tint vj,vj2;\n\t\t\t\t\tif (value1<value2){\n\t\t\t\t\t\tvj=j;\n\t\t\t\t\t}else{\n\t\t\t\t\t\tvj=j+blockDim.x;\n\t\t\t\t\t\tfloat t=value1;\n\t\t\t\t\t\tvalue1=value2;\n\t\t\t\t\t\tvalue2=t;\n\t\t\t\t\t}\n\t\t\t\t\tif (value3<value4){\n\t\t\t\t\t\tvj2=j+blockDim.x*2;\n\t\t\t\t\t}else{\n\t\t\t\t\t\tvj2=j+blockDim.x*3;\n\t\t\t\t\t\tfloat t=value3;\n\t\t\t\t\t\tvalue3=value4;\n\t\t\t\t\t\tvalue4=t;\n\t\t\t\t\t}\n\t\t\t\t\tif (value1<value3){\n\t\t\t\t\t\tvalue2=fminf(value2,value3);\n\t\t\t\t\t}else{\n\t\t\t\t\t\tvalue2=fminf(value1,value4);\n\t\t\t\t\t\tvalue1=value3;\n\t\t\t\t\t\tvj=vj2;\n\t\t\t\t\t}\n\t\t\t\t\tif (best<value1){\n\t\t\t\t\t\tbest2=fminf(best2,value1);\n\t\t\t\t\t}else{\n\t\t\t\t\t\tbest2=fminf(best,value2);\n\t\t\t\t\t\tbest=value1;\n\t\t\t\t\t\tbestj=vj;\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}else if (n>=blockDim.x*2){\n\t\t\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x*2){\n\t\t\t\t\tfloat value1=cost[blockIdx.x*n*n+i*n+j]+pricer[j];\n\t\t\t\t\tfloat value2=cost[blockIdx.x*n*n+i*n+j+blockDim.x]+pricer[j+blockDim.x];\n\t\t\t\t\tint vj;\n\t\t\t\t\tif (value1<value2){\n\t\t\t\t\t\tvj=j;\n\t\t\t\t\t}else{\n\t\t\t\t\t\tvj=j+blockDim.x;\n\t\t\t\t\t\tfloat t=value1;\n\t\t\t\t\t\tvalue1=value2;\n\t\t\t\t\t\tvalue2=t;\n\t\t\t\t\t}\n\t\t\t\t\tif (best<value1){\n\t\t\t\t\t\tbest2=fminf(best2,value1);\n\t\t\t\t\t}else{\n\t\t\t\t\t\tbest2=fminf(best,value2);\n\t\t\t\t\t\tbest=value1;\n\t\t\t\t\t\tbestj=vj;\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}else{\n\t\t\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x){\n\t\t\t\t\tfloat value=cost[blockIdx.x*n*n+i*n+j]+pricer[j];\n\t\t\t\t\tif (best<value){\n\t\t\t\t\t\tbest2=fminf(best2,value);\n\t\t\t\t\t}else{\n\t\t\t\t\t\tbest2=best;\n\t\t\t\t\t\tbestj=j;\n\t\t\t\t\t\tbest=value;\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t\tfor (int i=16;i>0;i>>=1){\n\t\t\t\tfloat b1=__shfl_down(best,i,32);\n\t\t\t\tfloat b2=__shfl_down(best2,i,32);\n\t\t\t\tint bj=__shfl_down(bestj,i,32);\n\t\t\t\tif (best<b1){\n\t\t\t\t\tbest2=fminf(b1,best2);\n\t\t\t\t}else{\n\t\t\t\t\tbest=b1;\n\t\t\t\t\tbest2=fminf(best,b2);\n\t\t\t\t\tbestj=bj;\n\t\t\t\t}\n\t\t\t}\n\t\t\tif ((threadIdx.x&31)==0){\n\t\t\t\tbests[threadIdx.x>>5][0]=best;\n\t\t\t\tbests[threadIdx.x>>5][1]=best2;\n\t\t\t\t*(int*)&bests[threadIdx.x>>5][2]=bestj;\n\t\t\t}\n\t\t\t__syncthreads();\n\t\t\tint nn=blockDim.x>>5;\n\t\t\tif (threadIdx.x<nn){\n\t\t\t\tbest=bests[threadIdx.x][0];\n\t\t\t\tbest2=bests[threadIdx.x][1];\n\t\t\t\tbestj=*(int*)&bests[threadIdx.x][2];\n\t\t\t\tfor (int i=nn>>1;i>0;i>>=1){\n\t\t\t\t\tfloat b1=__shfl_down(best,i,32);\n\t\t\t\t\tfloat b2=__shfl_down(best2,i,32);\n\t\t\t\t\tint bj=__shfl_down(bestj,i,32);\n\t\t\t\t\tif (best<b1){\n\t\t\t\t\t\tbest2=fminf(b1,best2);\n\t\t\t\t\t}else{\n\t\t\t\t\t\tbest=b1;\n\t\t\t\t\t\tbest2=fminf(best,b2);\n\t\t\t\t\t\tbestj=bj;\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t\tif (threadIdx.x==0){\n\t\t\t\tfloat delta=best2-best+tolerance;\n\t\t\t\tqhead++;\n\t\t\t\tqlen--;\n\t\t\t\tif (qhead>=n)\n\t\t\t\t\tqhead-=n;\n\t\t\t\tint old=matchrbuf[bestj];\n\t\t\t\tpricer[bestj]+=delta;\n\t\t\t\tcnt++;\n\t\t\t\tif (old!=-1){\n\t\t\t\t\tint ql=qlen;\n\t\t\t\t\tint tail=qhead+ql;\n\t\t\t\t\tqlen=ql+1;\n\t\t\t\t\tif (tail>=n)\n\t\t\t\t\t\ttail-=n;\n\t\t\t\t\tQueue[tail]=old;\n\t\t\t\t}\n\t\t\t\tif (cnt==(40*n)){\n\t\t\t\t\tif (tolerance==1.0)\n\t\t\t\t\t\tqlen=0;\n\t\t\t\t\ttolerance=fminf(1.0,tolerance*100);\n\t\t\t\t\tcnt=0;\n\t\t\t\t}\n\t\t\t}\n\t\t\t__syncthreads();\n\t\t\tif (threadIdx.x==0){\n\t\t\t\tmatchrbuf[bestj]=i;\n\t\t\t}\n\t\t}\n\t\t__syncthreads();\n\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x)\n\t\t\tmatchr[bno*n+j]=matchrbuf[j];\n\t\tfor (int j=threadIdx.x;j<n;j+=blockDim.x)\n\t\t\tmatchl[bno*n+matchrbuf[j]]=j;\n\t\t__syncthreads();\n\t}\n}\nvoid AuctionMatchLauncher(int b,int n,const float * xyz1,const float * xyz2,int * matchl,int * matchr,float * cost){\n\tAuctionMatchKernel<<<32,512>>>(b,n,xyz1,xyz2,matchl,matchr,cost);\n}\n\n"
  },
  {
    "path": "code/tf_ops/grouping/__init__.py",
    "content": ""
  },
  {
    "path": "code/tf_ops/grouping/compile.sh",
    "content": "g++ query_ball_point.cpp -o query_ball_point\nnvcc query_ball_point.cu -o query_ball_point_cuda\nnvcc query_ball_point_block.cu -o query_ball_point_block\nnvcc query_ball_point_grid.cu -o query_ball_point_grid\nnvcc query_ball_point_grid_count.cu -o query_ball_point_grid_count\ng++ -Wall selection_sort.cpp -o selection_sort\nnvcc selection_sort.cu -o selection_sort_cuda\n"
  },
  {
    "path": "code/tf_ops/grouping/query_ball_point.cpp",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <string>\n#include <vector>\nusing namespace std;\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n// input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)\n// output: idx (b,m,nsample)\nvoid query_ball_point_cpu(int b, int n, int m, const float* radius, int nsample, const float *xyz1, const float *xyz2, int *idx) {\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<m;++j) {\n            int cnt = 0;\n            for (int k=0;k<n;++k) {\n                if (cnt == nsample)\n                    break; // only pick the FIRST nsample points in the ball\n\t        float x2=xyz2[j*3+0];\n\t        float y2=xyz2[j*3+1];\n\t        float z2=xyz2[j*3+2];\n\t        float x1=xyz1[k*3+0];\n\t        float y1=xyz1[k*3+1];\n\t        float z1=xyz1[k*3+2];\n\t\tfloat d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);\n                if (d<*radius) {\n                    if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices\n                        for (int l=0;l<nsample;++l)\n                            idx[j*nsample+l] = k;\n                    }\n                    idx[j*nsample+cnt] = k;\n                    cnt+=1;\n                }\n            }\n        }\n        xyz1+=n*3;\n        xyz2+=m*3;\n        idx+=m*nsample;\n    }\n}\n\n\n// input: points (b,n,c), idx (b,m,nsample)\n// output: out (b,m,nsample,c)\nvoid group_point_cpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<m;++j) {\n            for (int k=0;k<nsample;++k) {\n                int ii = idx[j*nsample+k];\n                for (int l=0;l<c;++l) {\n                    out[j*nsample*c+k*c+l] = points[ii*c+l];\n                }\n            }\n        }\n        points+=n*c;\n        idx+=m*nsample;\n        out+=m*nsample*c;\n    }\n}\n\n// input: grad_out (b,m,nsample,c), idx (b,m,nsample), \n// output: grad_points (b,n,c)\nvoid group_point_grad_cpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<m;++j) {\n            for (int k=0;k<nsample;++k) {\n                int ii = idx[j*nsample+k];\n                for (int l=0;l<c;++l) {\n                     grad_points[ii*c+l] += grad_out[j*nsample*c+k*c+l];\n                }\n            }\n        }\n        idx+=m*nsample;\n        grad_out+=m*nsample*c;\n        grad_points+=n*c;\n    }\n}\n\nint main()\n{\n    int b=32,n=512,m=128,nsample=64,c=64;\n    float radius=0.1;\n    float *xyz1=new float[b*n*3];\n    float *xyz2=new float[b*m*3];\n    float *points=new float[b*n*c];\n    int *idx=new int[b*m*nsample];\n    memset(idx, 0, sizeof(int)*b*m*nsample);\n    float *out=new float[b*m*nsample*c];\n    float *grad_out=new float[b*m*nsample*c]; // grad to out\n    memset(grad_out, 0.0, sizeof(float)*b*m*nsample*c);\n    float *grad_points=new float[b*n*c]; // grad to points\n    for (int i=0;i<b*n*3;i++)\n        xyz1[i]=randomf();\n    for (int i=0;i<b*m*3;i++)\n        xyz2[i]=randomf();\n    for (int i=0;i<b*n*c;i++)\n        points[i]=randomf();\n\n    double t0=get_time();\n    query_ball_point_cpu(b,n,m,radius,nsample,xyz1,xyz2,idx);\n    printf(\"query_ball_point cpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    group_point_cpu(b,n,c,m,nsample,points,idx,out);\n    printf(\"grou_point cpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    group_point_grad_cpu(b,n,c,m,nsample,grad_out,idx,grad_points);\n    printf(\"grou_point_grad cpu time %f\\n\",get_time()-t0);\n\n    return 0;\n}\n"
  },
  {
    "path": "code/tf_ops/grouping/query_ball_point.cu",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <string>\n#include <vector>\n#include \"cuPrintf.cuh\"\n#include \"cuPrintf.cu\"\n\nusing namespace std;\nusing namespace std;\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n// input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)\n// output: idx (b,m,nsample)\n__global__ void query_ball_point_gpu(int b, int n, int m, const float* radius, int nsample, const float *xyz1, const float *xyz2, int *idx) {\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<m;++j) {\n            int cnt = 0;\n            for (int k=0;k<n;++k) {\n                if (cnt == nsample)\n                    break; // only pick the FIRST nsample points in the ball\n\t            float x2=xyz2[j*3+0];\n\t            float y2=xyz2[j*3+1];\n\t            float z2=xyz2[j*3+2];\n\t            float x1=xyz1[k*3+0];\n\t            float y1=xyz1[k*3+1];\n\t            float z1=xyz1[k*3+2];\n\t\t        float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);\n                if (d<radius[0]) {\n                    if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices\n                        for (int l=0;l<nsample;++l)\n                            idx[j*nsample+l] = k;\n                    }\n                    idx[j*nsample+cnt] = k;\n                    cnt+=1;\n                }\n            }\n        }\n        xyz1+=n*3;\n        xyz2+=m*3;\n        idx+=m*nsample;\n    }\n}\n\n\n// input: points (b,n,c), idx (b,m,nsample)\n// output: out (b,m,nsample,c)\n__global__ void group_point_gpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<m;++j) {\n            for (int k=0;k<nsample;++k) {\n                int ii = idx[j*nsample+k];\n                for (int l=0;l<c;++l) {\n                    out[j*nsample*c+k*c+l] = points[ii*c+l];\n                }\n            }\n        }\n        points+=n*c;\n        idx+=m*nsample;\n        out+=m*nsample*c;\n    }\n}\n\n// input: grad_out (b,m,nsample,c), idx (b,m,nsample), \n// output: grad_points (b,n,c)\n__global__ void group_point_grad_gpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<m;++j) {\n            for (int k=0;k<nsample;++k) {\n                int ii = idx[j*nsample+k];\n                for (int l=0;l<c;++l) {\n                     grad_points[ii*c+l] += grad_out[j*nsample*c+k*c+l];\n                }\n            }\n        }\n        idx+=m*nsample;\n        grad_out+=m*nsample*c;\n        grad_points+=n*c;\n    }\n}\n\nint main()\n{\n    int b=32,n=512,m=128,nsample=64,c=64;\n    float radius=0.1;\n    float *xyz1, *xyz2, *points;\n    cudaMallocManaged(&xyz1, b*n*3*sizeof(float));\n    cudaMallocManaged(&xyz2, b*m*3*sizeof(float));\n    cudaMallocManaged(&points, b*n*c*sizeof(float));\n    int *idx;\n    cudaMallocManaged(&idx, b*m*nsample*sizeof(int));\n    memset(idx, 0, sizeof(int)*b*m*nsample);\n    float *out, *grad_out;\n    cudaMallocManaged(&out, b*m*nsample*c*sizeof(float));\n    cudaMallocManaged(&grad_out, b*m*nsample*c*sizeof(float));\n    memset(grad_out, 0.0, sizeof(float)*b*m*nsample*c);\n    float *grad_points;\n    cudaMallocManaged(&grad_points, b*n*c*sizeof(float));\n\n    for (int i=0;i<b*n*3;i++)\n        xyz1[i]=randomf();\n    for (int i=0;i<b*m*3;i++)\n        xyz2[i]=randomf();\n    for (int i=0;i<b*n*c;i++)\n        points[i]=randomf();\n\n    double t0=get_time();\n    query_ball_point_gpu<<<1,1>>>(b,n,m,radius,nsample,xyz1,xyz2,idx);\n    cudaDeviceSynchronize();\n    printf(\"query_ball_point gpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    group_point_gpu<<<1,1>>>(b,n,c,m,nsample,points,idx,out);\n    cudaDeviceSynchronize();\n    printf(\"grou_point gpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    group_point_grad_gpu<<<1,1>>>(b,n,c,m,nsample,grad_out,idx,grad_points);\n    cudaDeviceSynchronize();\n    printf(\"grou_point_grad gpu time %f\\n\",get_time()-t0);\n\n    cudaFree(xyz1);\n    cudaFree(xyz2);\n    cudaFree(points);\n    cudaFree(idx);\n    cudaFree(out);\n    cudaFree(grad_out);\n    cudaFree(grad_points);\n    return 0;\n}\n"
  },
  {
    "path": "code/tf_ops/grouping/query_ball_point_block.cu",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <string>\n#include <vector>\nusing namespace std;\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n// input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)\n// output: idx (b,m,nsample)\n__global__ void query_ball_point_gpu(int b, int n, int m, const float *radius, int nsample, const float *xyz1, const float *xyz2, int *idx) {\n    int index = threadIdx.x;\n    xyz1 += n*3*index;\n    xyz2 += m*3*index;\n    idx += m*nsample*index;\n\n    for (int j=0;j<m;++j) {\n        int cnt = 0;\n        for (int k=0;k<n;++k) {\n            if (cnt == nsample)\n                break; // only pick the FIRST nsample points in the ball\n            float x2=xyz2[j*3+0];\n            float y2=xyz2[j*3+1];\n            float z2=xyz2[j*3+2];\n            float x1=xyz1[k*3+0];\n            float y1=xyz1[k*3+1];\n            float z1=xyz1[k*3+2];\n    \tfloat d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);\n            if (d<radius[0]) {\n                if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices\n                    for (int l=0;l<nsample;++l)\n                        idx[j*nsample+l] = k;\n                }\n                idx[j*nsample+cnt] = k;\n                cnt+=1;\n            }\n        }\n    }\n}\n\n\n// input: points (b,n,c), idx (b,m,nsample)\n// output: out (b,m,nsample,c)\n__global__ void group_point_gpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {\n    int index = threadIdx.x;\n    points += n*c*index;\n    idx += m*nsample*index;\n    out += m*nsample*c*index;\n\n    for (int j=0;j<m;++j) {\n        for (int k=0;k<nsample;++k) {\n            int ii = idx[j*nsample+k];\n            for (int l=0;l<c;++l) {\n                out[j*nsample*c+k*c+l] = points[ii*c+l];\n            }\n        }\n    }\n}\n\n// input: grad_out (b,m,nsample,c), idx (b,m,nsample), \n// output: grad_points (b,n,c)\n__global__ void group_point_grad_gpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {\n    int index = threadIdx.x;\n    idx += m*nsample*index;\n    grad_out += m*nsample*c*index;\n    grad_points += n*c*index;\n\n    for (int j=0;j<m;++j) {\n        for (int k=0;k<nsample;++k) {\n            int ii = idx[j*nsample+k];\n            for (int l=0;l<c;++l) {\n                 grad_points[ii*c+l] += grad_out[j*nsample*c+k*c+l];\n            }\n        }\n    }\n}\n\nint main()\n{\n    int b=32,n=512,m=128,nsample=64,c=64;\n    float radius=0.1;\n    float *xyz1, *xyz2, *points;\n    cudaMallocManaged(&xyz1, b*n*3*sizeof(float));\n    cudaMallocManaged(&xyz2, b*m*3*sizeof(float));\n    cudaMallocManaged(&points, b*n*c*sizeof(float));\n    int *idx;\n    cudaMallocManaged(&idx, b*m*nsample*sizeof(int));\n    memset(idx, 0, sizeof(int)*b*m*nsample);\n    float *out, *grad_out;\n    cudaMallocManaged(&out, b*m*nsample*c*sizeof(float));\n    cudaMallocManaged(&grad_out, b*m*nsample*c*sizeof(float));\n    memset(grad_out, 0.0, sizeof(float)*b*m*nsample*c);\n    float *grad_points;\n    cudaMallocManaged(&grad_points, b*n*c*sizeof(float));\n\n    for (int i=0;i<b*n*3;i++)\n        xyz1[i]=randomf();\n    for (int i=0;i<b*m*3;i++)\n        xyz2[i]=randomf();\n    for (int i=0;i<b*n*c;i++)\n        points[i]=randomf();\n\n    double t0=get_time();\n    query_ball_point_gpu<<<1,b>>>(b,n,m,radius,nsample,xyz1,xyz2,idx);\n    cudaDeviceSynchronize();\n    printf(\"query_ball_point gpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    group_point_gpu<<<1,b>>>(b,n,c,m,nsample,points,idx,out);\n    cudaDeviceSynchronize();\n    printf(\"grou_point gpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    group_point_grad_gpu<<<1,b>>>(b,n,c,m,nsample,grad_out,idx,grad_points);\n    cudaDeviceSynchronize();\n    printf(\"grou_point_grad gpu time %f\\n\",get_time()-t0);\n\n    cudaFree(xyz1);\n    cudaFree(xyz2);\n    cudaFree(points);\n    cudaFree(idx);\n    cudaFree(out);\n    cudaFree(grad_out);\n    cudaFree(grad_points);\n    return 0;\n}\n"
  },
  {
    "path": "code/tf_ops/grouping/query_ball_point_grid.cu",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <string>\n#include <vector>\nusing namespace std;\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n// input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)\n// output: idx (b,m,nsample)\n__global__ void query_ball_point_gpu(int b, int n, int m, const float *radius, int nsample, const float *xyz1, const float *xyz2, int *idx) {\n    int batch_index = blockIdx.x;\n    xyz1 += n*3*batch_index;\n    xyz2 += m*3*batch_index;\n    idx += m*nsample*batch_index;\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n    \n    for (int j=index;j<m;j+=stride) {\n        int cnt = 0;\n        for (int k=0;k<n;++k) {\n            if (cnt == nsample)\n                break; // only pick the FIRST nsample points in the ball\n            float x2=xyz2[j*3+0];\n            float y2=xyz2[j*3+1];\n            float z2=xyz2[j*3+2];\n            float x1=xyz1[k*3+0];\n            float y1=xyz1[k*3+1];\n            float z1=xyz1[k*3+2];\n    \tfloat d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);\n            if (d<radius[0]) {\n                if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices\n                    for (int l=0;l<nsample;++l)\n                        idx[j*nsample+l] = k;\n                }\n                idx[j*nsample+cnt] = k;\n                cnt+=1;\n            }\n        }\n    }\n}\n\n\n// input: points (b,n,c), idx (b,m,nsample)\n// output: out (b,m,nsample,c)\n__global__ void group_point_gpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {\n    int batch_index = blockIdx.x;\n    points += n*c*batch_index;\n    idx += m*nsample*batch_index;\n    out += m*nsample*c*batch_index;\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n    \n    for (int j=index;j<m;j+=stride) {\n        for (int k=0;k<nsample;++k) {\n            int ii = idx[j*nsample+k];\n            for (int l=0;l<c;++l) {\n                out[j*nsample*c+k*c+l] = points[ii*c+l];\n            }\n        }\n    }\n}\n\n// input: grad_out (b,m,nsample,c), idx (b,m,nsample), \n// output: grad_points (b,n,c)\n__global__ void group_point_grad_gpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {\n    int batch_index = blockIdx.x;\n    idx += m*nsample*batch_index;\n    grad_out += m*nsample*c*batch_index;\n    grad_points += n*c*batch_index;\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n\n    for (int j=index;j<m;j+=stride) {\n        for (int k=0;k<nsample;++k) {\n            int ii = idx[j*nsample+k];\n            for (int l=0;l<c;++l) {\n                 // Use atomic add to avoid race condition\n                 atomicAdd(&grad_points[ii*c+l], grad_out[j*nsample*c+k*c+l]);\n            }\n        }\n    }\n}\n\nint main()\n{\n    int b=32,n=512,m=128,nsample=64,c=64;\n    float radius=0.1;\n    float *xyz1, *xyz2, *points;\n    cudaMallocManaged(&xyz1, b*n*3*sizeof(float));\n    cudaMallocManaged(&xyz2, b*m*3*sizeof(float));\n    cudaMallocManaged(&points, b*n*c*sizeof(float));\n    int *idx;\n    cudaMallocManaged(&idx, b*m*nsample*sizeof(int));\n    memset(idx, 0, sizeof(int)*b*m*nsample);\n    float *out, *grad_out;\n    cudaMallocManaged(&out, b*m*nsample*c*sizeof(float));\n    cudaMallocManaged(&grad_out, b*m*nsample*c*sizeof(float));\n    memset(grad_out, 0.0, sizeof(float)*b*m*nsample*c);\n    float *grad_points;\n    cudaMallocManaged(&grad_points, b*n*c*sizeof(float));\n\n    for (int i=0;i<b*n*3;i++)\n        xyz1[i]=randomf();\n    for (int i=0;i<b*m*3;i++)\n        xyz2[i]=randomf();\n    for (int i=0;i<b*n*c;i++)\n        points[i]=randomf();\n\n    double t0=get_time();\n    query_ball_point_gpu<<<b,256>>>(b,n,m,radius,nsample,xyz1,xyz2,idx);\n    cudaDeviceSynchronize();\n    printf(\"query_ball_point gpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    group_point_gpu<<<b,256>>>(b,n,c,m,nsample,points,idx,out);\n    cudaDeviceSynchronize();\n    printf(\"grou_point gpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    group_point_grad_gpu<<<b,256>>>(b,n,c,m,nsample,grad_out,idx,grad_points);\n    cudaDeviceSynchronize();\n    printf(\"grou_point_grad gpu time %f\\n\",get_time()-t0);\n\n    cudaFree(xyz1);\n    cudaFree(xyz2);\n    cudaFree(points);\n    cudaFree(idx);\n    cudaFree(out);\n    cudaFree(grad_out);\n    cudaFree(grad_points);\n    return 0;\n}\n"
  },
  {
    "path": "code/tf_ops/grouping/selection_sort.cpp",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <string>\n#include <vector>\nusing namespace std;\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n\n// input: k (1), distance matrix dist (b,m,n)\n// output: idx (b,m,n), val (b,m,n)\nvoid selection_sort_cpu(int b, int n, int m, int k, const float *dist, int *idx, float *val) {\n    float *p_dist;\n    float tmp;\n    int tmpi;\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<m;++j) {\n            for (int s=0;s<n;++s) {\n                val[i*m*n+j*n+s] = dist[i*m*n+j*n+s];\n                idx[i*m*n+j*n+s] = s;\n            }\n        }\n    }\n\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<m;++j) {\n            for (int s=0;s<n;++s)\n                printf(\"%f \", dist[i*m*n+j*n+s]);\n            printf(\"\\n\");\n            p_dist = val+j*n;\n            // selection sort for the first k elements\n            for (int s=0;s<k;++s) {\n                int min=s; \n                // find the min\n                for (int t=s+1;t<n;++t) {\n                    if (p_dist[t]<p_dist[min]) {\n                        min = t;\n                    }\n                }\n                printf(\"%d\\n\", min);\n                // swap min-th and i-th element\n                if (min!=s) {\n                    tmp = p_dist[min];\n                    p_dist[min] = p_dist[s];\n                    p_dist[s] = tmp;\n                    tmpi = idx[j*n+min];\n                    idx[j*n+min] = idx[j*n+s];\n                    idx[j*n+s] = tmpi;\n                }       \n            }\n        }\n        idx+=m*n;\n        val+=m*n;\n    }\n}\n\nint main()\n{\n    //int b=32,n=10000,m=1000,k=128;\n    int b=2,n=4,m=2,k=3;\n    float *dist=new float[b*m*n];\n    int *idx=new int[b*m*n];\n    float *val=new float[b*m*n];\n    memset(idx, 0, sizeof(int)*b*m*n);\n    //for (int i=0;i<b*n*m;i++)\n    //    dist[i]=randomf();\n    for (int i=0;i<b*n*m;i++) {\n        dist[i] = float(10-i);\n        printf(\"%f \", dist[i]);\n    }\n    printf(\"\\n\");\n\n\n\n    double t0=get_time();\n    selection_sort_cpu(b,n,m,k,dist,idx,val);\n    printf(\"selection sort cpu time %f\\n\",get_time()-t0);\n\n    for (int i=0;i<b*n*m;++i)\n        printf(\"%d \", idx[i]);\n    printf(\"\\n\");\n    for (int i=0;i<b*n*m;++i)\n        printf(\"%f \", val[i]);\n    printf(\"\\n\");\n    return 0;\n}\n"
  },
  {
    "path": "code/tf_ops/grouping/selection_sort.cu",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <string>\n#include <vector>\nusing namespace std;\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n\n// input: k (1), distance matrix dist (b,m,n)\n// output: idx (b,m,k), val (b,m,k)\n__global__ void selection_sort_gpu(int b, int n, int m, int k, float *dist, int *idx, float *val) {\n    int batch_index = blockIdx.x;\n    dist+=m*n*batch_index;\n    idx+=m*k*batch_index;\n    val+=m*k*batch_index;\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n\n    float *p_dist;\n    for (int j=index;j<m;j+=stride) {\n        p_dist = dist+j*n;\n        // selection sort for the first k elements\n        for (int s=0;s<k;++s) {\n            int min=s; \n            // find the min\n            for (int t=s+1;t<n;++t) {\n                if (p_dist[t]<p_dist[min]) {\n                    min = t;\n                }\n            }\n            // update idx and val\n            idx[j*n+s] = min;\n            val[j*n+s] = p_dist[min];\n            // swap min-th and i-th element\n            float tmp = p_dist[min];\n            p_dist[min] = p_dist[s];\n            p_dist[s] = tmp;\n        }\n    }\n}\n\nint main()\n{\n    //int b=32,n=10000,m=1000,k=128;\n    int b=32,n=2048,m=512,k=128;\n    float *dist;\n    int *idx;\n    float *val;\n    cudaMallocManaged(&dist, b*m*n*sizeof(float));\n    cudaMallocManaged(&idx, b*m*k*sizeof(int));\n    cudaMallocManaged(&val, b*m*k*sizeof(float));\n    cudaMemset(idx, 0, sizeof(int)*b*m*k);\n    for (int i=0;i<b*n*m;i++)\n        dist[i]=randomf();\n\n    double t0=get_time();\n    selection_sort_gpu<<<b,256>>>(b,n,m,k,dist,idx,val);\n    cudaDeviceSynchronize();\n    printf(\"selection sort cpu time %f\\n\",get_time()-t0);\n\n    return 0;\n}\n"
  },
  {
    "path": "code/tf_ops/grouping/selection_sort_const.cu",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <string>\n#include <vector>\nusing namespace std;\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n\n// input: k (1), distance matrix dist (b,m,n)\n// output: idx (b,m,n), dist_out (b,m,n)\n__global__ void selection_sort_gpu(int b, int n, int m, int k, const float *dist, int *outi, float *out) {\n    int batch_index = blockIdx.x;\n    dist+=m*n*batch_index;\n    outi+=m*n*batch_index;\n    out+=m*n*batch_index;\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n\n    // copy from dist to dist_out\n    for (int j=index;j<m;j+=stride) {\n        for (int s=0;s<n;++s) {\n            out[j*n+s] = dist[j*n+s];\n            outi[j*n+s] = s;\n        }\n    }\n\n    float *p_dist;\n    for (int j=index;j<m;j+=stride) {\n        p_dist = out+j*n;\n        // selection sort for the first k elements\n        for (int s=0;s<k;++s) {\n            int min=s; \n            // find the min\n            for (int t=s+1;t<n;++t) {\n                if (p_dist[t]<p_dist[min]) {\n                    min = t;\n                }\n            }\n            // swap min-th and i-th element\n            if (min!=s) {\n                float tmp = p_dist[min];\n                p_dist[min] = p_dist[s];\n                p_dist[s] = tmp;\n                int tmpi = outi[j*n+min];\n                outi[j*n+min] = outi[j*n+s];\n                outi[j*n+s] = tmpi;\n            }\n        }\n    }\n}\n\nint main()\n{\n    //int b=32,n=10000,m=1000,k=128;\n    int b=32,n=2048,m=512,k=128;\n    //int b=2,n=4,m=2,k=3;\n    float *dist;\n    int *idx;\n    float *dist_out;\n    cudaMallocManaged(&dist, b*m*n*sizeof(float));\n    cudaMallocManaged(&idx, b*m*n*sizeof(int));\n    cudaMallocManaged(&dist_out, b*m*n*sizeof(float));\n    cudaMemset(idx, 0, sizeof(int)*b*m*n);\n    for (int i=0;i<b*n*m;i++)\n        dist[i]=randomf();\n    //for (int i=0;i<b*n*m;i++) {\n    //    dist[i] = float(10-i);\n    //    printf(\"%f \", dist[i]);\n    //}\n    //printf(\"\\n\");\n\n    double t0=get_time();\n    selection_sort_gpu<<<b,256>>>(b,n,m,k,dist,idx,dist_out);\n    cudaDeviceSynchronize();\n    printf(\"selection sort cpu time %f\\n\",get_time()-t0);\n    \n    //for (int i=0;i<b*n*m;++i)\n    //    printf(\"%d \", idx[i]);\n    //printf(\"\\n\");\n\n    return 0;\n}\n"
  },
  {
    "path": "code/tf_ops/grouping/test_knn.py",
    "content": "import tensorflow as tf\nimport numpy as np\n\nnp.random.seed(0)\n\n\na_val = np.random.random((2,5,3))\nb_val = np.random.random((2,2,3))\nfor b in range(2):\n    print '--- ', b\n    t1 = a_val[b,:,:]\n    t2 = b_val[b,:,:]\n    for i in range(2): #npoint in b\n        print '-- point b: ', i\n        for j in range(5): # npoint in a\n            d = np.sum((t2[i,:]-t1[j,:])**2)\n            print d\n            \n\n\na = tf.constant(a_val)\nb = tf.constant(b_val)\nprint a.get_shape()\nk = 3\n\na = tf.tile(tf.reshape(a, (2,1,5,3)), [1,2,1,1])\nb = tf.tile(tf.reshape(b, (2,2,1,3)), [1,1,5,1])\n\ndist = -tf.reduce_sum((a-b)**2, -1)\nprint dist\n\nval, idx = tf.nn.top_k(dist, k=k)\nprint val, idx\nsess = tf.Session()\nprint sess.run(a)\nprint sess.run(b)\nprint sess.run(dist)\nprint sess.run(val)\nprint sess.run(idx)\nprint sess.run(idx).shape\n"
  },
  {
    "path": "code/tf_ops/grouping/tf_grouping.cpp",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <iostream>\n#include \"tensorflow/core/framework/op.h\"\n#include \"tensorflow/core/framework/op_kernel.h\"\n#include \"tensorflow/core/framework/shape_inference.h\"\n#include \"tensorflow/core/framework/common_shape_fns.h\"\n#include <cuda_runtime.h>\nusing namespace tensorflow;\n\nREGISTER_OP(\"QueryBallPoint\")\n    .Attr(\"nsample: int\")\n    .Input(\"xyz1: float32\")\n    .Input(\"xyz2: float32\")\n    .Input(\"radius: float32\")\n    .Output(\"idx: int32\")\n    .Output(\"pts_cnt: int32\")\n    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n        ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoint * 3\n        c->WithRank(c->input(1), 3, &dims2);\n        int nsample;\n        TF_RETURN_IF_ERROR(c->GetAttr(\"nsample\", &nsample));\n        ::tensorflow::shape_inference::ShapeHandle output1 = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1), nsample});\n        c->set_output(0, output1);\n        ::tensorflow::shape_inference::ShapeHandle output2 = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1)});\n        c->set_output(1, output2);\n        return Status::OK();\n    });\nREGISTER_OP(\"SelectionSort\")\n    .Attr(\"k: int\")\n    .Input(\"dist: float32\")\n    .Output(\"outi: int32\")\n    .Output(\"out: float32\")\n    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n        c->set_output(0, c->input(0));\n        c->set_output(1, c->input(0));\n        return Status::OK();\n    });\nREGISTER_OP(\"GroupPoint\")\n    .Input(\"points: float32\")\n    .Input(\"idx: int32\")\n    .Output(\"out: float32\")\n    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n        ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ndataset * channels\n        c->WithRank(c->input(0), 3, &dims1);\n        ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints * nsample\n        c->WithRank(c->input(1), 3, &dims2);\n        // batch_size * npoints * nsample * channels\n        ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1), c->Dim(dims2, 2), c->Dim(dims1, 2)});\n        c->set_output(0, output);\n        return Status::OK();\n    });\nREGISTER_OP(\"GroupPointGrad\")\n    .Input(\"points: float32\")\n    .Input(\"idx: int32\")\n    .Input(\"grad_out: float32\")\n    .Output(\"grad_points: float32\")\n    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n        c->set_output(0, c->input(0));\n        return Status::OK();\n    });\n\n\nvoid queryBallPointLauncher(int b, int n, int m, const float* radius, int nsample, const float *xyz1, const float *xyz2, int *idx, int *pts_cnt);\nclass QueryBallPointGpuOp : public OpKernel {\n    public:\n        explicit QueryBallPointGpuOp(OpKernelConstruction* context) : OpKernel(context) {\n            //OP_REQUIRES_OK(context, context->GetAttr(\"radius\", &radius_));\n            //OP_REQUIRES(context, radius_ > 0, errors::InvalidArgument(\"QueryBallPoint expects positive radius\"));\n\n            OP_REQUIRES_OK(context, context->GetAttr(\"nsample\", &nsample_));\n            OP_REQUIRES(context, nsample_ > 0, errors::InvalidArgument(\"QueryBallPoint expects positive nsample\"));\n        }\n\n        void Compute(OpKernelContext* context) override {\n            const Tensor& xyz1_tensor = context->input(0);\n            OP_REQUIRES(context, xyz1_tensor.dims()==3 && xyz1_tensor.shape().dim_size(2)==3, errors::InvalidArgument(\"QueryBallPoint expects (batch_size, ndataset, 3) xyz1 shape.\"));\n            int b = xyz1_tensor.shape().dim_size(0);\n            int n = xyz1_tensor.shape().dim_size(1);\n\n            const Tensor& xyz2_tensor = context->input(1);\n            OP_REQUIRES(context, xyz2_tensor.dims()==3 && xyz2_tensor.shape().dim_size(2)==3, errors::InvalidArgument(\"QueryBallPoint expects (batch_size, npoint, 3) xyz2 shape.\"));\n            int m = xyz2_tensor.shape().dim_size(1);\n\n            Tensor *idx_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,m,nsample_}, &idx_tensor));\n            Tensor *pts_cnt_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,m}, &pts_cnt_tensor));\n\n            const Tensor& radius_tensor = context->input(2);\n            auto radius_flat = radius_tensor.flat<float>();\n            const float *radius = &(radius_flat(0));\n\n            auto xyz1_flat = xyz1_tensor.flat<float>();\n            const float *xyz1 = &(xyz1_flat(0));\n            auto xyz2_flat = xyz2_tensor.flat<float>();\n            const float *xyz2 = &(xyz2_flat(0));\n            auto idx_flat = idx_tensor->flat<int>();\n            int *idx = &(idx_flat(0));\n            auto pts_cnt_flat = pts_cnt_tensor->flat<int>();\n            int *pts_cnt = &(pts_cnt_flat(0));\n            queryBallPointLauncher(b,n,m,radius,nsample_,xyz1,xyz2,idx,pts_cnt);\n        }\n    private:\n        int nsample_;\n};\nREGISTER_KERNEL_BUILDER(Name(\"QueryBallPoint\").Device(DEVICE_GPU), QueryBallPointGpuOp);\n\nvoid selectionSortLauncher(int b, int n, int m, int k, const float *dist, int *outi, float *out);\nclass SelectionSortGpuOp : public OpKernel {\n    public:\n        explicit SelectionSortGpuOp(OpKernelConstruction* context) : OpKernel(context) {\n            OP_REQUIRES_OK(context, context->GetAttr(\"k\", &k_));\n            OP_REQUIRES(context, k_ > 0, errors::InvalidArgument(\"SelectionSort expects positive k\"));\n        }\n\n        void Compute(OpKernelContext* context) override {\n            const Tensor& dist_tensor = context->input(0);\n            OP_REQUIRES(context, dist_tensor.dims()==3, errors::InvalidArgument(\"SelectionSort expects (b,m,n) dist shape.\"));\n            int b = dist_tensor.shape().dim_size(0);\n            int m = dist_tensor.shape().dim_size(1);\n            int n = dist_tensor.shape().dim_size(2);\n\n            Tensor *outi_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,m,n}, &outi_tensor));\n            Tensor *out_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,m,n}, &out_tensor));\n\n            auto dist_flat = dist_tensor.flat<float>();\n            const float *dist = &(dist_flat(0));\n            auto outi_flat = outi_tensor->flat<int>();\n            int *outi = &(outi_flat(0));\n            auto out_flat = out_tensor->flat<float>();\n            float *out = &(out_flat(0));\n            selectionSortLauncher(b,n,m,k_,dist,outi,out);\n        }\n    private:\n        int k_;\n};\nREGISTER_KERNEL_BUILDER(Name(\"SelectionSort\").Device(DEVICE_GPU), SelectionSortGpuOp);\n\n\nvoid groupPointLauncher(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out);\nclass GroupPointGpuOp: public OpKernel{\n    public:\n        explicit GroupPointGpuOp(OpKernelConstruction * context):OpKernel(context){}\n\n        void Compute(OpKernelContext * context) override {\n            const Tensor& points_tensor=context->input(0);\n            OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument(\"GroupPoint expects (batch_size, num_points, channel) points shape\"));\n            int b = points_tensor.shape().dim_size(0);\n            int n = points_tensor.shape().dim_size(1);\n            int c = points_tensor.shape().dim_size(2);\n\n            const Tensor& idx_tensor=context->input(1);\n            OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument(\"GroupPoint expects (batch_size, npoints, nsample) idx shape\"));\n            int m = idx_tensor.shape().dim_size(1);\n            int nsample = idx_tensor.shape().dim_size(2);\n\n            Tensor * out_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,m,nsample,c}, &out_tensor));\n\n            auto points_flat = points_tensor.flat<float>();\n            const float *points = &(points_flat(0));\n            auto idx_flat = idx_tensor.flat<int>();\n            const int *idx = &(idx_flat(0));\n            auto out_flat = out_tensor->flat<float>();\n            float *out = &(out_flat(0));\n            groupPointLauncher(b,n,c,m,nsample,points,idx,out);\n        }\n};\nREGISTER_KERNEL_BUILDER(Name(\"GroupPoint\").Device(DEVICE_GPU),GroupPointGpuOp);\n\nvoid groupPointGradLauncher(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points);\nclass GroupPointGradGpuOp: public OpKernel{\n    public:\n        explicit GroupPointGradGpuOp(OpKernelConstruction * context):OpKernel(context){}\n\n        void Compute(OpKernelContext * context) override {\n            const Tensor& points_tensor=context->input(0);\n            OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument(\"GroupPointGrad expects (batch_size, num_points, channel) points shape\"));\n            int b = points_tensor.shape().dim_size(0);\n            int n = points_tensor.shape().dim_size(1);\n            int c = points_tensor.shape().dim_size(2);\n\n            const Tensor& idx_tensor=context->input(1);\n            OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument(\"GroupPointGrad expects (batch_size, npoints, nsample) idx shape\"));\n            int m = idx_tensor.shape().dim_size(1);\n            int nsample = idx_tensor.shape().dim_size(2);\n\n            const Tensor& grad_out_tensor=context->input(2);\n            OP_REQUIRES(context,grad_out_tensor.dims()==4 && grad_out_tensor.shape().dim_size(0)==b && grad_out_tensor.shape().dim_size(1)==m && grad_out_tensor.shape().dim_size(2)==nsample && grad_out_tensor.shape().dim_size(3)==c, errors::InvalidArgument(\"GroupPointGrad expects (batch_size, npoints, nsample, channel) grad_out shape\"));\n\n            Tensor * grad_points_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,n,c}, &grad_points_tensor));\n\n            auto points_flat = points_tensor.flat<float>();\n            const float *points = &(points_flat(0));\n            auto idx_flat = idx_tensor.flat<int>();\n            const int *idx = &(idx_flat(0));\n            auto grad_out_flat = grad_out_tensor.flat<float>();\n            const float *grad_out = &(grad_out_flat(0));\n            auto grad_points_flat = grad_points_tensor->flat<float>();\n            float *grad_points = &(grad_points_flat(0));\n            cudaMemset(grad_points, 0, sizeof(float)*b*n*c);\n            groupPointGradLauncher(b,n,c,m,nsample,grad_out,idx,grad_points);\n        }\n};\nREGISTER_KERNEL_BUILDER(Name(\"GroupPointGrad\").Device(DEVICE_GPU),GroupPointGradGpuOp);\n\n\n"
  },
  {
    "path": "code/tf_ops/grouping/tf_grouping.py",
    "content": "import tensorflow as tf\nfrom tensorflow.python.framework import ops\nimport sys\nimport os\nBASE_DIR = os.path.dirname(os.path.abspath(__file__))\nsys.path.append(BASE_DIR)\ngrouping_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_grouping_so.so'))\ndef query_ball_point(radius, nsample, xyz1, xyz2):\n    '''\n    Input:\n        radius: float32, ball search radius\n        nsample: int32, number of points selected in each ball region\n        xyz1: (batch_size, ndataset, 3) float32 array, input points\n        xyz2: (batch_size, npoint, 3) float32 array, query points\n    Output:\n        idx: (batch_size, npoint, nsample) int32 array, indices to input points\n        pts_cnt: (batch_size, npoint) int32 array, number of unique points in each local region\n    '''\n    #return grouping_module.query_ball_point(radius, nsample, xyz1, xyz2)\n    return grouping_module.query_ball_point(xyz1, xyz2, radius, nsample)\nops.NoGradient('QueryBallPoint')\ndef select_top_k(k, dist):\n    '''\n    Input:\n        k: int32, number of k SMALLEST elements selected\n        dist: (b,m,n) float32 array, distance matrix, m query points, n dataset points\n    Output:\n        idx: (b,m,n) int32 array, first k in n are indices to the top k\n        dist_out: (b,m,n) float32 array, first k in n are the top k\n    '''\n    return grouping_module.selection_sort(dist, k)\nops.NoGradient('SelectionSort')\ndef group_point(points, idx):\n    '''\n    Input:\n        points: (batch_size, ndataset, channel) float32 array, points to sample from\n        idx: (batch_size, npoint, nsample) int32 array, indices to points\n    Output:\n        out: (batch_size, npoint, nsample, channel) float32 array, values sampled from points\n    '''\n    return grouping_module.group_point(points, idx)\n@tf.RegisterGradient('GroupPoint')\ndef _group_point_grad(op, grad_out):\n    points = op.inputs[0]\n    idx = op.inputs[1]\n    return [grouping_module.group_point_grad(points, idx, grad_out), None]\n\ndef knn_point(k, xyz1, xyz2):\n    '''\n    Input:\n        k: int32, number of k in k-nn search\n        xyz1: (batch_size, ndataset, c) float32 array, input points\n        xyz2: (batch_size, npoint, c) float32 array, query points\n    Output:\n        val: (batch_size, npoint, k) float32 array, L2 distances\n        idx: (batch_size, npoint, k) int32 array, indices to input points\n    '''\n    # b = xyz1.get_shape()[0].value\n    # n = xyz1.get_shape()[1].value\n    # c = xyz1.get_shape()[2].value\n    # m = xyz2.get_shape()[1].value\n    # xyz1 = tf.tile(tf.reshape(xyz1, (b,1,n,c)), [1,m,1,1])\n    # xyz2 = tf.tile(tf.reshape(xyz2, (b,m,1,c)), [1,1,n,1])\n    xyz1 = tf.expand_dims(xyz1,axis=1)\n    xyz2 = tf.expand_dims(xyz2,axis=2)\n    dist = tf.reduce_sum((xyz1-xyz2)**2, -1)\n\n    # outi, out = select_top_k(k, dist)\n    # idx = tf.slice(outi, [0,0,0], [-1,-1,k])\n    # val = tf.slice(out, [0,0,0], [-1,-1,k])\n\n    val, idx = tf.nn.top_k(-dist, k=k) # ONLY SUPPORT CPU\n    return val, idx\n\nif __name__=='__main__':\n    knn=True\n    import numpy as np\n    import time\n    np.random.seed(100)\n    pts = np.random.random((32,512,64)).astype('float32')\n    tmp1 = np.random.random((32,512,3)).astype('float32')\n    tmp2 = np.random.random((32,128,3)).astype('float32')\n    with tf.device('/gpu:1'):\n        points = tf.constant(pts)\n        xyz1 = tf.constant(tmp1)\n        xyz2 = tf.constant(tmp2)\n        radius = 0.1 \n        nsample = 64\n        if knn:\n            _, idx = knn_point(nsample, xyz1, xyz2)\n            grouped_points = group_point(points, idx)\n        else:\n            idx, _ = query_ball_point(radius, nsample, xyz1, xyz2)\n            grouped_points = group_point(points, idx)\n            #grouped_points_grad = tf.ones_like(grouped_points)\n            #points_grad = tf.gradients(grouped_points, points, grouped_points_grad)\n    with tf.Session('') as sess:\n        now = time.time() \n        for _ in range(100):\n            ret = sess.run(grouped_points)\n        print time.time() - now\n        print ret.shape, ret.dtype\n        print ret\n    \n    \n"
  },
  {
    "path": "code/tf_ops/grouping/tf_grouping_compile.sh",
    "content": "#/bin/bash\n/usr/local/cuda-8.0/bin/nvcc tf_grouping_g.cu -o tf_grouping_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC\n\ng++ -std=c++11 tf_grouping.cpp tf_grouping_g.cu.o -o tf_grouping_so.so -shared -fPIC -I /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include  -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0\n"
  },
  {
    "path": "code/tf_ops/grouping/tf_grouping_g.cu",
    "content": "// input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)\n// output: idx (b,m,nsample), pts_cnt (b,m)\n__global__ void query_ball_point_gpu(int b, int n, int m, const float *radius, int nsample, const float *xyz1, const float *xyz2, int *idx, int *pts_cnt) {\n    int batch_index = blockIdx.x;\n    xyz1 += n*3*batch_index;\n    xyz2 += m*3*batch_index;\n    idx += m*nsample*batch_index;\n    pts_cnt += m*batch_index; // counting how many unique points selected in local region\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n    \n    for (int j=index;j<m;j+=stride) {\n        int cnt = 0;\n        for (int k=0;k<n;++k) {\n            if (cnt == nsample)\n                break; // only pick the FIRST nsample points in the ball\n            float x2=xyz2[j*3+0];\n            float y2=xyz2[j*3+1];\n            float z2=xyz2[j*3+2];\n            float x1=xyz1[k*3+0];\n            float y1=xyz1[k*3+1];\n            float z1=xyz1[k*3+2];\n    \t    float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);\n            if (d<radius[0]) {\n                if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices\n                    for (int l=0;l<nsample;++l)\n                        idx[j*nsample+l] = k;\n                }\n                idx[j*nsample+cnt] = k;\n                cnt+=1;\n            }\n        }\n        pts_cnt[j] = cnt;\n    }\n}\n\n// input: points (b,n,c), idx (b,m,nsample)\n// output: out (b,m,nsample,c)\n__global__ void group_point_gpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {\n    int batch_index = blockIdx.x;\n    points += n*c*batch_index;\n    idx += m*nsample*batch_index;\n    out += m*nsample*c*batch_index;\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n    \n    for (int j=index;j<m;j+=stride) {\n        for (int k=0;k<nsample;++k) {\n            int ii = idx[j*nsample+k];\n            for (int l=0;l<c;++l) {\n                out[j*nsample*c+k*c+l] = points[ii*c+l];\n            }\n        }\n    }\n}\n\n// input: grad_out (b,m,nsample,c), idx (b,m,nsample), \n// output: grad_points (b,n,c)\n__global__ void group_point_grad_gpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {\n    int batch_index = blockIdx.x;\n    idx += m*nsample*batch_index;\n    grad_out += m*nsample*c*batch_index;\n    grad_points += n*c*batch_index;\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n\n    for (int j=index;j<m;j+=stride) {\n        for (int k=0;k<nsample;++k) {\n            int ii = idx[j*nsample+k];\n            for (int l=0;l<c;++l) {\n                 atomicAdd(&grad_points[ii*c+l], grad_out[j*nsample*c+k*c+l]);\n            }\n        }\n    }\n}\n\n// input: k (1), distance matrix dist (b,m,n)\n// output: idx (b,m,n), dist_out (b,m,n)\n// only the top k results within n are useful\n__global__ void selection_sort_gpu(int b, int n, int m, int k, const float *dist, int *outi, float *out) {\n    int batch_index = blockIdx.x;\n    dist+=m*n*batch_index;\n    outi+=m*n*batch_index;\n    out+=m*n*batch_index;\n\n    int index = threadIdx.x;\n    int stride = blockDim.x;\n\n    // copy from dist to dist_out\n    for (int j=index;j<m;j+=stride) {\n        for (int s=0;s<n;++s) {\n            out[j*n+s] = dist[j*n+s];\n            outi[j*n+s] = s;\n        }\n    }\n\n    float *p_dist;\n    for (int j=index;j<m;j+=stride) {\n        p_dist = out+j*n;\n        // selection sort for the first k elements\n        for (int s=0;s<k;++s) {\n            int min=s; \n            // find the min\n            for (int t=s+1;t<n;++t) {\n                if (p_dist[t]<p_dist[min]) {\n                    min = t;\n                }\n            }\n            // swap min-th and i-th element\n            if (min!=s) {\n                float tmp = p_dist[min];\n                p_dist[min] = p_dist[s];\n                p_dist[s] = tmp;\n                int tmpi = outi[j*n+min];\n                outi[j*n+min] = outi[j*n+s];\n                outi[j*n+s] = tmpi;\n            }\n        }\n    }\n}\n\nvoid queryBallPointLauncher(int b, int n, int m, const float *radius, int nsample, const float *xyz1, const float *xyz2, int *idx, int *pts_cnt) {\n    query_ball_point_gpu<<<b,256>>>(b,n,m,radius,nsample,xyz1,xyz2,idx,pts_cnt);\n    //cudaDeviceSynchronize();\n}\nvoid selectionSortLauncher(int b, int n, int m, int k, const float *dist, int *outi, float *out) {\n    selection_sort_gpu<<<b,256>>>(b,n,m,k,dist,outi,out); \n    //cudaDeviceSynchronize();\n}\nvoid groupPointLauncher(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out){\n    group_point_gpu<<<b,256>>>(b,n,c,m,nsample,points,idx,out);\n    //cudaDeviceSynchronize();\n}\nvoid groupPointGradLauncher(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points){\n    group_point_grad_gpu<<<b,256>>>(b,n,c,m,nsample,grad_out,idx,grad_points);\n    //group_point_grad_gpu<<<1,1>>>(b,n,c,m,nsample,grad_out,idx,grad_points);\n    //cudaDeviceSynchronize();\n}\n"
  },
  {
    "path": "code/tf_ops/grouping/tf_grouping_op_test.py",
    "content": "import tensorflow as tf\nimport numpy as np\nfrom tf_grouping import query_ball_point, group_point\n\nclass GroupPointTest(tf.test.TestCase):\n  def test(self):\n    pass\n\n  def test_grad(self):\n    with tf.device('/gpu:0'):\n      points = tf.constant(np.random.random((1,128,16)).astype('float32'))\n      print points\n      xyz1 = tf.constant(np.random.random((1,128,3)).astype('float32'))\n      xyz2 = tf.constant(np.random.random((1,8,3)).astype('float32'))\n      radius = 0.3 \n      nsample = 32\n      idx, pts_cnt = query_ball_point(radius, nsample, xyz1, xyz2)\n      grouped_points = group_point(points, idx)\n      print grouped_points\n\n    with self.test_session():\n      print \"---- Going to compute gradient error\"\n      err = tf.test.compute_gradient_error(points, (1,128,16), grouped_points, (1,8,32,16))\n      print err\n      self.assertLess(err, 1e-4) \n\nif __name__=='__main__':\n  tf.test.main() \n"
  },
  {
    "path": "code/tf_ops/interpolation/__init__.py",
    "content": ""
  },
  {
    "path": "code/tf_ops/interpolation/interpolate.cpp",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include <string>\n#include <vector>\nusing namespace std;\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n\n// Find three nearest neigbors with square distance\n// input: xyz1 (b,n,3), xyz2(b,m,3)\n// output: dist (b,n,3), idx (b,n,3)\nvoid threenn_cpu(int b, int n, int m, const float *xyz1, const float *xyz2, float *dist, int *idx) {\n     for (int i=0;i<b;++i) {\n        for (int j=0;j<n;++j) {\n\t    float x1=xyz1[j*3+0];\n\t    float y1=xyz1[j*3+1];\n\t    float z1=xyz1[j*3+2];\n            double best1=1e40; double best2=1e40; double best3=1e40;\n            int besti1=0; int besti2=0; int besti3=0;\n            for (int k=0;k<m;++k) {\n                float x2=xyz2[k*3+0];\n\t        float y2=xyz2[k*3+1];\n\t        float z2=xyz2[k*3+2];\n\t\t//float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);\n\t\tdouble d=x2*x2+y2*y2+z2*z2;\n                if (d<best1) {\n                    best3=best2;\n                    besti3=besti2;\n                    best2=best1;\n                    besti2=besti1;\n                    best1=d;\n                    besti1=k;\n                } else if (d<best2) {\n                    best3=best2;\n                    besti3=besti2;\n                    best2=d;\n                    besti2=k;\n                } else if (d<best3) {\n                    best3=d;\n                    besti3=k;\n                }\n            } \n            dist[j*3]=best1;\n            idx[j*3]=besti1;\n            dist[j*3+1]=best2;\n            idx[j*3+1]=besti2;\n            dist[j*3+2]=best3;\n            idx[j*3+2]=besti3;\n        } \n        xyz1+=n*3;\n        xyz2+=m*3;\n        dist+=n*3;\n        idx+=n*3;\n    }\n} \n\n// CONSTANT WEIGHT TODO\n// input: dist (b,n,3)\n// output: weight (b,n,3)\nvoid get_weights_cpu(int b, int n, const float *dist, float *weight) {\n    const float w = 1.0/3.0;\n    for (int i=0;i<b;++i) {\n        for (int j=0;j<n;++j) {\n            weight[j*3]=w;\n            weight[j*3+1]=w;\n            weight[j*3+2]=w;\n        } \n        dist+=n*3;\n        weight+=n*3;\n    }\n}\n\n// input: points (b,m,c), idx (b,n,3), weight (b,n,3)\n// output: out (b,n,c)\nvoid interpolate_cpu(int b, int m, int c, int n, const float *points, const int *idx, const float *weight, float *out) {\n     float w1,w2,w3;\n     int i1,i2,i3;\n     for (int i=0;i<b;++i) {\n        for (int j=0;j<n;++j) {\n            w1=weight[j*3];\n            w2=weight[j*3+1];\n            w3=weight[j*3+2]; \n            i1=idx[j*3];\n            i2=idx[j*3+1];\n            i3=idx[j*3+2];\n            for (int l=0;l<c;++l) {\n                out[j*c+l] = points[i1*c+l]*w1 + points[i2*c+l]*w2 + points[i3*c+l]*w3;\n            }\n        } \n        points+=m*c;\n        idx+=n*3;\n        weight+=n*3;\n        out+=n*c;\n    }\n}\n\n// input: grad_out (b,n,c), idx (b,n,3), weight (b,n,3)\n// output: grad_points (b,m,c)\nvoid interpolate_grad_cpu(int b, int n, int c, int m, const float *grad_out, const int *idx, const float *weight, float *grad_points) {\n     float w1,w2,w3;\n     int i1,i2,i3;\n     for (int i=0;i<b;++i) {\n        for (int j=0;j<n;++j) {\n            w1=weight[j*3];\n            w2=weight[j*3+1];\n            w3=weight[j*3+2]; \n            i1=idx[j*3];\n            i2=idx[j*3+1];\n            i3=idx[j*3+2];\n            for (int l=0;l<c;++l) {\n                grad_points[i1*c+l] += grad_out[j*c+l]*w1;\n                grad_points[i2*c+l] += grad_out[j*c+l]*w2;\n                grad_points[i3*c+l] += grad_out[j*c+l]*w3;\n            }\n        } \n        grad_out+=n*c;\n        idx+=n*3;\n        weight+=n*3;\n        grad_points+=m*c;\n    }\n}\n\nint main()\n{\n    int b=32,n=512,m=128,c=64;\n    float *xyz1=new float[b*n*3];\n    float *xyz2=new float[b*m*3];\n    float *dist=new float[b*n*3];\n    int *idx=new int[b*n*3];\n    memset(idx, 0, sizeof(int)*b*n*3);\n    float *weight=new float[b*n*3];\n    float *points=new float[b*m*c];\n    float *out=new float[b*n*c];\n    float *grad_out=new float[b*n*c]; // grad to out\n    memset(grad_out, 0.0, sizeof(float)*b*n*c);\n    float *grad_points=new float[b*m*c]; // grad to points\n    for (int i=0;i<b*n*3;i++)\n        xyz1[i]=randomf();\n    for (int i=0;i<b*m*3;i++)\n        xyz2[i]=randomf();\n    for (int i=0;i<b*m*c;i++)\n        points[i]=randomf();\n\n    double t0=get_time();\n    threenn_cpu(b,n,m,xyz1,xyz2,dist,idx);\n    printf(\"threenn cpu time %f\\n\",get_time()-t0);\n    \n    t0=get_time();\n    get_weights_cpu(b,n,dist,weight);\n    printf(\"get_weights_cpu cpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    interpolate_cpu(b,m,c,n,points,idx,weight,out);\n    printf(\"interpolate_cpu cpu time %f\\n\",get_time()-t0);\n\n    t0=get_time();\n    interpolate_grad_cpu(b,n,c,m,grad_out,idx,weight,grad_points);\n    printf(\"interpolate_grad_cpu cpu time %f\\n\",get_time()-t0);\n    return 0;\n}\n"
  },
  {
    "path": "code/tf_ops/interpolation/tf_interpolate.cpp",
    "content": "#include <cstdio>\n#include <ctime>\n#include <cstring> // memset\n#include <cstdlib> // rand, RAND_MAX\n#include <cmath> // sqrtf\n#include \"tensorflow/core/framework/op.h\"\n#include \"tensorflow/core/framework/op_kernel.h\"\n#include \"tensorflow/core/framework/shape_inference.h\"\n#include \"tensorflow/core/framework/common_shape_fns.h\"\nusing namespace tensorflow;\n\nREGISTER_OP(\"ThreeNN\")\n    .Input(\"xyz1: float32\")\n    .Input(\"xyz2: float32\")\n    .Output(\"dist: float32\")\n    .Output(\"idx: int32\")\n    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n        c->set_output(0, c->input(0));\n        c->set_output(1, c->input(0));\n        return Status::OK();\n    });\nREGISTER_OP(\"ThreeInterpolate\")\n    .Input(\"points: float32\")\n    .Input(\"idx: int32\")\n    .Input(\"weight: float32\")\n    .Output(\"out: float32\")\n    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n        ::tensorflow::shape_inference::ShapeHandle dims1; // (b,m,c)\n        c->WithRank(c->input(0), 3, &dims1);\n        ::tensorflow::shape_inference::ShapeHandle dims2; // (b,n,3)\n        c->WithRank(c->input(1), 3, &dims2);\n        // (b,n,c)\n        ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), c->Dim(dims2, 1), c->Dim(dims1, 2)});\n        c->set_output(0, output);\n        return Status::OK();\n    });\nREGISTER_OP(\"ThreeInterpolateGrad\")\n    .Input(\"points: float32\")\n    .Input(\"idx: int32\")\n    .Input(\"weight: float32\")\n    .Input(\"grad_out: float32\")\n    .Output(\"grad_points: float32\")\n    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n        c->set_output(0, c->input(0));\n        return Status::OK();\n    });\n\nfloat randomf(){\n    return (rand()+0.5)/(RAND_MAX+1.0);\n}\nstatic double get_time(){\n    timespec tp;\n    clock_gettime(CLOCK_MONOTONIC,&tp);\n    return tp.tv_sec+tp.tv_nsec*1e-9;\n}\n\n// Find three nearest neigbors with square distance\n// input: xyz1 (b,n,3), xyz2(b,m,3)\n// output: dist (b,n,3), idx (b,n,3)\nvoid threenn_cpu(int b, int n, int m, const float *xyz1, const float *xyz2, float *dist, int *idx) {\n     for (int i=0;i<b;++i) {\n        for (int j=0;j<n;++j) {\n\t    float x1=xyz1[j*3+0];\n\t    float y1=xyz1[j*3+1];\n\t    float z1=xyz1[j*3+2];\n            double best1=1e40; double best2=1e40; double best3=1e40;\n            int besti1=0; int besti2=0; int besti3=0;\n            for (int k=0;k<m;++k) {\n                float x2=xyz2[k*3+0];\n\t        float y2=xyz2[k*3+1];\n\t        float z2=xyz2[k*3+2];\n\t\t//float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);\n\t\tdouble d=(x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1);\n                if (d<best1) {\n                    best3=best2;\n                    besti3=besti2;\n                    best2=best1;\n                    besti2=besti1;\n                    best1=d;\n                    besti1=k;\n                } else if (d<best2) {\n                    best3=best2;\n                    besti3=besti2;\n                    best2=d;\n                    besti2=k;\n                } else if (d<best3) {\n                    best3=d;\n                    besti3=k;\n                }\n            } \n            dist[j*3]=best1;\n            idx[j*3]=besti1;\n            dist[j*3+1]=best2;\n            idx[j*3+1]=besti2;\n            dist[j*3+2]=best3;\n            idx[j*3+2]=besti3;\n        } \n        xyz1+=n*3;\n        xyz2+=m*3;\n        dist+=n*3;\n        idx+=n*3;\n    }\n} \n\n// input: points (b,m,c), idx (b,n,3), weight (b,n,3)\n// output: out (b,n,c)\nvoid threeinterpolate_cpu(int b, int m, int c, int n, const float *points, const int *idx, const float *weight, float *out) {\n     float w1,w2,w3;\n     int i1,i2,i3;\n     for (int i=0;i<b;++i) {\n        for (int j=0;j<n;++j) {\n            w1=weight[j*3];\n            w2=weight[j*3+1];\n            w3=weight[j*3+2]; \n            i1=idx[j*3];\n            i2=idx[j*3+1];\n            i3=idx[j*3+2];\n            for (int l=0;l<c;++l) {\n                out[j*c+l] = points[i1*c+l]*w1 + points[i2*c+l]*w2 + points[i3*c+l]*w3;\n            }\n        } \n        points+=m*c;\n        idx+=n*3;\n        weight+=n*3;\n        out+=n*c;\n    }\n}\n\n// input: grad_out (b,n,c), idx (b,n,3), weight (b,n,3)\n// output: grad_points (b,m,c)\nvoid threeinterpolate_grad_cpu(int b, int n, int c, int m, const float *grad_out, const int *idx, const float *weight, float *grad_points) {\n     float w1,w2,w3;\n     int i1,i2,i3;\n     for (int i=0;i<b;++i) {\n        for (int j=0;j<n;++j) {\n            w1=weight[j*3];\n            w2=weight[j*3+1];\n            w3=weight[j*3+2]; \n            i1=idx[j*3];\n            i2=idx[j*3+1];\n            i3=idx[j*3+2];\n            for (int l=0;l<c;++l) {\n                grad_points[i1*c+l] += grad_out[j*c+l]*w1;\n                grad_points[i2*c+l] += grad_out[j*c+l]*w2;\n                grad_points[i3*c+l] += grad_out[j*c+l]*w3;\n            }\n        } \n        grad_out+=n*c;\n        idx+=n*3;\n        weight+=n*3;\n        grad_points+=m*c;\n    }\n}\n\n\n\nclass ThreeNNOp : public OpKernel {\n    public:\n        explicit ThreeNNOp(OpKernelConstruction* context) : OpKernel(context) {}\n\n        void Compute(OpKernelContext* context) override {\n            const Tensor& xyz1_tensor = context->input(0);\n            OP_REQUIRES(context, xyz1_tensor.dims()==3 && xyz1_tensor.shape().dim_size(2)==3, errors::InvalidArgument(\"ThreeNN expects (b,n,3) xyz1 shape.\"));\n            int b = xyz1_tensor.shape().dim_size(0);\n            int n = xyz1_tensor.shape().dim_size(1);\n\n            const Tensor& xyz2_tensor = context->input(1);\n            OP_REQUIRES(context, xyz2_tensor.dims()==3 && xyz2_tensor.shape().dim_size(2)==3, errors::InvalidArgument(\"ThreeNN expects (b,m,3) xyz2 shape.\"));\n            int m = xyz2_tensor.shape().dim_size(1);\n\n            Tensor *dist_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,n,3}, &dist_tensor));\n            Tensor *idx_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,n,3}, &idx_tensor));\n\n            auto xyz1_flat = xyz1_tensor.flat<float>();\n            const float *xyz1 = &(xyz1_flat(0));\n            auto xyz2_flat = xyz2_tensor.flat<float>();\n            const float *xyz2 = &(xyz2_flat(0));\n            auto dist_flat = dist_tensor->flat<float>();\n            float *dist = &(dist_flat(0));\n            auto idx_flat = idx_tensor->flat<int>();\n            int *idx = &(idx_flat(0));\n            threenn_cpu(b,n,m,xyz1,xyz2,dist,idx);\n        }\n};\nREGISTER_KERNEL_BUILDER(Name(\"ThreeNN\").Device(DEVICE_CPU), ThreeNNOp);\n\n\n\nclass ThreeInterpolateOp: public OpKernel{\n    public:\n        explicit ThreeInterpolateOp(OpKernelConstruction * context):OpKernel(context){}\n\n        void Compute(OpKernelContext * context) override {\n            const Tensor& points_tensor=context->input(0);\n            OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument(\"ThreeInterpolate expects (b,m,c) points shape\"));\n            int b = points_tensor.shape().dim_size(0);\n            int m = points_tensor.shape().dim_size(1);\n            int c = points_tensor.shape().dim_size(2);\n\n            const Tensor& idx_tensor=context->input(1);\n            OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b && idx_tensor.shape().dim_size(2)==3, errors::InvalidArgument(\"ThreeInterpolate expects (b,n,3) idx shape\"));\n            int n = idx_tensor.shape().dim_size(1);\n            const Tensor& weight_tensor=context->input(2);\n            OP_REQUIRES(context,weight_tensor.dims()==3 && weight_tensor.shape().dim_size(0)==b && weight_tensor.shape().dim_size(1)==n && weight_tensor.shape().dim_size(2)==3, errors::InvalidArgument(\"ThreeInterpolate expects (b,n,3) weight shape\"));\n\n            Tensor * out_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,n,c}, &out_tensor));\n\n            auto points_flat = points_tensor.flat<float>();\n            const float *points = &(points_flat(0));\n            auto idx_flat = idx_tensor.flat<int>();\n            const int *idx = &(idx_flat(0));\n            auto weight_flat = weight_tensor.flat<float>();\n            const float *weight = &(weight_flat(0));\n            auto out_flat = out_tensor->flat<float>();\n            float *out = &(out_flat(0));\n            threeinterpolate_cpu(b,m,c,n,points,idx,weight,out);\n        }\n};\nREGISTER_KERNEL_BUILDER(Name(\"ThreeInterpolate\").Device(DEVICE_CPU),ThreeInterpolateOp);\n\n\nclass ThreeInterpolateGradOp: public OpKernel{\n    public:\n        explicit ThreeInterpolateGradOp(OpKernelConstruction * context):OpKernel(context){}\n\n        void Compute(OpKernelContext * context) override {\n            const Tensor& points_tensor=context->input(0);\n            OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument(\"ThreeInterpolateGrad expects (b,m,c) points shape\"));\n            int b = points_tensor.shape().dim_size(0);\n            int m = points_tensor.shape().dim_size(1);\n            int c = points_tensor.shape().dim_size(2);\n\n            const Tensor& idx_tensor=context->input(1);\n            OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument(\"ThreeInterpolateGrad expects (b,n,3) idx shape\"));\n            int n = idx_tensor.shape().dim_size(1);\n            const Tensor& weight_tensor=context->input(2);\n            OP_REQUIRES(context,weight_tensor.dims()==3 && weight_tensor.shape().dim_size(0)==b && weight_tensor.shape().dim_size(1)==n && weight_tensor.shape().dim_size(2)==3, errors::InvalidArgument(\"ThreeInterpolateGrad expects (b,n,3) weight shape\"));\n\n            const Tensor& grad_out_tensor=context->input(3);\n            OP_REQUIRES(context,grad_out_tensor.dims()==3 && grad_out_tensor.shape().dim_size(0)==b && grad_out_tensor.shape().dim_size(1)==n && grad_out_tensor.shape().dim_size(2)==c, errors::InvalidArgument(\"ThreeInterpolateGrad expects (b,n,c) grad_out shape\"));\n\n            Tensor * grad_points_tensor = nullptr;\n            OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,m,c}, &grad_points_tensor));\n\n            auto points_flat = points_tensor.flat<float>();\n            const float *points = &(points_flat(0));\n            auto idx_flat = idx_tensor.flat<int>();\n            const int *idx = &(idx_flat(0));\n            auto weight_flat = weight_tensor.flat<float>();\n            const float *weight = &(weight_flat(0));\n            auto grad_out_flat = grad_out_tensor.flat<float>();\n            const float *grad_out = &(grad_out_flat(0));\n            auto grad_points_flat = grad_points_tensor->flat<float>();\n            float *grad_points = &(grad_points_flat(0));\n            memset(grad_points, 0, sizeof(float)*b*m*c);\n            threeinterpolate_grad_cpu(b,n,c,m,grad_out,idx,weight,grad_points);\n        }\n};\nREGISTER_KERNEL_BUILDER(Name(\"ThreeInterpolateGrad\").Device(DEVICE_CPU),ThreeInterpolateGradOp);\n\n\n"
  },
  {
    "path": "code/tf_ops/interpolation/tf_interpolate.py",
    "content": "import tensorflow as tf\nfrom tensorflow.python.framework import ops\nimport sys\nimport os\nBASE_DIR = os.path.dirname(__file__)\nsys.path.append(BASE_DIR)\ninterpolate_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_interpolate_so.so'))\ndef three_nn(xyz1, xyz2):\n    '''\n    Input:\n        xyz1: (b,n,3) float32 array, unknown points\n        xyz2: (b,m,3) float32 array, known points\n    Output:\n        dist: (b,n,3) float32 array, distances to known points\n        idx: (b,n,3) int32 array, indices to known points\n    '''\n    return interpolate_module.three_nn(xyz1, xyz2)\nops.NoGradient('ThreeNN')\ndef three_interpolate(points, idx, weight):\n    '''\n    Input:\n        points: (b,m,c) float32 array, known points\n        idx: (b,n,3) int32 array, indices to known points\n        weight: (b,n,3) float32 array, weights on known points\n    Output:\n        out: (b,n,c) float32 array, interpolated point values\n    '''\n    return interpolate_module.three_interpolate(points, idx, weight)\n@tf.RegisterGradient('ThreeInterpolate')\ndef _three_interpolate_grad(op, grad_out):\n    points = op.inputs[0]\n    idx = op.inputs[1]\n    weight = op.inputs[2]\n    return [interpolate_module.three_interpolate_grad(points, idx, weight, grad_out), None, None]\n\nif __name__=='__main__':\n    import numpy as np\n    import time\n    np.random.seed(100)\n    pts = np.random.random((32,128,64)).astype('float32')\n    tmp1 = np.random.random((32,512,3)).astype('float32')\n    tmp2 = np.random.random((32,128,3)).astype('float32')\n    with tf.device('/cpu:0'):\n        points = tf.constant(pts)\n        xyz1 = tf.constant(tmp1)\n        xyz2 = tf.constant(tmp2)\n        dist, idx = three_nn(xyz1, xyz2)\n        weight = tf.ones_like(dist)/3.0\n        interpolated_points = three_interpolate(points, idx, weight)\n    with tf.Session('') as sess:\n        now = time.time() \n        for _ in range(100):\n            ret = sess.run(interpolated_points)\n        print time.time() - now\n        print ret.shape, ret.dtype\n        #print ret\n    \n    \n    \n"
  },
  {
    "path": "code/tf_ops/interpolation/tf_interpolate_compile.sh",
    "content": "g++ -std=c++11 tf_interpolate.cpp -o tf_interpolate_so.so -shared -fPIC -I /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include  -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0\n"
  },
  {
    "path": "code/tf_ops/interpolation/tf_interpolate_op_test.py",
    "content": "import tensorflow as tf\nimport numpy as np\nfrom tf_interpolate import three_nn, three_interpolate\n\nclass GroupPointTest(tf.test.TestCase):\n  def test(self):\n    pass\n\n  def test_grad(self):\n    with self.test_session():\n      points = tf.constant(np.random.random((1,8,16)).astype('float32'))\n      print points\n      xyz1 = tf.constant(np.random.random((1,128,3)).astype('float32'))\n      xyz2 = tf.constant(np.random.random((1,8,3)).astype('float32'))\n      dist, idx = three_nn(xyz1, xyz2)\n      weight = tf.ones_like(dist)/3.0\n      interpolated_points = three_interpolate(points, idx, weight)\n      print interpolated_points\n      err = tf.test.compute_gradient_error(points, (1,8,16), interpolated_points, (1,128,16))\n      print err\n      self.assertLess(err, 1e-4) \n\nif __name__=='__main__':\n  tf.test.main() \n"
  },
  {
    "path": "code/tf_ops/interpolation/visu_interpolation.py",
    "content": "''' Visualize part segmentation '''\nimport os\nimport sys\nROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))\nsys.path.append('/home/rqi/Projects/toolkits/visualization')\nfrom show3d_balls import showpoints\nimport numpy as np\nfrom tf_interpolate import three_nn, three_interpolate\nimport tensorflow as tf\n\n\npts2 = np.array([[0,0,1],[1,0,0],[0,1,0],[1,1,0]]).astype('float32')\nxyz1 = np.random.random((100,3)).astype('float32')\nxyz2 = np.array([[0,0,0],[1,0,0],[0,1,0],[1,1,1]]).astype('float32')\n\ndef fun(xyz1,xyz2,pts2):\n    with tf.device('/cpu:0'):\n        points = tf.constant(np.expand_dims(pts2,0))\n        xyz1 = tf.constant(np.expand_dims(xyz1,0))\n        xyz2 = tf.constant(np.expand_dims(xyz2,0))\n        dist, idx = three_nn(xyz1, xyz2)\n        #weight = tf.ones_like(dist)/3.0\n        dist = tf.maximum(dist, 1e-10)\n        norm = tf.reduce_sum((1.0/dist),axis=2,keep_dims=True)\n        norm = tf.tile(norm, [1,1,3])\n        print norm\n        weight = (1.0/dist) / norm\n        interpolated_points = three_interpolate(points, idx, weight)\n    with tf.Session('') as sess:\n        tmp,pts1,d,w = sess.run([xyz1, interpolated_points, dist, weight])\n        #print w\n        pts1 = pts1.squeeze()\n    return pts1\n\npts1 = fun(xyz1,xyz2,pts2) \nall_pts = np.zeros((104,3))\nall_pts[0:100,:] = pts1\nall_pts[100:,:] = pts2\nall_xyz = np.zeros((104,3))\nall_xyz[0:100,:]=xyz1\nall_xyz[100:,:]=xyz2\nshowpoints(xyz2, pts2, ballradius=8)\nshowpoints(xyz1, pts1, ballradius=8)\nshowpoints(all_xyz, all_pts, ballradius=8)\n"
  },
  {
    "path": "code/tf_ops/sampling/__init__.py",
    "content": ""
  },
  {
    "path": "code/tf_ops/sampling/tf_sampling.cpp",
    "content": "/* Furthest point sampling\n * Original author: Haoqiang Fan\n * Modified by Charles R. Qi\n * All Rights Reserved. 2017. \n */\n#include \"tensorflow/core/framework/op.h\"\n#include \"tensorflow/core/framework/op_kernel.h\"\n#include \"tensorflow/core/framework/shape_inference.h\"\n#include \"tensorflow/core/framework/common_shape_fns.h\"\n#include <cuda_runtime.h>\n\nusing namespace tensorflow;\n\nREGISTER_OP(\"ProbSample\")\n  .Input(\"inp: float32\")\n  .Input(\"inpr: float32\")\n  .Output(\"out: int32\")\n  .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n    ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ncategory\n    c->WithRank(c->input(0), 2, &dims1);\n    ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints\n    c->WithRank(c->input(1), 2, &dims2);\n    // batch_size * npoints\n    ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1)});\n    c->set_output(0, output);\n    return Status::OK();\n  });\nREGISTER_OP(\"FarthestPointSample\")\n  .Attr(\"npoint: int\")\n  .Input(\"inp: float32\")\n  .Output(\"out: int32\")\n  .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n    ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * npoint * 3\n    c->WithRank(c->input(0), 3, &dims1);\n    int npoint;\n    TF_RETURN_IF_ERROR(c->GetAttr(\"npoint\", &npoint));\n    ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), npoint});\n    c->set_output(0, output);\n    return Status::OK();\n  });\nREGISTER_OP(\"GatherPoint\")\n  .Input(\"inp: float32\")\n  .Input(\"idx: int32\")\n  .Output(\"out: float32\")\n  .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n    ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ndataset * 3\n    c->WithRank(c->input(0), 3, &dims1);\n    ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints\n    c->WithRank(c->input(1), 2, &dims2);\n    // batch_size * npoints * 3\n    ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), c->Dim(dims2, 1), c->Dim(dims1, 2)});\n    c->set_output(0, output);\n    return Status::OK();\n  });\nREGISTER_OP(\"GatherPointGrad\")\n  .Input(\"inp: float32\")\n  .Input(\"idx: int32\")\n  .Input(\"out_g: float32\")\n  .Output(\"inp_g: float32\")\n  .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {\n    c->set_output(0, c->input(0));\n    return Status::OK();\n  });\n\nvoid probsampleLauncher(int b,int n,int m,const float * inp_p,const float * inp_r,float * temp,int * out);\nclass ProbSampleGpuOp: public OpKernel{\n  public:\n    explicit ProbSampleGpuOp(OpKernelConstruction* context):OpKernel(context){}\n    void Compute(OpKernelContext * context)override{\n      const Tensor& inp_tensor=context->input(0);\n      const Tensor& inpr_tensor=context->input(1);\n      auto inp_flat=inp_tensor.flat<float>();\n      auto inpr_flat=inpr_tensor.flat<float>();\n      const float * inp=&(inp_flat(0));\n      const float * inpr=&(inpr_flat(0));\n      OP_REQUIRES(context,inp_tensor.dims()==2,errors::InvalidArgument(\"ProbSample expects (batch_size,num_choices) inp shape\"));\n      int b=inp_tensor.shape().dim_size(0);\n      int n=inp_tensor.shape().dim_size(1);\n      OP_REQUIRES(context,inpr_tensor.dims()==2 && inpr_tensor.shape().dim_size(0)==b,errors::InvalidArgument(\"ProbSample expects (batch_size,num_points) inpr shape\"));\n      int m=inpr_tensor.shape().dim_size(1);\n      Tensor * out_tensor=NULL;\n      OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m},&out_tensor));\n      auto out_flat=out_tensor->flat<int>();\n      int * out=&(out_flat(0));\n      Tensor temp_tensor;\n      OP_REQUIRES_OK(context,context->allocate_temp(DataTypeToEnum<float>::value,TensorShape{b,n},&temp_tensor));\n      auto temp_flat=temp_tensor.flat<float>();\n      float * temp=&(temp_flat(0));\n      probsampleLauncher(b,n,m,inp,inpr,temp,out);\n    }\n};\nREGISTER_KERNEL_BUILDER(Name(\"ProbSample\").Device(DEVICE_GPU), ProbSampleGpuOp);\n\nvoid farthestpointsamplingLauncher(int b,int n,int m,const float * inp,float * temp,int * out);\nclass FarthestPointSampleGpuOp: public OpKernel{\n  public:\n    explicit FarthestPointSampleGpuOp(OpKernelConstruction* context):OpKernel(context) {\n                    OP_REQUIRES_OK(context, context->GetAttr(\"npoint\", &npoint_));\n                    OP_REQUIRES(context, npoint_ > 0, errors::InvalidArgument(\"FarthestPointSample expects positive npoint\"));\n                }\n    void Compute(OpKernelContext * context)override{\n      int m = npoint_;\n\n      const Tensor& inp_tensor=context->input(0);\n      OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"FarthestPointSample expects (batch_size,num_points,3) inp shape\"));\n      int b=inp_tensor.shape().dim_size(0);\n      int n=inp_tensor.shape().dim_size(1);\n      auto inp_flat=inp_tensor.flat<float>();\n      const float * inp=&(inp_flat(0));\n      Tensor * out_tensor;\n      OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m},&out_tensor));\n      auto out_flat=out_tensor->flat<int>();\n      int * out=&(out_flat(0));\n      Tensor temp_tensor;\n      OP_REQUIRES_OK(context,context->allocate_temp(DataTypeToEnum<float>::value,TensorShape{32,n},&temp_tensor));\n      auto temp_flat=temp_tensor.flat<float>();\n      float * temp=&(temp_flat(0));\n      farthestpointsamplingLauncher(b,n,m,inp,temp,out);\n    }\n    private:\n        int npoint_;\n};\nREGISTER_KERNEL_BUILDER(Name(\"FarthestPointSample\").Device(DEVICE_GPU),FarthestPointSampleGpuOp);\n\nvoid gatherpointLauncher(int b,int n,int m,const float * inp,const int * idx,float * out);\nclass GatherPointGpuOp: public OpKernel{\n  public:\n    explicit GatherPointGpuOp(OpKernelConstruction * context):OpKernel(context){}\n    void Compute(OpKernelContext * context)override{\n      const Tensor& inp_tensor=context->input(0);\n      OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"GatherPoint expects (batch_size,num_points,3) inp shape\"));\n      int b=inp_tensor.shape().dim_size(0);\n      int n=inp_tensor.shape().dim_size(1);\n      const Tensor& idx_tensor=context->input(1);\n      OP_REQUIRES(context,idx_tensor.dims()==2 && idx_tensor.shape().dim_size(0)==b,errors::InvalidArgument(\"GatherPoint expects (batch_size,num_result) idx shape\"));\n      int m=idx_tensor.shape().dim_size(1);\n      auto inp_flat=inp_tensor.flat<float>();\n      const float * inp=&(inp_flat(0));\n      auto idx_flat=idx_tensor.flat<int>();\n      const int * idx=&(idx_flat(0));\n      Tensor * out_tensor=NULL;\n      OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m,3},&out_tensor));\n      auto out_flat=out_tensor->flat<float>();\n      float * out=&(out_flat(0));\n      gatherpointLauncher(b,n,m,inp,idx,out);\n    }\n};\nREGISTER_KERNEL_BUILDER(Name(\"GatherPoint\").Device(DEVICE_GPU),GatherPointGpuOp);\n\nvoid scatteraddpointLauncher(int b,int n,int m,const float * out_g,const int * idx,float * inp_g);\nclass GatherPointGradGpuOp: public OpKernel{\n  public:\n    explicit GatherPointGradGpuOp(OpKernelConstruction * context):OpKernel(context){}\n    void Compute(OpKernelContext * context)override{\n      const Tensor& inp_tensor=context->input(0);\n      OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"GatherPointGradGpuOp expects (batch_size,num_points,3) inp\"));\n      int b=inp_tensor.shape().dim_size(0);\n      int n=inp_tensor.shape().dim_size(1);\n      const Tensor& idx_tensor=context->input(1);\n      OP_REQUIRES(context,idx_tensor.dims()==2 && idx_tensor.shape().dim_size(0)==b,errors::InvalidArgument(\"GatherPointGradGpuOp expects (batch_size,num_result) idx shape\"));\n      int m=idx_tensor.shape().dim_size(1);\n      auto inp_flat=inp_tensor.flat<float>();\n      const float * inp=&(inp_flat(0));\n      auto idx_flat=idx_tensor.flat<int>();\n      const int * idx=&(idx_flat(0));\n      const Tensor& out_g_tensor=context->input(2);\n      OP_REQUIRES(context,out_g_tensor.dims()==3 && out_g_tensor.shape().dim_size(0)==b && out_g_tensor.shape().dim_size(1)==m && out_g_tensor.shape().dim_size(2)==3,errors::InvalidArgument(\"GatherPointGradGpuOp expects (batch_size,num_result,3) out_g shape\"));\n      auto out_g_flat=out_g_tensor.flat<float>();\n      const float * out_g=&(out_g_flat(0));\n      Tensor * inp_g_tensor=NULL;\n      OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n,3},&inp_g_tensor));\n      auto inp_g_flat=inp_g_tensor->flat<float>();\n      float * inp_g=&(inp_g_flat(0));\n      cudaMemset(inp_g,0,b*n*3*4);\n      scatteraddpointLauncher(b,n,m,out_g,idx,inp_g);\n    }\n};\nREGISTER_KERNEL_BUILDER(Name(\"GatherPointGrad\").Device(DEVICE_GPU),GatherPointGradGpuOp);\n\n"
  },
  {
    "path": "code/tf_ops/sampling/tf_sampling.py",
    "content": "''' Furthest point sampling\nOriginal author: Haoqiang Fan\nModified by Charles R. Qi\nAll Rights Reserved. 2017. \n'''\nimport tensorflow as tf\nfrom tensorflow.python.framework import ops\nimport sys\nimport os\nBASE_DIR = os.path.dirname(os.path.abspath(__file__))\nsys.path.append(BASE_DIR)\nsampling_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_sampling_so.so'))\ndef prob_sample(inp,inpr):\n    '''\ninput:\n    batch_size * ncategory float32\n    batch_size * npoints   float32\nreturns:\n    batch_size * npoints   int32\n    '''\n    return sampling_module.prob_sample(inp,inpr)\nops.NoGradient('ProbSample')\n# TF1.0 API requires set shape in C++\n#@tf.RegisterShape('ProbSample')\n#def _prob_sample_shape(op):\n#    shape1=op.inputs[0].get_shape().with_rank(2)\n#    shape2=op.inputs[1].get_shape().with_rank(2)\n#    return [tf.TensorShape([shape2.dims[0],shape2.dims[1]])]\ndef gather_point(inp,idx):\n    '''\ninput:\n    batch_size * ndataset * 3   float32\n    batch_size * npoints        int32\nreturns:\n    batch_size * npoints * 3    float32\n    '''\n    return sampling_module.gather_point(inp,idx)\n#@tf.RegisterShape('GatherPoint')\n#def _gather_point_shape(op):\n#    shape1=op.inputs[0].get_shape().with_rank(3)\n#    shape2=op.inputs[1].get_shape().with_rank(2)\n#    return [tf.TensorShape([shape1.dims[0],shape2.dims[1],shape1.dims[2]])]\n@tf.RegisterGradient('GatherPoint')\ndef _gather_point_grad(op,out_g):\n    inp=op.inputs[0]\n    idx=op.inputs[1]\n    return [sampling_module.gather_point_grad(inp,idx,out_g),None]\ndef farthest_point_sample(npoint,inp):\n    '''\ninput:\n    int32\n    batch_size * ndataset * 3   float32\nreturns:\n    batch_size * npoint         int32\n    '''\n    return sampling_module.farthest_point_sample(inp, npoint)\nops.NoGradient('FarthestPointSample')\n    \n\nif __name__=='__main__':\n    import numpy as np\n    np.random.seed(100)\n    triangles=np.random.rand(1,5,3,3).astype('float32')\n    with tf.device('/gpu:1'):\n        inp=tf.constant(triangles)\n        tria=inp[:,:,0,:]\n        trib=inp[:,:,1,:]\n        tric=inp[:,:,2,:]\n        areas=tf.sqrt(tf.reduce_sum(tf.cross(trib-tria,tric-tria)**2,2)+1e-9)\n        randomnumbers=tf.random_uniform((1,8192))\n        triids=prob_sample(areas,randomnumbers)\n        tria_sample=gather_point(tria,triids)\n        trib_sample=gather_point(trib,triids)\n        tric_sample=gather_point(tric,triids)\n        us=tf.random_uniform((1,8192))\n        vs=tf.random_uniform((1,8192))\n        uplusv=1-tf.abs(us+vs-1)\n        uminusv=us-vs\n        us=(uplusv+uminusv)*0.5\n        vs=(uplusv-uminusv)*0.5\n        pt_sample=tria_sample+(trib_sample-tria_sample)*tf.expand_dims(us,-1)+(tric_sample-tria_sample)*tf.expand_dims(vs,-1)\n        print 'pt_sample: ', pt_sample\n        reduced_sample=gather_point(pt_sample,farthest_point_sample(1024,pt_sample))\n        print reduced_sample\n    with tf.Session('') as sess:\n        ret=sess.run(reduced_sample)\n    print ret.shape,ret.dtype\n    import cPickle as pickle\n    pickle.dump(ret,open('1.pkl','wb'),-1)\n"
  },
  {
    "path": "code/tf_ops/sampling/tf_sampling_compile.sh",
    "content": "#/bin/bash\n/usr/local/cuda-8.0/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC\ng++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC -I /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include  -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0\n"
  },
  {
    "path": "code/tf_ops/sampling/tf_sampling_g.cu",
    "content": "/* Furthest point sampling GPU implementation\n * Original author: Haoqiang Fan\n * Modified by Charles R. Qi\n * All Rights Reserved. 2017. \n */\n\n__global__ void cumsumKernel(int b,int n,const float * __restrict__ inp,float * __restrict__ out){\n  const int BlockSize=2048;\n  const int paddingLevel=5;\n  __shared__ float buffer4[BlockSize*4];\n  __shared__ float buffer[BlockSize+(BlockSize>>paddingLevel)];\n  for (int i=blockIdx.x;i<b;i+=gridDim.x){\n    float runningsum=0,runningsum2=0;\n    for (int j=0;j<n;j+=BlockSize*4){\n      int n24_i=min(n-j,BlockSize*4);\n      int n24=(n24_i+3)&~3;\n      int n2=n24>>2;\n      for (int k=threadIdx.x*4;k<n24_i;k+=blockDim.x*4){\n        if (k+3<n24_i){\n          float v1=inp[i*n+j+k];\n          float v2=inp[i*n+j+k+1];\n          v2+=v1;\n          float v3=inp[i*n+j+k+2];\n          float v4=inp[i*n+j+k+3];\n          v4+=v3;\n          v3+=v2;\n          v4+=v2;\n          buffer4[k]=v1;\n          buffer4[k+1]=v2;\n          buffer4[k+2]=v3;\n          buffer4[k+3]=v4;\n          buffer[(k>>2)+(k>>(2+paddingLevel))]=v4;\n        }else{\n          float v=0;\n          for (int k2=k;k2<n24_i;k2++){\n            v+=inp[i*n+j+k2];\n            buffer4[k2]=v;\n          }\n          for (int k2=n24_i;k2<n24;k2++){\n            buffer4[k2]=v;\n          }\n          buffer[(k>>2)+(k>>(2+paddingLevel))]=v;\n        }\n      }\n      int u=0;\n      for (;(2<<u)<=n2;u++){\n        __syncthreads();\n        for (int k=threadIdx.x;k<int(n2>>(u+1));k+=blockDim.x){\n          int i1=(((k<<1)+2)<<u)-1;\n          int i2=(((k<<1)+1)<<u)-1;\n          i1+=i1>>paddingLevel;\n          i2+=i2>>paddingLevel;\n          buffer[i1]+=buffer[i2];\n        }\n      }\n      u--;\n      for (;u>=0;u--){\n        __syncthreads();\n        for (int k=threadIdx.x;k<int((n2-(1<<u))>>(u+1));k+=blockDim.x){\n          int i1=(((k<<1)+3)<<u)-1;\n          int i2=(((k<<1)+2)<<u)-1;\n          i1+=i1>>paddingLevel;\n          i2+=i2>>paddingLevel;\n          buffer[i1]+=buffer[i2];\n        }\n      }\n      __syncthreads();\n      for (int k=threadIdx.x*4;k<n24;k+=blockDim.x*4){\n        if (k!=0){\n          int k2=((k>>2)-1)+(((k>>2)-1)>>paddingLevel);\n          buffer4[k]+=buffer[k2];\n          buffer4[k+1]+=buffer[k2];\n          buffer4[k+2]+=buffer[k2];\n          buffer4[k+3]+=buffer[k2];\n        }\n      }\n      __syncthreads();\n      for (int k=threadIdx.x;k<n24_i;k+=blockDim.x){\n        out[i*n+j+k]=buffer4[k]+runningsum;\n      }\n      float t=buffer[(n2-1)+((n2-1)>>paddingLevel)]+runningsum2;\n      float r2=runningsum+t;\n      runningsum2=t-(r2-runningsum);\n      runningsum=r2;\n      __syncthreads();\n    }\n  }\n}\n\n__global__ void binarysearchKernel(int b,int n,int m,const float * __restrict__ dataset,const float * __restrict__ query, int * __restrict__ result){\n  int base=1;\n  while (base<n)\n    base<<=1;\n  for (int i=blockIdx.x;i<b;i+=gridDim.x){\n    for (int j=blockIdx.y*blockDim.x+threadIdx.x;j<m;j+=blockDim.x*gridDim.y){\n      float q=query[i*m+j]*dataset[i*n+n-1];\n      int r=n-1;\n      for (int k=base;k>=1;k>>=1)\n        if (r>=k && dataset[i*n+r-k]>=q)\n          r-=k;\n      result[i*m+j]=r;\n    }\n  }\n}\n__global__ void farthestpointsamplingKernel(int b,int n,int m,const float * __restrict__ dataset,float * __restrict__ temp,int * __restrict__ idxs){\n  if (m<=0)\n    return;\n  const int BlockSize=512;\n  __shared__ float dists[BlockSize];\n  __shared__ int dists_i[BlockSize];\n  const int BufferSize=3072;\n  __shared__ float buf[BufferSize*3];\n  for (int i=blockIdx.x;i<b;i+=gridDim.x){\n    int old=0;\n    if (threadIdx.x==0)\n      idxs[i*m+0]=old;\n    for (int j=threadIdx.x;j<n;j+=blockDim.x){\n      temp[blockIdx.x*n+j]=1e38;\n    }\n    for (int j=threadIdx.x;j<min(BufferSize,n)*3;j+=blockDim.x){\n      buf[j]=dataset[i*n*3+j];\n    }\n    __syncthreads();\n    for (int j=1;j<m;j++){\n      int besti=0;\n      float best=-1;\n      float x1=dataset[i*n*3+old*3+0];\n      float y1=dataset[i*n*3+old*3+1];\n      float z1=dataset[i*n*3+old*3+2];\n      for (int k=threadIdx.x;k<n;k+=blockDim.x){\n        float td=temp[blockIdx.x*n+k];\n        float x2,y2,z2;\n        if (k<BufferSize){\n          x2=buf[k*3+0];\n          y2=buf[k*3+1];\n          z2=buf[k*3+2];\n        }else{\n          x2=dataset[i*n*3+k*3+0];\n          y2=dataset[i*n*3+k*3+1];\n          z2=dataset[i*n*3+k*3+2];\n        }\n        float d=(x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1);\n        float d2=min(d,td);\n        if (d2!=td)\n          temp[blockIdx.x*n+k]=d2;\n        if (d2>best){\n          best=d2;\n          besti=k;\n        }\n      }\n      dists[threadIdx.x]=best;\n      dists_i[threadIdx.x]=besti;\n      for (int u=0;(1<<u)<blockDim.x;u++){\n        __syncthreads();\n        if (threadIdx.x<(blockDim.x>>(u+1))){\n          int i1=(threadIdx.x*2)<<u;\n          int i2=(threadIdx.x*2+1)<<u;\n          if (dists[i1]<dists[i2]){\n            dists[i1]=dists[i2];\n            dists_i[i1]=dists_i[i2];\n          }\n        }\n      }\n      __syncthreads();\n      old=dists_i[0];\n      if (threadIdx.x==0)\n        idxs[i*m+j]=old;\n    }\n  }\n}\n\n__global__ void gatherpointKernel(int b,int n,int m,const float * __restrict__ inp,const int * __restrict__ idx,float * __restrict__ out){\n  for (int i=blockIdx.x;i<b;i+=gridDim.x){\n    for (int j=blockIdx.y*blockDim.x+threadIdx.x;j<m;j+=blockDim.x*gridDim.y){\n      int a=idx[i*m+j];\n      out[(i*m+j)*3+0]=inp[(i*n+a)*3+0];\n      out[(i*m+j)*3+1]=inp[(i*n+a)*3+1];\n      out[(i*m+j)*3+2]=inp[(i*n+a)*3+2];\n    }\n  }\n}\n\n__global__ void scatteraddpointKernel(int b,int n,int m,const float * __restrict__ out_g,const int * __restrict__ idx,float * __restrict__ inp_g){\n  for (int i=blockIdx.x;i<b;i+=gridDim.x){\n    for (int j=blockIdx.y*blockDim.x+threadIdx.x;j<m;j+=blockDim.x*gridDim.y){\n      int a=idx[i*m+j];\n      atomicAdd(&inp_g[(i*n+a)*3+0],out_g[(i*m+j)*3+0]);\n      atomicAdd(&inp_g[(i*n+a)*3+1],out_g[(i*m+j)*3+1]);\n      atomicAdd(&inp_g[(i*n+a)*3+2],out_g[(i*m+j)*3+2]);\n    }\n  }\n}\n\nvoid cumsumLauncher(int b,int n,const float * inp,float * out){\n  cumsumKernel<<<32,512>>>(b,n,inp,out);\n}\n//require b*n working space\nvoid probsampleLauncher(int b,int n,int m,const float * inp_p,const float * inp_r,float * temp,int * out){\n  cumsumKernel<<<32,512>>>(b,n,inp_p,temp);\n  binarysearchKernel<<<dim3(32,8,1),512>>>(b,n,m,temp,inp_r,out);\n}\n//require 32*n working space\nvoid farthestpointsamplingLauncher(int b,int n,int m,const float * inp,float * temp,int * out){\n  farthestpointsamplingKernel<<<32,512>>>(b,n,m,inp,temp,out);\n}\nvoid gatherpointLauncher(int b,int n,int m,const float * inp,const int * idx,float * out){\n  gatherpointKernel<<<dim3(2,8,1),512>>>(b,n,m,inp,idx,out);\n}\nvoid scatteraddpointLauncher(int b,int n,int m,const float * out_g,const int * idx,float * inp_g){\n  scatteraddpointKernel<<<dim3(2,8,1),512>>>(b,n,m,out_g,idx,inp_g);\n}\n\n"
  },
  {
    "path": "code/train_model_combined.py",
    "content": "import numpy as np\nimport tensorflow as tf\nimport scipy.misc as smc\n\nimport config_res as config\n\nfrom common.cnn_utils_res import *\nfrom common import resnet_rgb_model as model\nfrom common import resnet_depth_model as model_depth\nfrom common import all_transformer as at3\nfrom common import global_agg_net\nfrom common.Lie_functions import exponential_map_single\n\nimport nw_loader_color as ldr\nimport model_utils\n\n\n_BETA_CONST = 1.0\n_ALPHA_CONST = 1.0\nIMG_HT = config.depth_img_params['IMG_HT']\nIMG_WDT = config.depth_img_params['IMG_WDT']\nbatch_size = config.net_params['batch_size']\nlearning_rate = config.net_params['learning_rate']\nn_epochs = config.net_params['epochs']\ncurrent_epoch = config.net_params['load_epoch']\n\ntf.reset_default_graph()\n\nX1 = tf.placeholder(tf.float32, shape = (batch_size, IMG_HT, IMG_WDT, 3), name = \"X1\")\nX2 = tf.placeholder(tf.float32, shape = (batch_size, IMG_HT, IMG_WDT, 1), name = \"X2\")\ndepth_maps_target = tf.placeholder(tf.float32, shape = (batch_size, IMG_HT, IMG_WDT, 1), name = \"depth_maps_target\")\nexpected_transforms = tf.placeholder(tf.float32, shape = (batch_size, 4, 4), name = \"expected_transforms\")\n\nphase = tf.placeholder(tf.bool, [], name = \"phase\")\nphase_rgb = tf.placeholder(tf.bool, [], name = \"phase_rgb\")\nkeep_prob = tf.placeholder(tf.float32, name = \"keep_prob\")\n\nfx = config.camera_params['fx']\nfy = config.camera_params['fy']\ncx = config.camera_params['cx']\ncy = config.camera_params['cy']\n\nfx_scaled = 2*(fx)/np.float32(IMG_WDT)              # focal length x scaled for -1 to 1 range\nfy_scaled = 2*(fy)/np.float32(IMG_HT)               # focal length y scaled for -1 to 1 range\ncx_scaled = -1 + 2*(cx - 1.0)/np.float32(IMG_WDT)   # optical center x scaled for -1 to 1 range\ncy_scaled = -1 + 2*(cy - 1.0)/np.float32(IMG_HT)    # optical center y scaled for -1 to 1 range\n\nK_mat_scaled = np.array([[fx_scaled,  0.0, cx_scaled],\n                         [0.0, fy_scaled,  cy_scaled],\n                         [0.0, 0.0, 1.0]], dtype = np.float32)\n\nK_final = tf.constant(K_mat_scaled, dtype = tf.float32)\nsmall_transform = tf.constant(config.camera_params['cam_transform_02_inv'], dtype = tf.float32)\n\n\nX2_pooled = tf.nn.max_pool(X2, ksize=[1,5,5,1], strides=[1,1,1,1], padding=\"SAME\")\ndepth_maps_target_pooled = tf.nn.max_pool(depth_maps_target, ksize=[1,5,5,1], strides=[1,1,1,1], padding=\"SAME\")\n\noutput_vectors, weight_summaries = global_agg_net.End_Net_Out(X1, phase_rgb, X2_pooled, phase, keep_prob)\n\n# se(3) -> SE(3) for the whole batch\npredicted_transforms = tf.map_fn(lambda x:exponential_map_single(output_vectors[x]), elems=tf.range(0, batch_size, 1), dtype=tf.float32)\n\n# transforms depth maps by the predicted transformation\ndepth_maps_predicted, cloud_pred = tf.map_fn(lambda x:at3._simple_transformer(X2_pooled[x,:,:,0]*40.0 + 40.0, predicted_transforms[x], K_final, small_transform), elems = tf.range(0, batch_size, 1), dtype = (tf.float32, tf.float32))\n\n# transforms depth maps by the expected transformation\ndepth_maps_expected, cloud_exp = tf.map_fn(lambda x:at3._simple_transformer(X2_pooled[x,:,:,0]*40.0 + 40.0, expected_transforms[x], K_final, small_transform), elems = tf.range(0, batch_size, 1), dtype = (tf.float32, tf.float32))\n\n# photometric loss between predicted and expected transformation\nphotometric_loss = tf.nn.l2_loss(tf.subtract((depth_maps_expected[:,10:-10,10:-10] - 40.0)/40.0, (depth_maps_predicted[:,10:-10,10:-10] - 40.0)/40.0))\n\n# earth mover's distance between point clouds\ncloud_loss = model_utils.get_emd_loss(cloud_pred, cloud_exp)\n\n# final loss term\npredicted_loss_train = _ALPHA_CONST*photometric_loss + _BETA_CONST*cloud_loss\n\ntf.add_to_collection('losses1', predicted_loss_train)\nloss1 = tf.add_n(tf.get_collection('losses1'))\n\ntrain_step = tf.train.AdamOptimizer(learning_rate = config.net_params['learning_rate'],\n                                    beta1 = config.net_params['beta1']).minimize(predicted_loss_train)\n\npredicted_loss_validation = tf.nn.l2_loss(tf.subtract((depth_maps_expected[:,10:-10,10:-10] - 40.0)/40.0, (depth_maps_predicted[:,10:-10,10:-10] - 40.0)/40.0))\n\ncloud_loss_validation = model_utils.get_emd_loss(cloud_pred, cloud_exp)\n\ntraining_summary_1 = tf.summary.scalar('cloud_loss', _BETA_CONST*cloud_loss)\ntraining_summary_2 = tf.summary.scalar('photometric_loss', photometric_loss)\nvalidation_summary_1 = tf.summary.scalar('Validation_loss', predicted_loss_validation)\nvalidation_summary_2 = tf.summary.scalar('Validation_cloud_loss', cloud_loss_validation)\n\nmerge_train = tf.summary.merge([training_summary_1] + [training_summary_2] + weight_summaries)\nmerge_val = tf.summary.merge([validation_summary_1] + [validation_summary_2])\n\nsaver = tf.train.Saver()\n\n# tensorflow gpu configuration. Not to be confused with network configuration file\n\nconfig_tf = tf.ConfigProto()\nconfig_tf.gpu_options.allow_growth=True\n\nwith tf.Session(config = config_tf) as sess:\n    sess.run(tf.global_variables_initializer())\n\n    writer = tf.summary.FileWriter(\"./logs_simple_transformer/\")\n\n    total_iterations_train = 0\n    total_iterations_validate = 0\n\n    if(current_epoch == 0):\n        writer.add_graph(sess.graph)\n\n    checkpoint_path = config.paths['checkpoint_path']\n\n    if(current_epoch > 0):\n        print(\"Restoring Checkpoint\")\n\n        saver.restore(sess, checkpoint_path + \"/model-%d\"%current_epoch)\n        current_epoch+=1\n        total_iterations_train = current_epoch*config.net_params['total_frames_train']/batch_size\n        total_iterations_validate = current_epoch*config.net_params['total_frames_validation']/batch_size\n\n    for epoch in range(current_epoch, n_epochs):\n        total_partitions_train = config.net_params['total_frames_train']/config.net_params['partition_limit']\n        total_partitions_validation = config.net_params['total_frames_validation']/config.net_params['partition_limit']\n        ldr.shuffle()\n\n        for part in range(total_partitions_train):\n            source_container, target_container, source_img_container, target_img_container, transforms_container = ldr.load(part, mode = \"train\")\n\n            for source_b, target_b, source_img_b, target_img_b, transforms_b in zip(source_container, target_container, source_img_container, target_img_container, transforms_container):\n\n                outputs= sess.run([depth_maps_predicted, depth_maps_expected, predicted_loss_train, X2_pooled, train_step, merge_train, predicted_transforms, cloud_loss, photometric_loss, loss1], feed_dict={X1: source_img_b, X2: source_b, depth_maps_target: target_b, expected_transforms: transforms_b ,phase:True, keep_prob:0.5, phase_rgb: False})\n\n                dmaps_pred = outputs[0]\n                dmaps_exp = outputs[1]\n                loss = outputs[2]\n                source = outputs[3]\n\n                if(total_iterations_train%10 == 0):\n                    writer.add_summary(outputs[5], total_iterations_train/10)\n\n                print(outputs[8], _ALPHA_CONST*outputs[8], outputs[7], _BETA_CONST*outputs[7], outputs[9],total_iterations_train)\n\n                random_disp = np.random.randint(batch_size)\n                print(outputs[6][random_disp])\n                print(transforms_b[random_disp])\n\n                if(total_iterations_train%125 == 0):\n\n                    smc.imsave(config.paths['training_imgs_path'] + \"/training_save_%d.png\"%total_iterations_train, np.vstack((source[random_disp,:,:,0]*40.0 + 40.0, dmaps_pred[random_disp], dmaps_exp[random_disp])))\n\n                total_iterations_train+=1\n\n        if (epoch%1 == 0):\n            print(\"Saving after epoch %d\"%epoch)\n            saver.save(sess, checkpoint_path + \"/model-%d\"%epoch)\n\n        for part in range(total_partitions_validation):\n            source_container, target_container, source_img_container, target_img_container, transforms_container = ldr.load(part, mode = \"validation\")\n\n            for source_b, target_b, source_img_b, target_img_b, transforms_b in zip(source_container, target_container, source_img_container, target_img_container, transforms_container):\n\n                outputs= sess.run([depth_maps_predicted, depth_maps_expected, predicted_loss_validation, X2_pooled, merge_val, cloud_loss_validation], feed_dict={X1: source_img_b, X2: source_b, depth_maps_target: target_b, expected_transforms: transforms_b ,phase:False, keep_prob:1.0, phase_rgb: False})\n\n                dmaps_pred = outputs[0]\n                dmaps_exp = outputs[1]\n                loss = outputs[2]\n                source = outputs[3]\n\n                writer.add_summary(outputs[4], total_iterations_validate)\n                total_iterations_validate+=1\n\n                print(loss, total_iterations_validate, outputs[5])\n\n                if(total_iterations_validate%25 == 0):\n\n                    random_disp = np.random.randint(batch_size)\n\n                    smc.imsave(config.paths['validation_imgs_path'] + \"/validation_save_%d.png\"%total_iterations_validate, np.vstack((source[random_disp,:,:,0]*40.0 + 40.0, dmaps_pred[random_disp], dmaps_exp[random_disp])))\n"
  },
  {
    "path": "code/utils/__init__.py",
    "content": ""
  },
  {
    "path": "code/utils/data_prep_util.py",
    "content": "import os\nimport sys\nBASE_DIR = os.path.dirname(os.path.abspath(__file__))\nsys.path.append(BASE_DIR)\nfrom plyfile import (PlyData, PlyElement, make2d, PlyParseError, PlyProperty)\nimport numpy as np\nimport h5py\n\nSAMPLING_BIN = os.path.join(BASE_DIR, 'third_party/mesh_sampling/build/pcsample')\n\nSAMPLING_POINT_NUM = 2048\nSAMPLING_LEAF_SIZE = 0.005\n\nMODELNET40_PATH = '../datasets/modelnet40'\ndef export_ply(pc, filename):\n\tvertex = np.zeros(pc.shape[0], dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')])\n\tfor i in range(pc.shape[0]):\n\t\tvertex[i] = (pc[i][0], pc[i][1], pc[i][2])\n\tply_out = PlyData([PlyElement.describe(vertex, 'vertex', comments=['vertices'])])\n\tply_out.write(filename)\n\n# Sample points on the obj shape\ndef get_sampling_command(obj_filename, ply_filename):\n    cmd = SAMPLING_BIN + ' ' + obj_filename\n    cmd += ' ' + ply_filename\n    cmd += ' -n_samples %d ' % SAMPLING_POINT_NUM\n    cmd += ' -leaf_size %f ' % SAMPLING_LEAF_SIZE\n    return cmd\n\n# --------------------------------------------------------------\n# Following are the helper functions to load MODELNET40 shapes\n# --------------------------------------------------------------\n\n# Read in the list of categories in MODELNET40\ndef get_category_names():\n    shape_names_file = os.path.join(MODELNET40_PATH, 'shape_names.txt')\n    shape_names = [line.rstrip() for line in open(shape_names_file)]\n    return shape_names\n\n# Return all the filepaths for the shapes in MODELNET40 \ndef get_obj_filenames():\n    obj_filelist_file = os.path.join(MODELNET40_PATH, 'filelist.txt')\n    obj_filenames = [os.path.join(MODELNET40_PATH, line.rstrip()) for line in open(obj_filelist_file)]\n    print('Got %d obj files in modelnet40.' % len(obj_filenames))\n    return obj_filenames\n\n# Helper function to create the father folder and all subdir folders if not exist\ndef batch_mkdir(output_folder, subdir_list):\n    if not os.path.exists(output_folder):\n        os.mkdir(output_folder)\n    for subdir in subdir_list:\n        if not os.path.exists(os.path.join(output_folder, subdir)):\n            os.mkdir(os.path.join(output_folder, subdir))\n\n# ----------------------------------------------------------------\n# Following are the helper functions to load save/load HDF5 files\n# ----------------------------------------------------------------\n\n# Write numpy array data and label to h5_filename\ndef save_h5_data_label_normal(h5_filename, data, label, normal, \n\t\tdata_dtype='float32', label_dtype='uint8', noral_dtype='float32'):\n    h5_fout = h5py.File(h5_filename)\n    h5_fout.create_dataset(\n            'data', data=data,\n            compression='gzip', compression_opts=4,\n            dtype=data_dtype)\n    h5_fout.create_dataset(\n            'normal', data=normal,\n            compression='gzip', compression_opts=4,\n            dtype=normal_dtype)\n    h5_fout.create_dataset(\n            'label', data=label,\n            compression='gzip', compression_opts=1,\n            dtype=label_dtype)\n    h5_fout.close()\n\n\n# Write numpy array data and label to h5_filename\ndef save_h5(h5_filename, data, label, data_dtype='uint8', label_dtype='uint8'):\n    h5_fout = h5py.File(h5_filename)\n    h5_fout.create_dataset(\n            'data', data=data,\n            compression='gzip', compression_opts=4,\n            dtype=data_dtype)\n    h5_fout.create_dataset(\n            'label', data=label,\n            compression='gzip', compression_opts=1,\n            dtype=label_dtype)\n    h5_fout.close()\n\n# Read numpy array data and label from h5_filename\ndef load_h5_data_label_normal(h5_filename):\n    f = h5py.File(h5_filename)\n    data = f['data'][:]\n    label = f['label'][:]\n    normal = f['normal'][:]\n    return (data, label, normal)\n\n# Read numpy array data and label from h5_filename\ndef load_h5_data_label_seg(h5_filename):\n    f = h5py.File(h5_filename)\n    data = f['data'][:]\n    label = f['label'][:]\n    seg = f['pid'][:]\n    return (data, label, seg)\n\n# Read numpy array data and label from h5_filename\ndef load_h5(h5_filename):\n    f = h5py.File(h5_filename)\n    data = f['data'][:]\n    label = f['label'][:]\n    return (data, label)\n\n# ----------------------------------------------------------------\n# Following are the helper functions to load save/load PLY files\n# ----------------------------------------------------------------\n\n# Load PLY file\ndef load_ply_data(filename, point_num):\n    plydata = PlyData.read(filename)\n    pc = plydata['vertex'].data[:point_num]\n    pc_array = np.array([[x, y, z] for x,y,z in pc])\n    return pc_array\n\n# Load PLY file\ndef load_ply_normal(filename, point_num):\n    plydata = PlyData.read(filename)\n    pc = plydata['normal'].data[:point_num]\n    pc_array = np.array([[x, y, z] for x,y,z in pc])\n    return pc_array\n\n# Make up rows for Nxk array\n# Input Pad is 'edge' or 'constant'\ndef pad_arr_rows(arr, row, pad='edge'):\n    assert(len(arr.shape) == 2)\n    assert(arr.shape[0] <= row)\n    assert(pad == 'edge' or pad == 'constant')\n    if arr.shape[0] == row:\n        return arr\n    if pad == 'edge':\n        return np.lib.pad(arr, ((0, row-arr.shape[0]), (0, 0)), 'edge')\n    if pad == 'constant':\n        return np.lib.pad(arr, ((0, row-arr.shape[0]), (0, 0)), 'constant', (0, 0))\n\n\n"
  },
  {
    "path": "code/utils/eulerangles.py",
    "content": "# emacs: -*- mode: python-mode; py-indent-offset: 4; indent-tabs-mode: nil -*-\n# vi: set ft=python sts=4 ts=4 sw=4 et:\n### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ##\n#\n#   See COPYING file distributed along with the NiBabel package for the\n#   copyright and license terms.\n#\n### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ##\n''' Module implementing Euler angle rotations and their conversions\n\nSee:\n\n* http://en.wikipedia.org/wiki/Rotation_matrix\n* http://en.wikipedia.org/wiki/Euler_angles\n* http://mathworld.wolfram.com/EulerAngles.html\n\nSee also: *Representing Attitude with Euler Angles and Quaternions: A\nReference* (2006) by James Diebel. A cached PDF link last found here:\n\nhttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.110.5134\n\nEuler's rotation theorem tells us that any rotation in 3D can be\ndescribed by 3 angles.  Let's call the 3 angles the *Euler angle vector*\nand call the angles in the vector :math:`alpha`, :math:`beta` and\n:math:`gamma`.  The vector is [ :math:`alpha`,\n:math:`beta`. :math:`gamma` ] and, in this description, the order of the\nparameters specifies the order in which the rotations occur (so the\nrotation corresponding to :math:`alpha` is applied first).\n\nIn order to specify the meaning of an *Euler angle vector* we need to\nspecify the axes around which each of the rotations corresponding to\n:math:`alpha`, :math:`beta` and :math:`gamma` will occur.\n\nThere are therefore three axes for the rotations :math:`alpha`,\n:math:`beta` and :math:`gamma`; let's call them :math:`i` :math:`j`,\n:math:`k`.\n\nLet us express the rotation :math:`alpha` around axis `i` as a 3 by 3\nrotation matrix `A`.  Similarly :math:`beta` around `j` becomes 3 x 3\nmatrix `B` and :math:`gamma` around `k` becomes matrix `G`.  Then the\nwhole rotation expressed by the Euler angle vector [ :math:`alpha`,\n:math:`beta`. :math:`gamma` ], `R` is given by::\n\n   R = np.dot(G, np.dot(B, A))\n\nSee http://mathworld.wolfram.com/EulerAngles.html\n\nThe order :math:`G B A` expresses the fact that the rotations are\nperformed in the order of the vector (:math:`alpha` around axis `i` =\n`A` first).\n\nTo convert a given Euler angle vector to a meaningful rotation, and a\nrotation matrix, we need to define:\n\n* the axes `i`, `j`, `k`\n* whether a rotation matrix should be applied on the left of a vector to\n  be transformed (vectors are column vectors) or on the right (vectors\n  are row vectors).\n* whether the rotations move the axes as they are applied (intrinsic\n  rotations) - compared the situation where the axes stay fixed and the\n  vectors move within the axis frame (extrinsic)\n* the handedness of the coordinate system\n\nSee: http://en.wikipedia.org/wiki/Rotation_matrix#Ambiguities\n\nWe are using the following conventions:\n\n* axes `i`, `j`, `k` are the `z`, `y`, and `x` axes respectively.  Thus\n  an Euler angle vector [ :math:`alpha`, :math:`beta`. :math:`gamma` ]\n  in our convention implies a :math:`alpha` radian rotation around the\n  `z` axis, followed by a :math:`beta` rotation around the `y` axis,\n  followed by a :math:`gamma` rotation around the `x` axis.\n* the rotation matrix applies on the left, to column vectors on the\n  right, so if `R` is the rotation matrix, and `v` is a 3 x N matrix\n  with N column vectors, the transformed vector set `vdash` is given by\n  ``vdash = np.dot(R, v)``.\n* extrinsic rotations - the axes are fixed, and do not move with the\n  rotations.\n* a right-handed coordinate system\n\nThe convention of rotation around ``z``, followed by rotation around\n``y``, followed by rotation around ``x``, is known (confusingly) as\n\"xyz\", pitch-roll-yaw, Cardan angles, or Tait-Bryan angles.\n'''\n\nimport math\n\nimport sys\nif sys.version_info >= (3,0):\n    from functools import reduce\n\nimport numpy as np\n\n\n_FLOAT_EPS_4 = np.finfo(float).eps * 4.0\n\n\ndef euler2mat(z=0, y=0, x=0):\n    ''' Return matrix for rotations around z, y and x axes\n\n    Uses the z, then y, then x convention above\n\n    Parameters\n    ----------\n    z : scalar\n       Rotation angle in radians around z-axis (performed first)\n    y : scalar\n       Rotation angle in radians around y-axis\n    x : scalar\n       Rotation angle in radians around x-axis (performed last)\n\n    Returns\n    -------\n    M : array shape (3,3)\n       Rotation matrix giving same rotation as for given angles\n\n    Examples\n    --------\n    >>> zrot = 1.3 # radians\n    >>> yrot = -0.1\n    >>> xrot = 0.2\n    >>> M = euler2mat(zrot, yrot, xrot)\n    >>> M.shape == (3, 3)\n    True\n\n    The output rotation matrix is equal to the composition of the\n    individual rotations\n\n    >>> M1 = euler2mat(zrot)\n    >>> M2 = euler2mat(0, yrot)\n    >>> M3 = euler2mat(0, 0, xrot)\n    >>> composed_M = np.dot(M3, np.dot(M2, M1))\n    >>> np.allclose(M, composed_M)\n    True\n\n    You can specify rotations by named arguments\n\n    >>> np.all(M3 == euler2mat(x=xrot))\n    True\n\n    When applying M to a vector, the vector should column vector to the\n    right of M.  If the right hand side is a 2D array rather than a\n    vector, then each column of the 2D array represents a vector.\n\n    >>> vec = np.array([1, 0, 0]).reshape((3,1))\n    >>> v2 = np.dot(M, vec)\n    >>> vecs = np.array([[1, 0, 0],[0, 1, 0]]).T # giving 3x2 array\n    >>> vecs2 = np.dot(M, vecs)\n\n    Rotations are counter-clockwise.\n\n    >>> zred = np.dot(euler2mat(z=np.pi/2), np.eye(3))\n    >>> np.allclose(zred, [[0, -1, 0],[1, 0, 0], [0, 0, 1]])\n    True\n    >>> yred = np.dot(euler2mat(y=np.pi/2), np.eye(3))\n    >>> np.allclose(yred, [[0, 0, 1],[0, 1, 0], [-1, 0, 0]])\n    True\n    >>> xred = np.dot(euler2mat(x=np.pi/2), np.eye(3))\n    >>> np.allclose(xred, [[1, 0, 0],[0, 0, -1], [0, 1, 0]])\n    True\n\n    Notes\n    -----\n    The direction of rotation is given by the right-hand rule (orient\n    the thumb of the right hand along the axis around which the rotation\n    occurs, with the end of the thumb at the positive end of the axis;\n    curl your fingers; the direction your fingers curl is the direction\n    of rotation).  Therefore, the rotations are counterclockwise if\n    looking along the axis of rotation from positive to negative.\n    '''\n    Ms = []\n    if z:\n        cosz = math.cos(z)\n        sinz = math.sin(z)\n        Ms.append(np.array(\n                [[cosz, -sinz, 0],\n                 [sinz, cosz, 0],\n                 [0, 0, 1]]))\n    if y:\n        cosy = math.cos(y)\n        siny = math.sin(y)\n        Ms.append(np.array(\n                [[cosy, 0, siny],\n                 [0, 1, 0],\n                 [-siny, 0, cosy]]))\n    if x:\n        cosx = math.cos(x)\n        sinx = math.sin(x)\n        Ms.append(np.array(\n                [[1, 0, 0],\n                 [0, cosx, -sinx],\n                 [0, sinx, cosx]]))\n    if Ms:\n        return reduce(np.dot, Ms[::-1])\n    return np.eye(3)\n\n\ndef mat2euler(M, cy_thresh=None):\n    ''' Discover Euler angle vector from 3x3 matrix\n\n    Uses the conventions above.\n\n    Parameters\n    ----------\n    M : array-like, shape (3,3)\n    cy_thresh : None or scalar, optional\n       threshold below which to give up on straightforward arctan for\n       estimating x rotation.  If None (default), estimate from\n       precision of input.\n\n    Returns\n    -------\n    z : scalar\n    y : scalar\n    x : scalar\n       Rotations in radians around z, y, x axes, respectively\n\n    Notes\n    -----\n    If there was no numerical error, the routine could be derived using\n    Sympy expression for z then y then x rotation matrix, which is::\n\n      [                       cos(y)*cos(z),                       -cos(y)*sin(z),         sin(y)],\n      [cos(x)*sin(z) + cos(z)*sin(x)*sin(y), cos(x)*cos(z) - sin(x)*sin(y)*sin(z), -cos(y)*sin(x)],\n      [sin(x)*sin(z) - cos(x)*cos(z)*sin(y), cos(z)*sin(x) + cos(x)*sin(y)*sin(z),  cos(x)*cos(y)]\n\n    with the obvious derivations for z, y, and x\n\n       z = atan2(-r12, r11)\n       y = asin(r13)\n       x = atan2(-r23, r33)\n\n    Problems arise when cos(y) is close to zero, because both of::\n\n       z = atan2(cos(y)*sin(z), cos(y)*cos(z))\n       x = atan2(cos(y)*sin(x), cos(x)*cos(y))\n\n    will be close to atan2(0, 0), and highly unstable.\n\n    The ``cy`` fix for numerical instability below is from: *Graphics\n    Gems IV*, Paul Heckbert (editor), Academic Press, 1994, ISBN:\n    0123361559.  Specifically it comes from EulerAngles.c by Ken\n    Shoemake, and deals with the case where cos(y) is close to zero:\n\n    See: http://www.graphicsgems.org/\n\n    The code appears to be licensed (from the website) as \"can be used\n    without restrictions\".\n    '''\n    M = np.asarray(M)\n    if cy_thresh is None:\n        try:\n            cy_thresh = np.finfo(M.dtype).eps * 4\n        except ValueError:\n            cy_thresh = _FLOAT_EPS_4\n    r11, r12, r13, r21, r22, r23, r31, r32, r33 = M.flat\n    # cy: sqrt((cos(y)*cos(z))**2 + (cos(x)*cos(y))**2)\n    cy = math.sqrt(r33*r33 + r23*r23)\n    if cy > cy_thresh: # cos(y) not close to zero, standard form\n        z = math.atan2(-r12,  r11) # atan2(cos(y)*sin(z), cos(y)*cos(z))\n        y = math.atan2(r13,  cy) # atan2(sin(y), cy)\n        x = math.atan2(-r23, r33) # atan2(cos(y)*sin(x), cos(x)*cos(y))\n    else: # cos(y) (close to) zero, so x -> 0.0 (see above)\n        # so r21 -> sin(z), r22 -> cos(z) and\n        z = math.atan2(r21,  r22)\n        y = math.atan2(r13,  cy) # atan2(sin(y), cy)\n        x = 0.0\n    return z, y, x\n\n\ndef euler2quat(z=0, y=0, x=0):\n    ''' Return quaternion corresponding to these Euler angles\n\n    Uses the z, then y, then x convention above\n\n    Parameters\n    ----------\n    z : scalar\n       Rotation angle in radians around z-axis (performed first)\n    y : scalar\n       Rotation angle in radians around y-axis\n    x : scalar\n       Rotation angle in radians around x-axis (performed last)\n\n    Returns\n    -------\n    quat : array shape (4,)\n       Quaternion in w, x, y z (real, then vector) format\n\n    Notes\n    -----\n    We can derive this formula in Sympy using:\n\n    1. Formula giving quaternion corresponding to rotation of theta radians\n       about arbitrary axis:\n       http://mathworld.wolfram.com/EulerParameters.html\n    2. Generated formulae from 1.) for quaternions corresponding to\n       theta radians rotations about ``x, y, z`` axes\n    3. Apply quaternion multiplication formula -\n       http://en.wikipedia.org/wiki/Quaternions#Hamilton_product - to\n       formulae from 2.) to give formula for combined rotations.\n    '''\n    z = z/2.0\n    y = y/2.0\n    x = x/2.0\n    cz = math.cos(z)\n    sz = math.sin(z)\n    cy = math.cos(y)\n    sy = math.sin(y)\n    cx = math.cos(x)\n    sx = math.sin(x)\n    return np.array([\n             cx*cy*cz - sx*sy*sz,\n             cx*sy*sz + cy*cz*sx,\n             cx*cz*sy - sx*cy*sz,\n             cx*cy*sz + sx*cz*sy])\n\n\ndef quat2euler(q):\n    ''' Return Euler angles corresponding to quaternion `q`\n\n    Parameters\n    ----------\n    q : 4 element sequence\n       w, x, y, z of quaternion\n\n    Returns\n    -------\n    z : scalar\n       Rotation angle in radians around z-axis (performed first)\n    y : scalar\n       Rotation angle in radians around y-axis\n    x : scalar\n       Rotation angle in radians around x-axis (performed last)\n\n    Notes\n    -----\n    It's possible to reduce the amount of calculation a little, by\n    combining parts of the ``quat2mat`` and ``mat2euler`` functions, but\n    the reduction in computation is small, and the code repetition is\n    large.\n    '''\n    # delayed import to avoid cyclic dependencies\n    import nibabel.quaternions as nq\n    return mat2euler(nq.quat2mat(q))\n\n\ndef euler2angle_axis(z=0, y=0, x=0):\n    ''' Return angle, axis corresponding to these Euler angles\n\n    Uses the z, then y, then x convention above\n\n    Parameters\n    ----------\n    z : scalar\n       Rotation angle in radians around z-axis (performed first)\n    y : scalar\n       Rotation angle in radians around y-axis\n    x : scalar\n       Rotation angle in radians around x-axis (performed last)\n\n    Returns\n    -------\n    theta : scalar\n       angle of rotation\n    vector : array shape (3,)\n       axis around which rotation occurs\n\n    Examples\n    --------\n    >>> theta, vec = euler2angle_axis(0, 1.5, 0)\n    >>> print(theta)\n    1.5\n    >>> np.allclose(vec, [0, 1, 0])\n    True\n    '''\n    # delayed import to avoid cyclic dependencies\n    import nibabel.quaternions as nq\n    return nq.quat2angle_axis(euler2quat(z, y, x))\n\n\ndef angle_axis2euler(theta, vector, is_normalized=False):\n    ''' Convert angle, axis pair to Euler angles\n\n    Parameters\n    ----------\n    theta : scalar\n       angle of rotation\n    vector : 3 element sequence\n       vector specifying axis for rotation.\n    is_normalized : bool, optional\n       True if vector is already normalized (has norm of 1).  Default\n       False\n\n    Returns\n    -------\n    z : scalar\n    y : scalar\n    x : scalar\n       Rotations in radians around z, y, x axes, respectively\n\n    Examples\n    --------\n    >>> z, y, x = angle_axis2euler(0, [1, 0, 0])\n    >>> np.allclose((z, y, x), 0)\n    True\n\n    Notes\n    -----\n    It's possible to reduce the amount of calculation a little, by\n    combining parts of the ``angle_axis2mat`` and ``mat2euler``\n    functions, but the reduction in computation is small, and the code\n    repetition is large.\n    '''\n    # delayed import to avoid cyclic dependencies\n    import nibabel.quaternions as nq\n    M = nq.angle_axis2mat(theta, vector, is_normalized)\n    return mat2euler(M)\n"
  },
  {
    "path": "code/utils/modelnet_data_prep.py",
    "content": "import os\nimport sys\nimport glob\nimport h5py\nimport numpy as np\nfrom multiprocessing.dummy import Pool as ThreadPool\nimport threading\nfrom tqdm import tqdm\n\nSAMPLING_BIN = \"/home/lqyu/workspace/pcl-pcl-1.8.1/build/bin/pcl_mesh_sampling\"\nSAMPLING_POINT_NUM = 8192\nSAMPLING_LEAF_SIZE = 0.02\nSAVE_ROOT_PATH = '../../data/Patches'\n\n\ndef nonuniform_sampling(num = 4096, sample_num = 1024):\n    sample = set()\n    loc = np.random.rand()*0.6+0.2\n    while(len(sample)<sample_num):\n        a = int(np.random.normal(loc=loc,scale=0.4)*num)\n        if a<0 or a>=num:\n            continue\n        sample.add(a)\n\n    return list(sample)\n\ndef save_file(file_path, data):\n    if not os.path.exists(os.path.split(file_path)[0]):\n        os.makedirs(os.path.split(file_path)[0])\n    np.savetxt(file_path, data, fmt='%.6f')\n\ndef tmp(file_path=None):\n    file_list = glob.glob('/home/lqyu/workspace/PointSR/data/ModelNet10_test/ModelNet10_MC_2k/*.xyz')\n    file_list.sort()\n    for file_path in file_list:\n        print file_path\n        data_4096 = np.loadtxt(file_path)\n        data = data_4096[:, 0:3]\n        centroid = np.mean(data, axis=0, keepdims=True)\n        data = data - centroid\n        furthest_distance = np.amax(np.sqrt(np.sum(abs(data) ** 2, axis=-1)))\n        data = data / furthest_distance\n        data_4096[:, 0:3] = data\n        save_path = file_path.replace('ModelNet10_MC_2k','ModelNet10_MC_2k_normalize')\n        save_file(save_path,data_4096)\n\n        off_path = file_path.replace('ModelNet10_MC_2k','mesh')\n        off_path = off_path.replace('.xyz', '.off')\n        save_off_path = off_path.replace('mesh', 'mesh_normalize')\n        if not os.path.exists(os.path.split(save_off_path)[0]):\n            os.makedirs(os.path.split(save_off_path)[0])\n\n        offFile = open(off_path, 'r')\n        lines = offFile.readlines()\n        offFile.close()\n        with open(save_off_path, 'w') as f:\n            f.writelines(lines[0:2])\n            params = lines[1].split(' ')\n            nVert = int(params[0])\n            for i in range(2, nVert + 2):\n                coord = lines[i].split(' ')\n                coords = []\n                for item in coord:\n                    if item!='':\n                        coords.append(item)\n                x = (float(coords[0]) - centroid[0, 0]) / furthest_distance\n                y = (float(coords[1]) - centroid[0, 1]) / furthest_distance\n                z = (float(coords[2]) - centroid[0, 2]) / furthest_distance\n                f.write('%.6f %.6f %.6f\\n' % (x, y, z))\n            f.writelines(lines[nVert + 2:])\n        # continue\n\n        # path0 = file_path.replace('poisson_20k','20000_normalize')\n        # save_file(path0, data_4096)\n\n        # idx = np.argsort(data_4096[:,np.random.randint(0,2)])\n        # data_4096 = data_4096[idx]\n        # path1 = file_path.replace('poisson_20k','poisson_5k2')\n        # idx1 = nonuniform_sampling(num=len(data_4096), sample_num=len(data_4096)/4)\n        # data1 = data_4096[idx1, ...]\n        # save_file(path1,data1)\n\n\ndef nonuniformsample_from_pointcloud_fn():\n    file_list1 = glob.glob('/home/lqyu/server/proj49/PointSR_data/test_data/our_collected_data/Poisson_20k/*.xyz')\n    file_list1.sort()\n    # #handle with file_list2 and select the complete whole\n    # tmp_list1  = []\n    # tmp_list2 = []\n    # for item in file_list2:\n    #     tmp_item = item.replace('big_girl','biggirl')\n    #     if len(tmp_item.split('/')[-1].split('_'))==3:#biggirl_01_1\n    #         tmp_list1.append(item)\n    #     else:#biggirl_1\n    #         tmp_list2.append(item)\n    file_list = file_list1\n\n    for file_path in tqdm(file_list):\n        print file_path\n        phase = file_path.split('/')[-2]\n        name = file_path.split('/')[-1].replace(\"xyz\", \"xyz\")\n\n        data_4096 = np.loadtxt(file_path)\n        # data = data_4096[:, 0:3]\n        # centroid = np.mean(data, axis=0, keepdims=True)\n        # data = data - centroid\n        # furthest_distance = np.amax(np.sqrt(np.sum(abs(data) ** 2, axis=-1)))\n        # data = data / furthest_distance\n        # data_4096[:, 0:3] = data\n\n        path_4096 = file_path\n        path_nonuniform_1024 = path_4096.replace('Poisson_20k','Poisson_5k_nonuniform')\n\n        #idx_nonuniform_2048 = nonuniform_sampling(num=4096, sample_num=2048)\n        #data_nonuniform_2048 = data_4096[idx_nonuniform_2048, ...]\n        sort_idx = np.argsort(data_4096[:, np.random.randint(0, 2)])\n        perm_idx = np.random.permutation(np.arange(len(data_4096)))\n\n        idx_nonuniform_1024 = nonuniform_sampling(num=len(data_4096), sample_num=5000)\n        data_nonuniform_1024 = data_4096[sort_idx[idx_nonuniform_1024],...]\n\n        # data_nonuniform_1024 = data_4096[perm_idx[:5000], ...]\n\n        #save_file(path_4096,data_4096)\n        #save_file(path_nonuniform_2048,data_nonuniform_2048)\n        save_file(path_nonuniform_1024,data_nonuniform_1024)\n\n\ndef possion_sample_fn(file_path):\n    phase = file_path.split('/')[-2]\n    name = file_path.split('/')[-1].replace(\"off\", \"xyz\")\n    xyz_name = os.path.join(SAVE_ROOT_PATH, '20000~', phase, name)\n    if not os.path.exists(xyz_name):\n        sample_cmd = '../../third_party/PdSampling_nofix %s %s %s' % (str(20000), file_path, xyz_name)\n        print sample_cmd\n        if os.system(sample_cmd):\n            print \"cannot sample file: %s\" % (file_path)\n            return 1\n\n    # xyz_name = os.path.join(SAVE_ROOT_PATH, '4096', phase, name)\n    # if not os.path.exists(xyz_name):\n    #     sample_cmd = '../../third_party_pc901/PdSampling %s %s %s' % (str(4096), file_path, xyz_name)\n    #     print sample_cmd\n    #     if os.system(sample_cmd):\n    #         print \"cannot sample file: %s\" % (file_path)\n    #         return 1\n    #\n    # xyz_name = os.path.join(SAVE_ROOT_PATH, '2048', phase, name)\n    # if not os.path.exists(xyz_name):\n    #     sample_cmd = '../../third_party_pc901/PdSampling %s %s %s' % (str(2048), file_path, xyz_name)\n    #     print sample_cmd\n    #     if os.system(sample_cmd):\n    #         print \"cannot sample file: %s\" % (file_path)\n    #         return 1\n\n    # xyz_name = os.path.join(SAVE_ROOT_PATH, '8192~', phase, name)\n    # if os.path.exists(xyz_name):\n    #     return\n    # print file_path\n    # sample_cmd = '../../third_party/PdSampling_nofix %s %s %s' % (str(8192), file_path, xyz_name)\n    # if os.system(sample_cmd):\n    #     print \"cannot sample file: %s\" % (file_path)\n    #     return 1\n\n\ndef normal_off_file(folder='/home/lqyu/workspace/PointSR/data/ModelNet10'):\n    file_list = glob.glob(folder+'/*/train/*.off')\n    file_list.sort()\n    for file_path in file_list:\n        phase = file_path.split('/')[-2]\n        name = file_path.split('/')[-1][:-4]+'.xyz'\n        data_1024 = np.loadtxt(os.path.join('../../data/ModelNet10_poisson', '1024', phase, name))\n        data_2048 = np.loadtxt(os.path.join('../../data/ModelNet10_poisson', '2048', phase, name))\n        data_4096 = np.loadtxt(os.path.join('../../data/ModelNet10_poisson', '4096', phase, name))\n\n        data_ori = np.concatenate((data_1024, data_2048, data_4096), axis=0)\n        data = data_ori[:, 0:3]\n        centroid = np.mean(data, axis=0, keepdims=True)\n        data = data - centroid\n        furthest_distance = np.amax(np.sqrt(np.sum(abs(data) ** 2, axis=-1)))\n\n        save_off_path = file_path.replace('ModelNet10','ModelNet10_normalize')\n        if not os.path.exists(os.path.split(save_off_path)[0]):\n            os.makedirs(os.path.split(save_off_path)[0])\n\n        offFile = open(file_path, 'r')\n        lines = offFile.readlines()\n        offFile.close()\n        with open(save_off_path,'w') as f:\n            f.writelines(lines[0:2])\n            params = lines[1].split(' ')\n            nVert = int(params[0])\n            for i in range(2,nVert+2):\n                coord = lines[i].split(' ')\n                x = (float(coord[0])-centroid[0,0])/furthest_distance\n                y = (float(coord[1])-centroid[0,1])/furthest_distance\n                z = (float(coord[2])-centroid[0,2])/furthest_distance\n                f.write('%.6f %.6f %.6f\\n'%(x,y,z))\n            f.writelines(lines[nVert+2:])\n\n\ndef recalculateNormal_fn(file_path):\n    SAVE_ROOT_PATH='../../data/ModelNet10_poisson_normal'\n    phase = file_path.split('/')[-2]\n    name = file_path.split('/')[-1]\n    xyz_name = file_path.split('/')[-1][:-4] + \"_p.xyz\"\n    xyz_normal_name = file_path.split('/')[-1].replace(\"off\", \"xyz\")\n\n    if os.path.exists(os.path.join(SAVE_ROOT_PATH, '2048_nonuniform', phase, name)):\n        return\n    data_1024 = np.loadtxt(os.path.join('../../ModelNet10', '1024', phase, name))\n    data_2048 = np.loadtxt(os.path.join('../../ModelNet10', '2048', phase, name))\n    data_4096 = np.loadtxt(os.path.join('../../ModelNet10', '4096', phase, name))\n    data_8196 = np.loadtxt(os.path.join('../../ModelNet10', '20000~', phase, name))\n\n    data = np.concatenate((data_1024, data_2048, data_4096,data_8196), axis=0)\n    np.savetxt(xyz_name, data)\n    normal_cmd = 'meshlabserver -i %s -o %s -s ../../third_party/calculate_normal.mlx -om vn' % (\n        xyz_name, xyz_normal_name)\n    if os.system(normal_cmd):\n        print \"cannot calculate normal file: %s\" % (file_path)\n        return 1\n\n    data_ori = np.loadtxt(xyz_normal_name)\n    data_ori = data_ori[0:1024*7,:]\n\n    data = data_ori[:, 0:3]\n    centroid = np.mean(data, axis=0, keepdims=True)\n    data = data - centroid\n    furthest_distance = np.amax(np.sqrt(np.sum(abs(data) ** 2, axis=-1)))\n    data = data / furthest_distance\n    data_ori[:, 0:3] = data\n\n    data_1024 = data_ori[0:1024 * 1, :]\n    data_2048 = data_ori[1024 * 1:1024 * 3, :]\n    data_4096 = data_ori[1024 * 3:1024 * 7, :]\n\n    #generate nonuniform data\n    idx_nonuniform_2048 = nonuniform_sampling(num=len(data_4096), sample_num=2048)\n    data_nonuniform_2048 = data_4096[idx_nonuniform_2048, ...]\n    idx_nonuniform_1024 = nonuniform_sampling(num=len(data_4096), sample_num=1024)\n    data_nonuniform_1024 = data_4096[idx_nonuniform_1024, ...]\n\n    # save\n    path_1024 = os.path.join(SAVE_ROOT_PATH, '1024', phase, name)\n    path_2048 = os.path.join(SAVE_ROOT_PATH, '2048', phase, name)\n    path_4096 = os.path.join(SAVE_ROOT_PATH, '4096', phase, name)\n    path_nonuniform_1024 = os.path.join(SAVE_ROOT_PATH, '1024_nonuniform', phase, name)\n    path_nonuniform_2048 = os.path.join(SAVE_ROOT_PATH, '2048_nonuniform', phase, name)\n\n    if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '1024', phase)):\n        os.makedirs(os.path.join(SAVE_ROOT_PATH, '1024', phase))\n    if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '2048', phase)):\n        os.makedirs(os.path.join(SAVE_ROOT_PATH, '2048', phase))\n    if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '4096', phase)):\n        os.makedirs(os.path.join(SAVE_ROOT_PATH, '4096', phase))\n    if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '1024_nonuniform', phase)):\n        os.makedirs(os.path.join(SAVE_ROOT_PATH, '1024_nonuniform', phase))\n    if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '2048_nonuniform', phase)):\n        os.makedirs(os.path.join(SAVE_ROOT_PATH, '2048_nonuniform', phase))\n\n    np.savetxt(path_1024, data_1024, fmt='%.6f')\n    np.savetxt(path_2048, data_2048, fmt='%.6f')\n    np.savetxt(path_4096, data_4096, fmt='%.6f')\n    np.savetxt(path_nonuniform_1024, data_nonuniform_1024, fmt='%.6f')\n    np.savetxt(path_nonuniform_2048, data_nonuniform_2048, fmt='%.6f')\n\n    os.remove(xyz_name)\n    os.remove(xyz_normal_name)\n    return\n\n\ndef fix_off_file(filepath):\n    with open(filepath,'r') as f:\n        line = f.readline()\n        if line=='OFF\\n':\n            return\n        print filepath\n        lines = f.readlines()\n\n    nums = line.split(' ')\n    n1 = nums[0][3:]\n    n2 = nums[1]\n    n3 = nums[2]\n    with open(filepath,'w') as f:\n        f.write('OFF\\n%s %s %s'%(n1,n2,n3))\n        f.writelines(lines)\n\n\ndef possion_sample_fn(phase='train'):\n    file_list = glob.glob(os.path.join('../../data/ModelNet10', '*', phase, '*.off'))\n    file_list.sort()\n    new_file_list = []\n\n    for item in file_list:\n        name = item.split('/')[-1][:-3]+\"xyz\"\n        xyz_name = os.path.join(SAVE_ROOT_PATH, '20000~', phase,name)\n        if not os.path.exists(xyz_name):\n            new_file_list.append(item)\n    print('Got %d files in modelnet10.' % (len(new_file_list)))\n    pool = ThreadPool(8)\n    pool.map(possion_sample_fn, new_file_list)\n\ndef recalculateNormal(phase='train'):\n    file_list = glob.glob(os.path.join('../../ModelNet10/1024',phase,'*.xyz'))\n    file_list.sort()\n    print('Got %d files in modelnet10.' % (len(file_list)))\n    pool = ThreadPool(1)\n    pool.map(recalculateNormal_fn, file_list)\n\ndef nonuniformsample_from_pointcloud(phase='train'):\n    file_list = glob.glob(os.path.join('../../data/surface_with_area/pcl_4096~','*.xyz'))\n    file_list.sort()\n    print('Got %d files in modelnet10.' % (len(file_list)))\n    for item in file_list:\n        tmp(item)\n\n\ndef save_h52(save_names = ['poisson_4096','poisson_2048']):\n    h5_filename = '/home/lqyu/workspace/PointSR/data/Patches_tt.h5'\n    file_names1 = glob.glob(os.path.join('/home/lqyu/server/proj49/PointSR_data/train_data/SHREC',save_names[0],'*.xyz'))\n    file_names1.sort()\n    #select which data to save\n    select_file_names = []\n    for name in file_names1:\n        id = int(name.split('/')[-1].split('_')[0][1:])\n        if id<1000:\n            select_file_names.append(name)\n    ##read data\n    names = []\n    catetogy = len(save_names)\n    data = [[] for item in range(catetogy)]\n\n    # for item in tqdm(select_file_names):\n    #     path1 = os.path.join('/home/lqyu/server/proj49/PointSR_data/train_data/poisson2_4096', item.split('/')[-1])\n    #     path2 = os.path.join('/home/lqyu/server/proj49/PointSR_data/train_data/poisson_1024', item.split('/')[-1])\n    #     path3 = os.path.join('/home/lqyu/server/proj49/PointSR_data/train_data/MC_4096', item.split('/')[-1])\n    #\n    #     if os.path.exists(item) and os.path.exists(path1) and os.path.exists(path2):\n    #         tmp_data = np.loadtxt(item)\n    #         data[0].append(tmp_data)\n    #\n    #         tmp_data = np.loadtxt(path1)\n    #         data[1].append(tmp_data)\n    #\n    #         tmp_data = np.loadtxt(path2)\n    #         data[2].append(tmp_data)\n    #\n    #         tmp_data = np.loadtxt(path3)\n    #         data[3].append(tmp_data)\n    #         names.append(item)\n    #     else:\n    #         print item\n\n    for item in tqdm(select_file_names):\n\n        item_data = []\n        for i in range(catetogy):\n            path = item.replace(save_names[0],save_names[i])\n            tmp_data = np.loadtxt(path)\n\n            centroid = np.mean(tmp_data[:, 0:3], axis=0, keepdims=True)\n            tmp_data[:, 0:3] = tmp_data[:, 0:3] - centroid\n            furthest_distance = np.amax(np.sqrt(np.sum(abs(tmp_data[:, 0:3]) ** 2, axis=-1)))\n            tmp_data[:, 0:3] = tmp_data[:, 0:3] / furthest_distance\n            item_data.append(tmp_data)\n        if len(item_data)==catetogy:\n            names.append(item)\n            for i in range(catetogy):\n                data[i].append(item_data[i])\n\n    for i in range(catetogy-1):\n        assert len(data[i])==len(data[i+1])\n    assert len(data[i])==len(names)\n    print len(names)\n    h5_fout = h5py.File(h5_filename)\n    for i in range(catetogy):\n        h5_fout.create_dataset(\n            save_names[i], data=data[i],\n            compression='gzip', compression_opts=4,\n            dtype=np.float32)\n    string_dt = h5py.special_dtype(vlen=str)\n    h5_fout.create_dataset(\n        'name', data=names,\n        compression='gzip', compression_opts=1,\n        dtype=string_dt)\n    h5_fout.close()\n\n\ndef save_h5(h5_filename,save_names = ['patch_poisson_4096','patch_poisson_1024_nonuniform']):\n\n    file_names = os.listdir(os.path.join('',save_names[0]))\n    file_names.sort()\n\n    #select which data to save\n    test_name = ['nicolo','vaselion','bunny','gril']\n    select_file_names = []\n    for name in file_names:\n        mark = False\n        for tt in test_name:\n            if tt in name:\n                mark = True\n                break\n        if mark==False:\n            select_file_names.append(name)\n\n    ##read data\n    names = []\n    catetogy = len(save_names)\n    data = [[] for item in range(catetogy)]\n    for item in tqdm(select_file_names):\n        item_data = []\n        for i in range(catetogy):\n            path = os.path.join(pointcloud_path,save_names[i],item)\n            tmp_data = np.loadtxt(path)\n            if tmp_data.shape[0]==int(save_names[i].split('_')[0]):\n                item_data.append(tmp_data)\n        if len(item_data)==catetogy:\n            names.append(item)\n            for i in range(catetogy):\n                data[i].append(item_data[i])\n\n    for i in range(catetogy-1):\n        assert len(data[i])==len(data[i+1])\n    assert len(data[i])==len(names)\n\n    h5_fout = h5py.File(h5_filename)\n    for i in range(catetogy):\n        h5_fout.create_dataset(\n            save_names[i], data=data[i],\n            compression='gzip', compression_opts=4,\n            dtype=np.float32)\n    string_dt = h5py.special_dtype(vlen=str)\n    h5_fout.create_dataset(\n        'name', data=names,\n        compression='gzip', compression_opts=1,\n        dtype=string_dt)\n    h5_fout.close()\n\ndef load_h5(h5_filename):\n    f = h5py.File(h5_filename)\n    data_1024 = f['data_1024'][:]\n    data_4096 = f['data_4096'][:]\n    name = f['name'][:]\n    return (data_1024,data_4096)\n\nif __name__ == '__main__':\n    #nonuniform_subsample_from_pointcloud('/home/lqyu/workspace/PointSR/data/ModelNet10_pc',sampling_num=2048)\n\n    pointcloud_path = '/home/lqyu/workspace/PointSR/data/Patches'\n    h5_filename = '/home/lqyu/workspace/PointSR/data/Patches_tt.h5'\n    #nonuniformsample_from_pointcloud_fn('aa')\n    #nonuniformsample_from_pointcloud_fn('ss')\n    #nonuniformsample_from_pointcloud('train')\n    #calculateNormal('train')\n    #load_h5('/home/lqyu/workspace/PointSR/data/ModelNet10_tt.h5')\n    save_h52()\n    #normal_off_file()"
  },
  {
    "path": "code/utils/off2obj.py",
    "content": "#! /usr/bin/python\n# Written by John Bowers\n# http://johnsresearch.wordpress.com\n# 2009\n# You are welcome to use this however you want, this is public domain.\n\nimport sys\n\nif len(sys.argv) == 3:\n    off_path = sys.argv[1]\n    obj_path = sys.argv[2]\nelse:\n    print \"USAGE: off2obj.py [path to mesh] [output path]\"\n    sys.exit(0)\n\n# Class Mesh represents a mesh by a vertex list and a face list\n# and has a method loadFromOffFile to load the Mesh data from an\n# OFF file.\nclass Mesh:\n    \"\"\"Class Represents a Mesh by (V, F)\"\"\"\n    def __init__(self):\n\tself.verts = []\n\tself.faces = []\n\tself.nVerts = 0\n\tself.nFaces = 0\n\tself.edges = None\n    def writeToObjFile(self, pathToObjFile):\n\tobjFile = open(pathToObjFile, 'w')\n\tobjFile.write(\"# off2obj OBJ File\")\n\tobjFile.write(\"# http://johnsresearch.wordpress.com\\n\")\n\tfor vert in self.verts:\n\t    objFile.write(\"v \")\n\t    objFile.write(str(vert[0]))\n\t    objFile.write(\" \")\n\t    objFile.write(str(vert[1]))\n\t    objFile.write(\" \")\n\t    objFile.write(str(vert[2]))\n\t    objFile.write(\"\\n\")\n\tobjFile.write(\"s off\\n\")\n\tfor face in self.faces:\n\t    objFile.write(\"f \")\n\t    objFile.write(str(face[0]+1))\n\t    objFile.write(\" \")\n\t    objFile.write(str(face[1]+1))\n\t    objFile.write(\" \")\n\t    objFile.write(str(face[2]+1))\n\t    objFile.write(\"\\n\")\n\tobjFile.close()\n    def loadFromOffFile(self, pathToOffFile):\n\t#Reset this mesh:\n\tself.verts = []\n\tself.faces = []\n\tself.nVerts = 0\n\tself.nFaces = 0\n\n\t#Open the file for reading:\n\toffFile = open(pathToOffFile, 'r')\n\tlines = offFile.readlines()\n\n\t#Read the number of verts and faces\n\tparams = lines[1].split()\n\tself.nVerts = int(params[0])\n\tself.nFaces = int(params[1])\n\n\t#split the remaining lines into vert and face arrays\n\tvertLines = lines[2:2+self.nVerts]\n\tfaceLines = lines[2+self.nVerts:2+self.nVerts+self.nFaces]\n\n\t#Create the verts array\n\tfor vertLine in vertLines:\n\t    XYZ = vertLine.split()\n\t    self.verts.append([float(XYZ[0]), float(XYZ[1]), float(XYZ[2])])\n\n\t#Create the faces array\n\tfor faceLine in faceLines:\n\t    XYZ = faceLine.split()\n\t    self.faces.append((int(XYZ[1]), int(XYZ[2]), int(XYZ[3])))\n\t    if not(int(XYZ[0]) == 3):\n\t\tprint \"ERROR: This OFF loader can only handle meshes with 3 vertex faces.\"\n\t\tprint \"A face with\", XYZ[0], \"vertices is included in the file. Exiting.\"\n\t\toffFile.close()\n\t\tsys.exit(0)\n\n\t#Cleanup\n\toffFile.close()\n    def edgeList(self):\n\tif not(self.edges == None):\n\t    return self.edges\n\tself.edges = []\n\tfor i in range(0, self.nVerts):\n\t    self.edges.append([])\n\tfor face in self.faces:\n\t    i = face[0]\n\t    j = face[1]\n\t    k = face[2]\n\t    if not(j in self.edges[i]):\n\t\tself.edges[i].append(j)\n\t    if not(k in self.edges[i]):\n\t\tself.edges[i].append(k)\n\t    if not(i in self.edges[j]):\n\t\tself.edges[j].append(i)\n\t    if not(k in self.edges[j]):\n\t\tself.edges[j].append(k)\n\t    if not(i in self.edges[k]):\n\t\tself.edges[k].append(i)\n\t    if not(j in self.edges[k]):\n\t\tself.edges[k].append(j)\n\treturn self.edges\n\n\"\"\" Main Program \"\"\"\n\nmesh = Mesh()\nmesh.loadFromOffFile(off_path)\nmesh.writeToObjFile(obj_path)"
  },
  {
    "path": "code/utils/pc_util.py",
    "content": "\"\"\" Utility functions for processing point clouds.\n\nAuthor: Charles R. Qi, Hao Su\nDate: November 2016\n\"\"\"\n\nimport os\nimport sys\nfrom matplotlib import pyplot as plt\nfrom matplotlib import colors\n\nBASE_DIR = os.path.dirname(os.path.abspath(__file__))\nsys.path.append(BASE_DIR)\n\n# Draw point cloud\nfrom eulerangles import euler2mat\n\n# Point cloud IO\nimport numpy as np\nfrom plyfile import PlyData, PlyElement\n\n \n# ----------------------------------------\n# Point Cloud/Volume Conversions\n# ----------------------------------------\n\ndef point_cloud_to_volume_batch(point_clouds, vsize=12, radius=1.0, flatten=True):\n    \"\"\" Input is BxNx3 batch of point cloud\n        Output is Bx(vsize^3)\n    \"\"\"\n    vol_list = []\n    for b in range(point_clouds.shape[0]):\n        vol = point_cloud_to_volume(np.squeeze(point_clouds[b,:,:]), vsize, radius)\n        if flatten:\n            vol_list.append(vol.flatten())\n        else:\n            vol_list.append(np.expand_dims(np.expand_dims(vol, -1), 0))\n    if flatten:\n        return np.vstack(vol_list)\n    else:\n        return np.concatenate(vol_list, 0)\n\n\ndef point_cloud_to_volume(points, vsize, radius=1.0):\n    \"\"\" input is Nx3 points.\n        output is vsize*vsize*vsize\n        assumes points are in range [-radius, radius]\n    \"\"\"\n    vol = np.zeros((vsize,vsize,vsize))\n    voxel = 2*radius/float(vsize)\n    locations = (points + radius)/voxel\n    locations = locations.astype(int)\n    vol[locations[:,0],locations[:,1],locations[:,2]] = 1.0\n    return vol\n\n#a = np.zeros((16,1024,3))\n#print point_cloud_to_volume_batch(a, 12, 1.0, False).shape\n\ndef volume_to_point_cloud(vol):\n    \"\"\" vol is occupancy grid (value = 0 or 1) of size vsize*vsize*vsize\n        return Nx3 numpy array.\n    \"\"\"\n    vsize = vol.shape[0]\n    assert(vol.shape[1] == vsize and vol.shape[1] == vsize)\n    points = []\n    for a in range(vsize):\n        for b in range(vsize):\n            for c in range(vsize):\n                if vol[a,b,c] == 1:\n                    points.append(np.array([a,b,c]))\n    if len(points) == 0:\n        return np.zeros((0,3))\n    points = np.vstack(points)\n    return points\n\n# ----------------------------------------\n# Point cloud IO\n# ----------------------------------------\n\ndef read_ply(filename):\n    \"\"\" read XYZ point cloud from filename PLY file \"\"\"\n    plydata = PlyData.read(filename)\n    pc = plydata['vertex'].data\n    pc_array = np.array([[x, y, z] for x,y,z in pc])\n    return pc_array\n\n\ndef write_ply(points, filename, text=True):\n    \"\"\" input: Nx3, write points to filename as PLY format. \"\"\"\n    points = [(points[i,0], points[i,1], points[i,2]) for i in range(points.shape[0])]\n    vertex = np.array(points, dtype=[('x', 'f4'), ('y', 'f4'),('z', 'f4')])\n    el = PlyElement.describe(vertex, 'vertex', comments=['vertices'])\n    PlyData([el], text=text).write(filename)\n\n\n# ----------------------------------------\n# Simple Point cloud and Volume Renderers\n# ----------------------------------------\n\ndef draw_point_cloud(input_points, canvasSize=500, space=240, diameter=10,\n                     xrot=0, yrot=0, zrot=0, switch_xyz=[0,1,2], normalize=True):\n    \"\"\" Render point cloud to image with alpha channel.\n        Input:\n            points: Nx3 numpy array (+y is up direction)\n        Output:\n            gray image as numpy array of size canvasSizexcanvasSize\n    \"\"\"\n    canvasSizeX = canvasSize\n    canvasSizeY = canvasSize\n\n    image = np.zeros((canvasSizeX, canvasSizeY))\n    if input_points is None or input_points.shape[0] == 0:\n        return image\n\n    points = input_points[:, switch_xyz]\n    M = euler2mat(zrot, yrot, xrot)\n    points = (np.dot(M, points.transpose())).transpose()\n\n    # Normalize the point cloud\n    # We normalize scale to fit points in a unit sphere\n    if normalize:\n        centroid = np.mean(points, axis=0)\n        points -= centroid\n        furthest_distance = np.max(np.sqrt(np.sum(abs(points)**2,axis=-1)))\n        points /= furthest_distance\n\n    # Pre-compute the Gaussian disk\n    radius = (diameter-1)/2.0\n    disk = np.zeros((diameter, diameter))\n    for i in range(diameter):\n        for j in range(diameter):\n            if (i - radius) * (i-radius) + (j-radius) * (j-radius) <= radius * radius:\n                disk[i, j] = np.exp((-(i-radius)**2 - (j-radius)**2)/(radius**2))\n    mask = np.argwhere(disk > 0)\n    dx = mask[:, 0]\n    dy = mask[:, 1]\n    dv = disk[disk > 0]\n    \n    # Order points by z-buffer\n    zorder = np.argsort(points[:, 2])\n    points = points[zorder, :]\n    points[:, 2] = (points[:, 2] - np.min(points[:, 2])) / (np.max(points[:, 2] - np.min(points[:, 2])))\n    max_depth = np.max(points[:, 2])\n       \n    for i in range(points.shape[0]):\n        j = points.shape[0] - i - 1\n        x = points[j, 0]\n        y = points[j, 1]\n        xc = canvasSizeX/2 + (x*space)\n        yc = canvasSizeY/2 + (y*space)\n        xc = int(np.round(xc))\n        yc = int(np.round(yc))\n        \n        px = dx + xc\n        py = dy + yc\n        #image[px, py] = image[px, py] * 0.7 + dv * (max_depth - points[j, 2]) * 0.3\n        image[px, py] = image[px, py] * 0.7 + dv * 0.3\n\n    val = np.max(image)\n    val = np.percentile(image,99.9)\n    image = image / val\n    mask = image==0\n\n    image[image>1.0]=1.0\n    image = 1.0-image\n    #image = np.expand_dims(image, axis=-1)\n    #image = np.concatenate((image*0.3+0.7,np.ones_like(image), np.ones_like(image)), axis=2)\n    #image = colors.hsv_to_rgb(image)\n    image[mask]=1.0\n\n\n    return image\n\ndef point_cloud_three_views(points,diameter=5):\n    \"\"\" input points Nx3 numpy array (+y is up direction).\n        return an numpy array gray image of size 500x1500. \"\"\" \n    # +y is up direction\n    # xrot is azimuth\n    # yrot is in-plane\n    # zrot is elevation\n    # img1 = draw_point_cloud(points, xrot=90/180.0*np.pi,  yrot=0/180.0*np.pi, zrot=0/180.0*np.pi,diameter=diameter)\n    # img2 = draw_point_cloud(points, xrot=180/180.0*np.pi, yrot=0/180.0*np.pi, zrot=0/180.0*np.pi,diameter=diameter)\n    # img3 = draw_point_cloud(points, xrot=0/180.0*np.pi,  yrot=-90/180.0*np.pi, zrot=0/180.0*np.pi,diameter=diameter)\n    # image_large = np.concatenate([img1, img2, img3], 1)\n\n    img1 = draw_point_cloud(points, zrot=110 / 180.0 * np.pi, xrot=135 / 180.0 * np.pi, yrot=0 / 180.0 * np.pi,diameter=diameter)\n    img2 = draw_point_cloud(points, zrot=70 / 180.0 * np.pi, xrot=135 / 180.0 * np.pi, yrot=0 / 180.0 * np.pi,diameter=diameter)\n    img3 = draw_point_cloud(points, zrot=180.0 / 180.0 * np.pi, xrot=90 / 180.0 * np.pi, yrot=0 / 180.0 * np.pi,diameter=diameter)\n    image_large = np.concatenate([img1, img2, img3], 1)\n\n    return image_large\n\n\nfrom PIL import Image\ndef point_cloud_three_views_demo():\n    \"\"\" Demo for draw_point_cloud function \"\"\"\n    points = read_ply('../third_party/mesh_sampling/piano.ply')\n    im_array = point_cloud_three_views(points)\n    img = Image.fromarray(np.uint8(im_array*255.0))\n    img.save('piano.jpg')\n\nif __name__==\"__main__\":\n    point_cloud_three_views_demo()\n\n\nimport matplotlib.pyplot as plt\ndef pyplot_draw_point_cloud(points, output_filename):\n    \"\"\" points is a Nx3 numpy array \"\"\"\n    fig = plt.figure()\n    ax = fig.add_subplot(111, projection='3d')\n    ax.scatter(points[:,0], points[:,1], points[:,2])\n    ax.set_xlabel('x')\n    ax.set_ylabel('y')\n    ax.set_zlabel('z')\n    #savefig(output_filename)\n\ndef pyplot_draw_volume(vol, output_filename):\n    \"\"\" vol is of size vsize*vsize*vsize\n        output an image to output_filename\n    \"\"\"\n    points = volume_to_point_cloud(vol)\n    pyplot_draw_point_cloud(points, output_filename)\n"
  },
  {
    "path": "code/utils/plyfile.py",
    "content": "#   Copyright 2014 Darsh Ranjan\n#\n#   This file is part of python-plyfile.\n#\n#   python-plyfile is free software: you can redistribute it and/or\n#   modify it under the terms of the GNU General Public License as\n#   published by the Free Software Foundation, either version 3 of the\n#   License, or (at your option) any later version.\n#\n#   python-plyfile is distributed in the hope that it will be useful,\n#   but WITHOUT ANY WARRANTY; without even the implied warranty of\n#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n#   General Public License for more details.\n#\n#   You should have received a copy of the GNU General Public License\n#   along with python-plyfile.  If not, see\n#       <http://www.gnu.org/licenses/>.\n\nfrom itertools import islice as _islice\n\nimport numpy as _np\nfrom sys import byteorder as _byteorder\n\n\ntry:\n    _range = xrange\nexcept NameError:\n    _range = range\n\n\n# Many-many relation\n_data_type_relation = [\n    ('int8', 'i1'),\n    ('char', 'i1'),\n    ('uint8', 'u1'),\n    ('uchar', 'b1'),\n    ('uchar', 'u1'),\n    ('int16', 'i2'),\n    ('short', 'i2'),\n    ('uint16', 'u2'),\n    ('ushort', 'u2'),\n    ('int32', 'i4'),\n    ('int', 'i4'),\n    ('uint32', 'u4'),\n    ('uint', 'u4'),\n    ('float32', 'f4'),\n    ('float', 'f4'),\n    ('float64', 'f8'),\n    ('double', 'f8')\n]\n\n_data_types = dict(_data_type_relation)\n_data_type_reverse = dict((b, a) for (a, b) in _data_type_relation)\n\n_types_list = []\n_types_set = set()\nfor (_a, _b) in _data_type_relation:\n    if _a not in _types_set:\n        _types_list.append(_a)\n        _types_set.add(_a)\n    if _b not in _types_set:\n        _types_list.append(_b)\n        _types_set.add(_b)\n\n\n_byte_order_map = {\n    'ascii': '=',\n    'binary_little_endian': '<',\n    'binary_big_endian': '>'\n}\n\n_byte_order_reverse = {\n    '<': 'binary_little_endian',\n    '>': 'binary_big_endian'\n}\n\n_native_byte_order = {'little': '<', 'big': '>'}[_byteorder]\n\n\ndef _lookup_type(type_str):\n    if type_str not in _data_type_reverse:\n        try:\n            type_str = _data_types[type_str]\n        except KeyError:\n            raise ValueError(\"field type %r not in %r\" %\n                             (type_str, _types_list))\n\n    return _data_type_reverse[type_str]\n\n\ndef _split_line(line, n):\n    fields = line.split(None, n)\n    if len(fields) == n:\n        fields.append('')\n\n    assert len(fields) == n + 1\n\n    return fields\n\n\ndef make2d(array, cols=None, dtype=None):\n    '''\n    Make a 2D array from an array of arrays.  The `cols' and `dtype'\n    arguments can be omitted if the array is not empty.\n\n    '''\n    if (cols is None or dtype is None) and not len(array):\n        raise RuntimeError(\"cols and dtype must be specified for empty \"\n                           \"array\")\n\n    if cols is None:\n        cols = len(array[0])\n\n    if dtype is None:\n        dtype = array[0].dtype\n\n    return _np.fromiter(array, [('_', dtype, (cols,))],\n                        count=len(array))['_']\n\n\nclass PlyParseError(Exception):\n\n    '''\n    Raised when a PLY file cannot be parsed.\n\n    The attributes `element', `row', `property', and `message' give\n    additional information.\n\n    '''\n\n    def __init__(self, message, element=None, row=None, prop=None):\n        self.message = message\n        self.element = element\n        self.row = row\n        self.prop = prop\n\n        s = ''\n        if self.element:\n            s += 'element %r: ' % self.element.name\n        if self.row is not None:\n            s += 'row %d: ' % self.row\n        if self.prop:\n            s += 'property %r: ' % self.prop.name\n        s += self.message\n\n        Exception.__init__(self, s)\n\n    def __repr__(self):\n        return ('PlyParseError(%r, element=%r, row=%r, prop=%r)' %\n                self.message, self.element, self.row, self.prop)\n\n\nclass PlyData(object):\n\n    '''\n    PLY file header and data.\n\n    A PlyData instance is created in one of two ways: by the static\n    method PlyData.read (to read a PLY file), or directly from __init__\n    given a sequence of elements (which can then be written to a PLY\n    file).\n\n    '''\n\n    def __init__(self, elements=[], text=False, byte_order='=',\n                 comments=[], obj_info=[]):\n        '''\n        elements: sequence of PlyElement instances.\n\n        text: whether the resulting PLY file will be text (True) or\n            binary (False).\n\n        byte_order: '<' for little-endian, '>' for big-endian, or '='\n            for native.  This is only relevant if `text' is False.\n\n        comments: sequence of strings that will be placed in the header\n            between the 'ply' and 'format ...' lines.\n\n        obj_info: like comments, but will be placed in the header with\n            \"obj_info ...\" instead of \"comment ...\".\n\n        '''\n        if byte_order == '=' and not text:\n            byte_order = _native_byte_order\n\n        self.byte_order = byte_order\n        self.text = text\n\n        self.comments = list(comments)\n        self.obj_info = list(obj_info)\n        self.elements = elements\n\n    def _get_elements(self):\n        return self._elements\n\n    def _set_elements(self, elements):\n        self._elements = tuple(elements)\n        self._index()\n\n    elements = property(_get_elements, _set_elements)\n\n    def _get_byte_order(self):\n        return self._byte_order\n\n    def _set_byte_order(self, byte_order):\n        if byte_order not in ['<', '>', '=']:\n            raise ValueError(\"byte order must be '<', '>', or '='\")\n\n        self._byte_order = byte_order\n\n    byte_order = property(_get_byte_order, _set_byte_order)\n\n    def _index(self):\n        self._element_lookup = dict((elt.name, elt) for elt in\n                                    self._elements)\n        if len(self._element_lookup) != len(self._elements):\n            raise ValueError(\"two elements with same name\")\n\n    @staticmethod\n    def _parse_header(stream):\n        '''\n        Parse a PLY header from a readable file-like stream.\n\n        '''\n        lines = []\n        comments = {'comment': [], 'obj_info': []}\n        while True:\n            line = stream.readline().decode('ascii').strip()\n            fields = _split_line(line, 1)\n\n            if fields[0] == 'end_header':\n                break\n\n            elif fields[0] in comments.keys():\n                lines.append(fields)\n            else:\n                lines.append(line.split())\n\n        a = 0\n        if lines[a] != ['ply']:\n            raise PlyParseError(\"expected 'ply'\")\n\n        a += 1\n        while lines[a][0] in comments.keys():\n            comments[lines[a][0]].append(lines[a][1])\n            a += 1\n\n        if lines[a][0] != 'format':\n            raise PlyParseError(\"expected 'format'\")\n\n        if lines[a][2] != '1.0':\n            raise PlyParseError(\"expected version '1.0'\")\n\n        if len(lines[a]) != 3:\n            raise PlyParseError(\"too many fields after 'format'\")\n\n        fmt = lines[a][1]\n\n        if fmt not in _byte_order_map:\n            raise PlyParseError(\"don't understand format %r\" % fmt)\n\n        byte_order = _byte_order_map[fmt]\n        text = fmt == 'ascii'\n\n        a += 1\n        while a < len(lines) and lines[a][0] in comments.keys():\n            comments[lines[a][0]].append(lines[a][1])\n            a += 1\n\n        return PlyData(PlyElement._parse_multi(lines[a:]),\n                       text, byte_order,\n                       comments['comment'], comments['obj_info'])\n\n    @staticmethod\n    def read(stream):\n        '''\n        Read PLY data from a readable file-like object or filename.\n\n        '''\n        (must_close, stream) = _open_stream(stream, 'read')\n        try:\n            data = PlyData._parse_header(stream)\n            for elt in data:\n                elt._read(stream, data.text, data.byte_order)\n        finally:\n            if must_close:\n                stream.close()\n\n        return data\n\n    def write(self, stream):\n        '''\n        Write PLY data to a writeable file-like object or filename.\n\n        '''\n        (must_close, stream) = _open_stream(stream, 'write')\n        try:\n            stream.write(self.header.encode('ascii'))\n            stream.write(b'\\r\\n')\n            for elt in self:\n                elt._write(stream, self.text, self.byte_order)\n        finally:\n            if must_close:\n                stream.close()\n\n    @property\n    def header(self):\n        '''\n        Provide PLY-formatted metadata for the instance.\n\n        '''\n        lines = ['ply']\n\n        if self.text:\n            lines.append('format ascii 1.0')\n        else:\n            lines.append('format ' +\n                         _byte_order_reverse[self.byte_order] +\n                         ' 1.0')\n\n        # Some information is lost here, since all comments are placed\n        # between the 'format' line and the first element.\n        for c in self.comments:\n            lines.append('comment ' + c)\n\n        for c in self.obj_info:\n            lines.append('obj_info ' + c)\n\n        lines.extend(elt.header for elt in self.elements)\n        lines.append('end_header')\n        return '\\r\\n'.join(lines)\n\n    def __iter__(self):\n        return iter(self.elements)\n\n    def __len__(self):\n        return len(self.elements)\n\n    def __contains__(self, name):\n        return name in self._element_lookup\n\n    def __getitem__(self, name):\n        return self._element_lookup[name]\n\n    def __str__(self):\n        return self.header\n\n    def __repr__(self):\n        return ('PlyData(%r, text=%r, byte_order=%r, '\n                'comments=%r, obj_info=%r)' %\n                (self.elements, self.text, self.byte_order,\n                 self.comments, self.obj_info))\n\n\ndef _open_stream(stream, read_or_write):\n    if hasattr(stream, read_or_write):\n        return (False, stream)\n    try:\n        return (True, open(stream, read_or_write[0] + 'b'))\n    except TypeError:\n        raise RuntimeError(\"expected open file or filename\")\n\n\nclass PlyElement(object):\n\n    '''\n    PLY file element.\n\n    A client of this library doesn't normally need to instantiate this\n    directly, so the following is only for the sake of documenting the\n    internals.\n\n    Creating a PlyElement instance is generally done in one of two ways:\n    as a byproduct of PlyData.read (when reading a PLY file) and by\n    PlyElement.describe (before writing a PLY file).\n\n    '''\n\n    def __init__(self, name, properties, count, comments=[]):\n        '''\n        This is not part of the public interface.  The preferred methods\n        of obtaining PlyElement instances are PlyData.read (to read from\n        a file) and PlyElement.describe (to construct from a numpy\n        array).\n\n        '''\n        self._name = str(name)\n        self._check_name()\n        self._count = count\n\n        self._properties = tuple(properties)\n        self._index()\n\n        self.comments = list(comments)\n\n        self._have_list = any(isinstance(p, PlyListProperty)\n                              for p in self.properties)\n\n    @property\n    def count(self):\n        return self._count\n\n    def _get_data(self):\n        return self._data\n\n    def _set_data(self, data):\n        self._data = data\n        self._count = len(data)\n        self._check_sanity()\n\n    data = property(_get_data, _set_data)\n\n    def _check_sanity(self):\n        for prop in self.properties:\n            if prop.name not in self._data.dtype.fields:\n                raise ValueError(\"dangling property %r\" % prop.name)\n\n    def _get_properties(self):\n        return self._properties\n\n    def _set_properties(self, properties):\n        self._properties = tuple(properties)\n        self._check_sanity()\n        self._index()\n\n    properties = property(_get_properties, _set_properties)\n\n    def _index(self):\n        self._property_lookup = dict((prop.name, prop)\n                                     for prop in self._properties)\n        if len(self._property_lookup) != len(self._properties):\n            raise ValueError(\"two properties with same name\")\n\n    def ply_property(self, name):\n        return self._property_lookup[name]\n\n    @property\n    def name(self):\n        return self._name\n\n    def _check_name(self):\n        if any(c.isspace() for c in self._name):\n            msg = \"element name %r contains spaces\" % self._name\n            raise ValueError(msg)\n\n    def dtype(self, byte_order='='):\n        '''\n        Return the numpy dtype of the in-memory representation of the\n        data.  (If there are no list properties, and the PLY format is\n        binary, then this also accurately describes the on-disk\n        representation of the element.)\n\n        '''\n        return [(prop.name, prop.dtype(byte_order))\n                for prop in self.properties]\n\n    @staticmethod\n    def _parse_multi(header_lines):\n        '''\n        Parse a list of PLY element definitions.\n\n        '''\n        elements = []\n        while header_lines:\n            (elt, header_lines) = PlyElement._parse_one(header_lines)\n            elements.append(elt)\n\n        return elements\n\n    @staticmethod\n    def _parse_one(lines):\n        '''\n        Consume one element definition.  The unconsumed input is\n        returned along with a PlyElement instance.\n\n        '''\n        a = 0\n        line = lines[a]\n\n        if line[0] != 'element':\n            raise PlyParseError(\"expected 'element'\")\n        if len(line) > 3:\n            raise PlyParseError(\"too many fields after 'element'\")\n        if len(line) < 3:\n            raise PlyParseError(\"too few fields after 'element'\")\n\n        (name, count) = (line[1], int(line[2]))\n\n        comments = []\n        properties = []\n        while True:\n            a += 1\n            if a >= len(lines):\n                break\n\n            if lines[a][0] == 'comment':\n                comments.append(lines[a][1])\n            elif lines[a][0] == 'property':\n                properties.append(PlyProperty._parse_one(lines[a]))\n            else:\n                break\n\n        return (PlyElement(name, properties, count, comments),\n                lines[a:])\n\n    @staticmethod\n    def describe(data, name, len_types={}, val_types={},\n                 comments=[]):\n        '''\n        Construct a PlyElement from an array's metadata.\n\n        len_types and val_types can be given as mappings from list\n        property names to type strings (like 'u1', 'f4', etc., or\n        'int8', 'float32', etc.). These can be used to define the length\n        and value types of list properties.  List property lengths\n        always default to type 'u1' (8-bit unsigned integer), and value\n        types default to 'i4' (32-bit integer).\n\n        '''\n        if not isinstance(data, _np.ndarray):\n            raise TypeError(\"only numpy arrays are supported\")\n\n        if len(data.shape) != 1:\n            raise ValueError(\"only one-dimensional arrays are \"\n                             \"supported\")\n\n        count = len(data)\n\n        properties = []\n        descr = data.dtype.descr\n\n        for t in descr:\n            if not isinstance(t[1], str):\n                raise ValueError(\"nested records not supported\")\n\n            if not t[0]:\n                raise ValueError(\"field with empty name\")\n\n            if len(t) != 2 or t[1][1] == 'O':\n                # non-scalar field, which corresponds to a list\n                # property in PLY.\n\n                if t[1][1] == 'O':\n                    if len(t) != 2:\n                        raise ValueError(\"non-scalar object fields not \"\n                                         \"supported\")\n\n                len_str = _data_type_reverse[len_types.get(t[0], 'u1')]\n                if t[1][1] == 'O':\n                    val_type = val_types.get(t[0], 'i4')\n                    val_str = _lookup_type(val_type)\n                else:\n                    val_str = _lookup_type(t[1][1:])\n\n                prop = PlyListProperty(t[0], len_str, val_str)\n            else:\n                val_str = _lookup_type(t[1][1:])\n                prop = PlyProperty(t[0], val_str)\n\n            properties.append(prop)\n\n        elt = PlyElement(name, properties, count, comments)\n        elt.data = data\n\n        return elt\n\n    def _read(self, stream, text, byte_order):\n        '''\n        Read the actual data from a PLY file.\n\n        '''\n        if text:\n            self._read_txt(stream)\n        else:\n            if self._have_list:\n                # There are list properties, so a simple load is\n                # impossible.\n                self._read_bin(stream, byte_order)\n            else:\n                # There are no list properties, so loading the data is\n                # much more straightforward.\n                self._data = _np.fromfile(stream,\n                                          self.dtype(byte_order),\n                                          self.count)\n\n        if len(self._data) < self.count:\n            k = len(self._data)\n            del self._data\n            raise PlyParseError(\"early end-of-file\", self, k)\n\n        self._check_sanity()\n\n    def _write(self, stream, text, byte_order):\n        '''\n        Write the data to a PLY file.\n\n        '''\n        if text:\n            self._write_txt(stream)\n        else:\n            if self._have_list:\n                # There are list properties, so serialization is\n                # slightly complicated.\n                self._write_bin(stream, byte_order)\n            else:\n                # no list properties, so serialization is\n                # straightforward.\n                self.data.astype(self.dtype(byte_order),\n                                 copy=False).tofile(stream)\n\n    def _read_txt(self, stream):\n        '''\n        Load a PLY element from an ASCII-format PLY file.  The element\n        may contain list properties.\n\n        '''\n        self._data = _np.empty(self.count, dtype=self.dtype())\n\n        k = 0\n        for line in _islice(iter(stream.readline, b''), self.count):\n            fields = iter(line.strip().split())\n            for prop in self.properties:\n                try:\n                    self._data[prop.name][k] = prop._from_fields(fields)\n                except StopIteration:\n                    raise PlyParseError(\"early end-of-line\",\n                                        self, k, prop)\n                except ValueError:\n                    raise PlyParseError(\"malformed input\",\n                                        self, k, prop)\n            try:\n                next(fields)\n            except StopIteration:\n                pass\n            else:\n                raise PlyParseError(\"expected end-of-line\", self, k)\n            k += 1\n\n        if k < self.count:\n            del self._data\n            raise PlyParseError(\"early end-of-file\", self, k)\n\n    def _write_txt(self, stream):\n        '''\n        Save a PLY element to an ASCII-format PLY file.  The element may\n        contain list properties.\n\n        '''\n        for rec in self.data:\n            fields = []\n            for prop in self.properties:\n                fields.extend(prop._to_fields(rec[prop.name]))\n\n            _np.savetxt(stream, [fields], '%.18g', newline='\\r\\n')\n\n    def _read_bin(self, stream, byte_order):\n        '''\n        Load a PLY element from a binary PLY file.  The element may\n        contain list properties.\n\n        '''\n        self._data = _np.empty(self.count, dtype=self.dtype(byte_order))\n\n        for k in _range(self.count):\n            for prop in self.properties:\n                try:\n                    self._data[prop.name][k] = \\\n                        prop._read_bin(stream, byte_order)\n                except StopIteration:\n                    raise PlyParseError(\"early end-of-file\",\n                                        self, k, prop)\n\n    def _write_bin(self, stream, byte_order):\n        '''\n        Save a PLY element to a binary PLY file.  The element may\n        contain list properties.\n\n        '''\n        for rec in self.data:\n            for prop in self.properties:\n                prop._write_bin(rec[prop.name], stream, byte_order)\n\n    @property\n    def header(self):\n        '''\n        Format this element's metadata as it would appear in a PLY\n        header.\n\n        '''\n        lines = ['element %s %d' % (self.name, self.count)]\n\n        # Some information is lost here, since all comments are placed\n        # between the 'element' line and the first property definition.\n        for c in self.comments:\n            lines.append('comment ' + c)\n\n        lines.extend(list(map(str, self.properties)))\n\n        return '\\r\\n'.join(lines)\n\n    def __getitem__(self, key):\n        return self.data[key]\n\n    def __setitem__(self, key, value):\n        self.data[key] = value\n\n    def __str__(self):\n        return self.header\n\n    def __repr__(self):\n        return ('PlyElement(%r, %r, count=%d, comments=%r)' %\n                (self.name, self.properties, self.count,\n                 self.comments))\n\n\nclass PlyProperty(object):\n\n    '''\n    PLY property description.  This class is pure metadata; the data\n    itself is contained in PlyElement instances.\n\n    '''\n\n    def __init__(self, name, val_dtype):\n        self._name = str(name)\n        self._check_name()\n        self.val_dtype = val_dtype\n\n    def _get_val_dtype(self):\n        return self._val_dtype\n\n    def _set_val_dtype(self, val_dtype):\n        self._val_dtype = _data_types[_lookup_type(val_dtype)]\n\n    val_dtype = property(_get_val_dtype, _set_val_dtype)\n\n    @property\n    def name(self):\n        return self._name\n\n    def _check_name(self):\n        if any(c.isspace() for c in self._name):\n            msg = \"Error: property name %r contains spaces\" % self._name\n            raise RuntimeError(msg)\n\n    @staticmethod\n    def _parse_one(line):\n        assert line[0] == 'property'\n\n        if line[1] == 'list':\n            if len(line) > 5:\n                raise PlyParseError(\"too many fields after \"\n                                    \"'property list'\")\n            if len(line) < 5:\n                raise PlyParseError(\"too few fields after \"\n                                    \"'property list'\")\n\n            return PlyListProperty(line[4], line[2], line[3])\n\n        else:\n            if len(line) > 3:\n                raise PlyParseError(\"too many fields after \"\n                                    \"'property'\")\n            if len(line) < 3:\n                raise PlyParseError(\"too few fields after \"\n                                    \"'property'\")\n\n            return PlyProperty(line[2], line[1])\n\n    def dtype(self, byte_order='='):\n        '''\n        Return the numpy dtype description for this property (as a tuple\n        of strings).\n\n        '''\n        return byte_order + self.val_dtype\n\n    def _from_fields(self, fields):\n        '''\n        Parse from generator.  Raise StopIteration if the property could\n        not be read.\n\n        '''\n        return _np.dtype(self.dtype()).type(next(fields))\n\n    def _to_fields(self, data):\n        '''\n        Return generator over one item.\n\n        '''\n        yield _np.dtype(self.dtype()).type(data)\n\n    def _read_bin(self, stream, byte_order):\n        '''\n        Read data from a binary stream.  Raise StopIteration if the\n        property could not be read.\n\n        '''\n        try:\n            return _np.fromfile(stream, self.dtype(byte_order), 1)[0]\n        except IndexError:\n            raise StopIteration\n\n    def _write_bin(self, data, stream, byte_order):\n        '''\n        Write data to a binary stream.\n\n        '''\n        _np.dtype(self.dtype(byte_order)).type(data).tofile(stream)\n\n    def __str__(self):\n        val_str = _data_type_reverse[self.val_dtype]\n        return 'property %s %s' % (val_str, self.name)\n\n    def __repr__(self):\n        return 'PlyProperty(%r, %r)' % (self.name,\n                                        _lookup_type(self.val_dtype))\n\n\nclass PlyListProperty(PlyProperty):\n\n    '''\n    PLY list property description.\n\n    '''\n\n    def __init__(self, name, len_dtype, val_dtype):\n        PlyProperty.__init__(self, name, val_dtype)\n\n        self.len_dtype = len_dtype\n\n    def _get_len_dtype(self):\n        return self._len_dtype\n\n    def _set_len_dtype(self, len_dtype):\n        self._len_dtype = _data_types[_lookup_type(len_dtype)]\n\n    len_dtype = property(_get_len_dtype, _set_len_dtype)\n\n    def dtype(self, byte_order='='):\n        '''\n        List properties always have a numpy dtype of \"object\".\n\n        '''\n        return '|O'\n\n    def list_dtype(self, byte_order='='):\n        '''\n        Return the pair (len_dtype, val_dtype) (both numpy-friendly\n        strings).\n\n        '''\n        return (byte_order + self.len_dtype,\n                byte_order + self.val_dtype)\n\n    def _from_fields(self, fields):\n        (len_t, val_t) = self.list_dtype()\n\n        n = int(_np.dtype(len_t).type(next(fields)))\n\n        data = _np.loadtxt(list(_islice(fields, n)), val_t, ndmin=1)\n        if len(data) < n:\n            raise StopIteration\n\n        return data\n\n    def _to_fields(self, data):\n        '''\n        Return generator over the (numerical) PLY representation of the\n        list data (length followed by actual data).\n\n        '''\n        (len_t, val_t) = self.list_dtype()\n\n        data = _np.asarray(data, dtype=val_t).ravel()\n\n        yield _np.dtype(len_t).type(data.size)\n        for x in data:\n            yield x\n\n    def _read_bin(self, stream, byte_order):\n        (len_t, val_t) = self.list_dtype(byte_order)\n\n        try:\n            n = _np.fromfile(stream, len_t, 1)[0]\n        except IndexError:\n            raise StopIteration\n\n        data = _np.fromfile(stream, val_t, n)\n        if len(data) < n:\n            raise StopIteration\n\n        return data\n\n    def _write_bin(self, data, stream, byte_order):\n        '''\n        Write data to a binary stream.\n\n        '''\n        (len_t, val_t) = self.list_dtype(byte_order)\n\n        data = _np.asarray(data, dtype=val_t).ravel()\n\n        _np.array(data.size, dtype=len_t).tofile(stream)\n        data.tofile(stream)\n\n    def __str__(self):\n        len_str = _data_type_reverse[self.len_dtype]\n        val_str = _data_type_reverse[self.val_dtype]\n        return 'property list %s %s %s' % (len_str, val_str, self.name)\n\n    def __repr__(self):\n        return ('PlyListProperty(%r, %r, %r)' %\n                (self.name,\n                 _lookup_type(self.len_dtype),\n                 _lookup_type(self.val_dtype)))\n"
  },
  {
    "path": "code/utils/pointnet_util.py",
    "content": "\"\"\" PointNet++ Layers\n\nAuthor: Charles R. Qi\nDate: November 2017\n\"\"\"\n\nimport os\nimport sys\nfrom tf_ops.sampling.tf_sampling import farthest_point_sample, gather_point\nfrom tf_ops.grouping.tf_grouping import query_ball_point, group_point, knn_point\nfrom tf_ops.interpolation.tf_interpolate import three_nn, three_interpolate\nimport tensorflow as tf\nimport numpy as np\nimport tf_util2\n\ndef sample_and_group(npoint, radius, nsample, xyz, points, tnet_spec=None, knn=False, use_xyz=True):\n    '''\n    Input:\n        npoint: int32\n        radius: float32\n        nsample: int32\n        xyz: (batch_size, ndataset, 3) TF tensor\n        points: (batch_size, ndataset, channel) TF tensor, if None will just use xyz as points\n        tnet_spec: dict (keys: mlp, mlp2, is_training, bn_decay), if None do not apply tnet\n        knn: bool, if True use kNN instead of radius search\n        use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features\n    Output:\n        new_xyz: (batch_size, npoint, 3) TF tensor\n        new_points: (batch_size, npoint, nsample, 3+channel) TF tensor\n        idx: (batch_size, npoint, nsample) TF tensor, indices of local points as in ndataset points\n        grouped_xyz: (batch_size, npoint, nsample, 3) TF tensor, normalized point XYZs\n            (subtracted by seed point XYZ) in local regions\n    '''\n\n    new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz)) # (batch_size, npoint, 3)\n    if knn:\n        _,idx = knn_point(nsample, xyz, new_xyz)\n    else:\n        if np.isscalar(radius):\n            idx, pts_cnt = query_ball_point(radius, nsample, xyz, new_xyz)\n        else:\n            idx_list = []\n            for radius_one, xyz_one, new_xyz_one in zip(tf.unstack(radius,axis=0), tf.unstack(xyz, axis=0),tf.unstack(new_xyz, axis=0)):\n                idx_one, _ = query_ball_point(radius_one, nsample, tf.expand_dims(xyz_one, axis=0), tf.expand_dims(new_xyz_one, axis=0))\n                idx_list.append(idx_one)\n            idx = tf.stack(idx_list, axis=0)\n            idx = tf.squeeze(idx, axis=1)\n\n    grouped_xyz = group_point(xyz, idx) # (batch_size, npoint, nsample, 3)\n    grouped_xyz -= tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1]) # translation normalization\n    if tnet_spec is not None:\n        grouped_xyz = tnet(grouped_xyz, tnet_spec)\n    if points is not None:\n        grouped_points = group_point(points, idx) # (batch_size, npoint, nsample, channel)\n        if use_xyz:\n            # new_points = tf.concat([grouped_xyz, tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1]),grouped_points], axis=-1) # (batch_size, npoint, nample, 3+channel)\n            new_points = tf.concat([grouped_xyz, grouped_points],axis=-1)  # (batch_size, npoint, nample, 3+channel)\n        else:\n            new_points = grouped_points\n    else:\n        # new_points =  tf.concat([grouped_xyz, tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1])], axis=-1)\n        new_points = grouped_xyz\n\n    return new_xyz, new_points, idx, grouped_xyz\n\n\ndef sample_and_group_all(xyz, points, use_xyz=True):\n    '''\n    Inputs:\n        xyz: (batch_size, ndataset, 3) TF tensor\n        points: (batch_size, ndataset, channel) TF tensor, if None will just use xyz as points\n        use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features\n    Outputs:\n        new_xyz: (batch_size, 1, 3) as (0,0,0)\n        new_points: (batch_size, 1, ndataset, 3+channel) TF tensor\n    Note:\n        Equivalent to sample_and_group with npoint=1, radius=inf, use (0,0,0) as the centroid\n    '''\n    batch_size = xyz.get_shape()[0].value\n    nsample = xyz.get_shape()[1].value\n    new_xyz = tf.constant(np.tile(np.array([0,0,0]).reshape((1,1,3)), (batch_size,1,1)),dtype=tf.float32) # (batch_size, 1, 3)\n    idx = tf.constant(np.tile(np.array(range(nsample)).reshape((1,1,nsample)), (batch_size,1,1)))\n    grouped_xyz = tf.reshape(xyz, (batch_size, 1, nsample, 3)) # (batch_size, npoint=1, nsample, 3)\n    if points is not None:\n        if use_xyz:\n            new_points = tf.concat([xyz, points], axis=2) # (batch_size, 16, 259)\n        else:\n            new_points = points\n        new_points = tf.expand_dims(new_points, 1) # (batch_size, 1, 16, 259)\n    else:\n        new_points = grouped_xyz\n    return new_xyz, new_points, idx, grouped_xyz\n\n\ndef pointnet_sa_module(xyz, points, npoint, radius, nsample, mlp, mlp2, group_all, is_training,\n                       bn_decay, scope, bn=True, ibn=False, pooling='max', tnet_spec=None, knn=False, use_xyz=True):\n    ''' PointNet Set Abstraction (SA) Module\n        Input:\n            xyz: (batch_size, ndataset, 3) TF tensor\n            points: (batch_size, ndataset, channel) TF tensor\n            npoint: int32 -- #points sampled in farthest point sampling\n            radius: float32 -- search radius in local region\n            batch_radius: the size of each object\n            nsample: int32 -- how many points in each local region\n            mlp: list of int32 -- output size for MLP on each point\n            mlp2: list of int32 -- output size for MLP on each region\n            group_all: bool -- group all points into one PC if set true, OVERRIDE\n                npoint, radius and nsample settings\n            use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features\n        Return:\n            new_xyz: (batch_size, npoint, 3) TF tensor\n            new_points: (batch_size, npoint, mlp[-1] or mlp2[-1]) TF tensor\n            idx: (batch_size, npoint, nsample) int32 -- indices for local regions\n    '''\n    with tf.variable_scope(scope) as sc:\n        if group_all:\n            nsample = xyz.get_shape()[1].value\n            new_xyz, new_points, idx, grouped_xyz = sample_and_group_all(xyz, points, use_xyz)\n        else:\n            new_xyz, new_points, idx, grouped_xyz = sample_and_group(npoint, radius, nsample, xyz, points, tnet_spec, knn, use_xyz)\n        if mlp2 is None: mlp2 = []\n        for i, num_out_channel in enumerate(mlp):\n            new_points = tf_util2.conv2d(new_points, num_out_channel, [1,1],\n                                        padding='VALID', stride=[1,1],\n                                        bn=bn, ibn=ibn, is_training=is_training,\n                                        scope='conv%d'%(i), bn_decay=bn_decay) \n        if pooling=='avg':\n            new_points = tf.layers.average_pooling2d(new_points, [1,nsample], [1,1], padding='VALID', name='avgpool1')\n        elif pooling=='weighted_avg':\n            with tf.variable_scope('weighted_avg1'):\n                dists = tf.norm(grouped_xyz,axis=-1,ord=2,keep_dims=True)\n                exp_dists = tf.exp(-dists * 5)\n                weights = exp_dists/tf.reduce_sum(exp_dists,axis=2,keep_dims=True) # (batch_size, npoint, nsample, 1)\n                new_points *= weights # (batch_size, npoint, nsample, mlp[-1])\n                new_points = tf.reduce_sum(new_points, axis=2, keep_dims=True)\n        elif pooling=='max':\n            new_points = tf.reduce_max(new_points, axis=[2], keep_dims=True)\n        elif pooling=='min':\n            new_points = tf.layers.max_pooling2d(-1 * new_points, [1, nsample], [1, 1], padding='VALID',name='minpool1')\n        elif pooling=='max_and_avg':\n            avg_points = tf.layers.max_pooling2d(new_points, [1,nsample], [1,1], padding='VALID', name='maxpool1')\n            max_points = tf.layers.average_pooling2d(new_points, [1,nsample],[1,1], padding='VALID', name='avgpool1')\n            new_points = tf.concat([avg_points, max_points], axis=-1)\n            \n        if mlp2 is None: mlp2 = []\n        for i, num_out_channel in enumerate(mlp2):\n            new_points = tf_util2.conv2d(new_points, num_out_channel, [1,1],\n                                        padding='VALID', stride=[1,1],\n                                        bn=bn, ibn=ibn,is_training=is_training,\n                                        scope='conv_post_%d'%(i), bn_decay=bn_decay) \n        new_points = tf.squeeze(new_points, [2]) # (batch_size, npoints, mlp2[-1])\n        return new_xyz, new_points, idx\n\ndef pointnet_sa_module_msg(xyz, points, npoint, radius_list, nsample_list, mlp_list, is_training, bn_decay, scope, bn=True, ibn = False, use_xyz=True):\n    ''' PointNet Set Abstraction (SA) module with Multi-Scale Grouping (MSG)\n        Input:\n            xyz: (batch_size, ndataset, 3) TF tensor\n            points: (batch_size, ndataset, channel) TF tensor\n            npoint: int32 -- #points sampled in farthest point sampling\n            radius: list of float32 -- search radius in local region\n            nsample: list of int32 -- how many points in each local region\n            mlp: list of list of int32 -- output size for MLP on each point\n            use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features\n        Return:\n            new_xyz: (batch_size, npoint, 3) TF tensor\n            new_points: (batch_size, npoint, \\sum_k{mlp[k][-1]}) TF tensor\n    '''\n    with tf.variable_scope(scope) as sc:\n        new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz))\n        new_points_list = []\n        for i in range(len(radius_list)):\n            radius = radius_list[i]\n            nsample = nsample_list[i]\n            idx, pts_cnt = query_ball_point(radius, nsample, xyz, new_xyz)\n            grouped_xyz = group_point(xyz, idx)\n            grouped_xyz -= tf.expand_dims(new_xyz, 2)\n            if points is not None:\n                grouped_points = group_point(points, idx)\n                if use_xyz:\n                    grouped_points = tf.concat([grouped_points, grouped_xyz], axis=-1)\n            else:\n                grouped_points = grouped_xyz\n            for j,num_out_channel in enumerate(mlp_list[i]):\n                grouped_points = tf_util2.conv2d(grouped_points, num_out_channel, [1,1],\n                                                padding='VALID', stride=[1,1], bn=bn, ibn=ibn,is_training=is_training,\n                                                scope='conv%d_%d'%(i,j), bn_decay=bn_decay)\n            new_points = tf.reduce_max(grouped_points, axis=[2])\n            new_points_list.append(new_points)\n        new_points_concat = tf.concat(new_points_list, axis=-1)\n        return new_xyz, new_points_concat\n\n \ndef pointnet_fp_module(xyz1, xyz2, points1, points2, mlp, is_training, bn_decay, scope, bn=True,ibn=False):\n    ''' PointNet Feature Propogation (FP) Module\n        Input:                                                                                                      \n            xyz1: (batch_size, ndataset1, 3) TF tensor                                                              \n            xyz2: (batch_size, ndataset2, 3) TF tensor, sparser than xyz1                                           \n            points1: (batch_size, ndataset1, nchannel1) TF tensor                                                   \n            points2: (batch_size, ndataset2, nchannel2) TF tensor\n            mlp: list of int32 -- output size for MLP on each point                                                 \n        Return:\n            new_points: (batch_size, ndataset1, mlp[-1]) TF tensor\n    '''\n    with tf.variable_scope(scope) as sc:\n        dist, idx = three_nn(xyz1, xyz2)\n        dist = tf.maximum(dist, 1e-10)\n        norm = tf.reduce_sum((1.0/dist),axis=2,keep_dims=True)\n        norm = tf.tile(norm,[1,1,3])\n        weight = (1.0/dist) / norm\n        interpolated_points = three_interpolate(points2, idx, weight)\n\n        if points1 is not None:\n            new_points1 = tf.concat(axis=2, values=[interpolated_points, points1]) # B,ndataset1,nchannel1+nchannel2\n        else:\n            new_points1 = interpolated_points\n        new_points1 = tf.expand_dims(new_points1, 2)\n        for i, num_out_channel in enumerate(mlp):\n            new_points1 = tf_util2.conv2d(new_points1, num_out_channel, [1,1],\n                                         padding='VALID', stride=[1,1],\n                                         bn=bn, ibn=ibn,is_training=is_training,\n                                         scope='conv_%d'%(i), bn_decay=bn_decay)\n        new_points1 = tf.squeeze(new_points1, [2]) # B,ndataset1,mlp[-1]\n        return new_points1\n"
  },
  {
    "path": "code/utils/provider.py",
    "content": "import os\nimport sys\nimport numpy as np\nimport h5py\nBASE_DIR = os.path.dirname(os.path.abspath(__file__))\nsys.path.append(BASE_DIR)\n\n# Download dataset for point cloud classification\nDATA_DIR = os.path.join(BASE_DIR, 'data')\nif not os.path.exists(DATA_DIR):\n    os.mkdir(DATA_DIR)\nif not os.path.exists(os.path.join(DATA_DIR, 'modelnet40_ply_hdf5_2048')):\n    www = 'https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip'\n    zipfile = os.path.basename(www)\n    os.system('wget %s; unzip %s' % (www, zipfile))\n    os.system('mv %s %s' % (zipfile[:-4], DATA_DIR))\n    os.system('rm %s' % (zipfile))\n\n\ndef shuffle_data(data, labels):\n    \"\"\" Shuffle data and labels.\n        Input:\n          data: B,N,... numpy array\n          label: B,... numpy array\n        Return:\n          shuffled data, label and shuffle indices\n    \"\"\"\n    idx = np.arange(len(labels))\n    np.random.shuffle(idx)\n    return data[idx, ...], labels[idx], idx\n\n\ndef rotate_point_cloud(batch_data):\n    \"\"\" Randomly rotate the point clouds to augument the dataset\n        rotation is per shape based along up direction\n        Input:\n          BxNx3 array, original batch of point clouds\n        Return:\n          BxNx3 array, rotated batch of point clouds\n    \"\"\"\n    rotated_data = np.zeros(batch_data.shape, dtype=np.float32)\n    for k in xrange(batch_data.shape[0]):\n        rotation_angle = np.random.uniform() * 2 * np.pi\n        cosval = np.cos(rotation_angle)\n        sinval = np.sin(rotation_angle)\n        rotation_matrix = np.array([[cosval, 0, sinval],\n                                    [0, 1, 0],\n                                    [-sinval, 0, cosval]])\n        shape_pc = batch_data[k, ...]\n        rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), rotation_matrix)\n    return rotated_data\n\n\ndef rotate_point_cloud_by_angle(batch_data, rotation_angle):\n    \"\"\" Rotate the point cloud along up direction with certain angle.\n        Input:\n          BxNx3 array, original batch of point clouds\n        Return:\n          BxNx3 array, rotated batch of point clouds\n    \"\"\"\n    rotated_data = np.zeros(batch_data.shape, dtype=np.float32)\n    for k in xrange(batch_data.shape[0]):\n        #rotation_angle = np.random.uniform() * 2 * np.pi\n        cosval = np.cos(rotation_angle)\n        sinval = np.sin(rotation_angle)\n        rotation_matrix = np.array([[cosval, 0, sinval],\n                                    [0, 1, 0],\n                                    [-sinval, 0, cosval]])\n        shape_pc = batch_data[k, ...]\n        rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), rotation_matrix)\n    return rotated_data\n\n\ndef rotate_perturbation_point_cloud(batch_data, angle_sigma=0.06, angle_clip=0.18):\n    \"\"\" Randomly perturb the point clouds by small rotations\n        Input:\n          BxNx3 array, original batch of point clouds\n        Return:\n          BxNx3 array, rotated batch of point clouds\n    \"\"\"\n    rotated_data = np.zeros(batch_data.shape, dtype=np.float32)\n    for k in xrange(batch_data.shape[0]):\n        angles = np.clip(angle_sigma*np.random.randn(3), -angle_clip, angle_clip)\n        Rx = np.array([[1,0,0],\n                       [0,np.cos(angles[0]),-np.sin(angles[0])],\n                       [0,np.sin(angles[0]),np.cos(angles[0])]])\n        Ry = np.array([[np.cos(angles[1]),0,np.sin(angles[1])],\n                       [0,1,0],\n                       [-np.sin(angles[1]),0,np.cos(angles[1])]])\n        Rz = np.array([[np.cos(angles[2]),-np.sin(angles[2]),0],\n                       [np.sin(angles[2]),np.cos(angles[2]),0],\n                       [0,0,1]])\n        R = np.dot(Rz, np.dot(Ry,Rx))\n        shape_pc = batch_data[k, ...]\n        rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), R)\n    return rotated_data\n\n\ndef jitter_point_cloud(batch_data, sigma=0.01, clip=0.05):\n    \"\"\" Randomly jitter points. jittering is per point.\n        Input:\n          BxNx3 array, original batch of point clouds\n        Return:\n          BxNx3 array, jittered batch of point clouds\n    \"\"\"\n    B, N, C = batch_data.shape\n    assert(clip > 0)\n    jittered_data = np.clip(sigma * np.random.randn(B, N, C), -1*clip, clip)\n    jittered_data += batch_data\n    return jittered_data\n\ndef shift_point_cloud(batch_data, shift_range=0.1):\n    \"\"\" Randomly shift point cloud. Shift is per point cloud.\n        Input:\n          BxNx3 array, original batch of point clouds\n        Return:\n          BxNx3 array, shifted batch of point clouds\n    \"\"\"\n    B, N, C = batch_data.shape\n    shifts = np.random.uniform(-shift_range, shift_range, (B,3))\n    for batch_index in range(B):\n        batch_data[batch_index,:,:] += shifts[batch_index,:]\n    return batch_data\n\n\ndef random_scale_point_cloud(batch_data, scale_low=0.8, scale_high=1.25):\n    \"\"\" Randomly scale the point cloud. Scale is per point cloud.\n        Input:\n            BxNx3 array, original batch of point clouds\n        Return:\n            BxNx3 array, scaled batch of point clouds\n    \"\"\"\n    B, N, C = batch_data.shape\n    scales = np.random.uniform(scale_low, scale_high, B)\n    for batch_index in range(B):\n        batch_data[batch_index,:,:] *= scales[batch_index]\n    return batch_data\n\ndef getDataFiles(list_filename):\n    return [line.rstrip() for line in open(list_filename)]\n\ndef load_h5(h5_filename):\n    f = h5py.File(h5_filename)\n    data = f['data'][:]\n    label = f['label'][:]\n    return (data, label)\n\ndef loadDataFile(filename):\n    return load_h5(filename)"
  },
  {
    "path": "code/utils/show3d.py",
    "content": "'''\r\n\r\nThe default behavior is to visualize the points as white dots\r\n>>>show3d.showpoints(np.random.rand(10000,3))\r\n\r\nControl:\r\nkey q: quit\r\nkey Q: sys.exit(0)\r\nkey n: zoom in\r\nkey m: zoom out\r\nkey s: save screenshot to 'show3d.png'\r\nMouse: rotate\r\n\r\nYou can also play a video by specifying waittime\r\n>>>[show3d.showpoints(np.random.rand(10000,3),waittime=10) for i in xrange(10000)]\r\n\r\nColor can also be useful\r\n>>>green=np.linspace(0,1,10000)\r\n>>>red=np.linspace(1,0,10000)\r\n>>>blue=np.linspace(1,0,10000)**2\r\n>>>show3d.showpoints(np.random.rand(10000,3),green,red,blue)\r\n\r\nAdditional Parameters\r\n---------------------\r\nnormalizecolor:\r\n    if True (default), scale the maximum color to 1 for each channel.\r\nmagnifyBlue:\r\n    if True, magnify the blue dots to make them more visible\r\nbackground:\r\n    the background color. Defaults to black (0,0,0)\r\nfreezerot:\r\n    disable rotation\r\n\r\n'''\r\n\r\n\r\nimport numpy as np\r\n# import cv2\r\nimport sys\r\nshowsz=800\r\nmousex,mousey=0.5,0.5\r\nzoom=1.0\r\nchanged=True\r\ndef onmouse(*args):\r\n    global mousex,mousey,changed\r\n    y=args[1]\r\n    x=args[2]\r\n    mousex=x/float(showsz)\r\n    mousey=y/float(showsz)\r\n    changed=True\r\n\r\ndef showpoints(xyz,c0=None,c1=None,c2=None,waittime=0,showrot=False,magnifyBlue=0,freezerot=False,background=(0,0,0),normalizecolor=True):\r\n    # cv2.namedWindow('show3d')\r\n    # cv2.moveWindow('show3d', 0, 0)\r\n    # cv2.setMouseCallback('show3d', onmouse)\r\n\r\n    global showsz,mousex,mousey,zoom,changed\r\n    if len(xyz.shape)!=2 or xyz.shape[1]!=3:\r\n        raise Exception('showpoints expects (n,3) shape for xyz')\r\n    if c0 is not None and c0.shape!=xyz.shape[:1]:\r\n        raise Exception('showpoints expects (n,) shape for c0')\r\n    if c1 is not None and c1.shape!=xyz.shape[:1]:\r\n        raise Exception('showpoints expects (n,) shape for c1')\r\n    if c2 is not None and c2.shape!=xyz.shape[:1]:\r\n        raise Exception('showpoints expects (n,) shape for c2')\r\n    xyz=xyz-xyz.mean(axis=0)\r\n    radius=((xyz**2).sum(axis=-1)**0.5).max()\r\n    xyz/=(radius*2.2)/showsz\r\n    if c0 is None:\r\n        c0=np.zeros((len(xyz),),dtype='float32')+255\r\n    if c1 is None:\r\n        c1=c0\r\n    if c2 is None:\r\n        c2=c0\r\n    if normalizecolor:\r\n        c0=c0/((c0.max()+1e-14)/255.0)\r\n        c1=c1/((c1.max()+1e-14)/255.0)\r\n        c2=c2/((c2.max()+1e-14)/255.0)\r\n\r\n    show=np.zeros((showsz,showsz,3),dtype='uint8')\r\n    def render():\r\n        rotmat=np.eye(3)\r\n        if not freezerot:\r\n            xangle=(mousey-0.5)*np.pi*1.2\r\n        else:\r\n            xangle=0\r\n        rotmat=rotmat.dot(np.array([\r\n            [1.0,0.0,0.0],\r\n            [0.0,np.cos(xangle),-np.sin(xangle)],\r\n            [0.0,np.sin(xangle),np.cos(xangle)],\r\n            ]))\r\n        if not freezerot:\r\n            yangle=(mousex-0.5)*np.pi*1.2\r\n        else:\r\n            yangle=0\r\n        rotmat=rotmat.dot(np.array([\r\n            [np.cos(yangle),0.0,-np.sin(yangle)],\r\n            [0.0,1.0,0.0],\r\n            [np.sin(yangle),0.0,np.cos(yangle)],\r\n            ]))\r\n        rotmat*=zoom\r\n        nxyz=xyz.dot(rotmat)\r\n        nz=nxyz[:,2].argsort()\r\n        nxyz=nxyz[nz]\r\n        nxyz=(nxyz[:,:2]+[showsz/2,showsz/2]).astype('int32')\r\n        p=nxyz[:,0]*showsz+nxyz[:,1]\r\n        show[:]=background\r\n        m=(nxyz[:,0]>=0)*(nxyz[:,0]<showsz)*(nxyz[:,1]>=0)*(nxyz[:,1]<showsz)\r\n        show.reshape((showsz*showsz,3))[p[m],1]=c0[nz][m]\r\n        show.reshape((showsz*showsz,3))[p[m],2]=c1[nz][m]\r\n        show.reshape((showsz*showsz,3))[p[m],0]=c2[nz][m]\r\n        if magnifyBlue>0:\r\n            show[:,:,0]=np.maximum(show[:,:,0],np.roll(show[:,:,0],1,axis=0))\r\n            if magnifyBlue>=2:\r\n                show[:,:,0]=np.maximum(show[:,:,0],np.roll(show[:,:,0],-1,axis=0))\r\n            show[:,:,0]=np.maximum(show[:,:,0],np.roll(show[:,:,0],1,axis=1))\r\n            if magnifyBlue>=2:\r\n                show[:,:,0]=np.maximum(show[:,:,0],np.roll(show[:,:,0],-1,axis=1))\r\n        if showrot:\r\n\t    print(\"Show\")\r\n            # cv2.putText(show,'xangle %d'%(int(xangle/np.pi*180)),(30,showsz-30),0,0.5,cv2.cv.CV_RGB(255,0,0))\r\n            # cv2.putText(show,'yangle %d'%(int(yangle/np.pi*180)),(30,showsz-50),0,0.5,cv2.cv.CV_RGB(255,0,0))\r\n            # cv2.putText(show,'zoom %d%%'%(int(zoom*100)),(30,showsz-70),0,0.5,cv2.cv.CV_RGB(255,0,0))\r\n    changed=True\r\n    while True:\r\n        if changed:\r\n            render()\r\n            changed=False\r\n        # cv2.imshow('show3d',show)\r\n        if waittime==0:\r\n            # cmd=cv2.waitKey(10)%256\r\n            print(\"Some waittime small\")\r\n        else:\r\n\t    print(\"Some waittime\")\r\n            # cmd=cv2.waitKey(waittime)%256\r\n        if cmd==ord('q'):\r\n            break\r\n        elif cmd==ord('Q'):\r\n            sys.exit(0)\r\n        if cmd==ord('n'):\r\n            zoom*=1.1\r\n            changed=True\r\n        elif cmd==ord('m'):\r\n            zoom/=1.1\r\n            changed=True\r\n        elif cmd==ord('r'):\r\n            zoom=1.0\r\n            changed=True\r\n        elif cmd==ord('s'):\r\n\t    print(\"to show img\")\r\n            # cv2.imwrite('show3d.png',show)\r\n        if waittime!=0:\r\n            break\r\n    return cmd\r\nif __name__=='__main__':\r\n    showpoints(np.random.rand(10000,3))\r\n    green=np.linspace(0,1,10000)\r\n    red=np.linspace(1,0,10000)**0.5\r\n    blue=np.linspace(1,0,10000)\r\n    showpoints(np.random.rand(10000,3),green,red,blue,magnifyBlue=True)\r\n"
  },
  {
    "path": "code/utils/tf_util.py",
    "content": "\"\"\" Wrapper functions for TensorFlow layers.\n\nAuthor: Charles R. Qi\nDate: November 2017\n\"\"\"\n\nimport numpy as np\nimport tensorflow as tf\n\ndef _variable_on_cpu(name, shape, initializer, use_fp16=False):\n  \"\"\"Helper to create a Variable stored on CPU memory.\n  Args:\n    name: name of the variable\n    shape: list of ints\n    initializer: initializer for Variable\n  Returns:\n    Variable Tensor\n  \"\"\"\n  dtype = tf.float16 if use_fp16 else tf.float32\n  var = tf.get_variable(name, shape, initializer=initializer, dtype=dtype)\n  return var\n\ndef _variable_with_weight_decay(name, shape, stddev, wd, use_xavier=True):\n  \"\"\"Helper to create an initialized Variable with weight decay.\n\n  Note that the Variable is initialized with a truncated normal distribution.\n  A weight decay is added only if one is specified.\n\n  Args:\n    name: name of the variable\n    shape: list of ints\n    stddev: standard deviation of a truncated Gaussian\n    wd: add L2Loss weight decay multiplied by this float. If None, weight\n        decay is not added for this Variable.\n    use_xavier: bool, whether to use xavier initializer\n\n  Returns:\n    Variable Tensor\n  \"\"\"\n  if use_xavier:\n    initializer = tf.contrib.layers.xavier_initializer()\n  else:\n    initializer = tf.truncated_normal_initializer(stddev=stddev)\n  var = _variable_on_cpu(name, shape, initializer)\n  if wd is not None:\n    weight_decay = tf.multiply(tf.nn.l2_loss(var), wd, name='weight_loss')\n    tf.add_to_collection('losses', weight_decay)\n  return var\n\n\ndef conv1d(inputs,\n           num_output_channels,\n           kernel_size,\n           scope,\n           stride=1,\n           padding='SAME',\n           use_xavier=True,\n           stddev=1e-3,\n           weight_decay=0.0,\n           activation_fn=tf.nn.relu,\n           bn=False,\n           bn_decay=None,\n           is_training=None,\n           use_bias = True):\n  \"\"\" 1D convolution with non-linear operation.\n\n  Args:\n    inputs: 3-D tensor variable BxLxC\n    num_output_channels: int\n    kernel_size: int\n    scope: string\n    stride: int\n    padding: 'SAME' or 'VALID'\n    use_xavier: bool, use xavier_initializer if true\n    stddev: float, stddev for truncated_normal init\n    weight_decay: float\n    activation_fn: function\n    bn: bool, whether to use batch norm\n    bn_decay: float or float tensor variable in [0,1]\n    is_training: bool Tensor variable\n\n  Returns:\n    Variable tensor\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n    num_in_channels = inputs.get_shape()[-1].value\n    kernel_shape = [kernel_size,\n                    num_in_channels, num_output_channels]\n    kernel = _variable_with_weight_decay('weights',\n                                         shape=kernel_shape,\n                                         use_xavier=use_xavier,\n                                         stddev=stddev,\n                                         wd=weight_decay)\n    outputs = tf.nn.conv1d(inputs, kernel,\n                           stride=stride,\n                           padding=padding)\n    if use_bias:\n        biases = _variable_on_cpu('biases', [num_output_channels],\n                                tf.constant_initializer(0.0))\n        outputs = tf.nn.bias_add(outputs, biases)\n\n    if bn:\n      outputs = batch_norm_for_conv1d(outputs, is_training,\n                                      bn_decay=bn_decay, scope='bn')\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return outputs\n\n\n\n\ndef conv2d(inputs,\n           num_output_channels,\n           kernel_size,\n           scope,\n           stride=[1, 1],\n           padding='SAME',\n           use_xavier=True,\n           stddev=1e-3,\n           weight_decay=0.00001,\n           activation_fn=tf.nn.relu,\n           bn=False,\n           bn_decay=None,\n           is_training=None):\n  \"\"\" 2D convolution with non-linear operation.\n\n  Args:\n    inputs: 4-D tensor variable BxHxWxC\n    num_output_channels: int\n    kernel_size: a list of 2 ints\n    scope: string\n    stride: a list of 2 ints\n    padding: 'SAME' or 'VALID'\n    use_xavier: bool, use xavier_initializer if true\n    stddev: float, stddev for truncated_normal init\n    weight_decay: float\n    activation_fn: function\n    bn: bool, whether to use batch norm\n    bn_decay: float or float tensor variable in [0,1]\n    is_training: bool Tensor variable\n\n  Returns:\n    Variable tensor\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n      kernel_h, kernel_w = kernel_size\n      num_in_channels = inputs.get_shape()[-1].value\n      kernel_shape = [kernel_h, kernel_w,\n                      num_in_channels, num_output_channels]\n      kernel = _variable_with_weight_decay('weights',\n                                           shape=kernel_shape,\n                                           use_xavier=use_xavier,\n                                           stddev=stddev,\n                                           wd=weight_decay)\n      stride_h, stride_w = stride\n      outputs = tf.nn.conv2d(inputs, kernel,\n                             [1, stride_h, stride_w, 1],\n                             padding=padding)\n      biases = _variable_on_cpu('biases', [num_output_channels],\n                                tf.constant_initializer(0.0))\n      outputs = tf.nn.bias_add(outputs, biases)\n\n      if bn:\n        outputs = batch_norm_for_conv2d(outputs, is_training,\n                                        bn_decay=bn_decay, scope='bn')\n\n      if activation_fn is not None:\n        outputs = activation_fn(outputs)\n      return outputs\n\n\ndef conv2d_transpose(inputs,\n                     num_output_channels,\n                     kernel_size,\n                     scope,\n                     stride=[1, 1],\n                     padding='SAME',\n                     use_xavier=True,\n                     stddev=1e-3,\n                     weight_decay=0.0,\n                     activation_fn=tf.nn.relu,\n                     bn=False,\n                     bn_decay=None,\n                     is_training=None):\n  \"\"\" 2D convolution transpose with non-linear operation.\n\n  Args:\n    inputs: 4-D tensor variable BxHxWxC\n    num_output_channels: int\n    kernel_size: a list of 2 ints\n    scope: string\n    stride: a list of 2 ints\n    padding: 'SAME' or 'VALID'\n    use_xavier: bool, use xavier_initializer if true\n    stddev: float, stddev for truncated_normal init\n    weight_decay: float\n    activation_fn: function\n    bn: bool, whether to use batch norm\n    bn_decay: float or float tensor variable in [0,1]\n    is_training: bool Tensor variable\n\n  Returns:\n    Variable tensor\n\n  Note: conv2d(conv2d_transpose(a, num_out, ksize, stride), a.shape[-1], ksize, stride) == a\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n      kernel_h, kernel_w = kernel_size\n      num_in_channels = inputs.get_shape()[-1].value\n      kernel_shape = [kernel_h, kernel_w,\n                      num_output_channels, num_in_channels] # reversed to conv2d\n      kernel = _variable_with_weight_decay('weights',\n                                           shape=kernel_shape,\n                                           use_xavier=use_xavier,\n                                           stddev=stddev,\n                                           wd=weight_decay)\n      stride_h, stride_w = stride\n      \n      # from slim.convolution2d_transpose\n      def get_deconv_dim(dim_size, stride_size, kernel_size, padding):\n          dim_size *= stride_size\n\n          if padding == 'VALID' and dim_size is not None:\n            dim_size += max(kernel_size - stride_size, 0)\n          return dim_size\n\n      # caculate output shape\n      batch_size = inputs.get_shape()[0].value\n      height = inputs.get_shape()[1].value\n      width = inputs.get_shape()[2].value\n      out_height = get_deconv_dim(height, stride_h, kernel_h, padding)\n      out_width = get_deconv_dim(width, stride_w, kernel_w, padding)\n      output_shape = [batch_size, out_height, out_width, num_output_channels]\n\n      outputs = tf.nn.conv2d_transpose(inputs, kernel, output_shape,\n                             [1, stride_h, stride_w, 1],\n                             padding=padding)\n      biases = _variable_on_cpu('biases', [num_output_channels],\n                                tf.constant_initializer(0.0))\n      outputs = tf.nn.bias_add(outputs, biases)\n\n      if bn:\n        outputs = batch_norm_for_conv2d(outputs, is_training,\n                                        bn_decay=bn_decay, scope='bn')\n\n      if activation_fn is not None:\n        outputs = activation_fn(outputs)\n      return outputs\n\n   \n\ndef conv3d(inputs,\n           num_output_channels,\n           kernel_size,\n           scope,\n           stride=[1, 1, 1],\n           padding='SAME',\n           use_xavier=True,\n           stddev=1e-3,\n           weight_decay=0.0,\n           activation_fn=tf.nn.relu,\n           bn=False,\n           bn_decay=None,\n           is_training=None):\n  \"\"\" 3D convolution with non-linear operation.\n\n  Args:\n    inputs: 5-D tensor variable BxDxHxWxC\n    num_output_channels: int\n    kernel_size: a list of 3 ints\n    scope: string\n    stride: a list of 3 ints\n    padding: 'SAME' or 'VALID'\n    use_xavier: bool, use xavier_initializer if true\n    stddev: float, stddev for truncated_normal init\n    weight_decay: float\n    activation_fn: function\n    bn: bool, whether to use batch norm\n    bn_decay: float or float tensor variable in [0,1]\n    is_training: bool Tensor variable\n\n  Returns:\n    Variable tensor\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n    kernel_d, kernel_h, kernel_w = kernel_size\n    num_in_channels = inputs.get_shape()[-1].value\n    kernel_shape = [kernel_d, kernel_h, kernel_w,\n                    num_in_channels, num_output_channels]\n    kernel = _variable_with_weight_decay('weights',\n                                         shape=kernel_shape,\n                                         use_xavier=use_xavier,\n                                         stddev=stddev,\n                                         wd=weight_decay)\n    stride_d, stride_h, stride_w = stride\n    outputs = tf.nn.conv3d(inputs, kernel,\n                           [1, stride_d, stride_h, stride_w, 1],\n                           padding=padding)\n    biases = _variable_on_cpu('biases', [num_output_channels],\n                              tf.constant_initializer(0.0))\n    outputs = tf.nn.bias_add(outputs, biases)\n    \n    if bn:\n      outputs = batch_norm_for_conv3d(outputs, is_training,\n                                      bn_decay=bn_decay, scope='bn')\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return outputs\n\ndef fully_connected(inputs,\n                    num_outputs,\n                    scope,\n                    use_xavier=True,\n                    stddev=1e-3,\n                    weight_decay=0.0,\n                    activation_fn=tf.nn.relu,\n                    bn=False,\n                    bn_decay=None,\n                    is_training=None):\n  \"\"\" Fully connected layer with non-linear operation.\n  \n  Args:\n    inputs: 2-D tensor BxN\n    num_outputs: int\n  \n  Returns:\n    Variable tensor of size B x num_outputs.\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n    num_input_units = inputs.get_shape()[-1].value\n    weights = _variable_with_weight_decay('weights',\n                                          shape=[num_input_units, num_outputs],\n                                          use_xavier=use_xavier,\n                                          stddev=stddev,\n                                          wd=weight_decay)\n    outputs = tf.matmul(inputs, weights)\n    biases = _variable_on_cpu('biases', [num_outputs],\n                             tf.constant_initializer(0.0))\n    outputs = tf.nn.bias_add(outputs, biases)\n     \n    if bn:\n      outputs = batch_norm_for_fc(outputs, is_training, bn_decay, 'bn')\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return outputs\n\n\ndef max_pool2d(inputs,\n               kernel_size,\n               scope,\n               stride=[2, 2],\n               padding='VALID'):\n  \"\"\" 2D max pooling.\n\n  Args:\n    inputs: 4-D tensor BxHxWxC\n    kernel_size: a list of 2 ints\n    stride: a list of 2 ints\n  \n  Returns:\n    Variable tensor\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n    kernel_h, kernel_w = kernel_size\n    stride_h, stride_w = stride\n    outputs = tf.nn.max_pool(inputs,\n                             ksize=[1, kernel_h, kernel_w, 1],\n                             strides=[1, stride_h, stride_w, 1],\n                             padding=padding,\n                             name=sc.name)\n    return outputs\n\ndef avg_pool2d(inputs,\n               kernel_size,\n               scope,\n               stride=[2, 2],\n               padding='VALID'):\n  \"\"\" 2D avg pooling.\n\n  Args:\n    inputs: 4-D tensor BxHxWxC\n    kernel_size: a list of 2 ints\n    stride: a list of 2 ints\n  \n  Returns:\n    Variable tensor\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n    kernel_h, kernel_w = kernel_size\n    stride_h, stride_w = stride\n    outputs = tf.nn.avg_pool(inputs,\n                             ksize=[1, kernel_h, kernel_w, 1],\n                             strides=[1, stride_h, stride_w, 1],\n                             padding=padding,\n                             name=sc.name)\n    return outputs\n\n\ndef max_pool3d(inputs,\n               kernel_size,\n               scope,\n               stride=[2, 2, 2],\n               padding='VALID'):\n  \"\"\" 3D max pooling.\n\n  Args:\n    inputs: 5-D tensor BxDxHxWxC\n    kernel_size: a list of 3 ints\n    stride: a list of 3 ints\n  \n  Returns:\n    Variable tensor\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n    kernel_d, kernel_h, kernel_w = kernel_size\n    stride_d, stride_h, stride_w = stride\n    outputs = tf.nn.max_pool3d(inputs,\n                               ksize=[1, kernel_d, kernel_h, kernel_w, 1],\n                               strides=[1, stride_d, stride_h, stride_w, 1],\n                               padding=padding,\n                               name=sc.name)\n    return outputs\n\ndef avg_pool3d(inputs,\n               kernel_size,\n               scope,\n               stride=[2, 2, 2],\n               padding='VALID'):\n  \"\"\" 3D avg pooling.\n\n  Args:\n    inputs: 5-D tensor BxDxHxWxC\n    kernel_size: a list of 3 ints\n    stride: a list of 3 ints\n  \n  Returns:\n    Variable tensor\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n    kernel_d, kernel_h, kernel_w = kernel_size\n    stride_d, stride_h, stride_w = stride\n    outputs = tf.nn.avg_pool3d(inputs,\n                               ksize=[1, kernel_d, kernel_h, kernel_w, 1],\n                               strides=[1, stride_d, stride_h, stride_w, 1],\n                               padding=padding,\n                               name=sc.name)\n    return outputs\n\n\n\n\n\ndef batch_norm_template(inputs, is_training, scope, moments_dims, bn_decay):\n  \"\"\" Batch normalization on convolutional maps and beyond...\n  Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow\n  \n  Args:\n      inputs:        Tensor, k-D input ... x C could be BC or BHWC or BDHWC\n      is_training:   boolean tf.Varialbe, true indicates training phase\n      scope:         string, variable scope\n      moments_dims:  a list of ints, indicating dimensions for moments calculation\n      bn_decay:      float or float tensor variable, controling moving average weight\n  Return:\n      normed:        batch-normalized maps\n  \"\"\"\n  # For support of GAN\n  #bn_decay = bn_decay if bn_decay is not None else 0.9\n  #return tf.contrib.layers.batch_norm(inputs, \n  #                                    center=True, scale=True, \n  #                                    is_training=is_training, decay=bn_decay,updates_collections=None,\n  #                                    scope=scope)\n  with tf.variable_scope(scope) as sc:\n    num_channels = inputs.get_shape()[-1].value\n    beta = tf.Variable(tf.constant(0.0, shape=[num_channels]),\n                       name='beta', trainable=True)\n    gamma = tf.Variable(tf.constant(1.0, shape=[num_channels]),\n                        name='gamma', trainable=True)\n    \n    batch_mean, batch_var = tf.nn.moments(inputs, moments_dims, name='moments')\n    decay = bn_decay if bn_decay is not None else 0.9\n    ema = tf.train.ExponentialMovingAverage(decay=decay)\n    # Operator that maintains moving averages of variables.\n    # Need to set reuse=False, otherwise if reuse, will see moments_1/mean/ExponentialMovingAverage/ does not exist\n    # https://github.com/shekkizh/WassersteinGAN.tensorflow/issues/3\n    with tf.variable_scope(tf.get_variable_scope(), reuse=False):\n        ema_apply_op = tf.cond(is_training,\n                               lambda: ema.apply([batch_mean, batch_var]),\n                               lambda: tf.no_op())\n    \n    # Update moving average and return current batch's avg and var.\n    def mean_var_with_update():\n      with tf.control_dependencies([ema_apply_op]):\n        return tf.identity(batch_mean), tf.identity(batch_var)\n    \n    # ema.average returns the Variable holding the average of var.\n    mean, var = tf.cond(is_training,\n                        mean_var_with_update,\n                        lambda: (ema.average(batch_mean), ema.average(batch_var)))\n    normed = tf.nn.batch_normalization(inputs, mean, var, beta, gamma, 1e-3)\n  return normed\n\ndef batch_norm_for_fc(inputs, is_training, bn_decay, scope):\n  \"\"\" Batch normalization on FC data.\n  \n  Args:\n      inputs:      Tensor, 2D BxC input\n      is_training: boolean tf.Varialbe, true indicates training phase\n      bn_decay:    float or float tensor variable, controling moving average weight\n      scope:       string, variable scope\n  Return:\n      normed:      batch-normalized maps\n  \"\"\"\n  return batch_norm_template(inputs, is_training, scope, [0,], bn_decay)\n\n\ndef batch_norm_for_conv1d(inputs, is_training, bn_decay, scope):\n  \"\"\" Batch normalization on 1D convolutional maps.\n  \n  Args:\n      inputs:      Tensor, 3D BLC input maps\n      is_training: boolean tf.Varialbe, true indicates training phase\n      bn_decay:    float or float tensor variable, controling moving average weight\n      scope:       string, variable scope\n  Return:\n      normed:      batch-normalized maps\n  \"\"\"\n  return batch_norm_template(inputs, is_training, scope, [0,1], bn_decay)\n\n\n\n  \ndef batch_norm_for_conv2d(inputs, is_training, bn_decay, scope):\n  \"\"\" Batch normalization on 2D convolutional maps.\n  \n  Args:\n      inputs:      Tensor, 4D BHWC input maps\n      is_training: boolean tf.Varialbe, true indicates training phase\n      bn_decay:    float or float tensor variable, controling moving average weight\n      scope:       string, variable scope\n  Return:\n      normed:      batch-normalized maps\n  \"\"\"\n  return batch_norm_template(inputs, is_training, scope, [0,1,2], bn_decay)\n\n\n\ndef batch_norm_for_conv3d(inputs, is_training, bn_decay, scope):\n  \"\"\" Batch normalization on 3D convolutional maps.\n  \n  Args:\n      inputs:      Tensor, 5D BDHWC input maps\n      is_training: boolean tf.Varialbe, true indicates training phase\n      bn_decay:    float or float tensor variable, controling moving average weight\n      scope:       string, variable scope\n  Return:\n      normed:      batch-normalized maps\n  \"\"\"\n  return batch_norm_template(inputs, is_training, scope, [0,1,2,3], bn_decay)\n\n\ndef dropout(inputs,\n            is_training,\n            scope,\n            keep_prob=0.5,\n            noise_shape=None):\n  \"\"\" Dropout layer.\n\n  Args:\n    inputs: tensor\n    is_training: boolean tf.Variable\n    scope: string\n    keep_prob: float in [0,1]\n    noise_shape: list of ints\n\n  Returns:\n    tensor variable\n  \"\"\"\n  with tf.variable_scope(scope) as sc:\n    outputs = tf.cond(is_training,\n                      lambda: tf.nn.dropout(inputs, keep_prob, noise_shape),\n                      lambda: inputs)\n    return outputs\n"
  },
  {
    "path": "code/utils/tf_util2.py",
    "content": "import tensorflow as tf\n\ndef lrelu(x, alpha=0.2):\n  return tf.nn.relu(x) - alpha * tf.nn.relu(-x)\n\n\n# def lrelu2(x, leak=0.2, name=\"lrelu\"):\n#     with tf.variable_scope(name):\n#         f1 = 0.5 * (1 + leak)\n#         f2 = 0.5 * (1 - leak)\n#         return f1 * x + f2 * abs(x)\n\ndef instance_norm(net, train=True,weight_decay=0.00001):\n    batch, rows, cols, channels = [i.value for i in net.get_shape()]\n    var_shape = [channels]\n    mu, sigma_sq = tf.nn.moments(net, [1, 2], keep_dims=True)\n\n    shift = tf.get_variable('shift',shape=var_shape,\n                            initializer=tf.zeros_initializer,\n                            regularizer=tf.contrib.layers.l2_regularizer(weight_decay))\n    scale = tf.get_variable('scale', shape=var_shape,\n                            initializer=tf.ones_initializer,\n                            regularizer=tf.contrib.layers.l2_regularizer(weight_decay))\n    epsilon = 1e-3\n    normalized = (net - mu) / tf.square(sigma_sq + epsilon)\n    return scale * normalized + shift\n\ndef conv2d(inputs,\n           num_output_channels,\n           kernel_size,\n           scope=None,\n           stride=[1, 1],\n           padding='SAME',\n           use_xavier=True,\n           stddev=1e-3,\n           weight_decay=0.00001,\n           activation_fn=tf.nn.relu,\n           bn=False,\n           ibn = False,\n           bn_decay=None,\n           use_bias = True,\n           is_training=None,\n           reuse=None):\n  \"\"\" 2D convolution with non-linear operation.\n\n  Args:\n    inputs: 4-D tensor variable BxHxWxC\n    num_output_channels: int\n    kernel_size: a list of 2 ints\n    scope: string\n    stride: a list of 2 ints\n    padding: 'SAME' or 'VALID'\n    use_xavier: bool, use xavier_initializer if true\n    stddev: float, stddev for truncated_normal init\n    weight_decay: float\n    activation_fn: function\n    bn: bool, whether to use batch norm\n    bn_decay: float or float tensor variable in [0,1]\n    is_training: bool Tensor variable\n\n  Returns:\n    Variable tensor\n  \"\"\"\n  with tf.variable_scope(scope,reuse=reuse) as sc:\n      if use_xavier:\n          initializer = tf.contrib.layers.xavier_initializer()\n      else:\n          initializer = tf.truncated_normal_initializer(stddev=stddev)\n\n      outputs = tf.layers.conv2d(inputs,num_output_channels,kernel_size,stride,padding,\n                                 kernel_initializer=initializer,\n                                 kernel_regularizer=tf.contrib.layers.l2_regularizer(weight_decay),\n                                 bias_regularizer=tf.contrib.layers.l2_regularizer(weight_decay),\n                                 use_bias=use_bias,reuse=None)\n      assert not (bn and ibn)\n      if bn:\n          outputs = tf.layers.batch_normalization(outputs,momentum=bn_decay,training=is_training,renorm=False,fused=True)\n          #outputs = tf.contrib.layers.batch_norm(outputs,is_training=is_training)\n      if ibn:\n          outputs = instance_norm(outputs,is_training)\n\n\n      if activation_fn is not None:\n        outputs = activation_fn(outputs)\n\n      return outputs\n\n\ndef fully_connected(inputs,\n                    num_outputs,\n                    scope,\n                    use_xavier=True,\n                    stddev=1e-3,\n                    weight_decay=0.00001,\n                    activation_fn=tf.nn.relu,\n                    bn=False,\n                    bn_decay=None,\n                    use_bias = True,\n                    is_training=None):\n    \"\"\" Fully connected layer with non-linear operation.\n\n    Args:\n      inputs: 2-D tensor BxN\n      num_outputs: int\n\n    Returns:\n      Variable tensor of size B x num_outputs.\n    \"\"\"\n\n    with tf.variable_scope(scope) as sc:\n        if use_xavier:\n            initializer = tf.contrib.layers.xavier_initializer()\n        else:\n            initializer = tf.truncated_normal_initializer(stddev=stddev)\n\n        outputs = tf.layers.dense(inputs,num_outputs,\n                                  use_bias=use_bias,kernel_initializer=initializer,\n                                  kernel_regularizer=tf.contrib.layers.l2_regularizer(weight_decay),\n                                  bias_regularizer=tf.contrib.layers.l2_regularizer(weight_decay),\n                                  reuse=None)\n\n        if bn:\n            outputs = tf.layers.batch_normalization(outputs, momentum=bn_decay, training=is_training, renorm=False)\n\n        if activation_fn is not None:\n            outputs = activation_fn(outputs)\n\n        return outputs\n"
  },
  {
    "path": "code/utils/write_result2html.py",
    "content": "import os\nimport numpy as np\nfrom tqdm import tqdm\nfrom utils import pc_util\nfrom scipy.misc import imsave\n\ndef write_result():\n    root_path = \"/home/lqyu/server/proj49/PointSR_data/test_data/our_collected_data\"\n    model_names = ['1024_nonormal_generator2_2', '1024_nonormal_generator2_2_uniformloss',\n                   '1024_nonormal_generator2_2_recursive']\n\n    index_path = os.path.join(\"index.html\")\n    index = open(index_path, \"w\")\n    index.write(\"<html><body><table><tr>\")\n    index.write(\"<th width='5%%'>name</th>\")\n\n    index.write(\"<tr><th></th>\")\n    for model in model_names:\n        index.write(\"<th>%s</th>\" % model)\n    index.write(\"</tr>\")\n\n    # get sample list\n    items = os.listdir(root_path + \"/\" + model_names[0])\n    items.sort()\n\n    # mkdir model image path\n    for model in model_names:\n        if not os.path.exists(root_path + \"/\" + model + \"_three_view_img/\"):\n            os.makedirs(root_path + \"/\" + model + \"_three_view_img/\")\n\n    # write img to file\n    for item in tqdm(items):\n        index.write(\"<tr>\")\n        index.write(\"<td>%s</td>\" % item)\n\n        # write prediction\n        for model in model_names:\n            path = root_path + \"/\" + model +\"/\" + item\n            if not os.path.exists(path):\n                continue\n            img_path = root_path + \"/\" + model + \"_three_view_img/\" + item\n            img_path = img_path.replace(\"xyz\", \"png\")\n            if not os.path.exists(img_path):\n                data = np.loadtxt(path)\n                data = data[:, 0:3]\n                img = pc_util.point_cloud_three_views(data, diameter=8)\n                imsave(img_path, img)\n            index.write(\"<td><img width='100%%', src='%s'></td>\" % img_path)\n        index.write(\"</tr>\")\n    index.close()\n\n\ndef write_result2html_benchmark():\n    root_path = \"/home/lqyu/server/proj49/PointSR_data/test_data/our_collected_data\"\n    phase = 'surface_benchmark'\n    input_path =\"../data/\"+phase+\"/1024_nonuniform\"\n    gt_path = \"../data/\"+phase+\"/4096\"\n    model_names = ['1024_nonormal_generator2_2','1024_nonormal_generator2_2_uniformloss','1024_nonormal_generator2_2_recursive']\n\n\n    index_path = os.path.join(root_path, phase + \"_index.html\")\n    index = open(index_path, \"w\")\n    index.write(\"<html><body><table><tr>\")\n    index.write(\"<th width='5%%'>name</th><th>Input</th>\")\n    index.write(\"<th>Refered GT</th></tr>\")\n\n    index.write(\"<tr><th></th>\")\n    for model in model_names:\n        index.write(\"<th>%s</th>\" % model)\n    index.write(\"</tr>\")\n\n    # get sample list\n    items = os.listdir(root_path + \"/\" + model_names[0] + \"/result/\" + phase)\n    items.sort()\n\n    # mkdir model image path\n    for model in model_names:\n        if not os.path.exists(root_path + \"/\" + model + \"/result/\" + phase + \"_three_view_img/\"):\n            os.makedirs(root_path + \"/\" + model + \"/result/\" + phase + \"_three_view_img/\")\n\n    # write img to file\n    for item in tqdm(items):\n        index.write(\"<tr>\")\n        index.write(\"<td>%s</td>\" % item)\n\n        # write input image\n        object = item.split(\"_\")[0]\n        id = item.split(\".\")[0]\n        path = input_path + \"/%s.xyz\" % (id)\n        img_path = input_path + \"_three_view_img/%s.png\" % (id)\n        if not os.path.exists(input_path + \"_three_view_img/\"):\n            os.makedirs(input_path + \"_three_view_img/\")\n        if not os.path.exists(img_path):\n            data = np.loadtxt(path)\n            data = data[:, 0:3]\n            img = pc_util.point_cloud_three_views(data,diameter=8)\n            imsave(img_path, img)\n        index.write(\"<td><img width='100%%', src='%s'></td>\" % img_path)\n        # write gt image\n        path = gt_path + \"/%s.xyz\" % (id)\n        img_path = gt_path + \"_three_view_img/%s.png\" % (id)\n        if not os.path.exists(gt_path + \"_three_view_img/\"):\n            os.makedirs(gt_path + \"_three_view_img/\")\n        if not os.path.exists(img_path):\n            data = np.loadtxt(path)\n            data = data[:, 0:3]\n            img = pc_util.point_cloud_three_views(data,diameter=8)\n            imsave(img_path, img)\n        index.write(\"<td><img width='100%%', src='%s'></td>\" % img_path)\n        index.write(\"</tr>\")\n\n        index.write(\"<tr><th></th>\")\n        # write prediction\n        for model in model_names:\n            path = root_path + \"/\" + model + \"/result/\" + phase + \"/\" + item\n            if not os.path.exists(path):\n                continue\n            img_path = root_path + \"/\" + model + \"/result/\" + phase + \"_three_view_img/\" + item\n            img_path = img_path.replace(\"xyz\", \"png\")\n            if not os.path.exists(img_path):\n                data = np.loadtxt(path)\n                data = data[:, 0:3]\n                img = pc_util.point_cloud_three_views(data,diameter=8)\n                imsave(img_path, img)\n            index.write(\"<td><img width='100%%', src='%s'></td>\" % img_path)\n        index.write(\"</tr>\")\n    index.close()\n\n\ndef write_result2html_ModelNet():\n    root_path = \"../model\"\n    gt_path = \"../data/ModelNet10_poisson_normal\"\n    #gt_path = \"../data/Patches\"\n    model_names = ['1024_generator2_2','new_1024_generator2_2','new_1024_generator2_2_fixed_lr']\n    phase = 'test'\n\n    index_path = os.path.join(root_path, phase + \"_index.html\")\n    index = open(index_path, \"w\")\n    index.write(\"<html><body><table><tr>\")\n    index.write(\"<th width='5%%'>name</th><th>Input</th>\")\n    index.write(\"<th>Refered GT</th></tr>\")\n\n    index.write(\"<tr><th></th>\")\n    for model in model_names:\n        index.write(\"<th>%s</th>\" % model)\n    index.write(\"</tr>\")\n\n    # get sample list\n    items = os.listdir(root_path + \"/\" + model_names[0] + \"/result/\" + phase)\n    items.sort()\n\n    # mkdir model image path\n    for model in model_names:\n        if not os.path.exists(root_path + \"/\" + model + \"/result/\" + phase + \"_three_view_img/\"):\n            os.makedirs(root_path + \"/\" + model + \"/result/\" + phase + \"_three_view_img/\")\n\n    # write img to file\n    for item in tqdm(items[::25]):\n        index.write(\"<tr>\")\n        index.write(\"<td>%s</td>\" % item)\n\n        # write input image\n        object = item.split(\"_\")[0]\n        id = item.split(\".\")[0]\n        fixed = \"%s/1024_nonuniform/%s\" % (gt_path, 'train')\n        path = fixed + \"/%s.xyz\" % (id)\n        img_path = fixed + \"_three_view_img/%s.png\" % (id)\n        if not os.path.exists(fixed + \"_three_view_img/\"):\n            os.makedirs(fixed + \"_three_view_img/\")\n        if not os.path.exists(img_path):\n            data = np.loadtxt(path)\n            data = data[:, 0:3]\n            img = pc_util.point_cloud_three_views(data,diameter=8)\n            imsave(img_path, img)\n        index.write(\"<td><img width='100%%', src='%s'></td>\" % img_path)\n        # write gt image\n        fixed = \"%s/4096/%s\" % (gt_path, 'train')\n        path = fixed + \"/%s.xyz\" % (id)\n        img_path = fixed + \"_three_view_img/%s.png\" % (id)\n        if not os.path.exists(fixed + \"_three_view_img/\"):\n            os.makedirs(fixed + \"_three_view_img/\")\n        if not os.path.exists(img_path):\n            data = np.loadtxt(path)\n            data = data[:, 0:3]\n            img = pc_util.point_cloud_three_views(data,diameter=8)\n            imsave(img_path, img)\n        index.write(\"<td><img width='100%%', src='%s'></td>\" % img_path)\n        index.write(\"</tr>\")\n\n        index.write(\"<tr><th></th>\")\n        # write prediction\n        for model in model_names:\n            path = root_path + \"/\" + model + \"/result/\" + phase + \"/\" + item\n            if not os.path.exists(path):\n                continue\n            img_path = root_path + \"/\" + model + \"/result/\" + phase + \"_three_view_img/\" + item\n            img_path = img_path.replace(\"xyz\", \"png\")\n            if not os.path.exists(img_path):\n                data = np.loadtxt(path)\n                data = data[:, 0:3]\n                img = pc_util.point_cloud_three_views(data,diameter=8)\n                imsave(img_path, img)\n            index.write(\"<td><img width='100%%', src='%s'></td>\" % img_path)\n        index.write(\"</tr>\")\n    index.close()\n\nif __name__ == '__main__':\n    write_result2html_ModelNet()\n    #write_result2html_benchmark()\n    #calculate_emd_error('ModelNet40')\n"
  },
  {
    "path": "docs/.nojekyll",
    "content": ""
  },
  {
    "path": "docs/_site/css/main.css",
    "content": "body { margin: 60px auto; width: 70%; }\n\na { text-decoration: none; color: #999; }\na:hover { text-decoration: underline; }*/\n\np, ul { font-size: 1.5em; line-height: 1.4em; color: #333; }\n\nh1, h2, h3, h4 { font-family: 'Helvetica', 'Arial', 'Sans-Serif'; }\nh1 { font-size: 3em;  }\nh2 { font-size: 2.7em; }\nh3 { font-size: 2.3em; }\nh4 { font-size: 1.9em; }\n\nnav ul, footer ul { font-size: 1em; font-family: 'Helvetica', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none; font-weight: bold; }\nnav ul li, footer ul li { display: inline; margin-right: 20px; }\n\nfooter { border-top: 1px solid #d5d5d5; font-size: .8em; }\n\n/* Blog */\nul.posts { margin: 20px auto 40px; font-size: 1.5em; }\nul.posts li { list-style: none; }\n\n/* CV */\n.cv { margin: 0px 0px 60px; }\n.cv h1 { font-size: 3em; }\n.cv h2 { font-size: 2em; }\n.cv address, .cv .download { font-family: 'Helvetica', 'Arial', 'Sans-Serif'; }\n.cv address, .cv p { font-size: 1.2em; }\n.cv .download { float: right; font-size: .8em; text-transform: uppercase; }\n"
  },
  {
    "path": "docs/_site/css/project.css",
    "content": "\n@import url('https://fonts.googleapis.com/css?family=Lato:400,400i,700,900&subset=latin-ext');\n\nbody {\n  font-family: \"Lato\", sans-serif;\n  color: #322f30;\n  background: #fff;\n  -webkit-font-smoothing: antialiased;\n  margin: 10px auto; width: 70%;\n}\n\na {\n  text-decoration: none;\n  color: #001f3f;\n}\n\n\na:link, a:visited{\n  color: #39CCCC;\n  /* text-decoration: underline; */\n}\n\na:hover {\n  text-decoration: underline;\n  color: #001f3f;\n}\n\na.paper {\n    font-size:\n    text-decoration: none;\n    color: #001f3f;\n}\n\n\na.paper:link, a.paper:visited{\n  color: #322f30;\n  /* text-decoration: underline; */\n}\n\na.paper:hover {\n  text-decoration: underline;\n  color: #39CCCC;\n}\n\n\n\nanchor{\n    line-height: 0;\n    font-size: 0;\n    color: transparent;\n}\n\n/*p, ul { font-size: 1.5em; line-height: 1.4em; color: #fff; }*/\np { font-size: 1.25em; line-height: 1.2em; color: #322f30; font-weight: 200; text-align: justify;}\np2{ font-size: 1.0em; line-height: 0.9em; color: #322f30; font-weight: 200; text-align: justify; margin-right: 20px}\nul { font-size: 1.5em; line-height: 1.4em; color: #322f30;}\n\nh2, h3, h4 { font-family: 'Lato', 'Sans-Serif';\n                font-weight:300; }\n/* h1 { font-size: 3em;  } */\nh2 { font-size: 2.7em; }\nh3 { font-size: 2.3em; margin-top: 0.8em; margin-bottom: 0.8em; text-align: center;}\nh4 { font-size: 1.9em; margin-top: 0.8em; margin-bottom: 0.8em; text-align: center;}\n\npapertitle {\nfont-family: 'Lato', Verdana, Helvetica, sans-serif;\nfont-size: 14px;\nfont-weight: 700\n}\n\n/* video {\n\n  width: 100% !important;\n  height: auto !important;\n  margin: 0 auto;\n  display: block;\n} */\n\nhr{border: 0px; height: 1px; background-image: linear-gradient(to right, rgba(0, 0, 0, 0), rgba(0, 0, 0, 0.75), rgba(0, 0, 0, 0)); }\n\n.center {\n  padding-top: 20px;\n  padding-bottom: 20px;\n  width: 90% !important;\n  height: 90% !important;\n  margin: 0 auto;\n  display: block;\n}\n\nh5{ font-family: 'Lato', 'Sans-Serif';font-weight:300; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;}\nh1{ font-family: 'Lato', 'Sans-Serif';font-weight:400; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;}\n\n/*nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px; list-style: none; text-align: center;}*/\nnav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px;list-style: none; text-align: center; margin-top: 10px;}\nfooter ul { font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none;}\n/*nav ul li{ display: inline; margin-right: 20px; padding: 20px;}*/\nnav ul li{ display: inline; padding: 20px;}\n/*footer div ul li{ display: inline; margin-right: 20px; text-align: left;}*/\nfooter div ul li{ display: inline; margin-right: 20px; text-align: left;}\n\nfooter { border-top: 1px solid #d5d5d5; font-size: .8em; }\n\n/* Blog */\nul.posts { margin: 20px auto 40px; font-size: 1.5em; }\nul.posts li { list-style: none; }\n\n/* CV */\n.cv { margin: 0px 0px 60px; }\n.cv h1 { font-size: 3em; }\n.cv h2 { font-size: 2em; }\n.cv address, .cv .download { font-family: 'Lato', 'Arial', 'Sans-Serif'; }\n.cv address, .cv p { font-size: 1.2em; }\n.cv .download { float: right; font-size: .8em; text-transform: uppercase; }\n\nimg.center_thumbnail{\n    margin: 0 auto;\n    display: block;\n}\n\n\n\nimg {\n    display: inline;\n    margin: 0 auto;\n    height: 100%;\n    width: auto;\n}\n"
  },
  {
    "path": "docs/_site/css/project_file.css",
    "content": "@import url(https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.css);\n\n/*body {\n  font-family: \"Lato\", sans-serif;\n  color: #fff;\n  background: #322f30;\n  -webkit-font-smoothing: antialiased;\n  margin: 20px auto; width: 70%;\n}*/\n\nbody {\n  font-family: \"Lato\", sans-serif;\n  color: #fff;\n  background: #322f30;\n  -webkit-font-smoothing: antialiased;\n  margin: 10px auto; width: 70%;\n}\n\na {\n  text-decoration: none;\n  color: #fff;\n}\n\n\na:link, a:visited{\n  color: #7fdbff;\n  /*text-decoration: underline;*/\n}\n\na:hover {\n  text-decoration: underline;\n}\n\n/*p, ul { font-size: 1.5em; line-height: 1.4em; color: #fff; }*/\np { font-size: 1.25em; line-height: 1.2em; color: #fff; font-weight: 200; text-align: justify;}\nul { font-size: 1.5em; line-height: 1.4em; color: #fff;}\n\nh1, h2, h3, h4 { font-family: 'Lato', 'Sans-Serif';\n                font-weight:300; }\nh1 { font-size: 3em;  }\nh2 { font-size: 2.7em; }\nh3 { font-size: 2.3em; }\nh4 { font-size: 1.9em; }\n\nh5{ font-family: 'Lato', 'Sans-Serif';font-weight:300; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;}\n\n\n/*nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px; list-style: none; text-align: center;}*/\nnav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px;list-style: none; text-align: center; margin-top: 10px;}\nfooter ul { font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none;}\n/*nav ul li{ display: inline; margin-right: 20px; padding: 20px;}*/\nnav ul li{ display: inline; padding: 20px;}\n/*footer div ul li{ display: inline; margin-right: 20px; text-align: left;}*/\nfooter div ul li{ display: inline; margin-right: 20px; text-align: left;}\n\nfooter { border-top: 1px solid #d5d5d5; font-size: .8em; }\n"
  },
  {
    "path": "docs/_site/index.html",
    "content": "<!DOCTYPE html>\n\n<html>\n\t<!-- <meta name=viewport content=“width=800”> -->\n\t<head>\n\t\t<title>Ganesh Iyer</title>\n\t\t<!-- link to main stylesheet -->\n\t\t<link href=\"https://fonts.googleapis.com/css?family=Lato:400,400i,700,900&subset=latin-ext\" rel=\"stylesheet\">\n\t\t<link rel=\"stylesheet\" type=\"text/css\" href=\"/css/project.css\">\n\t</head>\n\t<body>\n\n\t\t<h1>CalibNet: Self-Supervised Extrinsic Calibration using 3D Spatial Transformer Networks</h5>\n\n\t\t<nav>\n    \t\t<ul>\n        \t\t<li><a href=\"https://epiception.github.io/\">Ganesh Iyer<sup>1</sup></a></li>\n        \t\t<li><a href=\"http://karnikram.info/\">Karnik Ram<sup>1</sup></a></li>\n        \t\t<li><a href=\"http://krrish94.github.io/\">Krishna Murthy<sup>2</sup></a></li>\n\t\t\t\t<li><a href=\"https://www.iiit.ac.in/people/faculty/mkrishna/\">K. Madhava Krishna<sup>1</sup></a></li>\n\t\t\t\t<li><p2>1. Robotics Research Center, International Institute of Information and Technology</p2></li>\n\t\t\t\t<li><p2>2. Montreal Institute of Learning Algorithms, Universit‌&#233; de Montr‌&#233;al</p2></li>\n    \t\t</ul>\n\t\t</nav>\n\n\t\t<div>\n\t\t\t<img src=\"/assets/teaser.png\" alt=\"Paper Teaser\" class = \"center\">\n\t\t</div>\n\n\t\t\t\t<p>3D LiDARs and 2D cameras are increasingly being\n\t\tused alongside each other in sensor rigs for perception tasks.\n\t\tBefore these sensors can be used to gather meaningful data,\n\t\thowever, their extrinsics (and intrinsics) need to be accurately\n\t\tcalibrated, as the performance of the sensor rig is extremely\n\t\tsensitive to these calibration parameters. A vast majority of\n\t\texisting calibration techniques require significant amounts of\n\t\tdata and/or calibration targets and human effort, severely\n\t\timpacting their applicability in large-scale production systems.\n\t\tWe address this gap with CalibNet: a self-supervised deep\n\t\tnetwork capable of automatically estimating the 6-DoF rigid\n\t\tbody transformation between a 3D LiDAR and a 2D camera in\n\t\treal-time. CalibNet alleviates the need for calibration targets,\n\t\tthereby resulting in significant savings in calibration efforts.\n\t\tDuring training, the network only takes as input a LiDAR point\n\t\tcloud, the corresponding monocular image, and the camera\n\t\tcalibration matrix K. At train time, we do not impose direct\n\t\tsupervision (i.e., we do not directly regress to the calibration\n\t\tparameters, for example). Instead, we train the network to\n\t\tpredict calibration parameters that maximize the geometric and\n\t\tphotometric consistency of the input images and point clouds.\n\t\tCalibNet learns to iteratively solve the underlying geometric\n\t\tproblem and accurately predicts extrinsic calibration parameters for a wide range of mis-calibrations, without requiring\n\t\tretraining or domain adaptation.</p>\n\n\t\t<hr>\n\t\t\t<h3> Video </h3>\n\t\t</hr>\n\n\t\t<div>\n\t \t\t<video class=\"center\" src=\"assets/video.mp4\" controls width=\"1200\" height=\"450\">⁪</video>\n\t\t</div>\n\n\t\t<hr>\n\t\t\t<a href=\"https://github.com/epiception/CalibNet\">\n\t\t\t<h3> Code </h3></a>\n\t\t\t<h4>Coming Soon!</h4>\n\t\t</hr>\n\n\t\t<hr>\n\t\t\t<h3> Paper </h3>\n\t\t\t<a class=\"paper\" href=\"/\">\n \t\t\t\t<img class=\"center_thumbnail\" src=\"assets/pdf_thumbnail.png\" alt=\"Paper Thumbnail\" width=\"1\" height=\"2\" border-style = \"dotted\" border = 1px>\n\t\t\t</a>\n\t\t\t<h4>Coming Soon!</h4>\n\t\t</hr>\n\n\n\n\t</body>\n</html>\n"
  },
  {
    "path": "docs/css/main.css",
    "content": "body { margin: 60px auto; width: 70%; }\n\na { text-decoration: none; color: #999; }\na:hover { text-decoration: underline; }*/\n\np, ul { font-size: 1.5em; line-height: 1.4em; color: #333; }\n\nh1, h2, h3, h4 { font-family: 'Helvetica', 'Arial', 'Sans-Serif'; }\nh1 { font-size: 3em;  }\nh2 { font-size: 2.7em; }\nh3 { font-size: 2.3em; }\nh4 { font-size: 1.9em; }\n\nnav ul, footer ul { font-size: 1em; font-family: 'Helvetica', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none; font-weight: bold; }\nnav ul li, footer ul li { display: inline; margin-right: 20px; }\n\nfooter { border-top: 1px solid #d5d5d5; font-size: .8em; }\n\n/* Blog */\nul.posts { margin: 20px auto 40px; font-size: 1.5em; }\nul.posts li { list-style: none; }\n\n/* CV */\n.cv { margin: 0px 0px 60px; }\n.cv h1 { font-size: 3em; }\n.cv h2 { font-size: 2em; }\n.cv address, .cv .download { font-family: 'Helvetica', 'Arial', 'Sans-Serif'; }\n.cv address, .cv p { font-size: 1.2em; }\n.cv .download { float: right; font-size: .8em; text-transform: uppercase; }\n"
  },
  {
    "path": "docs/css/project.css",
    "content": "\n@import url('https://fonts.googleapis.com/css?family=Lato:400,400i,700,900&subset=latin-ext');\n\nbody {\n  font-family: \"Lato\", sans-serif;\n  color: #322f30;\n  background: #fff;\n  -webkit-font-smoothing: antialiased;\n  margin: 10px auto; width: 70%;\n}\n\na {\n  text-decoration: none;\n  color: #001f3f;\n}\n\n\na:link, a:visited{\n  color: #39CCCC;\n  /* text-decoration: underline; */\n}\n\na:hover {\n  text-decoration: underline;\n  color: #001f3f;\n}\n\na.paper {\n    font-size:\n    text-decoration: none;\n    color: #001f3f;\n}\n\n\na.paper:link, a.paper:visited{\n  color: #322f30;\n  /* text-decoration: underline; */\n}\n\na.paper:hover {\n  text-decoration: underline;\n  color: #39CCCC;\n}\n\n\n\nanchor{\n    line-height: 0;\n    font-size: 0;\n    color: transparent;\n}\n\n/*p, ul { font-size: 1.5em; line-height: 1.4em; color: #fff; }*/\np { font-size: 1.25em; line-height: 1.2em; color: #322f30; font-weight: 200; text-align: justify;}\np2{ font-size: 1.0em; line-height: 0.9em; color: #322f30; font-weight: 200; text-align: justify; margin-right: 20px}\nul { font-size: 1.5em; line-height: 1.4em; color: #322f30;}\n\nh2, h3, h4 { font-family: 'Lato', 'Sans-Serif';\n                font-weight:300; }\n/* h1 { font-size: 3em;  } */\nh2 { font-size: 2.7em; }\nh3 { font-size: 2.3em; margin-top: 0.8em; margin-bottom: 0.8em; text-align: center;}\nh4 { font-size: 1.9em; margin-top: 0.8em; margin-bottom: 0.8em; text-align: center;}\n\npapertitle {\nfont-family: 'Lato', Verdana, Helvetica, sans-serif;\nfont-size: 14px;\nfont-weight: 700\n}\n\n/* video {\n\n  width: 100% !important;\n  height: auto !important;\n  margin: 0 auto;\n  display: block;\n} */\n\nhr{border: 0px; height: 1px; background-image: linear-gradient(to right, rgba(0, 0, 0, 0), rgba(0, 0, 0, 0.75), rgba(0, 0, 0, 0)); }\n\n.center {\n  padding-top: 20px;\n  padding-bottom: 20px;\n  width: 90% !important;\n  height: 90% !important;\n  margin: 0 auto;\n  display: block;\n}\n\nh5{ font-family: 'Lato', 'Sans-Serif';font-weight:300; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;}\nh1{ font-family: 'Lato', 'Sans-Serif';font-weight:400; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;}\n\n/*nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px; list-style: none; text-align: center;}*/\nnav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px;list-style: none; text-align: center; margin-top: 10px;}\nfooter ul { font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none;}\n/*nav ul li{ display: inline; margin-right: 20px; padding: 20px;}*/\nnav ul li{ display: inline; padding: 20px;}\n/*footer div ul li{ display: inline; margin-right: 20px; text-align: left;}*/\nfooter div ul li{ display: inline; margin-right: 20px; text-align: left;}\n\nfooter { border-top: 1px solid #d5d5d5; font-size: .8em; }\n\n/* Blog */\nul.posts { margin: 20px auto 40px; font-size: 1.5em; }\nul.posts li { list-style: none; }\n\n/* CV */\n.cv { margin: 0px 0px 60px; }\n.cv h1 { font-size: 3em; }\n.cv h2 { font-size: 2em; }\n.cv address, .cv .download { font-family: 'Lato', 'Arial', 'Sans-Serif'; }\n.cv address, .cv p { font-size: 1.2em; }\n.cv .download { float: right; font-size: .8em; text-transform: uppercase; }\n\nimg.center_thumbnail{\n    margin: 0 auto;\n    display: block;\n}\n\n\n\nimg {\n    display: inline;\n    margin: 0 auto;\n    height: 100%;\n    width: auto;\n}\n"
  },
  {
    "path": "docs/css/project_file.css",
    "content": "@import url(https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.css);\n\n/*body {\n  font-family: \"Lato\", sans-serif;\n  color: #fff;\n  background: #322f30;\n  -webkit-font-smoothing: antialiased;\n  margin: 20px auto; width: 70%;\n}*/\n\nbody {\n  font-family: \"Lato\", sans-serif;\n  color: #fff;\n  background: #322f30;\n  -webkit-font-smoothing: antialiased;\n  margin: 10px auto; width: 70%;\n}\n\na {\n  text-decoration: none;\n  color: #fff;\n}\n\n\na:link, a:visited{\n  color: #7fdbff;\n  /*text-decoration: underline;*/\n}\n\na:hover {\n  text-decoration: underline;\n}\n\n/*p, ul { font-size: 1.5em; line-height: 1.4em; color: #fff; }*/\np { font-size: 1.25em; line-height: 1.2em; color: #fff; font-weight: 200; text-align: justify;}\nul { font-size: 1.5em; line-height: 1.4em; color: #fff;}\n\nh1, h2, h3, h4 { font-family: 'Lato', 'Sans-Serif';\n                font-weight:300; }\nh1 { font-size: 3em;  }\nh2 { font-size: 2.7em; }\nh3 { font-size: 2.3em; }\nh4 { font-size: 1.9em; }\n\nh5{ font-family: 'Lato', 'Sans-Serif';font-weight:300; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;}\n\n\n/*nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px; list-style: none; text-align: center;}*/\nnav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px;list-style: none; text-align: center; margin-top: 10px;}\nfooter ul { font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none;}\n/*nav ul li{ display: inline; margin-right: 20px; padding: 20px;}*/\nnav ul li{ display: inline; padding: 20px;}\n/*footer div ul li{ display: inline; margin-right: 20px; text-align: left;}*/\nfooter div ul li{ display: inline; margin-right: 20px; text-align: left;}\n\nfooter { border-top: 1px solid #d5d5d5; font-size: .8em; }\n"
  },
  {
    "path": "docs/index.html",
    "content": "<!DOCTYPE html>\n\n<html>\n\t<!-- <meta name=viewport content=“width=800”> -->\n\t<head>\n\t\t<title>CalibNet</title>\n\t\t<!-- link to main stylesheet -->\n\t\t<link href=\"https://fonts.googleapis.com/css?family=Lato:400,400i,700,900&subset=latin-ext\" rel=\"stylesheet\">\n\t\t<link rel=\"stylesheet\" type=\"text/css\" href=\"./css/project.css\">\n\t</head>\n\t<body>\n\n\t\t<h1>CalibNet: Self-Supervised Extrinsic Calibration using 3D Spatial Transformer Networks</h5>\n\n\t\t<nav>\n    \t\t<ul>\n        \t\t<li><a href=\"https://epiception.github.io/\">Ganesh Iyer<sup>1</sup></a></li>\n        \t\t<li><a href=\"http://karnikram.info/\">Karnik Ram<sup>1</sup></a></li>\n        \t\t<li><a href=\"http://krrish94.github.io/\">Krishna Murthy<sup>2</sup></a></li>\n\t\t\t\t<li><a href=\"https://www.iiit.ac.in/people/faculty/mkrishna/\">K. Madhava Krishna<sup>1</sup></a></li>\n\t\t\t\t<li><p2>1. Robotics Research Center, International Institute of Information and Technology</p2></li>\n\t\t\t\t<li><p2>2. Montreal Institute of Learning Algorithms, Universit‌&#233; de Montr‌&#233;al</p2></li>\n    \t\t</ul>\n\t\t</nav>\n\n\t\t<div>\n\t\t\t<img src=\"./assets/teaser.png\" alt=\"Paper Teaser\" class = \"center\">\n\t\t</div>\n\n\t\t\t\t<p>3D LiDARs and 2D cameras are increasingly being\n\t\tused alongside each other in sensor rigs for perception tasks.\n\t\tBefore these sensors can be used to gather meaningful data,\n\t\thowever, their extrinsics (and intrinsics) need to be accurately\n\t\tcalibrated, as the performance of the sensor rig is extremely\n\t\tsensitive to these calibration parameters. A vast majority of\n\t\texisting calibration techniques require significant amounts of\n\t\tdata and/or calibration targets and human effort, severely\n\t\timpacting their applicability in large-scale production systems.\n\t\tWe address this gap with CalibNet: a self-supervised deep\n\t\tnetwork capable of automatically estimating the 6-DoF rigid\n\t\tbody transformation between a 3D LiDAR and a 2D camera in\n\t\treal-time. CalibNet alleviates the need for calibration targets,\n\t\tthereby resulting in significant savings in calibration efforts.\n\t\tDuring training, the network only takes as input a LiDAR point\n\t\tcloud, the corresponding monocular image, and the camera\n\t\tcalibration matrix K. At train time, we do not impose direct\n\t\tsupervision (i.e., we do not directly regress to the calibration\n\t\tparameters, for example). Instead, we train the network to\n\t\tpredict calibration parameters that maximize the geometric and\n\t\tphotometric consistency of the input images and point clouds.\n\t\tCalibNet learns to iteratively solve the underlying geometric\n\t\tproblem and accurately predicts extrinsic calibration parameters for a wide range of mis-calibrations, without requiring\n\t\tretraining or domain adaptation.</p>\n\n\t\t<hr>\n\t\t\t<h3> Video </h3>\n\t\t</hr>\n\n\t\t<div>\n\t \t\t<video class=\"center\" src=\"./assets/video.mp4\" controls width=\"1200\" height=\"450\">⁪</video>\n\t\t</div>\n\n\t\t<hr>\n\t\t\t<h3> Paper </h3>\n\t\t\t<a class=\"paper\" href=\"https://arxiv.org/abs/1803.08181\">\n \t\t\t\t<img class=\"center_thumbnail\" src=\"./assets/pdf_thumbnail.png\" alt=\"Paper Thumbnail\" width=\"1\" height=\"2\" border-style = \"dotted\" border = 1px>\n\t\t\t</a>\n\t\t\t<!-- <h4>Coming Soon!</h4> -->\n\t\t</hr>\n\n\t\t<hr>\n\t\t\t<a href=\"https://github.com/epiception/CalibNet\">\n\t\t\t<p style=\"text-align:center\">For the Tensorflow implementation</p>\n\t\t\t<h3> Check out the Code! </h3></a>\n\n\t\t\t<!-- <h4>Coming Soon!</h4> -->\n\t\t</hr>\n\n\n\n\n\n\t</body>\n</html>\n"
  }
]