Repository: epiception/CalibNet Branch: main Commit: a6fed6de1805 Files: 83 Total size: 325.6 KB Directory structure: gitextract_1fl8qxli/ ├── LICENSE ├── README.md ├── code/ │ ├── common/ │ │ ├── Lie_functions.py │ │ ├── __init__.py │ │ ├── all_transformer.py │ │ ├── cnn_utils_res.py │ │ ├── global_agg_net.py │ │ ├── resnet_depth_model.py │ │ └── resnet_rgb_model.py │ ├── config_res.py │ ├── dataset_files/ │ │ ├── dataset_build_color.py │ │ ├── dataset_build_color_2.py │ │ ├── dataset_builder_parallel.sh │ │ └── parser.py │ ├── model_utils.py │ ├── nw_loader_color.py │ ├── tf_ops/ │ │ ├── CD/ │ │ │ ├── __init__.py │ │ │ ├── makefile │ │ │ ├── render_balls_so.cpp │ │ │ ├── tf_nndistance.cpp │ │ │ ├── tf_nndistance.py │ │ │ ├── tf_nndistance_g.cu │ │ │ └── tf_nndistance_g.cu.o │ │ ├── __init__.py │ │ ├── emd/ │ │ │ ├── __init__.py │ │ │ ├── tf_auctionmatch.cpp │ │ │ ├── tf_auctionmatch.py │ │ │ ├── tf_auctionmatch_compile.sh │ │ │ ├── tf_auctionmatch_g.cu │ │ │ └── tf_auctionmatch_g.cu.o │ │ ├── grouping/ │ │ │ ├── __init__.py │ │ │ ├── compile.sh │ │ │ ├── query_ball_point.cpp │ │ │ ├── query_ball_point.cu │ │ │ ├── query_ball_point_block.cu │ │ │ ├── query_ball_point_grid.cu │ │ │ ├── selection_sort │ │ │ ├── selection_sort.cpp │ │ │ ├── selection_sort.cu │ │ │ ├── selection_sort_const.cu │ │ │ ├── test_knn.py │ │ │ ├── tf_grouping.cpp │ │ │ ├── tf_grouping.py │ │ │ ├── tf_grouping_compile.sh │ │ │ ├── tf_grouping_g.cu │ │ │ ├── tf_grouping_g.cu.o │ │ │ └── tf_grouping_op_test.py │ │ ├── interpolation/ │ │ │ ├── __init__.py │ │ │ ├── interpolate.cpp │ │ │ ├── tf_interpolate.cpp │ │ │ ├── tf_interpolate.py │ │ │ ├── tf_interpolate_compile.sh │ │ │ ├── tf_interpolate_op_test.py │ │ │ └── visu_interpolation.py │ │ └── sampling/ │ │ ├── __init__.py │ │ ├── tf_sampling.cpp │ │ ├── tf_sampling.py │ │ ├── tf_sampling_compile.sh │ │ ├── tf_sampling_g.cu │ │ └── tf_sampling_g.cu.o │ ├── train_model_combined.py │ └── utils/ │ ├── __init__.py │ ├── data_prep_util.py │ ├── eulerangles.py │ ├── modelnet_data_prep.py │ ├── off2obj.py │ ├── pc_util.py │ ├── plyfile.py │ ├── pointnet_util.py │ ├── provider.py │ ├── show3d.py │ ├── tf_util.py │ ├── tf_util2.py │ └── write_result2html.py └── docs/ ├── .nojekyll ├── _site/ │ ├── css/ │ │ ├── main.css │ │ ├── project.css │ │ └── project_file.css │ └── index.html ├── css/ │ ├── main.css │ ├── project.css │ └── project_file.css └── index.html ================================================ FILE CONTENTS ================================================ ================================================ FILE: LICENSE ================================================ MIT License Copyright (c) 2018 Ganesh Iyer Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # CalibNet ### [DEPRECATED] This repository is no longer actively supported. While the authors work on an update, please check out this unofficial implementation: [CalibNet_pytorch](https://github.com/gitouni/CalibNet_pytorch) :fire: :slightly_smiling_face: ___ Code for our paper: [CalibNet: Self-Supervised Extrinsic Calibration using 3D Spatial Transformer Networks](https://arxiv.org/pdf/1803.08181.pdf) Check out our [project page](https://epiception.github.io/CalibNet/)! ![CalibNet_gif1](https://media.giphy.com/media/1zjOgLf7j4lHmeMubG/giphy.gif) ### Prerequisites CalibNet is trained on Tensorflow 1.3, CUDA 8.0, CUDNN 7.0.1 ##### Installation The code for point cloud distance loss is modified from [PU-NET](https://github.com/yulequan/PU-Net), PointNet++, PointSetGeneration. This repository, thus, is based on Tensorflow and the TF operators from PointNet++ and PU-NET. For installing tensorflow, please follow the official instructions in here. The code is tested under TF1.3 and Python 2.7 on Ubuntu 16.04. For compiling TF operators, please check tf_xxx_compile.sh under each op subfolder in code/tf_ops folder, and change the path correctly to ../path/to/tensorflow/include. Note that you need to update nvcc, python and tensoflow include library if necessary. You also need to remove -D_GLIBCXX_USE_CXX11_ABI=0 flag in g++ command in order to compile correctly if necessary. We are working to update the code and installation steps for the latest tensorflow versions. ### Dataset Preparation To prepare the dataset, run /dataset_files/dataset_builder_parallel.sh in the directory where you wish to store. We will also create a parser `parsed_set.txt` for the dataset, that contains the file names for training. ``` git clone https://github.com/epiception/CalibNet.git or svn checkout https://github.com/CalibNet/trunk/code (for the code) cd ../path/to/dataset_directory bash ../path/to/code_folder/dataset_files/dataset_builder_parallel.sh cd ../path/to/CalibNet/code python dataset_files/parser.py ../dataset_directory/2011_09_26/ ``` ##### Resnet-18 Pretrained Resnet-18 parameters can be found [here](https://drive.google.com/open?id=1XGqdBH3A88m1LgUIe5tS7VjKjtQc1A6V). ### Training Before training, be sure to make requisite changes to the paths and training parameters in the config file `config_res.py`. We trained using 2 GPUs. The base code is written to support ops for the same device configuration. To begin training: ``` CUDA_VISIBLE_DEVICES=, python -B train_model_combined.py ``` ##### Trained Weights Trained weights for the base variant (non-iterative) model is available [here](https://drive.google.com/drive/folders/138hq7OgTEBmG-wK52h7gchg5ob1WqARn?usp=sharing). This model was trained for 44 epochs. As mentioned in the paper, the iterative realignment model for better translation outputs will uploaded soon. ### Evaluation/Test Code for Direct Evaluation/Testing pipeline for point cloud calibration will be uploaded soon. ================================================ FILE: code/common/Lie_functions.py ================================================ import numpy as np import tensorflow as tf def for_translation(Old_transform,t): R = Old_transform[:3,:3] t = tf.expand_dims(t,1) trans_T = tf.concat([R,t], 1) return trans_T def for_rotation(Old_transform): R = Old_transform[:3,:3] t = tf.expand_dims(tf.constant(np.array([0.0, 0.0, 0.0], dtype = np.float32)), 1) rots_T = tf.concat([R,t], 1) return rots_T def exponential_map_single(vec): "Exponential Map Operation. Decoupled for SO(3) and translation t" with tf.name_scope("Exponential_map"): u = vec[:3] omega = vec[3:] theta = tf.sqrt(omega[0]*omega[0] + omega[1]*omega[1] + omega[2]*omega[2]) omega_cross = tf.stack([0.0, -omega[2], omega[1], omega[2], 0.0, -omega[0], -omega[1], omega[0], 0.0]) omega_cross = tf.reshape(omega_cross, [3,3]) #Taylor's approximation for A,B and C not being used currently, approximations preferable for low values of theta # A = 1.0 - (tf.pow(theta,2)/factorial(3.0)) + (tf.pow(theta, 4)/factorial(5.0)) # B = 1.0/factorial(2.0) - (tf.pow(theta,2)/factorial(4.0)) + (tf.pow(theta, 4)/factorial(6.0)) # C = 1.0/factorial(3.0) - (tf.pow(theta,2)/factorial(5.0)) + (tf.pow(theta, 4)/factorial(7.0)) A = tf.sin(theta)/theta B = (1.0 - tf.cos(theta))/(tf.pow(theta,2)) C = (1.0 - A)/(tf.pow(theta,2)) omega_cross_square = tf.matmul(omega_cross, omega_cross) R = tf.eye(3,3) + A*omega_cross + B*omega_cross_square V = tf.eye(3,3) + B*omega_cross + C*omega_cross_square Vu = tf.matmul(V,tf.expand_dims(u,1)) T = tf.concat([R, Vu], 1) return T def transforms_mul(T1, T2): T1 = tf.concat ([T1, tf.constant(np.array([[0.0, 0.0, 0.0, 1.0]]), dtype = tf.float32)], 0) T2 = tf.concat ([T2, tf.constant(np.array([[0.0, 0.0, 0.0, 1.0]]), dtype = tf.float32)], 0) product = tf.matmul(T1, T2) return product ================================================ FILE: code/common/__init__.py ================================================ ================================================ FILE: code/common/all_transformer.py ================================================ import tensorflow as tf import numpy as np import matplotlib.pyplot as plt import scipy.misc as smc import config_res as config IMG_HT = config.depth_img_params['IMG_HT'] IMG_WDT = config.depth_img_params['IMG_WDT'] batch_size = 1 shape = (IMG_HT, IMG_WDT) def _simple_transformer(depth_map, t_mat, k_mat, small_transform): batch_grids, transformed_depth_map, sparse_cloud = _3D_meshgrid_batchwise_diff(IMG_HT, IMG_WDT, depth_map, batch_size, t_mat, k_mat, small_transform) x_all = tf.reshape(batch_grids[:,0], (IMG_HT, IMG_WDT)) y_all = tf.reshape(batch_grids[:,1], (IMG_HT, IMG_WDT)) return _bilinear_sampling(transformed_depth_map, x_all, y_all), sparse_cloud def sparsify_cloud(S): """ Cluster centers of point clouds used to sparsify cloud for Earth Mover's Distance. Using 4096 centroids """ with tf.device('/cpu:0'): point_limit = 4096 no_points = tf.shape(S)[0] no_partitions = no_points/tf.constant(point_limit, dtype=tf.int32) saved_points = tf.gather_nd(S, [tf.expand_dims(tf.range(0, no_partitions*point_limit), 1)]) saved_points = tf.reshape(saved_points, [point_limit, no_partitions, 3]) saved_points_sparse = tf.reduce_mean(saved_points, 1) return saved_points_sparse def _3D_meshgrid_batchwise_diff(height, width, depth_img, num_batch, transformation_matrix, tf_K_mat, small_transform): """ Creates 3d sampling meshgrid """ x_index = tf.linspace(-1.0, 1.0, width) y_index = tf.linspace(-1.0, 1.0, height) z_index = tf.range(0, width*height) x_t, y_t = tf.meshgrid(x_index, y_index) # flatten x_t_flat = tf.reshape(x_t, [1,-1]) y_t_flat = tf.reshape(y_t, [1,-1]) ZZ = tf.reshape(depth_img, [-1]) zeros_target = tf.zeros_like(ZZ) mask = tf.not_equal(ZZ, zeros_target) ones = tf.ones_like(x_t_flat) sampling_grid_2d = tf.concat([x_t_flat, y_t_flat, ones], 0) sampling_grid_2d_sparse = tf.transpose(tf.boolean_mask(tf.transpose(sampling_grid_2d), mask)) ZZ_saved = tf.boolean_mask(ZZ, mask) ones_saved = tf.expand_dims(tf.ones_like(ZZ_saved), 0) projection_grid_3d = tf.matmul(tf.matrix_inverse(tf_K_mat), sampling_grid_2d_sparse*ZZ_saved) homog_points_3d = tf.concat([projection_grid_3d, ones_saved], 0) final_transformation_matrix = tf.matmul(transformation_matrix, small_transform)[:3,:] warped_sampling_grid = tf.matmul(final_transformation_matrix, homog_points_3d) points_2d = tf.matmul(tf_K_mat, warped_sampling_grid[:3,:]) Z = points_2d[2,:] x_dash_pred = points_2d[0,:] y_dash_pred = points_2d[1,:] point_cloud = tf.stack([x_dash_pred, y_dash_pred, Z], 1) sparse_point_cloud = sparsify_cloud(point_cloud) x = tf.transpose(points_2d[0,:]/Z) y = tf.transpose(points_2d[1,:]/Z) mask_int = tf.cast(mask, 'int32') updated_indices = tf.expand_dims(tf.boolean_mask(mask_int*z_index, mask), 1) updated_Z = tf.scatter_nd(updated_indices, Z, tf.constant([width*height])) updated_x = tf.scatter_nd(updated_indices, x, tf.constant([width*height])) neg_ones = tf.ones_like(updated_x)*-1.0 updated_x_fin = tf.where(tf.equal(updated_Z, zeros_target), neg_ones, updated_x) updated_y = tf.scatter_nd(updated_indices, y, tf.constant([width*height])) updated_y_fin = tf.where(tf.equal(updated_Z, zeros_target), neg_ones, updated_y) reprojected_grid = tf.stack([updated_x_fin, updated_y_fin], 1) transformed_depth = tf.reshape(updated_Z, (IMG_HT, IMG_WDT)) return reprojected_grid, transformed_depth, sparse_point_cloud def reverse_all(z): """Reversing from cantor function indices to correct indices""" z = tf.cast(z, 'float32') w = tf.floor((tf.sqrt(8.*z + 1.) - 1.)/2.0) t = (w**2 + w)/2.0 y = tf.clip_by_value(tf.expand_dims(z - t, 1), 0.0, IMG_HT - 1) x = tf.clip_by_value(tf.expand_dims(w - y[:,0], 1), 0.0, IMG_WDT - 1) return tf.concat([y,x], 1) def get_pixel_value(img, x, y): """Cantor pairing for removing non-unique updates and indices. At the time of implementation, unfixed issue with scatter_nd causes problems with int32 update values. Till resolution, implemented on cpu """ with tf.device('/cpu:0'): indices = tf.stack([y, x], 2) indices = tf.reshape(indices, (375*1242, 2)) values = tf.reshape(img, [-1]) Y = indices[:,0] X = indices[:,1] Z = (X + Y)*(X + Y + 1)/2 + Y filtered, idx = tf.unique(tf.squeeze(Z)) updated_values = tf.unsorted_segment_max(values, idx, tf.shape(filtered)[0]) # updated_indices = tf.map_fn(fn=lambda i: reverse(i), elems=filtered, dtype=tf.float32) updated_indices = reverse_all(filtered) updated_indices = tf.cast(updated_indices, 'int32') resolved_map = tf.scatter_nd(updated_indices, updated_values, img.shape) return resolved_map def _bilinear_sampling(img, x_func, y_func): """ Sampling from input image and performing bilinear interpolation """ max_y = tf.constant(IMG_HT - 1, dtype=tf.int32) max_x = tf.constant(IMG_WDT - 1, dtype=tf.int32) zero = tf.zeros([], dtype='int32') # rescale x and y to [0, W/H/D] x = 0.5 * ((x_func + 1.0) * tf.cast(IMG_WDT - 1, 'float32')) y = 0.5 * ((y_func + 1.0) * tf.cast(IMG_HT - 1, 'float32')) x = tf.clip_by_value(x, 0.0, tf.cast(max_x, 'float32')) y = tf.clip_by_value(y, 0.0, tf.cast(max_y, 'float32')) # grab 4 nearest corner points for each (x_i, y_i) # i.e. we need a rectangle around the point of interest x0 = tf.cast(tf.floor(x), 'int32') x1 = x0 + 1 y0 = tf.cast(tf.floor(y), 'int32') y1 = y0 + 1 # clip to range [0, H/W] to not violate img boundaries x0 = tf.clip_by_value(x0, 0, max_x) x1 = tf.clip_by_value(x1, 0, max_x) y0 = tf.clip_by_value(y0, 0, max_y) y1 = tf.clip_by_value(y1, 0, max_y) # find Ia, Ib, Ic, Id Ia = get_pixel_value(img, x0, y0) Ib = get_pixel_value(img, x0, y1) Ic = get_pixel_value(img, x1, y0) Id = get_pixel_value(img, x1, y1) x0 = tf.cast(x0, 'float32') x1 = tf.cast(x1, 'float32') y0 = tf.cast(y0, 'float32') y1 = tf.cast(y1, 'float32') # calculate deltas wa = (x1-x) * (y1-y) wb = (x1-x) * (y-y0) wc = (x-x0) * (y1-y) wd = (x-x0) * (y-y0) loc = wa*Ia + wb*Ib + wc*Ic + wd*Id return loc ================================================ FILE: code/common/cnn_utils_res.py ================================================ import numpy as np import tensorflow as tf def weight_variable(shape, str): ''' Helper function to create a weight variable initialized with a normal distribution (truncated to two standard devs) Parameters ---------- shape : list Size of weight variable ''' W = tf.get_variable("weight" + str, shape=shape, initializer=tf.contrib.layers.xavier_initializer()) return W def weight_variable_fc(shape, str): ''' Helper function to create a weight variable initialized with a normal distribution (truncated to two standard devs) Parameters ---------- shape : list Size of weight variable ''' W = 0.01*tf.get_variable("weight" + str, shape=shape, initializer=tf.contrib.layers.xavier_initializer()) return W def bias_variable(shape, str): B = tf.Variable(tf.constant(0.0, shape= shape, dtype=tf.float32), name="bias" + str) return B def init_weights(W, str, to_train): init = tf.constant_initializer(W) weight = tf.get_variable('weight'+str, shape=W.shape, dtype=tf.float32, initializer= init, trainable = to_train) return weight def init_bias(B, layerno, to_train): init = tf.constant_initializer(B) bias = tf.get_variable('bias_%d'%layerno, shape=B.shape, dtype=tf.float32, initializer= init, trainable = to_train) return bias def conv2d_batchnorm(x, W, name, phase, beta_r, gamma_r, mean_r, variance_r, stride = [1,1,1,1], relu = True): beta = tf.constant_initializer(beta_r) gamma = tf.constant_initializer(gamma_r) moving_mean = tf.constant_initializer(mean_r) moving_variance = tf.constant_initializer(variance_r) with tf.name_scope(name): mid1 = tf.nn.conv2d(x, W, strides = stride, padding = "SAME") with tf.name_scope('batch_norm'): mid2 = tf.contrib.layers.batch_norm(mid1, param_initializers={'beta': beta, 'gamma': gamma, 'moving_mean': moving_mean,'moving_variance': moving_variance,}, is_training = phase, updates_collections = None, scale = True, decay = 0.9) if(relu == True): return tf.nn.relu(mid2) else: return mid2 def conv2d_batchnorm_init(x, W, name, phase, stride = [1,1,1,1], relu = True): with tf.name_scope(name): mid1 = tf.nn.conv2d(x, W, strides = stride, padding ="SAME") with tf.name_scope('batch_norm'): mid2 = tf.contrib.layers.batch_norm(mid1, is_training = phase, updates_collections = None) if(relu == True): return tf.nn.relu(mid2) else: return mid2 def conv2d_init(x, W, name, phase, stride, padding): with tf.name_scope(name): mid1 = tf.nn.conv2d(x, W, strides = stride, padding = padding) mid2 = tf.nn.relu(mid1) return mid2 def conv2d_bias_init(x, W, b, name): with tf.name_scope(name): mid1 = tf.nn.conv2d(x, W, strides = [1,1,1,1], padding = "SAME") + b mid2 = tf.nn.relu(mid1) return mid2 def max_pool(x, name): return tf.nn.max_pool(x, ksize = [1,2,2,1], strides = [1,2,2,1], padding = "SAME", name=name) def variable_summaries(var): """Attach a lot of summaries to a Tensor (for TensorBoard visualization).""" with tf.name_scope('summaries'): mean = tf.reduce_mean(var) sum_mean = tf.summary.scalar('mean', mean) with tf.name_scope('stddev'): stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean))) sum_stddev = tf.summary.scalar('stddev', stddev) #tf.summary.scalar('max', tf.reduce_max(var)) #tf.summary.scalar('min', tf.reduce_min(var)) sum_hist = tf.summary.histogram('histogram', var) return [sum_mean, sum_hist, sum_stddev] ================================================ FILE: code/common/global_agg_net.py ================================================ import numpy as np import tensorflow as tf import scipy.misc as smc import matplotlib.pyplot as plt import config_res as config from cnn_utils_res import * import resnet_rgb_model as model import resnet_depth_model as model_depth batch_size = config.net_params['batch_size'] current_epoch = config.net_params['load_epoch'] def End_Net_weights_init(): """ Initialize Aggregation Network Weights and Summaries """ W_ext1 = weight_variable([3,3,768,384], "_8") W_ext2 = weight_variable([3,3,384,384], "_9") W_ext3 = weight_variable([1,2,384,384], "_10") W_ext4_rot = weight_variable([1,1,384,384], "_11") W_fc_rot = weight_variable_fc([3840,3], "_12") W_ext4_tr = weight_variable([1,1,384,384], "_13") W_fc_tr = weight_variable_fc([3840,3], "_14") end_weights = [W_ext1, W_ext2, W_ext3, W_ext4_rot, W_fc_rot, W_ext4_tr, W_fc_tr] weight_summaries = [] for weight_index in range(len(end_weights)): with tf.name_scope('weight_%d'%weight_index): weight_summaries += variable_summaries(end_weights[weight_index]) return end_weights, weight_summaries def End_Net(input_x, phase_depth, keep_prob): """ Define Aggregation Network """ weights, summaries = End_Net_weights_init() layer8 = conv2d_batchnorm_init(input_x, weights[0], name="conv_9", phase= phase_depth, stride=[1,2,2,1]) layer9 = conv2d_batchnorm_init(layer8, weights[1], name="conv_10", phase= phase_depth, stride=[1,2,2,1]) layer10 = conv2d_batchnorm_init(layer9, weights[2], name="conv_11", phase= phase_depth, stride=[1,1,1,1]) layer11_rot = conv2d_batchnorm_init(layer10, weights[3], name="conv_12", phase= phase_depth, stride=[1,1,1,1]) layer11_m_rot = tf.reshape(layer11_rot, [batch_size, 3840]) layer11_drop_rot = tf.nn.dropout(layer11_m_rot, keep_prob) layer11_vec_rot = (tf.matmul(layer11_drop_rot, weights[4])) layer11_tr = conv2d_batchnorm_init(layer10, weights[5], name="conv_13", phase= phase_depth, stride=[1,1,1,1]) layer11_m_tr = tf.reshape(layer11_tr, [batch_size, 3840]) layer11_drop_tr = tf.nn.dropout(layer11_m_tr, keep_prob) layer11_vec_tr = (tf.matmul(layer11_drop_tr, weights[6])) output_vectors = tf.concat([layer11_vec_tr, layer11_vec_rot], 1) return output_vectors, summaries def End_Net_Out(X1, phase_rgb, pooled_input2, phase, keep_prob): """ Computation Graph """ RGB_Net_obj = model.Resnet(X1, phase_rgb) Depth_Net_obj = model_depth.Depthnet(pooled_input2, phase) with tf.variable_scope('ResNet'): with tf.device('/device:GPU:0'): output_rgb = RGB_Net_obj.Net() with tf.variable_scope('DepthNet'): with tf.device('/device:GPU:1'): output_depth = Depth_Net_obj.Net() layer_next = tf.concat([output_rgb, output_depth], 3) end_net_op = End_Net(layer_next, phase, keep_prob) return end_net_op ================================================ FILE: code/common/resnet_depth_model.py ================================================ import numpy as np import tensorflow as tf import json from cnn_utils_res import * import config_res with open(config_res.paths['resnet_params_path']) as f_in: parameters = json.load(f_in) class Depthnet: def __init__(self, input_y, phase, parameters = parameters): self.input_y = input_y self.phase = phase self.parameters = parameters self.layer_zero self.layer self.Net def Net(self): layer_zero_out = self.layer_zero(self.input_y) current_output = layer_zero_out for layer_idx in range(1,5): layer_out = self.layer(current_output, layer_idx) current_output = layer_out return current_output def layer_zero(self, layer_input): layer_dict =self.parameters['layer0'] bl_str = "block_1" W = np.array(layer_dict[bl_str]['conv1']['weight'], dtype = np.float32) bn_mov_mean = np.array(layer_dict[bl_str]['bn1']['running_mean'], dtype = np.float32) bn_mov_var = np.array(layer_dict[bl_str]['bn1']['running_var'], dtype = np.float32) bn_gamma = np.array(layer_dict[bl_str]['bn1']['weight'], dtype = np.float32) bn_beta = np.array(layer_dict[bl_str]['bn1']['bias'], dtype = np.float32) shapex = W.shape W_conv = weight_variable([shapex[0], shapex[1], 1, shapex[3]/2], "_depth_0") # out = conv2d_batchnorm(layer_input, W_conv, "layer_depth_0", self.phase, bn_beta, bn_gamma, bn_mov_mean, bn_mov_var, [1,2,2,1], True) out = conv2d_batchnorm_init(layer_input, W_conv, "layer_depth_0", self.phase, [1,2,2,1], True) out = tf.nn.max_pool(out, [1,3,3,1], strides=[1,2,2,1], padding="SAME") print('layer0', out.shape) return out def layer(self, layer_input, layer_no): layer_dict = self.parameters['layer%d'%layer_no] cur = layer_input res = layer_input for b_no in range(1,3): bl_str = "block_%d"%b_no stride = [0,0] if(b_no == 1): stride = [2,1] else: stride = [1,1] # for in_bno in range(1,3): W1 = np.array(layer_dict[bl_str]['conv1']['weight'], dtype = np.float32) bn_mov_mean1 = np.array(layer_dict[bl_str]['bn1']['running_mean'], dtype = np.float32) bn_mov_var1 = np.array(layer_dict[bl_str]['bn1']['running_var'], dtype = np.float32) bn_gamma1 = np.array(layer_dict[bl_str]['bn1']['weight'], dtype = np.float32) bn_beta1 = np.array(layer_dict[bl_str]['bn1']['bias'], dtype = np.float32) W2 = np.array(layer_dict[bl_str]['conv2']['weight'], dtype = np.float32) bn_mov_mean2 = np.array(layer_dict[bl_str]['bn2']['running_mean'], dtype = np.float32) bn_mov_var2 = np.array(layer_dict[bl_str]['bn2']['running_var'], dtype = np.float32) bn_gamma2 = np.array(layer_dict[bl_str]['bn2']['weight'], dtype = np.float32) bn_beta2 = np.array(layer_dict[bl_str]['bn2']['bias'], dtype = np.float32) # W_conv1 = init_weights(W1, "_l_%d_bl_%d_no_%d"%(layer_no,b_no, 1), False) # W_conv2 = init_weights(W2, "_l_%d_bl_%d_no_%d"%(layer_no,b_no, 2), False) shapex1 = W1.shape shapex2 = W2.shape W_conv1 = weight_variable([shapex1[0], shapex1[1], shapex1[2]/2, shapex1[3]/2], "dep_l_%d_bl_%d_no_%d"%(layer_no,b_no, 1)) W_conv2 = weight_variable([shapex2[0], shapex2[1], shapex2[2]/2, shapex2[3]/2], "dep_l_%d_bl_%d_no_%d"%(layer_no,b_no, 2)) # out1 = conv2d_batchnorm(cur, W_conv1, "layer_%d_%d_1"%(layer_no,b_no), self.phase, bn_beta1, bn_gamma1, bn_mov_mean1, bn_mov_var1, [1,stride[0],stride[0],1], False) out1 = conv2d_batchnorm_init(cur, W_conv1, "dep_layer_%d_%d_1"%(layer_no,b_no), self.phase, [1,stride[0],stride[0],1], False) print("layer_%d_%d_1"%(layer_no,b_no), out1.shape) """ if layer1 no downsample, so stride 2,1 then 1,1 """ """else stride 2,1 then downsample then 1,1 """ if(layer_no > 1 and b_no == 1): downsample_dict = self.parameters['layer%d_downsample'%layer_no] W_dn = np.array(downsample_dict['block_1']['conv']['weight'], dtype = np.float32) bn_mov_mean_dn = np.array(downsample_dict['block_1']['bn']['running_mean'], dtype = np.float32) bn_mov_var_dn = np.array(downsample_dict['block_1']['bn']['running_var'], dtype = np.float32) bn_gamma_dn = np.array(downsample_dict['block_1']['bn']['weight'], dtype = np.float32) bn_beta_dn = np.array(downsample_dict['block_1']['bn']['bias'], dtype = np.float32) # W_conv_dn = init_weights(W_dn, "downsample_%d"%(layer_no), False) # res = conv2d_batchnorm(res, W_conv_dn, "layer_dn_%d"%(layer_no), self.phase, bn_beta_dn, bn_gamma_dn, bn_mov_mean_dn, bn_mov_var_dn, [1,2,2,1], False) shapex = W_dn.shape W_conv_dn = weight_variable([shapex[0], shapex[1], shapex[2]/2, shapex[3]/2], "dep_downsample_%d"%(layer_no)) res = conv2d_batchnorm_init(res, W_conv_dn, "dep_layer_dn_%d"%(layer_no), self.phase, [1,2,2,1], False) print("downsample_layer_%d_%d_1"%(layer_no,b_no), res.shape) out1 = tf.nn.relu(out1 + res) else: out1 = tf.nn.relu(out1) out2 = conv2d_batchnorm_init(out1, W_conv2, "dep_layer_%d_%d_2"%(layer_no,b_no), self.phase, [1,stride[1],stride[1],1], True) print("layer_%d_%d_2"%(layer_no,b_no), out2.shape) cur = out2 return cur ================================================ FILE: code/common/resnet_rgb_model.py ================================================ import numpy as np import tensorflow as tf import json from cnn_utils_res import * import config_res with open(config_res.paths['resnet_params_path']) as f_in: parameters = json.load(f_in) class Resnet: def __init__(self, input_x, phase, parameters = parameters): self.input_x = input_x self.phase = phase self.parameters = parameters self.layer_zero self.layer self.Net def Net(self): layer_zero_out = self.layer_zero(self.input_x) current_output = layer_zero_out for layer_idx in range(1,5): layer_out = self.layer(current_output, layer_idx) current_output = layer_out return current_output def layer_zero(self, layer_input): layer_dict =self.parameters['layer0'] bl_str = "block_1" W = np.array(layer_dict[bl_str]['conv1']['weight'], dtype = np.float32) bn_mov_mean = np.array(layer_dict[bl_str]['bn1']['running_mean'], dtype = np.float32) bn_mov_var = np.array(layer_dict[bl_str]['bn1']['running_var'], dtype = np.float32) bn_gamma = np.array(layer_dict[bl_str]['bn1']['weight'], dtype = np.float32) bn_beta = np.array(layer_dict[bl_str]['bn1']['bias'], dtype = np.float32) W_conv = init_weights(W, "_0", False) # with tf.name_scope('layer_0'): out = conv2d_batchnorm(layer_input, W_conv, "layer_0", self.phase, bn_beta, bn_gamma, bn_mov_mean, bn_mov_var, [1,2,2,1], True) out = tf.nn.max_pool(out, [1,3,3,1], strides=[1,2,2,1], padding="SAME") print('layer0', out.shape) return out def layer(self, layer_input, layer_no): layer_dict = self.parameters['layer%d'%layer_no] cur = layer_input res = layer_input for b_no in range(1,3): bl_str = "block_%d"%b_no stride = [0,0] if(b_no == 1): stride = [2,1] else: stride = [1,1] # for in_bno in range(1,3): W1 = np.array(layer_dict[bl_str]['conv1']['weight'], dtype = np.float32) bn_mov_mean1 = np.array(layer_dict[bl_str]['bn1']['running_mean'], dtype = np.float32) bn_mov_var1 = np.array(layer_dict[bl_str]['bn1']['running_var'], dtype = np.float32) bn_gamma1 = np.array(layer_dict[bl_str]['bn1']['weight'], dtype = np.float32) bn_beta1 = np.array(layer_dict[bl_str]['bn1']['bias'], dtype = np.float32) W2 = np.array(layer_dict[bl_str]['conv2']['weight'], dtype = np.float32) bn_mov_mean2 = np.array(layer_dict[bl_str]['bn2']['running_mean'], dtype = np.float32) bn_mov_var2 = np.array(layer_dict[bl_str]['bn2']['running_var'], dtype = np.float32) bn_gamma2 = np.array(layer_dict[bl_str]['bn2']['weight'], dtype = np.float32) bn_beta2 = np.array(layer_dict[bl_str]['bn2']['bias'], dtype = np.float32) W_conv1 = init_weights(W1, "_l_%d_bl_%d_no_%d"%(layer_no,b_no, 1), False) W_conv2 = init_weights(W2, "_l_%d_bl_%d_no_%d"%(layer_no,b_no, 2), False) # with tf.name_scope("layer_%d_%d_1"%(layer_no,b_no)): out1 = conv2d_batchnorm(cur, W_conv1, "layer_%d_%d_1"%(layer_no,b_no), self.phase, bn_beta1, bn_gamma1, bn_mov_mean1, bn_mov_var1, [1,stride[0],stride[0],1], False) print("layer_%d_%d_1"%(layer_no,b_no), out1.shape) """ if layer1 no downsample, so stride 2,1 then 1,1 """ """else stride 2,1 then downsample then 1,1 """ if(layer_no > 1 and b_no == 1): downsample_dict = self.parameters['layer%d_downsample'%layer_no] W_dn = np.array(downsample_dict['block_1']['conv']['weight'], dtype = np.float32) bn_mov_mean_dn = np.array(downsample_dict['block_1']['bn']['running_mean'], dtype = np.float32) bn_mov_var_dn = np.array(downsample_dict['block_1']['bn']['running_var'], dtype = np.float32) bn_gamma_dn = np.array(downsample_dict['block_1']['bn']['weight'], dtype = np.float32) bn_beta_dn = np.array(downsample_dict['block_1']['bn']['bias'], dtype = np.float32) W_conv_dn = init_weights(W_dn, "downsample_%d"%(layer_no), False) # with tf.name_scope("downsample_layer_%d_%d"%(layer_no,b_no)): res = conv2d_batchnorm(res, W_conv_dn, "layer_dn_%d"%(layer_no), self.phase, bn_beta_dn, bn_gamma_dn, bn_mov_mean_dn, bn_mov_var_dn, [1,2,2,1], False) print("downsample_layer_%d_%d_1"%(layer_no,b_no), res.shape) out1 = tf.nn.relu(out1 + res) else: out1 = tf.nn.relu(out1) # with tf.name_scope("layer_%d_%d_2"%(layer_no,b_no)): out2 = conv2d_batchnorm(out1, W_conv2, "layer_%d_%d_2"%(layer_no,b_no), self.phase, bn_beta2, bn_gamma2, bn_mov_mean2, bn_mov_var2, [1,stride[1],stride[1],1], True) print("layer_%d_%d_2"%(layer_no,b_no), out2.shape) cur = out2 return cur ================================================ FILE: code/config_res.py ================================================ import numpy as np # Important paths # resnet_params_path: path containing .pkl file of RESNET18_BN weights # dataset_path_full: path to parser file # checkpoint_path: dir where checkpoints are saved and loaded from # training_imgs_path: dir to save training progress frames (spatial transformer outputs during training) # validation_imgs_path: dir to save validation progress frames (spatial transformer outputs during validation) paths = dict( resnet_params_path = "../Extrinsic_Calibration_2/parameters.json", dataset_path_full = "/home/ganeshiyer/Extrinsic_Calibration_1.5/model_with_emd/rotation_results/parsed_set.txt", checkpoint_path = "/tmp/ganesh_saved_Weights/Checkpoint_simple_transformer", training_imgs_path = "/tmp/ganesh_saved_Weights/training_imgs", validation_imgs_path = "/tmp/ganesh_saved_Weights/validation_imgs" ) # Depth Map parameters # IMG_HT: input image height # IMG_WDT: input image width depth_img_params = dict( IMG_HT = 375, IMG_WDT = 1242 ) # Camera parameters # The current parameters are for the training set created for the all raw sequences: http://www.cvlibs.net/datasets/kitti/raw_data.php on 2011_09_26. Camera parameters change based on the date of sequence, hence used in config # fx: focal length x # fy: focal length y # cx: camera center cx # cy: camera center cy # cam_transform_02: Transform to color camera 02 (applied after transform) # cam_transform_02_inv: Inverse translation of above camera_params = dict( fx = 7.215377e+02, fy = 7.215377e+02, cx = 6.095593e+02, cy = 1.728540e+02, cam_transform_02 = np.array([1.0, 0.0, 0.0, (-4.485728e+01)/7.215377e+02, 0.0, 1.0, 0.0, (-2.163791e-01)/7.215377e+02, 0.0, 0.0, 1.0, -2.745884e-03, 0.0, 0.0, 0.0, 1.0]).reshape(4,4), cam_transform_02_inv = np.array([1.0, 0.0, 0.0, (4.485728e+01)/7.215377e+02, 0.0, 1.0, 0.0, (2.163791e-01)/7.215377e+02, 0.0, 0.0, 1.0, 2.745884e-03, 0.0, 0.0, 0.0, 1.0]).reshape(4,4) ) # Network and Training Parameters # batch_size: batch_size taken during training # total_frames: total instances (check parsed_set.txt for total number of lines) # total_frames_train: total training instances # total_frames_validation: total validation instances # partition_limit: partition size of the total dataset loaded into memory during training # epochs: total number of epochs # learning_rate # beta1: momentum term for Adam Optimizer # load_epoch: Load checkpoint no. 0 at the start of training (can be changed to resume training) net_params = dict( batch_size = 20, total_frames = 30000, total_frames_train = 24000, total_frames_validation = 6000, partition_limit = 1200, epochs = 40, learning_rate = 5e-4, beta1 = 0.9, load_epoch = 0 ) ================================================ FILE: code/dataset_files/dataset_build_color.py ================================================ """ For sequence: 2011_09_26 """ import numpy as np import matplotlib.pyplot as plt import os import glob import argparse from natsort import natsorted as ns from skimage import io import scipy.misc as smc plt.ion() IMG_HT = 375 IMG_WDT = 1242 fx = 7.215377e+02 fy = 7.215377e+02 cx = 6.095593e+02 cy = 1.728540e+02 K = np.array([7.215377e+02, 0.000000e+00, 6.095593e+02, 0.000000e+00, 7.215377e+02, 1.728540e+02, 0.000000e+00, 0.000000e+00, 1.000000e+00]).reshape(3,3) velo_to_cam_R = np.array([7.533745e-03, -9.999714e-01, -6.166020e-04, 1.480249e-02, 7.280733e-04, -9.998902e-01, 9.998621e-01, 7.523790e-03, 1.480755e-02]).reshape(3,3) velo_to_cam_T = np.array([-4.069766e-03, -7.631618e-02, -2.717806e-01]).reshape(3,1) velo_to_cam = np.vstack((np.hstack((velo_to_cam_R, velo_to_cam_T)), np.array([[0,0,0,1]]))) R_rect_00 = np.array([9.999239e-01, 9.837760e-03, -7.445048e-03, 0.0, -9.869795e-03, 9.999421e-01, -4.278459e-03, 0.0, 7.402527e-03, 4.351614e-03, 9.999631e-01, 0.0, 0.0, 0.0, 0.0, 1.0]).reshape(4,4) cam_02_transform = np.array([1.0, 0.0, 0.0, 4.485728e+01/fx, 0.0, 1.0, 0.0, 2.163791e-01/fy, 0.0, 0.0, 1.0, 2.745884e-03, 0.0, 0.0, 0.0, 1.0]).reshape(4,4) parser = argparse.ArgumentParser(description="Create Lidar Dataset") parser.add_argument("path", help = "path_to_folder, end with number", type = str) args = parser.parse_args() # # main_path = "/tmp/lidar_calibration_dataset/2011_09_26/2011_09_26_drive_0104" main_path = args.path def timestamp_sync(path): txt1 = np.loadtxt(path + "_extract/velodyne_points/timestamps.txt", dtype = str) txt2 = np.loadtxt(path + "_sync/velodyne_points/timestamps.txt", dtype = str) file_list = ns(glob.glob(path + "_extract/velodyne_points/data/*.txt")) times1 = txt1[:,1] times2 = txt2[:,1] for idx in range(times1.shape[0]): times1[idx] = times1[idx].split(":")[2] for idx in range(times2.shape[0]): times2[idx] = times2[idx].split(":")[2] start_pt = times2[0] end_pt = times2[-1] index_start = np.where(times1 == start_pt)[0][0] index_end = np.where(times1 == end_pt)[0][0] return file_list[index_start:index_end+1] if not os.path.exists(main_path + "_sync/depth_maps"): os.makedirs(main_path + "_sync/depth_maps") if not os.path.exists(main_path + "_sync/target_imgs"): os.makedirs(main_path + "_sync/target_imgs") if not os.path.exists(main_path + "_sync/depth_maps_transformed"): os.makedirs(main_path + "_sync/depth_maps_transformed") depth_maps_folder = main_path + "_sync/depth_maps" target_img_folder = main_path + "_sync/target_imgs" depth_maps_transformed_folder = main_path + "_sync/depth_maps_transformed" point_files = timestamp_sync(main_path) imgs_files = ns(glob.glob(main_path + "_sync/image_02/data/*.png")) angle_limit = 0.34722965035593395/1.25 tr_limit = 0.34722965035593395/1.25 angle_list = np.zeros((1,16), dtype = np.float32) for img_name, cloud_name in zip(imgs_files, point_files): print(img_name, cloud_name) omega_x = angle_limit*np.random.random_sample() - (angle_limit/2.0) omega_y = angle_limit*np.random.random_sample() - (angle_limit/2.0) omega_z = angle_limit*np.random.random_sample() - (angle_limit/2.0) tr_x = tr_limit*np.random.random_sample() - (tr_limit/2.0) tr_y = tr_limit*np.random.random_sample() - (tr_limit/2.0) tr_z = tr_limit*np.random.random_sample() - (tr_limit/2.0) theta = np.sqrt(omega_x**2 + omega_y**2 + omega_z**2) omega_cross = np.array([0.0, -omega_z, omega_y, omega_z, 0.0, -omega_x, -omega_y, omega_x, 0.0]).reshape(3,3) A = np.sin(theta)/theta B = (1.0 - np.cos(theta))/(theta**2) R = np.eye(3,3) + A*omega_cross + B*np.matmul(omega_cross, omega_cross) T = np.array([tr_x, tr_y, tr_z]).reshape(3,1) random_transform = np.vstack((np.hstack((R, T)), np.array([[0.0, 0.0, 0.0, 1.0]]))) to_write_tr = np.expand_dims(np.ndarray.flatten(random_transform), 0) angle_list = np.vstack((angle_list, to_write_tr)) points = np.loadtxt(cloud_name) points = points[:,:3] ones_col = np.ones(shape=(points.shape[0],1)) points = np.hstack((points,ones_col)) current_img = smc.imread(img_name) # current_img_color = smc.imread(color_name) img = smc.imread(img_name) img_ht = img.shape[0] img_wdt = img.shape[1] points_in_cam_axis = np.matmul(R_rect_00, (np.matmul(velo_to_cam, points.T))) transformed_points = np.matmul(random_transform, points_in_cam_axis) # transformed_points = transformed_points[:-1,:] points_2d = np.matmul(K, np.matmul(cam_02_transform, transformed_points)[:-1,:]) # points_2d = np.matmul(K, points_in_cam_axis[:-1,:]) Z = points_2d[2,:] x = (points_2d[0,:]/Z).T y = (points_2d[1,:]/Z).T x = np.clip(x, 0.0, img_wdt - 1) y = np.clip(y, 0.0, img_ht - 1) reprojected_img = np.zeros_like(img) for x_idx, y_idx,z_idx in zip(x,y,Z): if(z_idx>0): reprojected_img[int(y_idx), int(x_idx)] = z_idx smc.imsave(depth_maps_transformed_folder + "/" + img_name[-14:], reprojected_img) points_2d = np.matmul(K, np.matmul(cam_02_transform, points_in_cam_axis)[:-1,:]) Z = points_2d[2,:] x = (points_2d[0,:]/Z).T y = (points_2d[1,:]/Z).T x = np.clip(x, 0.0, img_wdt - 1) y = np.clip(y, 0.0, img_ht - 1) reprojected_img = np.zeros_like(img) for x_idx, y_idx,z_idx in zip(x,y,Z): if(z_idx>0): reprojected_img[int(y_idx), int(x_idx)] = z_idx pooled_img = reprojected_img print(img_name[-14:]) reconstructed_img = current_img*(pooled_img>0.) smc.imsave(depth_maps_folder + "/" + img_name[-14:], pooled_img) smc.imsave(target_img_folder + "/" + img_name[-14:], reconstructed_img) np.savetxt(depth_maps_transformed_folder + "/../angle_list.txt", angle_list[1:], fmt = "%.4f") ================================================ FILE: code/dataset_files/dataset_build_color_2.py ================================================ """ For sequence: 2011_09_26 """ import numpy as np import matplotlib.pyplot as plt import os import glob import argparse from natsort import natsorted as ns from skimage import io import scipy.misc as smc plt.ion() IMG_HT = 375 IMG_WDT = 1242 fx = 7.215377e+02 fy = 7.215377e+02 cx = 6.095593e+02 cy = 1.728540e+02 K = np.array([7.215377e+02, 0.000000e+00, 6.095593e+02, 0.000000e+00, 7.215377e+02, 1.728540e+02, 0.000000e+00, 0.000000e+00, 1.000000e+00]).reshape(3,3) velo_to_cam_R = np.array([7.533745e-03, -9.999714e-01, -6.166020e-04, 1.480249e-02, 7.280733e-04, -9.998902e-01, 9.998621e-01, 7.523790e-03, 1.480755e-02]).reshape(3,3) velo_to_cam_T = np.array([-4.069766e-03, -7.631618e-02, -2.717806e-01]).reshape(3,1) velo_to_cam = np.vstack((np.hstack((velo_to_cam_R, velo_to_cam_T)), np.array([[0,0,0,1]]))) R_rect_00 = np.array([9.999239e-01, 9.837760e-03, -7.445048e-03, 0.0, -9.869795e-03, 9.999421e-01, -4.278459e-03, 0.0, 7.402527e-03, 4.351614e-03, 9.999631e-01, 0.0, 0.0, 0.0, 0.0, 1.0]).reshape(4,4) cam_02_transform = np.array([1.0, 0.0, 0.0, 4.485728e+01/fx, 0.0, 1.0, 0.0, 2.163791e-01/fy, 0.0, 0.0, 1.0, 2.745884e-03, 0.0, 0.0, 0.0, 1.0]).reshape(4,4) parser = argparse.ArgumentParser(description="Create Lidar Dataset") parser.add_argument("path", help = "path_to_folder, end with number", type = str) args = parser.parse_args() # # main_path = "/tmp/lidar_calibration_dataset/2011_09_26/2011_09_26_drive_0104" main_path = args.path def timestamp_sync(path): txt1 = np.loadtxt(path + "_extract/velodyne_points/timestamps.txt", dtype = str) txt2 = np.loadtxt(path + "_sync/velodyne_points/timestamps.txt", dtype = str) file_list = ns(glob.glob(path + "_extract/velodyne_points/data/*.txt")) times1 = txt1[:,1] times2 = txt2[:,1] for idx in range(times1.shape[0]): times1[idx] = times1[idx].split(":")[2] for idx in range(times2.shape[0]): times2[idx] = times2[idx].split(":")[2] start_pt = times2[0] end_pt = times2[-1] index_start = np.where(times1 == start_pt)[0][0] index_end = np.where(times1 == end_pt)[0][0] return file_list[index_start:index_end+1] if not os.path.exists(main_path + "_sync/depth_maps_2"): os.makedirs(main_path + "_sync/depth_maps_2") if not os.path.exists(main_path + "_sync/target_imgs_2"): os.makedirs(main_path + "_sync/target_imgs_2") if not os.path.exists(main_path + "_sync/depth_maps_transformed_2"): os.makedirs(main_path + "_sync/depth_maps_transformed_2") depth_maps_folder = main_path + "_sync/depth_maps_2" target_img_folder = main_path + "_sync/target_imgs_2" depth_maps_transformed_folder = main_path + "_sync/depth_maps_transformed_2" point_files = timestamp_sync(main_path) imgs_files = ns(glob.glob(main_path + "_sync/image_02/data/*.png")) angle_limit = 0.34722965035593395/2.50 tr_limit = 0.34722965035593395*1.5 angle_list = np.zeros((1,16), dtype = np.float32) for img_name, cloud_name in zip(imgs_files, point_files): print(img_name, cloud_name) omega_x = angle_limit*np.random.random_sample() - (angle_limit/2.0) omega_y = angle_limit*np.random.random_sample() - (angle_limit/2.0) omega_z = angle_limit*np.random.random_sample() - (angle_limit/2.0) tr_x = tr_limit*np.random.random_sample() - (tr_limit/2.0) tr_y = tr_limit*np.random.random_sample() - (tr_limit/2.0) tr_z = tr_limit*np.random.random_sample() - (tr_limit/2.0) theta = np.sqrt(omega_x**2 + omega_y**2 + omega_z**2) omega_cross = np.array([0.0, -omega_z, omega_y, omega_z, 0.0, -omega_x, -omega_y, omega_x, 0.0]).reshape(3,3) A = np.sin(theta)/theta B = (1.0 - np.cos(theta))/(theta**2) R = np.eye(3,3) + A*omega_cross + B*np.matmul(omega_cross, omega_cross) T = np.array([tr_x, tr_y, tr_z]).reshape(3,1) random_transform = np.vstack((np.hstack((R, T)), np.array([[0.0, 0.0, 0.0, 1.0]]))) to_write_tr = np.expand_dims(np.ndarray.flatten(random_transform), 0) angle_list = np.vstack((angle_list, to_write_tr)) points = np.loadtxt(cloud_name) points = points[:,:3] ones_col = np.ones(shape=(points.shape[0],1)) points = np.hstack((points,ones_col)) current_img = smc.imread(img_name) # current_img_color = smc.imread(color_name) img = smc.imread(img_name) img_ht = img.shape[0] img_wdt = img.shape[1] points_in_cam_axis = np.matmul(R_rect_00, (np.matmul(velo_to_cam, points.T))) transformed_points = np.matmul(random_transform, points_in_cam_axis) # transformed_points = transformed_points[:-1,:] points_2d = np.matmul(K, np.matmul(cam_02_transform, transformed_points)[:-1,:]) # points_2d = np.matmul(K, points_in_cam_axis[:-1,:]) Z = points_2d[2,:] x = (points_2d[0,:]/Z).T y = (points_2d[1,:]/Z).T x = np.clip(x, 0.0, img_wdt - 1) y = np.clip(y, 0.0, img_ht - 1) reprojected_img = np.zeros_like(img) for x_idx, y_idx,z_idx in zip(x,y,Z): if(z_idx>0): reprojected_img[int(y_idx), int(x_idx)] = z_idx smc.imsave(depth_maps_transformed_folder + "/" + img_name[-14:], reprojected_img) points_2d = np.matmul(K, np.matmul(cam_02_transform, points_in_cam_axis)[:-1,:]) Z = points_2d[2,:] x = (points_2d[0,:]/Z).T y = (points_2d[1,:]/Z).T x = np.clip(x, 0.0, img_wdt - 1) y = np.clip(y, 0.0, img_ht - 1) reprojected_img = np.zeros_like(img) for x_idx, y_idx,z_idx in zip(x,y,Z): if(z_idx>0): reprojected_img[int(y_idx), int(x_idx)] = z_idx pooled_img = reprojected_img print(img_name[-14:]) print(np.max(pooled_img), np.min(pooled_img)) reconstructed_img = current_img*(pooled_img>0.) smc.imsave(depth_maps_folder + "/" + img_name[-14:], pooled_img) smc.imsave(target_img_folder + "/" + img_name[-14:], reconstructed_img) np.savetxt(depth_maps_transformed_folder + "/../angle_list_2.txt", angle_list[1:], fmt = "%.4f") ================================================ FILE: code/dataset_files/dataset_builder_parallel.sh ================================================ SCRIPTPATH="$( cd "$(dirname "$0")" ; pwd -P )" array=( 0001 0002 0005 0009 0011 0013 0014 0015 0017 0018 0019 0020 0022 0023 0027 0028 0029 0032 0035 0036 0039 0046 0048 0051 0052 0056 0057 0059 0060 0061 0064 0070 0079 0084 0086 0087 0091 0093 0095 0096 0101 0104 0106 0113 0117 ) task() { wget -k -c https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_$1/2011_09_26_drive_$1_extract.zip wget -k -c https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_$1/2011_09_26_drive_$1_sync.zip echo "Downloading $1" unzip 2011_09_26_drive_$1_extract.zip unzip 2011_09_26_drive_$1_sync.zip echo "Extracting $1" python -B $SCRIPTPATH/dataset_build_color.py $PWD/2011_09_26/2011_09_26_drive_$1 python -B $SCRIPTPATH/dataset_build_color_2.py $PWD/2011_09_26/2011_09_26_drive_$1 echo "Perturbing Depth Maps for ${array[$id]}" } N=4 ( for id in {0..44} do ((i=i%N)); ((i++==0)) && wait task ${array[$id]} & done ) wait echo "All processes done!" ================================================ FILE: code/dataset_files/parser.py ================================================ import numpy as np import scipy.misc as smc from natsort import natsorted as ns import glob, os import argparse parser = argparse.ArgumentParser(description="Create Lidar Dataset Parser file") parser.add_argument("path", help = "path_to_folder", type = str) args = parser.parse_args() dataset_path = args.path #Picking up all sync folders folder_names = ns(glob.glob(dataset_path +"*_sync" + os.path.sep)) dataset_array = np.zeros(dtype = str, shape = (1,20)) dataset_array_2 = np.zeros(dtype = str, shape = (1,20)) for fn in folder_names: print fn file_names_source = ns(glob.glob(fn + "depth_maps_transformed/*.png")) file_names_target = ns(glob.glob(fn + "depth_maps/*.png")) img_source = ns(glob.glob(fn + "image_02/data/*.png")) img_target = ns(glob.glob(fn + "image_03/data/*.png")) transforms_list = np.loadtxt(fn + "angle_list.txt", dtype = str) file_names_source = np.array(file_names_source, dtype=str).reshape(-1,1) file_names_target = np.array(file_names_target, dtype=str).reshape(-1,1) img_source = np.array(img_source, dtype=str).reshape(-1,1) img_target = np.array(img_target, dtype=str).reshape(-1,1) dataset = np.hstack((file_names_source, file_names_target, img_source, img_target, transforms_list)) print(dataset.shape) dataset_array = np.vstack((dataset_array, dataset)) ####################################################################################### file_names_source_2 = ns(glob.glob(fn + "depth_maps_transformed_2/*.png")) file_names_target_2 = ns(glob.glob(fn + "depth_maps_2/*.png")) transforms_list_2 = np.loadtxt(fn + "angle_list_2.txt", dtype = str) file_names_source_2 = np.array(file_names_source_2, dtype=str).reshape(-1,1) file_names_target_2 = np.array(file_names_target_2, dtype=str).reshape(-1,1) dataset_2 = np.hstack((file_names_source_2, file_names_target_2, img_source, img_target, transforms_list_2)) print(dataset_2.shape) dataset_array_2 = np.vstack((dataset_array_2, dataset_2)) dataset_array = dataset_array[1:] dataset_array_2 = dataset_array_2[1:] final_array = np.vstack((dataset_array, dataset_array_2)) np.random.shuffle(final_array) np.savetxt("parsed_set.txt", final_array, fmt = "%s", delimiter=' ') ================================================ FILE: code/model_utils.py ================================================ import os import tensorflow as tf from tf_ops.emd import tf_auctionmatch from tf_ops.CD import tf_nndistance from tf_ops.sampling import tf_sampling from tf_ops.grouping.tf_grouping import query_ball_point, group_point def pre_load_checkpoint(checkpoint_dir): ckpt = tf.train.get_checkpoint_state(checkpoint_dir) if ckpt and ckpt.model_checkpoint_path: # print(" [*] Reading checkpoint from {}".format(ckpt.model_checkpoint_path)) epoch_step = int(os.path.basename(ckpt.model_checkpoint_path).split('-')[1]) return epoch_step,ckpt.model_checkpoint_path else: return 0,None def get_repulsion_loss4(pred, nsample=20, radius=0.07): # pred: (batch_size, npoint,3) idx, pts_cnt = query_ball_point(radius, nsample, pred, pred) tf.summary.histogram('smooth/unque_index', pts_cnt) grouped_pred = group_point(pred, idx) # (batch_size, npoint, nsample, 3) grouped_pred -= tf.expand_dims(pred, 2) ##get the uniform loss h = 0.03 dist_square = tf.reduce_sum(grouped_pred ** 2, axis=-1) dist_square, idx = tf.nn.top_k(-dist_square, 5) dist_square = -dist_square[:, :, 1:] # remove the first one dist_square = tf.maximum(1e-12,dist_square) dist = tf.sqrt(dist_square) weight = tf.exp(-dist_square/h**2) uniform_loss = tf.reduce_mean(radius-dist*weight) return uniform_loss def get_emd_loss(pred, gt): """ pred: BxNxC, label: BxN, """ batch_size = pred.get_shape()[0].value matchl_out, matchr_out = tf_auctionmatch.auction_match(pred, gt) matched_out = tf_sampling.gather_point(gt, matchl_out) dist = tf.reshape((pred - matched_out) ** 2, shape=(batch_size, -1)) emd_loss = tf.reduce_sum(dist) return emd_loss def get_cd_loss(pred, gt): """ pred: BxNxC, label: BxN, """ dists_forward, _, dists_backward, _ = tf_nndistance.nn_distance(gt, pred) #dists_forward is for each element in gt, the cloest distance to this element CD_dist = tf.reduce_sum(dists_forward) + tf.reduce_sum(dists_backward) return CD_dist if __name__ == '__main__': gt = tf.constant([[[1,0,0],[2,0,0],[3,0,0],[4,0,0]]],tf.float32) pred = tf.constant([[[-10,0,0], [1,0, 0], [2,0, 0], [3,0,0]]],tf.float32) dists_forward, idx1, dists_backward, idx2 = tf_nndistance.nn_distance(gt, pred) with tf.Session() as sess: print idx1.eval() # for each element in gt, the idx of pred print idx2.eval() # for each element in pred, print dists_forward.eval() print dists_backward.eval() ================================================ FILE: code/nw_loader_color.py ================================================ import numpy as np import glob, os, argparse import scipy.misc as smc from tqdm import tqdm import matplotlib.pyplot as plt import config_res as config total = config.net_params['total_frames'] total_train = config.net_params['total_frames_train'] total_validation = config.net_params['total_frames_validation'] partition_limit = config.net_params['partition_limit'] IMG_HT = config.depth_img_params['IMG_HT'] IMG_WDT = config.depth_img_params['IMG_WDT'] batch_size = config.net_params['batch_size'] dataset = np.loadtxt(config.paths['dataset_path_full'], dtype = str) dataset_train = dataset[:total_train] dataset_validation = dataset[total_train:total] def shuffle(): np.random.shuffle(dataset_train) np.random.shuffle(dataset_validation) def load(p_no, mode): if(mode == "train"): dataset_part = dataset_train[p_no*partition_limit:(p_no + 1)*partition_limit] elif(mode == "validation"): dataset_part = dataset_validation[p_no*partition_limit:(p_no + 1)*partition_limit] source_file_names = dataset_part[:,0] target_file_names = dataset_part[:,1] source_image_names = dataset_part[:,2] target_image_names = dataset_part[:,3] transforms = np.float32(dataset_part[:,4:]) target_container = np.zeros((partition_limit, IMG_HT, IMG_WDT, 1), dtype = np.float32) source_container = np.zeros((partition_limit, IMG_HT, IMG_WDT, 1), dtype = np.float32) source_img_container = np.zeros((partition_limit, IMG_HT, IMG_WDT, 3), dtype = np.float32) target_img_container = np.zeros((partition_limit, IMG_HT, IMG_WDT, 3), dtype = np.float32) transforms_container = np.zeros((partition_limit, 4, 4), dtype = np.float32) c_idx = 0 for s_name, t_name, img_source_name, img_target_name, transform in tqdm(zip(source_file_names, target_file_names, source_image_names, target_image_names, transforms)): warped_ip = np.float32(smc.imread(s_name, True)) warped_ip[0:5,:] = 0.0 ; warped_ip[:,0:5] = 0.0 ; warped_ip[IMG_HT - 5:,:] = 0.0 ; warped_ip[:,IMG_WDT-5:] = 0.0 ; warped_ip = (warped_ip - 40.0)/40.0 source_container[c_idx, :, :, 0] = warped_ip target_ip = np.float32(smc.imread(t_name, True)) target_ip[0:5,:] = 0.0 ; target_ip[:,0:5] = 0.0 ; target_ip[IMG_HT - 5:,:] = 0.0 ; target_ip[:,IMG_WDT-5:] = 0.0 ; target_ip = (target_ip - 40.0)/40.0 target_container[c_idx, :, :, 0] = target_ip source_img = np.float32(smc.imread(img_source_name)) source_img[0:5,:,:] = 0.0 ; source_img[:,0:5,:] = 0.0 ; source_img[IMG_HT - 5:,:,:] = 0.0 ; source_img[:,IMG_WDT-5:,:] = 0.0 ; source_img = (source_img - 127.5)/127.5 source_img_container[c_idx, :, :, :] = source_img target_img = np.float32(smc.imread(img_target_name)) target_img[0:5,:,:] = 0.0 ; target_img[:,0:5,:] = 0.0 ; target_img[IMG_HT - 5:,:,:] = 0.0 ; target_img[:,IMG_WDT-5:,:] = 0.0 ; target_img = (target_img - 127.5)/127.5 target_img_container[c_idx, :, :, :] = target_img transforms_container[c_idx, :, :] = np.linalg.inv(transform.reshape(4,4)) c_idx+=1 source_container = source_container.reshape(partition_limit/batch_size, batch_size, IMG_HT, IMG_WDT , 1) target_container = target_container.reshape(partition_limit/batch_size, batch_size, IMG_HT, IMG_WDT , 1) source_img_container = source_img_container.reshape(partition_limit/batch_size, batch_size, IMG_HT, IMG_WDT, 3) target_img_container = target_img_container.reshape(partition_limit/batch_size, batch_size, IMG_HT, IMG_WDT, 3) transforms_container = transforms_container.reshape(partition_limit/batch_size, batch_size, 4, 4) return source_container, target_container, source_img_container, target_img_container, transforms_container ================================================ FILE: code/tf_ops/CD/__init__.py ================================================ ================================================ FILE: code/tf_ops/CD/makefile ================================================ nvcc = /usr/local/cuda-8.0/bin/nvcc cudalib = /usr/local/cuda-8.0/lib64/ tensorflow = /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include all: tf_nndistance_so.so render_balls_so.so .PHONY : all tf_nndistance_so.so: tf_nndistance_g.cu.o tf_nndistance.cpp g++ -std=c++11 tf_nndistance.cpp tf_nndistance_g.cu.o -o tf_nndistance_so.so -shared -fPIC -I $(tensorflow) -lcudart -L $(cudalib) -O2 -D_GLIBCXX_USE_CXX11_ABI=0 tf_nndistance_g.cu.o: tf_nndistance_g.cu $(nvcc) -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 -c -o tf_nndistance_g.cu.o tf_nndistance_g.cu -I $(tensorflow) -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -O2 render_balls_so.so: render_balls_so.cpp g++ -std=c++11 render_balls_so.cpp -o render_balls_so.so -shared -fPIC -O2 -D_GLIBCXX_USE_CXX11_ABI=0 ================================================ FILE: code/tf_ops/CD/render_balls_so.cpp ================================================ #include #include #include #include using namespace std; struct PointInfo{ int x,y,z; float r,g,b; }; extern "C"{ void render_ball(int h,int w,unsigned char * show,int n,int * xyzs,float * c0,float * c1,float * c2,int r){ r=max(r,1); vector depth(h*w,-2100000000); vector pattern; for (int dx=-r;dx<=r;dx++) for (int dy=-r;dy<=r;dy++) if (dx*dx+dy*dy=h || y2<0 || y2>=w) && depth[x2*w+y2]input(0); const Tensor& xyz2_tensor=context->input(1); OP_REQUIRES(context,xyz1_tensor.dims()==3,errors::InvalidArgument("NnDistance requires xyz1 be of shape (batch,#points,3)")); OP_REQUIRES(context,xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument("NnDistance only accepts 3d point set xyz1")); int b=xyz1_tensor.shape().dim_size(0); int n=xyz1_tensor.shape().dim_size(1); OP_REQUIRES(context,xyz2_tensor.dims()==3,errors::InvalidArgument("NnDistance requires xyz2 be of shape (batch,#points,3)")); OP_REQUIRES(context,xyz2_tensor.shape().dim_size(2)==3,errors::InvalidArgument("NnDistance only accepts 3d point set xyz2")); int m=xyz2_tensor.shape().dim_size(1); OP_REQUIRES(context,xyz2_tensor.shape().dim_size(0)==b,errors::InvalidArgument("NnDistance expects xyz1 and xyz2 have same batch size")); auto xyz1_flat=xyz1_tensor.flat(); const float * xyz1=&xyz1_flat(0); auto xyz2_flat=xyz2_tensor.flat(); const float * xyz2=&xyz2_flat(0); Tensor * dist1_tensor=NULL; Tensor * idx1_tensor=NULL; Tensor * dist2_tensor=NULL; Tensor * idx2_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n},&dist1_tensor)); OP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,n},&idx1_tensor)); auto dist1_flat=dist1_tensor->flat(); auto idx1_flat=idx1_tensor->flat(); OP_REQUIRES_OK(context,context->allocate_output(2,TensorShape{b,m},&dist2_tensor)); OP_REQUIRES_OK(context,context->allocate_output(3,TensorShape{b,m},&idx2_tensor)); auto dist2_flat=dist2_tensor->flat(); auto idx2_flat=idx2_tensor->flat(); float * dist1=&(dist1_flat(0)); int * idx1=&(idx1_flat(0)); float * dist2=&(dist2_flat(0)); int * idx2=&(idx2_flat(0)); nnsearch(b,n,m,xyz1,xyz2,dist1,idx1); nnsearch(b,m,n,xyz2,xyz1,dist2,idx2); } }; REGISTER_KERNEL_BUILDER(Name("NnDistance").Device(DEVICE_CPU), NnDistanceOp); class NnDistanceGradOp : public OpKernel{ public: explicit NnDistanceGradOp(OpKernelConstruction* context):OpKernel(context){} void Compute(OpKernelContext * context)override{ const Tensor& xyz1_tensor=context->input(0); const Tensor& xyz2_tensor=context->input(1); const Tensor& grad_dist1_tensor=context->input(2); const Tensor& idx1_tensor=context->input(3); const Tensor& grad_dist2_tensor=context->input(4); const Tensor& idx2_tensor=context->input(5); OP_REQUIRES(context,xyz1_tensor.dims()==3,errors::InvalidArgument("NnDistanceGrad requires xyz1 be of shape (batch,#points,3)")); OP_REQUIRES(context,xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument("NnDistanceGrad only accepts 3d point set xyz1")); int b=xyz1_tensor.shape().dim_size(0); int n=xyz1_tensor.shape().dim_size(1); OP_REQUIRES(context,xyz2_tensor.dims()==3,errors::InvalidArgument("NnDistanceGrad requires xyz2 be of shape (batch,#points,3)")); OP_REQUIRES(context,xyz2_tensor.shape().dim_size(2)==3,errors::InvalidArgument("NnDistanceGrad only accepts 3d point set xyz2")); int m=xyz2_tensor.shape().dim_size(1); OP_REQUIRES(context,xyz2_tensor.shape().dim_size(0)==b,errors::InvalidArgument("NnDistanceGrad expects xyz1 and xyz2 have same batch size")); OP_REQUIRES(context,grad_dist1_tensor.shape()==(TensorShape{b,n}),errors::InvalidArgument("NnDistanceGrad requires grad_dist1 be of shape(batch,#points)")); OP_REQUIRES(context,idx1_tensor.shape()==(TensorShape{b,n}),errors::InvalidArgument("NnDistanceGrad requires idx1 be of shape(batch,#points)")); OP_REQUIRES(context,grad_dist2_tensor.shape()==(TensorShape{b,m}),errors::InvalidArgument("NnDistanceGrad requires grad_dist2 be of shape(batch,#points)")); OP_REQUIRES(context,idx2_tensor.shape()==(TensorShape{b,m}),errors::InvalidArgument("NnDistanceGrad requires idx2 be of shape(batch,#points)")); auto xyz1_flat=xyz1_tensor.flat(); const float * xyz1=&xyz1_flat(0); auto xyz2_flat=xyz2_tensor.flat(); const float * xyz2=&xyz2_flat(0); auto idx1_flat=idx1_tensor.flat(); const int * idx1=&idx1_flat(0); auto idx2_flat=idx2_tensor.flat(); const int * idx2=&idx2_flat(0); auto grad_dist1_flat=grad_dist1_tensor.flat(); const float * grad_dist1=&grad_dist1_flat(0); auto grad_dist2_flat=grad_dist2_tensor.flat(); const float * grad_dist2=&grad_dist2_flat(0); Tensor * grad_xyz1_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n,3},&grad_xyz1_tensor)); Tensor * grad_xyz2_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,m,3},&grad_xyz2_tensor)); auto grad_xyz1_flat=grad_xyz1_tensor->flat(); float * grad_xyz1=&grad_xyz1_flat(0); auto grad_xyz2_flat=grad_xyz2_tensor->flat(); float * grad_xyz2=&grad_xyz2_flat(0); for (int i=0;iinput(0); const Tensor& xyz2_tensor=context->input(1); OP_REQUIRES(context,xyz1_tensor.dims()==3,errors::InvalidArgument("NnDistance requires xyz1 be of shape (batch,#points,3)")); OP_REQUIRES(context,xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument("NnDistance only accepts 3d point set xyz1")); int b=xyz1_tensor.shape().dim_size(0); int n=xyz1_tensor.shape().dim_size(1); OP_REQUIRES(context,xyz2_tensor.dims()==3,errors::InvalidArgument("NnDistance requires xyz2 be of shape (batch,#points,3)")); OP_REQUIRES(context,xyz2_tensor.shape().dim_size(2)==3,errors::InvalidArgument("NnDistance only accepts 3d point set xyz2")); int m=xyz2_tensor.shape().dim_size(1); OP_REQUIRES(context,xyz2_tensor.shape().dim_size(0)==b,errors::InvalidArgument("NnDistance expects xyz1 and xyz2 have same batch size")); auto xyz1_flat=xyz1_tensor.flat(); const float * xyz1=&xyz1_flat(0); auto xyz2_flat=xyz2_tensor.flat(); const float * xyz2=&xyz2_flat(0); Tensor * dist1_tensor=NULL; Tensor * idx1_tensor=NULL; Tensor * dist2_tensor=NULL; Tensor * idx2_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n},&dist1_tensor)); OP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,n},&idx1_tensor)); auto dist1_flat=dist1_tensor->flat(); auto idx1_flat=idx1_tensor->flat(); OP_REQUIRES_OK(context,context->allocate_output(2,TensorShape{b,m},&dist2_tensor)); OP_REQUIRES_OK(context,context->allocate_output(3,TensorShape{b,m},&idx2_tensor)); auto dist2_flat=dist2_tensor->flat(); auto idx2_flat=idx2_tensor->flat(); float * dist1=&(dist1_flat(0)); int * idx1=&(idx1_flat(0)); float * dist2=&(dist2_flat(0)); int * idx2=&(idx2_flat(0)); NmDistanceKernelLauncher(b,n,xyz1,m,xyz2,dist1,idx1,dist2,idx2); } }; REGISTER_KERNEL_BUILDER(Name("NnDistance").Device(DEVICE_GPU), NnDistanceGpuOp); void NmDistanceGradKernelLauncher(int b,int n,const float * xyz1,int m,const float * xyz2,const float * grad_dist1,const int * idx1,const float * grad_dist2,const int * idx2,float * grad_xyz1,float * grad_xyz2); class NnDistanceGradGpuOp : public OpKernel{ public: explicit NnDistanceGradGpuOp(OpKernelConstruction* context):OpKernel(context){} void Compute(OpKernelContext * context)override{ const Tensor& xyz1_tensor=context->input(0); const Tensor& xyz2_tensor=context->input(1); const Tensor& grad_dist1_tensor=context->input(2); const Tensor& idx1_tensor=context->input(3); const Tensor& grad_dist2_tensor=context->input(4); const Tensor& idx2_tensor=context->input(5); OP_REQUIRES(context,xyz1_tensor.dims()==3,errors::InvalidArgument("NnDistanceGrad requires xyz1 be of shape (batch,#points,3)")); OP_REQUIRES(context,xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument("NnDistanceGrad only accepts 3d point set xyz1")); int b=xyz1_tensor.shape().dim_size(0); int n=xyz1_tensor.shape().dim_size(1); OP_REQUIRES(context,xyz2_tensor.dims()==3,errors::InvalidArgument("NnDistanceGrad requires xyz2 be of shape (batch,#points,3)")); OP_REQUIRES(context,xyz2_tensor.shape().dim_size(2)==3,errors::InvalidArgument("NnDistanceGrad only accepts 3d point set xyz2")); int m=xyz2_tensor.shape().dim_size(1); OP_REQUIRES(context,xyz2_tensor.shape().dim_size(0)==b,errors::InvalidArgument("NnDistanceGrad expects xyz1 and xyz2 have same batch size")); OP_REQUIRES(context,grad_dist1_tensor.shape()==(TensorShape{b,n}),errors::InvalidArgument("NnDistanceGrad requires grad_dist1 be of shape(batch,#points)")); OP_REQUIRES(context,idx1_tensor.shape()==(TensorShape{b,n}),errors::InvalidArgument("NnDistanceGrad requires idx1 be of shape(batch,#points)")); OP_REQUIRES(context,grad_dist2_tensor.shape()==(TensorShape{b,m}),errors::InvalidArgument("NnDistanceGrad requires grad_dist2 be of shape(batch,#points)")); OP_REQUIRES(context,idx2_tensor.shape()==(TensorShape{b,m}),errors::InvalidArgument("NnDistanceGrad requires idx2 be of shape(batch,#points)")); auto xyz1_flat=xyz1_tensor.flat(); const float * xyz1=&xyz1_flat(0); auto xyz2_flat=xyz2_tensor.flat(); const float * xyz2=&xyz2_flat(0); auto idx1_flat=idx1_tensor.flat(); const int * idx1=&idx1_flat(0); auto idx2_flat=idx2_tensor.flat(); const int * idx2=&idx2_flat(0); auto grad_dist1_flat=grad_dist1_tensor.flat(); const float * grad_dist1=&grad_dist1_flat(0); auto grad_dist2_flat=grad_dist2_tensor.flat(); const float * grad_dist2=&grad_dist2_flat(0); Tensor * grad_xyz1_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n,3},&grad_xyz1_tensor)); Tensor * grad_xyz2_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,m,3},&grad_xyz2_tensor)); auto grad_xyz1_flat=grad_xyz1_tensor->flat(); float * grad_xyz1=&grad_xyz1_flat(0); auto grad_xyz2_flat=grad_xyz2_tensor->flat(); float * grad_xyz2=&grad_xyz2_flat(0); NmDistanceGradKernelLauncher(b,n,xyz1,m,xyz2,grad_dist1,idx1,grad_dist2,idx2,grad_xyz1,grad_xyz2); } }; REGISTER_KERNEL_BUILDER(Name("NnDistanceGrad").Device(DEVICE_GPU), NnDistanceGradGpuOp); ================================================ FILE: code/tf_ops/CD/tf_nndistance.py ================================================ import os import tensorflow as tf from tensorflow.python.framework import ops BASE_DIR = os.path.dirname(os.path.abspath(__file__)) nn_distance_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_nndistance_so.so')) def nn_distance(xyz1,xyz2): ''' Computes the distance of nearest neighbors for a pair of point clouds input: xyz1: (batch_size,#points_1,3) the first point cloud input: xyz2: (batch_size,#points_2,3) the second point cloud output: dist1: (batch_size,#point_1) distance from first to second output: idx1: (batch_size,#point_1) nearest neighbor from first to second output: dist2: (batch_size,#point_2) distance from second to first output: idx2: (batch_size,#point_2) nearest neighbor from second to first ''' return nn_distance_module.nn_distance(xyz1,xyz2) #@tf.RegisterShape('NnDistance') #def _nn_distance_shape(op): #shape1=op.inputs[0].get_shape().with_rank(3) #shape2=op.inputs[1].get_shape().with_rank(3) #return [tf.TensorShape([shape1.dims[0],shape1.dims[1]]),tf.TensorShape([shape1.dims[0],shape1.dims[1]]), #tf.TensorShape([shape2.dims[0],shape2.dims[1]]),tf.TensorShape([shape2.dims[0],shape2.dims[1]])] @ops.RegisterGradient('NnDistance') def _nn_distance_grad(op,grad_dist1,grad_idx1,grad_dist2,grad_idx2): xyz1=op.inputs[0] xyz2=op.inputs[1] idx1=op.outputs[1] idx2=op.outputs[3] return nn_distance_module.nn_distance_grad(xyz1,xyz2,grad_dist1,idx1,grad_dist2,idx2) if __name__=='__main__': import numpy as np import random import time from tensorflow.python.ops.gradient_checker import compute_gradient random.seed(100) np.random.seed(100) with tf.Session('') as sess: xyz1=np.random.randn(32,16384,3).astype('float32') xyz2=np.random.randn(32,1024,3).astype('float32') #with tf.device('/gpu:0'): if True: inp1=tf.Variable(xyz1) inp2=tf.constant(xyz2) reta,retb,retc,retd=nn_distance(inp1,inp2) loss=tf.reduce_sum(reta)+tf.reduce_sum(retc) train=tf.train.GradientDescentOptimizer(learning_rate=0.05).minimize(loss) sess.run(tf.initialize_all_variables()) t0=time.time() t1=t0 best=1e100 for i in xrange(100): trainloss,_=sess.run([loss,train]) newt=time.time() best=min(best,newt-t1) print i,trainloss,(newt-t0)/(i+1),best t1=newt #print sess.run([inp1,retb,inp2,retd]) #grads=compute_gradient([inp1,inp2],[(16,32,3),(16,32,3)],loss,(1,),[xyz1,xyz2]) #for i,j in grads: #print i.shape,j.shape,np.mean(np.abs(i-j)),np.mean(np.abs(i)),np.mean(np.abs(j)) #for i in xrange(10): #t0=time.time() #a,b,c,d=sess.run([reta,retb,retc,retd],feed_dict={inp1:xyz1,inp2:xyz2}) #print 'time',time.time()-t0 #print a.shape,b.shape,c.shape,d.shape #print a.dtype,b.dtype,c.dtype,d.dtype #samples=np.array(random.sample(range(xyz2.shape[1]),100),dtype='int32') #dist1=((xyz1[:,samples,None,:]-xyz2[:,None,:,:])**2).sum(axis=-1).min(axis=-1) #idx1=((xyz1[:,samples,None,:]-xyz2[:,None,:,:])**2).sum(axis=-1).argmin(axis=-1) #print np.abs(dist1-a[:,samples]).max() #print np.abs(idx1-b[:,samples]).max() #dist2=((xyz2[:,samples,None,:]-xyz1[:,None,:,:])**2).sum(axis=-1).min(axis=-1) #idx2=((xyz2[:,samples,None,:]-xyz1[:,None,:,:])**2).sum(axis=-1).argmin(axis=-1) #print np.abs(dist2-c[:,samples]).max() #print np.abs(idx2-d[:,samples]).max() ================================================ FILE: code/tf_ops/CD/tf_nndistance_g.cu ================================================ #if GOOGLE_CUDA #define EIGEN_USE_GPU #include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor" __global__ void NmDistanceKernel(int b,int n,const float * xyz,int m,const float * xyz2,float * result,int * result_i){ const int batch=512; __shared__ float buf[batch*3]; for (int i=blockIdx.x;ibest){ result[(i*n+j)]=best; result_i[(i*n+j)]=best_i; } } __syncthreads(); } } } void NmDistanceKernelLauncher(int b,int n,const float * xyz,int m,const float * xyz2,float * result,int * result_i,float * result2,int * result2_i){ NmDistanceKernel<<>>(b,n,xyz,m,xyz2,result,result_i); NmDistanceKernel<<>>(b,m,xyz2,n,xyz,result2,result2_i); } __global__ void NmDistanceGradKernel(int b,int n,const float * xyz1,int m,const float * xyz2,const float * grad_dist1,const int * idx1,float * grad_xyz1,float * grad_xyz2){ for (int i=blockIdx.x;i>>(b,n,xyz1,m,xyz2,grad_dist1,idx1,grad_xyz1,grad_xyz2); NmDistanceGradKernel<<>>(b,m,xyz2,n,xyz1,grad_dist2,idx2,grad_xyz2,grad_xyz1); } #endif ================================================ FILE: code/tf_ops/__init__.py ================================================ ================================================ FILE: code/tf_ops/emd/__init__.py ================================================ ================================================ FILE: code/tf_ops/emd/tf_auctionmatch.cpp ================================================ #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/shape_inference.h" #include "tensorflow/core/framework/common_shape_fns.h" #include #include #include using namespace tensorflow; REGISTER_OP("AuctionMatch") .Input("xyz1: float32") .Input("xyz2: float32") .Output("matchl: int32") .Output("matchr: int32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c){ ::tensorflow::shape_inference::ShapeHandle dims1; c->WithRank(c->input(0), 3, &dims1); ::tensorflow::shape_inference::ShapeHandle dims2; c->WithRank(c->input(1), 3, &dims2); ::tensorflow::shape_inference::ShapeHandle output1 = c->MakeShape({c->Dim(dims1, 0), c->Dim(dims1, 1)}); c->set_output(0, output1); ::tensorflow::shape_inference::ShapeHandle output2 = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1)}); c->set_output(1, output2); return Status::OK(); }); void AuctionMatchLauncher(int b,int n,const float * xyz1,const float * xyz2,int * matchl,int * matchr,float * cost); class AuctionMatchGpuOp: public OpKernel{ public: explicit AuctionMatchGpuOp(OpKernelConstruction* context):OpKernel(context){} void Compute(OpKernelContext * context)override{ const Tensor& xyz1_tensor=context->input(0); OP_REQUIRES(context,xyz1_tensor.dims()==3 && xyz1_tensor.shape().dim_size(2)==3,errors::InvalidArgument("ApproxMatch expects (batch_size,num_points,3) xyz1 shape")); auto xyz1_flat=xyz1_tensor.flat(); const float * xyz1=&(xyz1_flat(0)); int b=xyz1_tensor.shape().dim_size(0); int n=xyz1_tensor.shape().dim_size(1); OP_REQUIRES(context,n<=4096,errors::InvalidArgument("AuctionMatch handles at most 4096 dataset points")); const Tensor& xyz2_tensor=context->input(1); OP_REQUIRES(context,xyz2_tensor.dims()==3 && xyz2_tensor.shape().dim_size(2)==3 && xyz2_tensor.shape().dim_size(0)==b && xyz2_tensor.shape().dim_size(1)==n,errors::InvalidArgument("AuctionMatch expects (batch_size,num_points,3) xyz2 shape, and shape must match with xyz1")); auto xyz2_flat=xyz2_tensor.flat(); const float * xyz2=&(xyz2_flat(0)); Tensor * matchl_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n},&matchl_tensor)); auto matchl_flat=matchl_tensor->flat(); int * matchl=&(matchl_flat(0)); Tensor * matchr_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(1,TensorShape{b,n},&matchr_tensor)); auto matchr_flat=matchr_tensor->flat(); int * matchr=&(matchr_flat(0)); Tensor temp_tensor; OP_REQUIRES_OK(context,context->allocate_temp(DataTypeToEnum::value,TensorShape{b,n,n},&temp_tensor)); auto temp_flat=temp_tensor.flat(); float * temp=&(temp_flat(0)); AuctionMatchLauncher(b,n,xyz1,xyz2,matchl,matchr,temp); } }; REGISTER_KERNEL_BUILDER(Name("AuctionMatch").Device(DEVICE_GPU), AuctionMatchGpuOp); ================================================ FILE: code/tf_ops/emd/tf_auctionmatch.py ================================================ import tensorflow as tf from tensorflow.python.framework import ops import sys import os BASE_DIR = os.path.dirname(os.path.abspath(__file__)) sys.path.append(BASE_DIR) auctionmatch_module = tf.load_op_library(os.path.join(BASE_DIR, 'tf_auctionmatch_so.so')) def auction_match(xyz1,xyz2): ''' input: xyz1 : batch_size * #points * 3 xyz2 : batch_size * #points * 3 returns: matchl : batch_size * #npoints matchr : batch_size * #npoints ''' print ("Here") return auctionmatch_module.auction_match(xyz1,xyz2) ops.NoGradient('AuctionMatch') # TF1.0 API requires set shape in C++ # @tf.RegisterShape('AuctionMatch') # def _auction_match_shape(op): # shape1=op.inputs[0].get_shape().with_rank(3) # shape2=op.inputs[1].get_shape().with_rank(3) # return [ # tf.TensorShape([shape1.dims[0],shape1.dims[1]]), # tf.TensorShape([shape2.dims[0],shape2.dims[1]]) # ] if __name__=='__main__': from tf_ops.grouping import tf_grouping from tf_ops.sampling import tf_sampling npoint=4096 xyz1_in=tf.placeholder(tf.float32,shape=(32,npoint,3)) xyz2_in=tf.placeholder(tf.float32,shape=(32,npoint,3)) matchl_out,matchr_out=auction_match(xyz1_in,xyz2_in) matched_out=tf_sampling.gather_point(xyz2_in,matchl_out) import numpy as np np.random.seed(100) xyz1=np.random.randn(32,npoint,3).astype('float32') xyz2=xyz1.copy()+np.random.randn(32,npoint,3)*0.01 for i in xrange(len(xyz2)): xyz2[i]=np.roll(xyz2[i],i,axis=0) with tf.Session('') as sess: ret=sess.run(matched_out,feed_dict={xyz1_in:xyz1,xyz2_in:xyz2}) print ((xyz1-ret)**2).mean() ================================================ FILE: code/tf_ops/emd/tf_auctionmatch_compile.sh ================================================ echo 'nvcc' /usr/local/cuda-8.0/bin/nvcc tf_auctionmatch_g.cu -o tf_auctionmatch_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_30 echo 'g++' g++ -std=c++11 tf_auctionmatch.cpp tf_auctionmatch_g.cu.o -o tf_auctionmatch_so.so -shared -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -I /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2 ================================================ FILE: code/tf_ops/emd/tf_auctionmatch_g.cu ================================================ #include __global__ void AuctionMatchKernel(int b,int n,const float * __restrict__ xyz1,const float * __restrict__ xyz2,int * matchl,int * matchr,float * cost){ //this kernel handles up to 4096 points const int NMax=4096; __shared__ short Queue[NMax]; __shared__ short matchrbuf[NMax]; __shared__ float pricer[NMax]; __shared__ float bests[32][3]; __shared__ int qhead,qlen; const int BufLen=2048; __shared__ float buf[BufLen]; for (int bno=blockIdx.x;bno1; } int vj,vj2,vj3,vj4; if (value1=blockDim.x*4){ for (int j=threadIdx.x;j=blockDim.x*2){ for (int j=threadIdx.x;j0;i>>=1){ float b1=__shfl_down(best,i,32); float b2=__shfl_down(best2,i,32); int bj=__shfl_down(bestj,i,32); if (best>5][0]=best; bests[threadIdx.x>>5][1]=best2; *(int*)&bests[threadIdx.x>>5][2]=bestj; } __syncthreads(); int nn=blockDim.x>>5; if (threadIdx.x>1;i>0;i>>=1){ float b1=__shfl_down(best,i,32); float b2=__shfl_down(best2,i,32); int bj=__shfl_down(bestj,i,32); if (best=n) qhead-=n; int old=matchrbuf[bestj]; pricer[bestj]+=delta; cnt++; if (old!=-1){ int ql=qlen; int tail=qhead+ql; qlen=ql+1; if (tail>=n) tail-=n; Queue[tail]=old; } if (cnt==(40*n)){ if (tolerance==1.0) qlen=0; tolerance=fminf(1.0,tolerance*100); cnt=0; } } __syncthreads(); if (threadIdx.x==0){ matchrbuf[bestj]=i; } } __syncthreads(); for (int j=threadIdx.x;j>>(b,n,xyz1,xyz2,matchl,matchr,cost); } ================================================ FILE: code/tf_ops/grouping/__init__.py ================================================ ================================================ FILE: code/tf_ops/grouping/compile.sh ================================================ g++ query_ball_point.cpp -o query_ball_point nvcc query_ball_point.cu -o query_ball_point_cuda nvcc query_ball_point_block.cu -o query_ball_point_block nvcc query_ball_point_grid.cu -o query_ball_point_grid nvcc query_ball_point_grid_count.cu -o query_ball_point_grid_count g++ -Wall selection_sort.cpp -o selection_sort nvcc selection_sort.cu -o selection_sort_cuda ================================================ FILE: code/tf_ops/grouping/query_ball_point.cpp ================================================ #include #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include using namespace std; float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3) // output: idx (b,m,nsample) void query_ball_point_cpu(int b, int n, int m, const float* radius, int nsample, const float *xyz1, const float *xyz2, int *idx) { for (int i=0;i #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include #include "cuPrintf.cuh" #include "cuPrintf.cu" using namespace std; using namespace std; float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3) // output: idx (b,m,nsample) __global__ void query_ball_point_gpu(int b, int n, int m, const float* radius, int nsample, const float *xyz1, const float *xyz2, int *idx) { for (int i=0;i>>(b,n,m,radius,nsample,xyz1,xyz2,idx); cudaDeviceSynchronize(); printf("query_ball_point gpu time %f\n",get_time()-t0); t0=get_time(); group_point_gpu<<<1,1>>>(b,n,c,m,nsample,points,idx,out); cudaDeviceSynchronize(); printf("grou_point gpu time %f\n",get_time()-t0); t0=get_time(); group_point_grad_gpu<<<1,1>>>(b,n,c,m,nsample,grad_out,idx,grad_points); cudaDeviceSynchronize(); printf("grou_point_grad gpu time %f\n",get_time()-t0); cudaFree(xyz1); cudaFree(xyz2); cudaFree(points); cudaFree(idx); cudaFree(out); cudaFree(grad_out); cudaFree(grad_points); return 0; } ================================================ FILE: code/tf_ops/grouping/query_ball_point_block.cu ================================================ #include #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include using namespace std; float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3) // output: idx (b,m,nsample) __global__ void query_ball_point_gpu(int b, int n, int m, const float *radius, int nsample, const float *xyz1, const float *xyz2, int *idx) { int index = threadIdx.x; xyz1 += n*3*index; xyz2 += m*3*index; idx += m*nsample*index; for (int j=0;j>>(b,n,m,radius,nsample,xyz1,xyz2,idx); cudaDeviceSynchronize(); printf("query_ball_point gpu time %f\n",get_time()-t0); t0=get_time(); group_point_gpu<<<1,b>>>(b,n,c,m,nsample,points,idx,out); cudaDeviceSynchronize(); printf("grou_point gpu time %f\n",get_time()-t0); t0=get_time(); group_point_grad_gpu<<<1,b>>>(b,n,c,m,nsample,grad_out,idx,grad_points); cudaDeviceSynchronize(); printf("grou_point_grad gpu time %f\n",get_time()-t0); cudaFree(xyz1); cudaFree(xyz2); cudaFree(points); cudaFree(idx); cudaFree(out); cudaFree(grad_out); cudaFree(grad_points); return 0; } ================================================ FILE: code/tf_ops/grouping/query_ball_point_grid.cu ================================================ #include #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include using namespace std; float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3) // output: idx (b,m,nsample) __global__ void query_ball_point_gpu(int b, int n, int m, const float *radius, int nsample, const float *xyz1, const float *xyz2, int *idx) { int batch_index = blockIdx.x; xyz1 += n*3*batch_index; xyz2 += m*3*batch_index; idx += m*nsample*batch_index; int index = threadIdx.x; int stride = blockDim.x; for (int j=index;j>>(b,n,m,radius,nsample,xyz1,xyz2,idx); cudaDeviceSynchronize(); printf("query_ball_point gpu time %f\n",get_time()-t0); t0=get_time(); group_point_gpu<<>>(b,n,c,m,nsample,points,idx,out); cudaDeviceSynchronize(); printf("grou_point gpu time %f\n",get_time()-t0); t0=get_time(); group_point_grad_gpu<<>>(b,n,c,m,nsample,grad_out,idx,grad_points); cudaDeviceSynchronize(); printf("grou_point_grad gpu time %f\n",get_time()-t0); cudaFree(xyz1); cudaFree(xyz2); cudaFree(points); cudaFree(idx); cudaFree(out); cudaFree(grad_out); cudaFree(grad_points); return 0; } ================================================ FILE: code/tf_ops/grouping/selection_sort.cpp ================================================ #include #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include using namespace std; float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // input: k (1), distance matrix dist (b,m,n) // output: idx (b,m,n), val (b,m,n) void selection_sort_cpu(int b, int n, int m, int k, const float *dist, int *idx, float *val) { float *p_dist; float tmp; int tmpi; for (int i=0;i #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include using namespace std; float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // input: k (1), distance matrix dist (b,m,n) // output: idx (b,m,k), val (b,m,k) __global__ void selection_sort_gpu(int b, int n, int m, int k, float *dist, int *idx, float *val) { int batch_index = blockIdx.x; dist+=m*n*batch_index; idx+=m*k*batch_index; val+=m*k*batch_index; int index = threadIdx.x; int stride = blockDim.x; float *p_dist; for (int j=index;j>>(b,n,m,k,dist,idx,val); cudaDeviceSynchronize(); printf("selection sort cpu time %f\n",get_time()-t0); return 0; } ================================================ FILE: code/tf_ops/grouping/selection_sort_const.cu ================================================ #include #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include using namespace std; float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // input: k (1), distance matrix dist (b,m,n) // output: idx (b,m,n), dist_out (b,m,n) __global__ void selection_sort_gpu(int b, int n, int m, int k, const float *dist, int *outi, float *out) { int batch_index = blockIdx.x; dist+=m*n*batch_index; outi+=m*n*batch_index; out+=m*n*batch_index; int index = threadIdx.x; int stride = blockDim.x; // copy from dist to dist_out for (int j=index;j>>(b,n,m,k,dist,idx,dist_out); cudaDeviceSynchronize(); printf("selection sort cpu time %f\n",get_time()-t0); //for (int i=0;i #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/shape_inference.h" #include "tensorflow/core/framework/common_shape_fns.h" #include using namespace tensorflow; REGISTER_OP("QueryBallPoint") .Attr("nsample: int") .Input("xyz1: float32") .Input("xyz2: float32") .Input("radius: float32") .Output("idx: int32") .Output("pts_cnt: int32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoint * 3 c->WithRank(c->input(1), 3, &dims2); int nsample; TF_RETURN_IF_ERROR(c->GetAttr("nsample", &nsample)); ::tensorflow::shape_inference::ShapeHandle output1 = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1), nsample}); c->set_output(0, output1); ::tensorflow::shape_inference::ShapeHandle output2 = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1)}); c->set_output(1, output2); return Status::OK(); }); REGISTER_OP("SelectionSort") .Attr("k: int") .Input("dist: float32") .Output("outi: int32") .Output("out: float32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->input(0)); c->set_output(1, c->input(0)); return Status::OK(); }); REGISTER_OP("GroupPoint") .Input("points: float32") .Input("idx: int32") .Output("out: float32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ndataset * channels c->WithRank(c->input(0), 3, &dims1); ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints * nsample c->WithRank(c->input(1), 3, &dims2); // batch_size * npoints * nsample * channels ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1), c->Dim(dims2, 2), c->Dim(dims1, 2)}); c->set_output(0, output); return Status::OK(); }); REGISTER_OP("GroupPointGrad") .Input("points: float32") .Input("idx: int32") .Input("grad_out: float32") .Output("grad_points: float32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->input(0)); return Status::OK(); }); void queryBallPointLauncher(int b, int n, int m, const float* radius, int nsample, const float *xyz1, const float *xyz2, int *idx, int *pts_cnt); class QueryBallPointGpuOp : public OpKernel { public: explicit QueryBallPointGpuOp(OpKernelConstruction* context) : OpKernel(context) { //OP_REQUIRES_OK(context, context->GetAttr("radius", &radius_)); //OP_REQUIRES(context, radius_ > 0, errors::InvalidArgument("QueryBallPoint expects positive radius")); OP_REQUIRES_OK(context, context->GetAttr("nsample", &nsample_)); OP_REQUIRES(context, nsample_ > 0, errors::InvalidArgument("QueryBallPoint expects positive nsample")); } void Compute(OpKernelContext* context) override { const Tensor& xyz1_tensor = context->input(0); OP_REQUIRES(context, xyz1_tensor.dims()==3 && xyz1_tensor.shape().dim_size(2)==3, errors::InvalidArgument("QueryBallPoint expects (batch_size, ndataset, 3) xyz1 shape.")); int b = xyz1_tensor.shape().dim_size(0); int n = xyz1_tensor.shape().dim_size(1); const Tensor& xyz2_tensor = context->input(1); OP_REQUIRES(context, xyz2_tensor.dims()==3 && xyz2_tensor.shape().dim_size(2)==3, errors::InvalidArgument("QueryBallPoint expects (batch_size, npoint, 3) xyz2 shape.")); int m = xyz2_tensor.shape().dim_size(1); Tensor *idx_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,m,nsample_}, &idx_tensor)); Tensor *pts_cnt_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,m}, &pts_cnt_tensor)); const Tensor& radius_tensor = context->input(2); auto radius_flat = radius_tensor.flat(); const float *radius = &(radius_flat(0)); auto xyz1_flat = xyz1_tensor.flat(); const float *xyz1 = &(xyz1_flat(0)); auto xyz2_flat = xyz2_tensor.flat(); const float *xyz2 = &(xyz2_flat(0)); auto idx_flat = idx_tensor->flat(); int *idx = &(idx_flat(0)); auto pts_cnt_flat = pts_cnt_tensor->flat(); int *pts_cnt = &(pts_cnt_flat(0)); queryBallPointLauncher(b,n,m,radius,nsample_,xyz1,xyz2,idx,pts_cnt); } private: int nsample_; }; REGISTER_KERNEL_BUILDER(Name("QueryBallPoint").Device(DEVICE_GPU), QueryBallPointGpuOp); void selectionSortLauncher(int b, int n, int m, int k, const float *dist, int *outi, float *out); class SelectionSortGpuOp : public OpKernel { public: explicit SelectionSortGpuOp(OpKernelConstruction* context) : OpKernel(context) { OP_REQUIRES_OK(context, context->GetAttr("k", &k_)); OP_REQUIRES(context, k_ > 0, errors::InvalidArgument("SelectionSort expects positive k")); } void Compute(OpKernelContext* context) override { const Tensor& dist_tensor = context->input(0); OP_REQUIRES(context, dist_tensor.dims()==3, errors::InvalidArgument("SelectionSort expects (b,m,n) dist shape.")); int b = dist_tensor.shape().dim_size(0); int m = dist_tensor.shape().dim_size(1); int n = dist_tensor.shape().dim_size(2); Tensor *outi_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,m,n}, &outi_tensor)); Tensor *out_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,m,n}, &out_tensor)); auto dist_flat = dist_tensor.flat(); const float *dist = &(dist_flat(0)); auto outi_flat = outi_tensor->flat(); int *outi = &(outi_flat(0)); auto out_flat = out_tensor->flat(); float *out = &(out_flat(0)); selectionSortLauncher(b,n,m,k_,dist,outi,out); } private: int k_; }; REGISTER_KERNEL_BUILDER(Name("SelectionSort").Device(DEVICE_GPU), SelectionSortGpuOp); void groupPointLauncher(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out); class GroupPointGpuOp: public OpKernel{ public: explicit GroupPointGpuOp(OpKernelConstruction * context):OpKernel(context){} void Compute(OpKernelContext * context) override { const Tensor& points_tensor=context->input(0); OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument("GroupPoint expects (batch_size, num_points, channel) points shape")); int b = points_tensor.shape().dim_size(0); int n = points_tensor.shape().dim_size(1); int c = points_tensor.shape().dim_size(2); const Tensor& idx_tensor=context->input(1); OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument("GroupPoint expects (batch_size, npoints, nsample) idx shape")); int m = idx_tensor.shape().dim_size(1); int nsample = idx_tensor.shape().dim_size(2); Tensor * out_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,m,nsample,c}, &out_tensor)); auto points_flat = points_tensor.flat(); const float *points = &(points_flat(0)); auto idx_flat = idx_tensor.flat(); const int *idx = &(idx_flat(0)); auto out_flat = out_tensor->flat(); float *out = &(out_flat(0)); groupPointLauncher(b,n,c,m,nsample,points,idx,out); } }; REGISTER_KERNEL_BUILDER(Name("GroupPoint").Device(DEVICE_GPU),GroupPointGpuOp); void groupPointGradLauncher(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points); class GroupPointGradGpuOp: public OpKernel{ public: explicit GroupPointGradGpuOp(OpKernelConstruction * context):OpKernel(context){} void Compute(OpKernelContext * context) override { const Tensor& points_tensor=context->input(0); OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument("GroupPointGrad expects (batch_size, num_points, channel) points shape")); int b = points_tensor.shape().dim_size(0); int n = points_tensor.shape().dim_size(1); int c = points_tensor.shape().dim_size(2); const Tensor& idx_tensor=context->input(1); OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument("GroupPointGrad expects (batch_size, npoints, nsample) idx shape")); int m = idx_tensor.shape().dim_size(1); int nsample = idx_tensor.shape().dim_size(2); const Tensor& grad_out_tensor=context->input(2); OP_REQUIRES(context,grad_out_tensor.dims()==4 && grad_out_tensor.shape().dim_size(0)==b && grad_out_tensor.shape().dim_size(1)==m && grad_out_tensor.shape().dim_size(2)==nsample && grad_out_tensor.shape().dim_size(3)==c, errors::InvalidArgument("GroupPointGrad expects (batch_size, npoints, nsample, channel) grad_out shape")); Tensor * grad_points_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,n,c}, &grad_points_tensor)); auto points_flat = points_tensor.flat(); const float *points = &(points_flat(0)); auto idx_flat = idx_tensor.flat(); const int *idx = &(idx_flat(0)); auto grad_out_flat = grad_out_tensor.flat(); const float *grad_out = &(grad_out_flat(0)); auto grad_points_flat = grad_points_tensor->flat(); float *grad_points = &(grad_points_flat(0)); cudaMemset(grad_points, 0, sizeof(float)*b*n*c); groupPointGradLauncher(b,n,c,m,nsample,grad_out,idx,grad_points); } }; REGISTER_KERNEL_BUILDER(Name("GroupPointGrad").Device(DEVICE_GPU),GroupPointGradGpuOp); ================================================ FILE: code/tf_ops/grouping/tf_grouping.py ================================================ import tensorflow as tf from tensorflow.python.framework import ops import sys import os BASE_DIR = os.path.dirname(os.path.abspath(__file__)) sys.path.append(BASE_DIR) grouping_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_grouping_so.so')) def query_ball_point(radius, nsample, xyz1, xyz2): ''' Input: radius: float32, ball search radius nsample: int32, number of points selected in each ball region xyz1: (batch_size, ndataset, 3) float32 array, input points xyz2: (batch_size, npoint, 3) float32 array, query points Output: idx: (batch_size, npoint, nsample) int32 array, indices to input points pts_cnt: (batch_size, npoint) int32 array, number of unique points in each local region ''' #return grouping_module.query_ball_point(radius, nsample, xyz1, xyz2) return grouping_module.query_ball_point(xyz1, xyz2, radius, nsample) ops.NoGradient('QueryBallPoint') def select_top_k(k, dist): ''' Input: k: int32, number of k SMALLEST elements selected dist: (b,m,n) float32 array, distance matrix, m query points, n dataset points Output: idx: (b,m,n) int32 array, first k in n are indices to the top k dist_out: (b,m,n) float32 array, first k in n are the top k ''' return grouping_module.selection_sort(dist, k) ops.NoGradient('SelectionSort') def group_point(points, idx): ''' Input: points: (batch_size, ndataset, channel) float32 array, points to sample from idx: (batch_size, npoint, nsample) int32 array, indices to points Output: out: (batch_size, npoint, nsample, channel) float32 array, values sampled from points ''' return grouping_module.group_point(points, idx) @tf.RegisterGradient('GroupPoint') def _group_point_grad(op, grad_out): points = op.inputs[0] idx = op.inputs[1] return [grouping_module.group_point_grad(points, idx, grad_out), None] def knn_point(k, xyz1, xyz2): ''' Input: k: int32, number of k in k-nn search xyz1: (batch_size, ndataset, c) float32 array, input points xyz2: (batch_size, npoint, c) float32 array, query points Output: val: (batch_size, npoint, k) float32 array, L2 distances idx: (batch_size, npoint, k) int32 array, indices to input points ''' # b = xyz1.get_shape()[0].value # n = xyz1.get_shape()[1].value # c = xyz1.get_shape()[2].value # m = xyz2.get_shape()[1].value # xyz1 = tf.tile(tf.reshape(xyz1, (b,1,n,c)), [1,m,1,1]) # xyz2 = tf.tile(tf.reshape(xyz2, (b,m,1,c)), [1,1,n,1]) xyz1 = tf.expand_dims(xyz1,axis=1) xyz2 = tf.expand_dims(xyz2,axis=2) dist = tf.reduce_sum((xyz1-xyz2)**2, -1) # outi, out = select_top_k(k, dist) # idx = tf.slice(outi, [0,0,0], [-1,-1,k]) # val = tf.slice(out, [0,0,0], [-1,-1,k]) val, idx = tf.nn.top_k(-dist, k=k) # ONLY SUPPORT CPU return val, idx if __name__=='__main__': knn=True import numpy as np import time np.random.seed(100) pts = np.random.random((32,512,64)).astype('float32') tmp1 = np.random.random((32,512,3)).astype('float32') tmp2 = np.random.random((32,128,3)).astype('float32') with tf.device('/gpu:1'): points = tf.constant(pts) xyz1 = tf.constant(tmp1) xyz2 = tf.constant(tmp2) radius = 0.1 nsample = 64 if knn: _, idx = knn_point(nsample, xyz1, xyz2) grouped_points = group_point(points, idx) else: idx, _ = query_ball_point(radius, nsample, xyz1, xyz2) grouped_points = group_point(points, idx) #grouped_points_grad = tf.ones_like(grouped_points) #points_grad = tf.gradients(grouped_points, points, grouped_points_grad) with tf.Session('') as sess: now = time.time() for _ in range(100): ret = sess.run(grouped_points) print time.time() - now print ret.shape, ret.dtype print ret ================================================ FILE: code/tf_ops/grouping/tf_grouping_compile.sh ================================================ #/bin/bash /usr/local/cuda-8.0/bin/nvcc tf_grouping_g.cu -o tf_grouping_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC g++ -std=c++11 tf_grouping.cpp tf_grouping_g.cu.o -o tf_grouping_so.so -shared -fPIC -I /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0 ================================================ FILE: code/tf_ops/grouping/tf_grouping_g.cu ================================================ // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3) // output: idx (b,m,nsample), pts_cnt (b,m) __global__ void query_ball_point_gpu(int b, int n, int m, const float *radius, int nsample, const float *xyz1, const float *xyz2, int *idx, int *pts_cnt) { int batch_index = blockIdx.x; xyz1 += n*3*batch_index; xyz2 += m*3*batch_index; idx += m*nsample*batch_index; pts_cnt += m*batch_index; // counting how many unique points selected in local region int index = threadIdx.x; int stride = blockDim.x; for (int j=index;j>>(b,n,m,radius,nsample,xyz1,xyz2,idx,pts_cnt); //cudaDeviceSynchronize(); } void selectionSortLauncher(int b, int n, int m, int k, const float *dist, int *outi, float *out) { selection_sort_gpu<<>>(b,n,m,k,dist,outi,out); //cudaDeviceSynchronize(); } void groupPointLauncher(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out){ group_point_gpu<<>>(b,n,c,m,nsample,points,idx,out); //cudaDeviceSynchronize(); } void groupPointGradLauncher(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points){ group_point_grad_gpu<<>>(b,n,c,m,nsample,grad_out,idx,grad_points); //group_point_grad_gpu<<<1,1>>>(b,n,c,m,nsample,grad_out,idx,grad_points); //cudaDeviceSynchronize(); } ================================================ FILE: code/tf_ops/grouping/tf_grouping_op_test.py ================================================ import tensorflow as tf import numpy as np from tf_grouping import query_ball_point, group_point class GroupPointTest(tf.test.TestCase): def test(self): pass def test_grad(self): with tf.device('/gpu:0'): points = tf.constant(np.random.random((1,128,16)).astype('float32')) print points xyz1 = tf.constant(np.random.random((1,128,3)).astype('float32')) xyz2 = tf.constant(np.random.random((1,8,3)).astype('float32')) radius = 0.3 nsample = 32 idx, pts_cnt = query_ball_point(radius, nsample, xyz1, xyz2) grouped_points = group_point(points, idx) print grouped_points with self.test_session(): print "---- Going to compute gradient error" err = tf.test.compute_gradient_error(points, (1,128,16), grouped_points, (1,8,32,16)) print err self.assertLess(err, 1e-4) if __name__=='__main__': tf.test.main() ================================================ FILE: code/tf_ops/interpolation/__init__.py ================================================ ================================================ FILE: code/tf_ops/interpolation/interpolate.cpp ================================================ #include #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include #include using namespace std; float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // Find three nearest neigbors with square distance // input: xyz1 (b,n,3), xyz2(b,m,3) // output: dist (b,n,3), idx (b,n,3) void threenn_cpu(int b, int n, int m, const float *xyz1, const float *xyz2, float *dist, int *idx) { for (int i=0;i #include #include // memset #include // rand, RAND_MAX #include // sqrtf #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/shape_inference.h" #include "tensorflow/core/framework/common_shape_fns.h" using namespace tensorflow; REGISTER_OP("ThreeNN") .Input("xyz1: float32") .Input("xyz2: float32") .Output("dist: float32") .Output("idx: int32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->input(0)); c->set_output(1, c->input(0)); return Status::OK(); }); REGISTER_OP("ThreeInterpolate") .Input("points: float32") .Input("idx: int32") .Input("weight: float32") .Output("out: float32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { ::tensorflow::shape_inference::ShapeHandle dims1; // (b,m,c) c->WithRank(c->input(0), 3, &dims1); ::tensorflow::shape_inference::ShapeHandle dims2; // (b,n,3) c->WithRank(c->input(1), 3, &dims2); // (b,n,c) ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), c->Dim(dims2, 1), c->Dim(dims1, 2)}); c->set_output(0, output); return Status::OK(); }); REGISTER_OP("ThreeInterpolateGrad") .Input("points: float32") .Input("idx: int32") .Input("weight: float32") .Input("grad_out: float32") .Output("grad_points: float32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->input(0)); return Status::OK(); }); float randomf(){ return (rand()+0.5)/(RAND_MAX+1.0); } static double get_time(){ timespec tp; clock_gettime(CLOCK_MONOTONIC,&tp); return tp.tv_sec+tp.tv_nsec*1e-9; } // Find three nearest neigbors with square distance // input: xyz1 (b,n,3), xyz2(b,m,3) // output: dist (b,n,3), idx (b,n,3) void threenn_cpu(int b, int n, int m, const float *xyz1, const float *xyz2, float *dist, int *idx) { for (int i=0;iinput(0); OP_REQUIRES(context, xyz1_tensor.dims()==3 && xyz1_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeNN expects (b,n,3) xyz1 shape.")); int b = xyz1_tensor.shape().dim_size(0); int n = xyz1_tensor.shape().dim_size(1); const Tensor& xyz2_tensor = context->input(1); OP_REQUIRES(context, xyz2_tensor.dims()==3 && xyz2_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeNN expects (b,m,3) xyz2 shape.")); int m = xyz2_tensor.shape().dim_size(1); Tensor *dist_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,n,3}, &dist_tensor)); Tensor *idx_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,n,3}, &idx_tensor)); auto xyz1_flat = xyz1_tensor.flat(); const float *xyz1 = &(xyz1_flat(0)); auto xyz2_flat = xyz2_tensor.flat(); const float *xyz2 = &(xyz2_flat(0)); auto dist_flat = dist_tensor->flat(); float *dist = &(dist_flat(0)); auto idx_flat = idx_tensor->flat(); int *idx = &(idx_flat(0)); threenn_cpu(b,n,m,xyz1,xyz2,dist,idx); } }; REGISTER_KERNEL_BUILDER(Name("ThreeNN").Device(DEVICE_CPU), ThreeNNOp); class ThreeInterpolateOp: public OpKernel{ public: explicit ThreeInterpolateOp(OpKernelConstruction * context):OpKernel(context){} void Compute(OpKernelContext * context) override { const Tensor& points_tensor=context->input(0); OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument("ThreeInterpolate expects (b,m,c) points shape")); int b = points_tensor.shape().dim_size(0); int m = points_tensor.shape().dim_size(1); int c = points_tensor.shape().dim_size(2); const Tensor& idx_tensor=context->input(1); OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b && idx_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeInterpolate expects (b,n,3) idx shape")); int n = idx_tensor.shape().dim_size(1); const Tensor& weight_tensor=context->input(2); OP_REQUIRES(context,weight_tensor.dims()==3 && weight_tensor.shape().dim_size(0)==b && weight_tensor.shape().dim_size(1)==n && weight_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeInterpolate expects (b,n,3) weight shape")); Tensor * out_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,n,c}, &out_tensor)); auto points_flat = points_tensor.flat(); const float *points = &(points_flat(0)); auto idx_flat = idx_tensor.flat(); const int *idx = &(idx_flat(0)); auto weight_flat = weight_tensor.flat(); const float *weight = &(weight_flat(0)); auto out_flat = out_tensor->flat(); float *out = &(out_flat(0)); threeinterpolate_cpu(b,m,c,n,points,idx,weight,out); } }; REGISTER_KERNEL_BUILDER(Name("ThreeInterpolate").Device(DEVICE_CPU),ThreeInterpolateOp); class ThreeInterpolateGradOp: public OpKernel{ public: explicit ThreeInterpolateGradOp(OpKernelConstruction * context):OpKernel(context){} void Compute(OpKernelContext * context) override { const Tensor& points_tensor=context->input(0); OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument("ThreeInterpolateGrad expects (b,m,c) points shape")); int b = points_tensor.shape().dim_size(0); int m = points_tensor.shape().dim_size(1); int c = points_tensor.shape().dim_size(2); const Tensor& idx_tensor=context->input(1); OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument("ThreeInterpolateGrad expects (b,n,3) idx shape")); int n = idx_tensor.shape().dim_size(1); const Tensor& weight_tensor=context->input(2); OP_REQUIRES(context,weight_tensor.dims()==3 && weight_tensor.shape().dim_size(0)==b && weight_tensor.shape().dim_size(1)==n && weight_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeInterpolateGrad expects (b,n,3) weight shape")); const Tensor& grad_out_tensor=context->input(3); OP_REQUIRES(context,grad_out_tensor.dims()==3 && grad_out_tensor.shape().dim_size(0)==b && grad_out_tensor.shape().dim_size(1)==n && grad_out_tensor.shape().dim_size(2)==c, errors::InvalidArgument("ThreeInterpolateGrad expects (b,n,c) grad_out shape")); Tensor * grad_points_tensor = nullptr; OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,m,c}, &grad_points_tensor)); auto points_flat = points_tensor.flat(); const float *points = &(points_flat(0)); auto idx_flat = idx_tensor.flat(); const int *idx = &(idx_flat(0)); auto weight_flat = weight_tensor.flat(); const float *weight = &(weight_flat(0)); auto grad_out_flat = grad_out_tensor.flat(); const float *grad_out = &(grad_out_flat(0)); auto grad_points_flat = grad_points_tensor->flat(); float *grad_points = &(grad_points_flat(0)); memset(grad_points, 0, sizeof(float)*b*m*c); threeinterpolate_grad_cpu(b,n,c,m,grad_out,idx,weight,grad_points); } }; REGISTER_KERNEL_BUILDER(Name("ThreeInterpolateGrad").Device(DEVICE_CPU),ThreeInterpolateGradOp); ================================================ FILE: code/tf_ops/interpolation/tf_interpolate.py ================================================ import tensorflow as tf from tensorflow.python.framework import ops import sys import os BASE_DIR = os.path.dirname(__file__) sys.path.append(BASE_DIR) interpolate_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_interpolate_so.so')) def three_nn(xyz1, xyz2): ''' Input: xyz1: (b,n,3) float32 array, unknown points xyz2: (b,m,3) float32 array, known points Output: dist: (b,n,3) float32 array, distances to known points idx: (b,n,3) int32 array, indices to known points ''' return interpolate_module.three_nn(xyz1, xyz2) ops.NoGradient('ThreeNN') def three_interpolate(points, idx, weight): ''' Input: points: (b,m,c) float32 array, known points idx: (b,n,3) int32 array, indices to known points weight: (b,n,3) float32 array, weights on known points Output: out: (b,n,c) float32 array, interpolated point values ''' return interpolate_module.three_interpolate(points, idx, weight) @tf.RegisterGradient('ThreeInterpolate') def _three_interpolate_grad(op, grad_out): points = op.inputs[0] idx = op.inputs[1] weight = op.inputs[2] return [interpolate_module.three_interpolate_grad(points, idx, weight, grad_out), None, None] if __name__=='__main__': import numpy as np import time np.random.seed(100) pts = np.random.random((32,128,64)).astype('float32') tmp1 = np.random.random((32,512,3)).astype('float32') tmp2 = np.random.random((32,128,3)).astype('float32') with tf.device('/cpu:0'): points = tf.constant(pts) xyz1 = tf.constant(tmp1) xyz2 = tf.constant(tmp2) dist, idx = three_nn(xyz1, xyz2) weight = tf.ones_like(dist)/3.0 interpolated_points = three_interpolate(points, idx, weight) with tf.Session('') as sess: now = time.time() for _ in range(100): ret = sess.run(interpolated_points) print time.time() - now print ret.shape, ret.dtype #print ret ================================================ FILE: code/tf_ops/interpolation/tf_interpolate_compile.sh ================================================ g++ -std=c++11 tf_interpolate.cpp -o tf_interpolate_so.so -shared -fPIC -I /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0 ================================================ FILE: code/tf_ops/interpolation/tf_interpolate_op_test.py ================================================ import tensorflow as tf import numpy as np from tf_interpolate import three_nn, three_interpolate class GroupPointTest(tf.test.TestCase): def test(self): pass def test_grad(self): with self.test_session(): points = tf.constant(np.random.random((1,8,16)).astype('float32')) print points xyz1 = tf.constant(np.random.random((1,128,3)).astype('float32')) xyz2 = tf.constant(np.random.random((1,8,3)).astype('float32')) dist, idx = three_nn(xyz1, xyz2) weight = tf.ones_like(dist)/3.0 interpolated_points = three_interpolate(points, idx, weight) print interpolated_points err = tf.test.compute_gradient_error(points, (1,8,16), interpolated_points, (1,128,16)) print err self.assertLess(err, 1e-4) if __name__=='__main__': tf.test.main() ================================================ FILE: code/tf_ops/interpolation/visu_interpolation.py ================================================ ''' Visualize part segmentation ''' import os import sys ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) sys.path.append('/home/rqi/Projects/toolkits/visualization') from show3d_balls import showpoints import numpy as np from tf_interpolate import three_nn, three_interpolate import tensorflow as tf pts2 = np.array([[0,0,1],[1,0,0],[0,1,0],[1,1,0]]).astype('float32') xyz1 = np.random.random((100,3)).astype('float32') xyz2 = np.array([[0,0,0],[1,0,0],[0,1,0],[1,1,1]]).astype('float32') def fun(xyz1,xyz2,pts2): with tf.device('/cpu:0'): points = tf.constant(np.expand_dims(pts2,0)) xyz1 = tf.constant(np.expand_dims(xyz1,0)) xyz2 = tf.constant(np.expand_dims(xyz2,0)) dist, idx = three_nn(xyz1, xyz2) #weight = tf.ones_like(dist)/3.0 dist = tf.maximum(dist, 1e-10) norm = tf.reduce_sum((1.0/dist),axis=2,keep_dims=True) norm = tf.tile(norm, [1,1,3]) print norm weight = (1.0/dist) / norm interpolated_points = three_interpolate(points, idx, weight) with tf.Session('') as sess: tmp,pts1,d,w = sess.run([xyz1, interpolated_points, dist, weight]) #print w pts1 = pts1.squeeze() return pts1 pts1 = fun(xyz1,xyz2,pts2) all_pts = np.zeros((104,3)) all_pts[0:100,:] = pts1 all_pts[100:,:] = pts2 all_xyz = np.zeros((104,3)) all_xyz[0:100,:]=xyz1 all_xyz[100:,:]=xyz2 showpoints(xyz2, pts2, ballradius=8) showpoints(xyz1, pts1, ballradius=8) showpoints(all_xyz, all_pts, ballradius=8) ================================================ FILE: code/tf_ops/sampling/__init__.py ================================================ ================================================ FILE: code/tf_ops/sampling/tf_sampling.cpp ================================================ /* Furthest point sampling * Original author: Haoqiang Fan * Modified by Charles R. Qi * All Rights Reserved. 2017. */ #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/shape_inference.h" #include "tensorflow/core/framework/common_shape_fns.h" #include using namespace tensorflow; REGISTER_OP("ProbSample") .Input("inp: float32") .Input("inpr: float32") .Output("out: int32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ncategory c->WithRank(c->input(0), 2, &dims1); ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints c->WithRank(c->input(1), 2, &dims2); // batch_size * npoints ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1)}); c->set_output(0, output); return Status::OK(); }); REGISTER_OP("FarthestPointSample") .Attr("npoint: int") .Input("inp: float32") .Output("out: int32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * npoint * 3 c->WithRank(c->input(0), 3, &dims1); int npoint; TF_RETURN_IF_ERROR(c->GetAttr("npoint", &npoint)); ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), npoint}); c->set_output(0, output); return Status::OK(); }); REGISTER_OP("GatherPoint") .Input("inp: float32") .Input("idx: int32") .Output("out: float32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ndataset * 3 c->WithRank(c->input(0), 3, &dims1); ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints c->WithRank(c->input(1), 2, &dims2); // batch_size * npoints * 3 ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), c->Dim(dims2, 1), c->Dim(dims1, 2)}); c->set_output(0, output); return Status::OK(); }); REGISTER_OP("GatherPointGrad") .Input("inp: float32") .Input("idx: int32") .Input("out_g: float32") .Output("inp_g: float32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->input(0)); return Status::OK(); }); void probsampleLauncher(int b,int n,int m,const float * inp_p,const float * inp_r,float * temp,int * out); class ProbSampleGpuOp: public OpKernel{ public: explicit ProbSampleGpuOp(OpKernelConstruction* context):OpKernel(context){} void Compute(OpKernelContext * context)override{ const Tensor& inp_tensor=context->input(0); const Tensor& inpr_tensor=context->input(1); auto inp_flat=inp_tensor.flat(); auto inpr_flat=inpr_tensor.flat(); const float * inp=&(inp_flat(0)); const float * inpr=&(inpr_flat(0)); OP_REQUIRES(context,inp_tensor.dims()==2,errors::InvalidArgument("ProbSample expects (batch_size,num_choices) inp shape")); int b=inp_tensor.shape().dim_size(0); int n=inp_tensor.shape().dim_size(1); OP_REQUIRES(context,inpr_tensor.dims()==2 && inpr_tensor.shape().dim_size(0)==b,errors::InvalidArgument("ProbSample expects (batch_size,num_points) inpr shape")); int m=inpr_tensor.shape().dim_size(1); Tensor * out_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m},&out_tensor)); auto out_flat=out_tensor->flat(); int * out=&(out_flat(0)); Tensor temp_tensor; OP_REQUIRES_OK(context,context->allocate_temp(DataTypeToEnum::value,TensorShape{b,n},&temp_tensor)); auto temp_flat=temp_tensor.flat(); float * temp=&(temp_flat(0)); probsampleLauncher(b,n,m,inp,inpr,temp,out); } }; REGISTER_KERNEL_BUILDER(Name("ProbSample").Device(DEVICE_GPU), ProbSampleGpuOp); void farthestpointsamplingLauncher(int b,int n,int m,const float * inp,float * temp,int * out); class FarthestPointSampleGpuOp: public OpKernel{ public: explicit FarthestPointSampleGpuOp(OpKernelConstruction* context):OpKernel(context) { OP_REQUIRES_OK(context, context->GetAttr("npoint", &npoint_)); OP_REQUIRES(context, npoint_ > 0, errors::InvalidArgument("FarthestPointSample expects positive npoint")); } void Compute(OpKernelContext * context)override{ int m = npoint_; const Tensor& inp_tensor=context->input(0); OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument("FarthestPointSample expects (batch_size,num_points,3) inp shape")); int b=inp_tensor.shape().dim_size(0); int n=inp_tensor.shape().dim_size(1); auto inp_flat=inp_tensor.flat(); const float * inp=&(inp_flat(0)); Tensor * out_tensor; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m},&out_tensor)); auto out_flat=out_tensor->flat(); int * out=&(out_flat(0)); Tensor temp_tensor; OP_REQUIRES_OK(context,context->allocate_temp(DataTypeToEnum::value,TensorShape{32,n},&temp_tensor)); auto temp_flat=temp_tensor.flat(); float * temp=&(temp_flat(0)); farthestpointsamplingLauncher(b,n,m,inp,temp,out); } private: int npoint_; }; REGISTER_KERNEL_BUILDER(Name("FarthestPointSample").Device(DEVICE_GPU),FarthestPointSampleGpuOp); void gatherpointLauncher(int b,int n,int m,const float * inp,const int * idx,float * out); class GatherPointGpuOp: public OpKernel{ public: explicit GatherPointGpuOp(OpKernelConstruction * context):OpKernel(context){} void Compute(OpKernelContext * context)override{ const Tensor& inp_tensor=context->input(0); OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument("GatherPoint expects (batch_size,num_points,3) inp shape")); int b=inp_tensor.shape().dim_size(0); int n=inp_tensor.shape().dim_size(1); const Tensor& idx_tensor=context->input(1); OP_REQUIRES(context,idx_tensor.dims()==2 && idx_tensor.shape().dim_size(0)==b,errors::InvalidArgument("GatherPoint expects (batch_size,num_result) idx shape")); int m=idx_tensor.shape().dim_size(1); auto inp_flat=inp_tensor.flat(); const float * inp=&(inp_flat(0)); auto idx_flat=idx_tensor.flat(); const int * idx=&(idx_flat(0)); Tensor * out_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m,3},&out_tensor)); auto out_flat=out_tensor->flat(); float * out=&(out_flat(0)); gatherpointLauncher(b,n,m,inp,idx,out); } }; REGISTER_KERNEL_BUILDER(Name("GatherPoint").Device(DEVICE_GPU),GatherPointGpuOp); void scatteraddpointLauncher(int b,int n,int m,const float * out_g,const int * idx,float * inp_g); class GatherPointGradGpuOp: public OpKernel{ public: explicit GatherPointGradGpuOp(OpKernelConstruction * context):OpKernel(context){} void Compute(OpKernelContext * context)override{ const Tensor& inp_tensor=context->input(0); OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument("GatherPointGradGpuOp expects (batch_size,num_points,3) inp")); int b=inp_tensor.shape().dim_size(0); int n=inp_tensor.shape().dim_size(1); const Tensor& idx_tensor=context->input(1); OP_REQUIRES(context,idx_tensor.dims()==2 && idx_tensor.shape().dim_size(0)==b,errors::InvalidArgument("GatherPointGradGpuOp expects (batch_size,num_result) idx shape")); int m=idx_tensor.shape().dim_size(1); auto inp_flat=inp_tensor.flat(); const float * inp=&(inp_flat(0)); auto idx_flat=idx_tensor.flat(); const int * idx=&(idx_flat(0)); const Tensor& out_g_tensor=context->input(2); OP_REQUIRES(context,out_g_tensor.dims()==3 && out_g_tensor.shape().dim_size(0)==b && out_g_tensor.shape().dim_size(1)==m && out_g_tensor.shape().dim_size(2)==3,errors::InvalidArgument("GatherPointGradGpuOp expects (batch_size,num_result,3) out_g shape")); auto out_g_flat=out_g_tensor.flat(); const float * out_g=&(out_g_flat(0)); Tensor * inp_g_tensor=NULL; OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n,3},&inp_g_tensor)); auto inp_g_flat=inp_g_tensor->flat(); float * inp_g=&(inp_g_flat(0)); cudaMemset(inp_g,0,b*n*3*4); scatteraddpointLauncher(b,n,m,out_g,idx,inp_g); } }; REGISTER_KERNEL_BUILDER(Name("GatherPointGrad").Device(DEVICE_GPU),GatherPointGradGpuOp); ================================================ FILE: code/tf_ops/sampling/tf_sampling.py ================================================ ''' Furthest point sampling Original author: Haoqiang Fan Modified by Charles R. Qi All Rights Reserved. 2017. ''' import tensorflow as tf from tensorflow.python.framework import ops import sys import os BASE_DIR = os.path.dirname(os.path.abspath(__file__)) sys.path.append(BASE_DIR) sampling_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_sampling_so.so')) def prob_sample(inp,inpr): ''' input: batch_size * ncategory float32 batch_size * npoints float32 returns: batch_size * npoints int32 ''' return sampling_module.prob_sample(inp,inpr) ops.NoGradient('ProbSample') # TF1.0 API requires set shape in C++ #@tf.RegisterShape('ProbSample') #def _prob_sample_shape(op): # shape1=op.inputs[0].get_shape().with_rank(2) # shape2=op.inputs[1].get_shape().with_rank(2) # return [tf.TensorShape([shape2.dims[0],shape2.dims[1]])] def gather_point(inp,idx): ''' input: batch_size * ndataset * 3 float32 batch_size * npoints int32 returns: batch_size * npoints * 3 float32 ''' return sampling_module.gather_point(inp,idx) #@tf.RegisterShape('GatherPoint') #def _gather_point_shape(op): # shape1=op.inputs[0].get_shape().with_rank(3) # shape2=op.inputs[1].get_shape().with_rank(2) # return [tf.TensorShape([shape1.dims[0],shape2.dims[1],shape1.dims[2]])] @tf.RegisterGradient('GatherPoint') def _gather_point_grad(op,out_g): inp=op.inputs[0] idx=op.inputs[1] return [sampling_module.gather_point_grad(inp,idx,out_g),None] def farthest_point_sample(npoint,inp): ''' input: int32 batch_size * ndataset * 3 float32 returns: batch_size * npoint int32 ''' return sampling_module.farthest_point_sample(inp, npoint) ops.NoGradient('FarthestPointSample') if __name__=='__main__': import numpy as np np.random.seed(100) triangles=np.random.rand(1,5,3,3).astype('float32') with tf.device('/gpu:1'): inp=tf.constant(triangles) tria=inp[:,:,0,:] trib=inp[:,:,1,:] tric=inp[:,:,2,:] areas=tf.sqrt(tf.reduce_sum(tf.cross(trib-tria,tric-tria)**2,2)+1e-9) randomnumbers=tf.random_uniform((1,8192)) triids=prob_sample(areas,randomnumbers) tria_sample=gather_point(tria,triids) trib_sample=gather_point(trib,triids) tric_sample=gather_point(tric,triids) us=tf.random_uniform((1,8192)) vs=tf.random_uniform((1,8192)) uplusv=1-tf.abs(us+vs-1) uminusv=us-vs us=(uplusv+uminusv)*0.5 vs=(uplusv-uminusv)*0.5 pt_sample=tria_sample+(trib_sample-tria_sample)*tf.expand_dims(us,-1)+(tric_sample-tria_sample)*tf.expand_dims(vs,-1) print 'pt_sample: ', pt_sample reduced_sample=gather_point(pt_sample,farthest_point_sample(1024,pt_sample)) print reduced_sample with tf.Session('') as sess: ret=sess.run(reduced_sample) print ret.shape,ret.dtype import cPickle as pickle pickle.dump(ret,open('1.pkl','wb'),-1) ================================================ FILE: code/tf_ops/sampling/tf_sampling_compile.sh ================================================ #/bin/bash /usr/local/cuda-8.0/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC g++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC -I /home/ganeshiyer/tensorflow/lib/python2.7/site-packages/tensorflow/include -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0 ================================================ FILE: code/tf_ops/sampling/tf_sampling_g.cu ================================================ /* Furthest point sampling GPU implementation * Original author: Haoqiang Fan * Modified by Charles R. Qi * All Rights Reserved. 2017. */ __global__ void cumsumKernel(int b,int n,const float * __restrict__ inp,float * __restrict__ out){ const int BlockSize=2048; const int paddingLevel=5; __shared__ float buffer4[BlockSize*4]; __shared__ float buffer[BlockSize+(BlockSize>>paddingLevel)]; for (int i=blockIdx.x;i>2; for (int k=threadIdx.x*4;k>2)+(k>>(2+paddingLevel))]=v4; }else{ float v=0; for (int k2=k;k2>2)+(k>>(2+paddingLevel))]=v; } } int u=0; for (;(2<>(u+1));k+=blockDim.x){ int i1=(((k<<1)+2)<>paddingLevel; i2+=i2>>paddingLevel; buffer[i1]+=buffer[i2]; } } u--; for (;u>=0;u--){ __syncthreads(); for (int k=threadIdx.x;k>(u+1));k+=blockDim.x){ int i1=(((k<<1)+3)<>paddingLevel; i2+=i2>>paddingLevel; buffer[i1]+=buffer[i2]; } } __syncthreads(); for (int k=threadIdx.x*4;k>2)-1)+(((k>>2)-1)>>paddingLevel); buffer4[k]+=buffer[k2]; buffer4[k+1]+=buffer[k2]; buffer4[k+2]+=buffer[k2]; buffer4[k+3]+=buffer[k2]; } } __syncthreads(); for (int k=threadIdx.x;k>paddingLevel)]+runningsum2; float r2=runningsum+t; runningsum2=t-(r2-runningsum); runningsum=r2; __syncthreads(); } } } __global__ void binarysearchKernel(int b,int n,int m,const float * __restrict__ dataset,const float * __restrict__ query, int * __restrict__ result){ int base=1; while (base=1;k>>=1) if (r>=k && dataset[i*n+r-k]>=q) r-=k; result[i*m+j]=r; } } } __global__ void farthestpointsamplingKernel(int b,int n,int m,const float * __restrict__ dataset,float * __restrict__ temp,int * __restrict__ idxs){ if (m<=0) return; const int BlockSize=512; __shared__ float dists[BlockSize]; __shared__ int dists_i[BlockSize]; const int BufferSize=3072; __shared__ float buf[BufferSize*3]; for (int i=blockIdx.x;ibest){ best=d2; besti=k; } } dists[threadIdx.x]=best; dists_i[threadIdx.x]=besti; for (int u=0;(1<>(u+1))){ int i1=(threadIdx.x*2)<>>(b,n,inp,out); } //require b*n working space void probsampleLauncher(int b,int n,int m,const float * inp_p,const float * inp_r,float * temp,int * out){ cumsumKernel<<<32,512>>>(b,n,inp_p,temp); binarysearchKernel<<>>(b,n,m,temp,inp_r,out); } //require 32*n working space void farthestpointsamplingLauncher(int b,int n,int m,const float * inp,float * temp,int * out){ farthestpointsamplingKernel<<<32,512>>>(b,n,m,inp,temp,out); } void gatherpointLauncher(int b,int n,int m,const float * inp,const int * idx,float * out){ gatherpointKernel<<>>(b,n,m,inp,idx,out); } void scatteraddpointLauncher(int b,int n,int m,const float * out_g,const int * idx,float * inp_g){ scatteraddpointKernel<<>>(b,n,m,out_g,idx,inp_g); } ================================================ FILE: code/train_model_combined.py ================================================ import numpy as np import tensorflow as tf import scipy.misc as smc import config_res as config from common.cnn_utils_res import * from common import resnet_rgb_model as model from common import resnet_depth_model as model_depth from common import all_transformer as at3 from common import global_agg_net from common.Lie_functions import exponential_map_single import nw_loader_color as ldr import model_utils _BETA_CONST = 1.0 _ALPHA_CONST = 1.0 IMG_HT = config.depth_img_params['IMG_HT'] IMG_WDT = config.depth_img_params['IMG_WDT'] batch_size = config.net_params['batch_size'] learning_rate = config.net_params['learning_rate'] n_epochs = config.net_params['epochs'] current_epoch = config.net_params['load_epoch'] tf.reset_default_graph() X1 = tf.placeholder(tf.float32, shape = (batch_size, IMG_HT, IMG_WDT, 3), name = "X1") X2 = tf.placeholder(tf.float32, shape = (batch_size, IMG_HT, IMG_WDT, 1), name = "X2") depth_maps_target = tf.placeholder(tf.float32, shape = (batch_size, IMG_HT, IMG_WDT, 1), name = "depth_maps_target") expected_transforms = tf.placeholder(tf.float32, shape = (batch_size, 4, 4), name = "expected_transforms") phase = tf.placeholder(tf.bool, [], name = "phase") phase_rgb = tf.placeholder(tf.bool, [], name = "phase_rgb") keep_prob = tf.placeholder(tf.float32, name = "keep_prob") fx = config.camera_params['fx'] fy = config.camera_params['fy'] cx = config.camera_params['cx'] cy = config.camera_params['cy'] fx_scaled = 2*(fx)/np.float32(IMG_WDT) # focal length x scaled for -1 to 1 range fy_scaled = 2*(fy)/np.float32(IMG_HT) # focal length y scaled for -1 to 1 range cx_scaled = -1 + 2*(cx - 1.0)/np.float32(IMG_WDT) # optical center x scaled for -1 to 1 range cy_scaled = -1 + 2*(cy - 1.0)/np.float32(IMG_HT) # optical center y scaled for -1 to 1 range K_mat_scaled = np.array([[fx_scaled, 0.0, cx_scaled], [0.0, fy_scaled, cy_scaled], [0.0, 0.0, 1.0]], dtype = np.float32) K_final = tf.constant(K_mat_scaled, dtype = tf.float32) small_transform = tf.constant(config.camera_params['cam_transform_02_inv'], dtype = tf.float32) X2_pooled = tf.nn.max_pool(X2, ksize=[1,5,5,1], strides=[1,1,1,1], padding="SAME") depth_maps_target_pooled = tf.nn.max_pool(depth_maps_target, ksize=[1,5,5,1], strides=[1,1,1,1], padding="SAME") output_vectors, weight_summaries = global_agg_net.End_Net_Out(X1, phase_rgb, X2_pooled, phase, keep_prob) # se(3) -> SE(3) for the whole batch predicted_transforms = tf.map_fn(lambda x:exponential_map_single(output_vectors[x]), elems=tf.range(0, batch_size, 1), dtype=tf.float32) # transforms depth maps by the predicted transformation depth_maps_predicted, cloud_pred = tf.map_fn(lambda x:at3._simple_transformer(X2_pooled[x,:,:,0]*40.0 + 40.0, predicted_transforms[x], K_final, small_transform), elems = tf.range(0, batch_size, 1), dtype = (tf.float32, tf.float32)) # transforms depth maps by the expected transformation depth_maps_expected, cloud_exp = tf.map_fn(lambda x:at3._simple_transformer(X2_pooled[x,:,:,0]*40.0 + 40.0, expected_transforms[x], K_final, small_transform), elems = tf.range(0, batch_size, 1), dtype = (tf.float32, tf.float32)) # photometric loss between predicted and expected transformation photometric_loss = tf.nn.l2_loss(tf.subtract((depth_maps_expected[:,10:-10,10:-10] - 40.0)/40.0, (depth_maps_predicted[:,10:-10,10:-10] - 40.0)/40.0)) # earth mover's distance between point clouds cloud_loss = model_utils.get_emd_loss(cloud_pred, cloud_exp) # final loss term predicted_loss_train = _ALPHA_CONST*photometric_loss + _BETA_CONST*cloud_loss tf.add_to_collection('losses1', predicted_loss_train) loss1 = tf.add_n(tf.get_collection('losses1')) train_step = tf.train.AdamOptimizer(learning_rate = config.net_params['learning_rate'], beta1 = config.net_params['beta1']).minimize(predicted_loss_train) predicted_loss_validation = tf.nn.l2_loss(tf.subtract((depth_maps_expected[:,10:-10,10:-10] - 40.0)/40.0, (depth_maps_predicted[:,10:-10,10:-10] - 40.0)/40.0)) cloud_loss_validation = model_utils.get_emd_loss(cloud_pred, cloud_exp) training_summary_1 = tf.summary.scalar('cloud_loss', _BETA_CONST*cloud_loss) training_summary_2 = tf.summary.scalar('photometric_loss', photometric_loss) validation_summary_1 = tf.summary.scalar('Validation_loss', predicted_loss_validation) validation_summary_2 = tf.summary.scalar('Validation_cloud_loss', cloud_loss_validation) merge_train = tf.summary.merge([training_summary_1] + [training_summary_2] + weight_summaries) merge_val = tf.summary.merge([validation_summary_1] + [validation_summary_2]) saver = tf.train.Saver() # tensorflow gpu configuration. Not to be confused with network configuration file config_tf = tf.ConfigProto() config_tf.gpu_options.allow_growth=True with tf.Session(config = config_tf) as sess: sess.run(tf.global_variables_initializer()) writer = tf.summary.FileWriter("./logs_simple_transformer/") total_iterations_train = 0 total_iterations_validate = 0 if(current_epoch == 0): writer.add_graph(sess.graph) checkpoint_path = config.paths['checkpoint_path'] if(current_epoch > 0): print("Restoring Checkpoint") saver.restore(sess, checkpoint_path + "/model-%d"%current_epoch) current_epoch+=1 total_iterations_train = current_epoch*config.net_params['total_frames_train']/batch_size total_iterations_validate = current_epoch*config.net_params['total_frames_validation']/batch_size for epoch in range(current_epoch, n_epochs): total_partitions_train = config.net_params['total_frames_train']/config.net_params['partition_limit'] total_partitions_validation = config.net_params['total_frames_validation']/config.net_params['partition_limit'] ldr.shuffle() for part in range(total_partitions_train): source_container, target_container, source_img_container, target_img_container, transforms_container = ldr.load(part, mode = "train") for source_b, target_b, source_img_b, target_img_b, transforms_b in zip(source_container, target_container, source_img_container, target_img_container, transforms_container): outputs= sess.run([depth_maps_predicted, depth_maps_expected, predicted_loss_train, X2_pooled, train_step, merge_train, predicted_transforms, cloud_loss, photometric_loss, loss1], feed_dict={X1: source_img_b, X2: source_b, depth_maps_target: target_b, expected_transforms: transforms_b ,phase:True, keep_prob:0.5, phase_rgb: False}) dmaps_pred = outputs[0] dmaps_exp = outputs[1] loss = outputs[2] source = outputs[3] if(total_iterations_train%10 == 0): writer.add_summary(outputs[5], total_iterations_train/10) print(outputs[8], _ALPHA_CONST*outputs[8], outputs[7], _BETA_CONST*outputs[7], outputs[9],total_iterations_train) random_disp = np.random.randint(batch_size) print(outputs[6][random_disp]) print(transforms_b[random_disp]) if(total_iterations_train%125 == 0): smc.imsave(config.paths['training_imgs_path'] + "/training_save_%d.png"%total_iterations_train, np.vstack((source[random_disp,:,:,0]*40.0 + 40.0, dmaps_pred[random_disp], dmaps_exp[random_disp]))) total_iterations_train+=1 if (epoch%1 == 0): print("Saving after epoch %d"%epoch) saver.save(sess, checkpoint_path + "/model-%d"%epoch) for part in range(total_partitions_validation): source_container, target_container, source_img_container, target_img_container, transforms_container = ldr.load(part, mode = "validation") for source_b, target_b, source_img_b, target_img_b, transforms_b in zip(source_container, target_container, source_img_container, target_img_container, transforms_container): outputs= sess.run([depth_maps_predicted, depth_maps_expected, predicted_loss_validation, X2_pooled, merge_val, cloud_loss_validation], feed_dict={X1: source_img_b, X2: source_b, depth_maps_target: target_b, expected_transforms: transforms_b ,phase:False, keep_prob:1.0, phase_rgb: False}) dmaps_pred = outputs[0] dmaps_exp = outputs[1] loss = outputs[2] source = outputs[3] writer.add_summary(outputs[4], total_iterations_validate) total_iterations_validate+=1 print(loss, total_iterations_validate, outputs[5]) if(total_iterations_validate%25 == 0): random_disp = np.random.randint(batch_size) smc.imsave(config.paths['validation_imgs_path'] + "/validation_save_%d.png"%total_iterations_validate, np.vstack((source[random_disp,:,:,0]*40.0 + 40.0, dmaps_pred[random_disp], dmaps_exp[random_disp]))) ================================================ FILE: code/utils/__init__.py ================================================ ================================================ FILE: code/utils/data_prep_util.py ================================================ import os import sys BASE_DIR = os.path.dirname(os.path.abspath(__file__)) sys.path.append(BASE_DIR) from plyfile import (PlyData, PlyElement, make2d, PlyParseError, PlyProperty) import numpy as np import h5py SAMPLING_BIN = os.path.join(BASE_DIR, 'third_party/mesh_sampling/build/pcsample') SAMPLING_POINT_NUM = 2048 SAMPLING_LEAF_SIZE = 0.005 MODELNET40_PATH = '../datasets/modelnet40' def export_ply(pc, filename): vertex = np.zeros(pc.shape[0], dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')]) for i in range(pc.shape[0]): vertex[i] = (pc[i][0], pc[i][1], pc[i][2]) ply_out = PlyData([PlyElement.describe(vertex, 'vertex', comments=['vertices'])]) ply_out.write(filename) # Sample points on the obj shape def get_sampling_command(obj_filename, ply_filename): cmd = SAMPLING_BIN + ' ' + obj_filename cmd += ' ' + ply_filename cmd += ' -n_samples %d ' % SAMPLING_POINT_NUM cmd += ' -leaf_size %f ' % SAMPLING_LEAF_SIZE return cmd # -------------------------------------------------------------- # Following are the helper functions to load MODELNET40 shapes # -------------------------------------------------------------- # Read in the list of categories in MODELNET40 def get_category_names(): shape_names_file = os.path.join(MODELNET40_PATH, 'shape_names.txt') shape_names = [line.rstrip() for line in open(shape_names_file)] return shape_names # Return all the filepaths for the shapes in MODELNET40 def get_obj_filenames(): obj_filelist_file = os.path.join(MODELNET40_PATH, 'filelist.txt') obj_filenames = [os.path.join(MODELNET40_PATH, line.rstrip()) for line in open(obj_filelist_file)] print('Got %d obj files in modelnet40.' % len(obj_filenames)) return obj_filenames # Helper function to create the father folder and all subdir folders if not exist def batch_mkdir(output_folder, subdir_list): if not os.path.exists(output_folder): os.mkdir(output_folder) for subdir in subdir_list: if not os.path.exists(os.path.join(output_folder, subdir)): os.mkdir(os.path.join(output_folder, subdir)) # ---------------------------------------------------------------- # Following are the helper functions to load save/load HDF5 files # ---------------------------------------------------------------- # Write numpy array data and label to h5_filename def save_h5_data_label_normal(h5_filename, data, label, normal, data_dtype='float32', label_dtype='uint8', noral_dtype='float32'): h5_fout = h5py.File(h5_filename) h5_fout.create_dataset( 'data', data=data, compression='gzip', compression_opts=4, dtype=data_dtype) h5_fout.create_dataset( 'normal', data=normal, compression='gzip', compression_opts=4, dtype=normal_dtype) h5_fout.create_dataset( 'label', data=label, compression='gzip', compression_opts=1, dtype=label_dtype) h5_fout.close() # Write numpy array data and label to h5_filename def save_h5(h5_filename, data, label, data_dtype='uint8', label_dtype='uint8'): h5_fout = h5py.File(h5_filename) h5_fout.create_dataset( 'data', data=data, compression='gzip', compression_opts=4, dtype=data_dtype) h5_fout.create_dataset( 'label', data=label, compression='gzip', compression_opts=1, dtype=label_dtype) h5_fout.close() # Read numpy array data and label from h5_filename def load_h5_data_label_normal(h5_filename): f = h5py.File(h5_filename) data = f['data'][:] label = f['label'][:] normal = f['normal'][:] return (data, label, normal) # Read numpy array data and label from h5_filename def load_h5_data_label_seg(h5_filename): f = h5py.File(h5_filename) data = f['data'][:] label = f['label'][:] seg = f['pid'][:] return (data, label, seg) # Read numpy array data and label from h5_filename def load_h5(h5_filename): f = h5py.File(h5_filename) data = f['data'][:] label = f['label'][:] return (data, label) # ---------------------------------------------------------------- # Following are the helper functions to load save/load PLY files # ---------------------------------------------------------------- # Load PLY file def load_ply_data(filename, point_num): plydata = PlyData.read(filename) pc = plydata['vertex'].data[:point_num] pc_array = np.array([[x, y, z] for x,y,z in pc]) return pc_array # Load PLY file def load_ply_normal(filename, point_num): plydata = PlyData.read(filename) pc = plydata['normal'].data[:point_num] pc_array = np.array([[x, y, z] for x,y,z in pc]) return pc_array # Make up rows for Nxk array # Input Pad is 'edge' or 'constant' def pad_arr_rows(arr, row, pad='edge'): assert(len(arr.shape) == 2) assert(arr.shape[0] <= row) assert(pad == 'edge' or pad == 'constant') if arr.shape[0] == row: return arr if pad == 'edge': return np.lib.pad(arr, ((0, row-arr.shape[0]), (0, 0)), 'edge') if pad == 'constant': return np.lib.pad(arr, ((0, row-arr.shape[0]), (0, 0)), 'constant', (0, 0)) ================================================ FILE: code/utils/eulerangles.py ================================================ # emacs: -*- mode: python-mode; py-indent-offset: 4; indent-tabs-mode: nil -*- # vi: set ft=python sts=4 ts=4 sw=4 et: ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ## # # See COPYING file distributed along with the NiBabel package for the # copyright and license terms. # ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ## ''' Module implementing Euler angle rotations and their conversions See: * http://en.wikipedia.org/wiki/Rotation_matrix * http://en.wikipedia.org/wiki/Euler_angles * http://mathworld.wolfram.com/EulerAngles.html See also: *Representing Attitude with Euler Angles and Quaternions: A Reference* (2006) by James Diebel. A cached PDF link last found here: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.110.5134 Euler's rotation theorem tells us that any rotation in 3D can be described by 3 angles. Let's call the 3 angles the *Euler angle vector* and call the angles in the vector :math:`alpha`, :math:`beta` and :math:`gamma`. The vector is [ :math:`alpha`, :math:`beta`. :math:`gamma` ] and, in this description, the order of the parameters specifies the order in which the rotations occur (so the rotation corresponding to :math:`alpha` is applied first). In order to specify the meaning of an *Euler angle vector* we need to specify the axes around which each of the rotations corresponding to :math:`alpha`, :math:`beta` and :math:`gamma` will occur. There are therefore three axes for the rotations :math:`alpha`, :math:`beta` and :math:`gamma`; let's call them :math:`i` :math:`j`, :math:`k`. Let us express the rotation :math:`alpha` around axis `i` as a 3 by 3 rotation matrix `A`. Similarly :math:`beta` around `j` becomes 3 x 3 matrix `B` and :math:`gamma` around `k` becomes matrix `G`. Then the whole rotation expressed by the Euler angle vector [ :math:`alpha`, :math:`beta`. :math:`gamma` ], `R` is given by:: R = np.dot(G, np.dot(B, A)) See http://mathworld.wolfram.com/EulerAngles.html The order :math:`G B A` expresses the fact that the rotations are performed in the order of the vector (:math:`alpha` around axis `i` = `A` first). To convert a given Euler angle vector to a meaningful rotation, and a rotation matrix, we need to define: * the axes `i`, `j`, `k` * whether a rotation matrix should be applied on the left of a vector to be transformed (vectors are column vectors) or on the right (vectors are row vectors). * whether the rotations move the axes as they are applied (intrinsic rotations) - compared the situation where the axes stay fixed and the vectors move within the axis frame (extrinsic) * the handedness of the coordinate system See: http://en.wikipedia.org/wiki/Rotation_matrix#Ambiguities We are using the following conventions: * axes `i`, `j`, `k` are the `z`, `y`, and `x` axes respectively. Thus an Euler angle vector [ :math:`alpha`, :math:`beta`. :math:`gamma` ] in our convention implies a :math:`alpha` radian rotation around the `z` axis, followed by a :math:`beta` rotation around the `y` axis, followed by a :math:`gamma` rotation around the `x` axis. * the rotation matrix applies on the left, to column vectors on the right, so if `R` is the rotation matrix, and `v` is a 3 x N matrix with N column vectors, the transformed vector set `vdash` is given by ``vdash = np.dot(R, v)``. * extrinsic rotations - the axes are fixed, and do not move with the rotations. * a right-handed coordinate system The convention of rotation around ``z``, followed by rotation around ``y``, followed by rotation around ``x``, is known (confusingly) as "xyz", pitch-roll-yaw, Cardan angles, or Tait-Bryan angles. ''' import math import sys if sys.version_info >= (3,0): from functools import reduce import numpy as np _FLOAT_EPS_4 = np.finfo(float).eps * 4.0 def euler2mat(z=0, y=0, x=0): ''' Return matrix for rotations around z, y and x axes Uses the z, then y, then x convention above Parameters ---------- z : scalar Rotation angle in radians around z-axis (performed first) y : scalar Rotation angle in radians around y-axis x : scalar Rotation angle in radians around x-axis (performed last) Returns ------- M : array shape (3,3) Rotation matrix giving same rotation as for given angles Examples -------- >>> zrot = 1.3 # radians >>> yrot = -0.1 >>> xrot = 0.2 >>> M = euler2mat(zrot, yrot, xrot) >>> M.shape == (3, 3) True The output rotation matrix is equal to the composition of the individual rotations >>> M1 = euler2mat(zrot) >>> M2 = euler2mat(0, yrot) >>> M3 = euler2mat(0, 0, xrot) >>> composed_M = np.dot(M3, np.dot(M2, M1)) >>> np.allclose(M, composed_M) True You can specify rotations by named arguments >>> np.all(M3 == euler2mat(x=xrot)) True When applying M to a vector, the vector should column vector to the right of M. If the right hand side is a 2D array rather than a vector, then each column of the 2D array represents a vector. >>> vec = np.array([1, 0, 0]).reshape((3,1)) >>> v2 = np.dot(M, vec) >>> vecs = np.array([[1, 0, 0],[0, 1, 0]]).T # giving 3x2 array >>> vecs2 = np.dot(M, vecs) Rotations are counter-clockwise. >>> zred = np.dot(euler2mat(z=np.pi/2), np.eye(3)) >>> np.allclose(zred, [[0, -1, 0],[1, 0, 0], [0, 0, 1]]) True >>> yred = np.dot(euler2mat(y=np.pi/2), np.eye(3)) >>> np.allclose(yred, [[0, 0, 1],[0, 1, 0], [-1, 0, 0]]) True >>> xred = np.dot(euler2mat(x=np.pi/2), np.eye(3)) >>> np.allclose(xred, [[1, 0, 0],[0, 0, -1], [0, 1, 0]]) True Notes ----- The direction of rotation is given by the right-hand rule (orient the thumb of the right hand along the axis around which the rotation occurs, with the end of the thumb at the positive end of the axis; curl your fingers; the direction your fingers curl is the direction of rotation). Therefore, the rotations are counterclockwise if looking along the axis of rotation from positive to negative. ''' Ms = [] if z: cosz = math.cos(z) sinz = math.sin(z) Ms.append(np.array( [[cosz, -sinz, 0], [sinz, cosz, 0], [0, 0, 1]])) if y: cosy = math.cos(y) siny = math.sin(y) Ms.append(np.array( [[cosy, 0, siny], [0, 1, 0], [-siny, 0, cosy]])) if x: cosx = math.cos(x) sinx = math.sin(x) Ms.append(np.array( [[1, 0, 0], [0, cosx, -sinx], [0, sinx, cosx]])) if Ms: return reduce(np.dot, Ms[::-1]) return np.eye(3) def mat2euler(M, cy_thresh=None): ''' Discover Euler angle vector from 3x3 matrix Uses the conventions above. Parameters ---------- M : array-like, shape (3,3) cy_thresh : None or scalar, optional threshold below which to give up on straightforward arctan for estimating x rotation. If None (default), estimate from precision of input. Returns ------- z : scalar y : scalar x : scalar Rotations in radians around z, y, x axes, respectively Notes ----- If there was no numerical error, the routine could be derived using Sympy expression for z then y then x rotation matrix, which is:: [ cos(y)*cos(z), -cos(y)*sin(z), sin(y)], [cos(x)*sin(z) + cos(z)*sin(x)*sin(y), cos(x)*cos(z) - sin(x)*sin(y)*sin(z), -cos(y)*sin(x)], [sin(x)*sin(z) - cos(x)*cos(z)*sin(y), cos(z)*sin(x) + cos(x)*sin(y)*sin(z), cos(x)*cos(y)] with the obvious derivations for z, y, and x z = atan2(-r12, r11) y = asin(r13) x = atan2(-r23, r33) Problems arise when cos(y) is close to zero, because both of:: z = atan2(cos(y)*sin(z), cos(y)*cos(z)) x = atan2(cos(y)*sin(x), cos(x)*cos(y)) will be close to atan2(0, 0), and highly unstable. The ``cy`` fix for numerical instability below is from: *Graphics Gems IV*, Paul Heckbert (editor), Academic Press, 1994, ISBN: 0123361559. Specifically it comes from EulerAngles.c by Ken Shoemake, and deals with the case where cos(y) is close to zero: See: http://www.graphicsgems.org/ The code appears to be licensed (from the website) as "can be used without restrictions". ''' M = np.asarray(M) if cy_thresh is None: try: cy_thresh = np.finfo(M.dtype).eps * 4 except ValueError: cy_thresh = _FLOAT_EPS_4 r11, r12, r13, r21, r22, r23, r31, r32, r33 = M.flat # cy: sqrt((cos(y)*cos(z))**2 + (cos(x)*cos(y))**2) cy = math.sqrt(r33*r33 + r23*r23) if cy > cy_thresh: # cos(y) not close to zero, standard form z = math.atan2(-r12, r11) # atan2(cos(y)*sin(z), cos(y)*cos(z)) y = math.atan2(r13, cy) # atan2(sin(y), cy) x = math.atan2(-r23, r33) # atan2(cos(y)*sin(x), cos(x)*cos(y)) else: # cos(y) (close to) zero, so x -> 0.0 (see above) # so r21 -> sin(z), r22 -> cos(z) and z = math.atan2(r21, r22) y = math.atan2(r13, cy) # atan2(sin(y), cy) x = 0.0 return z, y, x def euler2quat(z=0, y=0, x=0): ''' Return quaternion corresponding to these Euler angles Uses the z, then y, then x convention above Parameters ---------- z : scalar Rotation angle in radians around z-axis (performed first) y : scalar Rotation angle in radians around y-axis x : scalar Rotation angle in radians around x-axis (performed last) Returns ------- quat : array shape (4,) Quaternion in w, x, y z (real, then vector) format Notes ----- We can derive this formula in Sympy using: 1. Formula giving quaternion corresponding to rotation of theta radians about arbitrary axis: http://mathworld.wolfram.com/EulerParameters.html 2. Generated formulae from 1.) for quaternions corresponding to theta radians rotations about ``x, y, z`` axes 3. Apply quaternion multiplication formula - http://en.wikipedia.org/wiki/Quaternions#Hamilton_product - to formulae from 2.) to give formula for combined rotations. ''' z = z/2.0 y = y/2.0 x = x/2.0 cz = math.cos(z) sz = math.sin(z) cy = math.cos(y) sy = math.sin(y) cx = math.cos(x) sx = math.sin(x) return np.array([ cx*cy*cz - sx*sy*sz, cx*sy*sz + cy*cz*sx, cx*cz*sy - sx*cy*sz, cx*cy*sz + sx*cz*sy]) def quat2euler(q): ''' Return Euler angles corresponding to quaternion `q` Parameters ---------- q : 4 element sequence w, x, y, z of quaternion Returns ------- z : scalar Rotation angle in radians around z-axis (performed first) y : scalar Rotation angle in radians around y-axis x : scalar Rotation angle in radians around x-axis (performed last) Notes ----- It's possible to reduce the amount of calculation a little, by combining parts of the ``quat2mat`` and ``mat2euler`` functions, but the reduction in computation is small, and the code repetition is large. ''' # delayed import to avoid cyclic dependencies import nibabel.quaternions as nq return mat2euler(nq.quat2mat(q)) def euler2angle_axis(z=0, y=0, x=0): ''' Return angle, axis corresponding to these Euler angles Uses the z, then y, then x convention above Parameters ---------- z : scalar Rotation angle in radians around z-axis (performed first) y : scalar Rotation angle in radians around y-axis x : scalar Rotation angle in radians around x-axis (performed last) Returns ------- theta : scalar angle of rotation vector : array shape (3,) axis around which rotation occurs Examples -------- >>> theta, vec = euler2angle_axis(0, 1.5, 0) >>> print(theta) 1.5 >>> np.allclose(vec, [0, 1, 0]) True ''' # delayed import to avoid cyclic dependencies import nibabel.quaternions as nq return nq.quat2angle_axis(euler2quat(z, y, x)) def angle_axis2euler(theta, vector, is_normalized=False): ''' Convert angle, axis pair to Euler angles Parameters ---------- theta : scalar angle of rotation vector : 3 element sequence vector specifying axis for rotation. is_normalized : bool, optional True if vector is already normalized (has norm of 1). Default False Returns ------- z : scalar y : scalar x : scalar Rotations in radians around z, y, x axes, respectively Examples -------- >>> z, y, x = angle_axis2euler(0, [1, 0, 0]) >>> np.allclose((z, y, x), 0) True Notes ----- It's possible to reduce the amount of calculation a little, by combining parts of the ``angle_axis2mat`` and ``mat2euler`` functions, but the reduction in computation is small, and the code repetition is large. ''' # delayed import to avoid cyclic dependencies import nibabel.quaternions as nq M = nq.angle_axis2mat(theta, vector, is_normalized) return mat2euler(M) ================================================ FILE: code/utils/modelnet_data_prep.py ================================================ import os import sys import glob import h5py import numpy as np from multiprocessing.dummy import Pool as ThreadPool import threading from tqdm import tqdm SAMPLING_BIN = "/home/lqyu/workspace/pcl-pcl-1.8.1/build/bin/pcl_mesh_sampling" SAMPLING_POINT_NUM = 8192 SAMPLING_LEAF_SIZE = 0.02 SAVE_ROOT_PATH = '../../data/Patches' def nonuniform_sampling(num = 4096, sample_num = 1024): sample = set() loc = np.random.rand()*0.6+0.2 while(len(sample)=num: continue sample.add(a) return list(sample) def save_file(file_path, data): if not os.path.exists(os.path.split(file_path)[0]): os.makedirs(os.path.split(file_path)[0]) np.savetxt(file_path, data, fmt='%.6f') def tmp(file_path=None): file_list = glob.glob('/home/lqyu/workspace/PointSR/data/ModelNet10_test/ModelNet10_MC_2k/*.xyz') file_list.sort() for file_path in file_list: print file_path data_4096 = np.loadtxt(file_path) data = data_4096[:, 0:3] centroid = np.mean(data, axis=0, keepdims=True) data = data - centroid furthest_distance = np.amax(np.sqrt(np.sum(abs(data) ** 2, axis=-1))) data = data / furthest_distance data_4096[:, 0:3] = data save_path = file_path.replace('ModelNet10_MC_2k','ModelNet10_MC_2k_normalize') save_file(save_path,data_4096) off_path = file_path.replace('ModelNet10_MC_2k','mesh') off_path = off_path.replace('.xyz', '.off') save_off_path = off_path.replace('mesh', 'mesh_normalize') if not os.path.exists(os.path.split(save_off_path)[0]): os.makedirs(os.path.split(save_off_path)[0]) offFile = open(off_path, 'r') lines = offFile.readlines() offFile.close() with open(save_off_path, 'w') as f: f.writelines(lines[0:2]) params = lines[1].split(' ') nVert = int(params[0]) for i in range(2, nVert + 2): coord = lines[i].split(' ') coords = [] for item in coord: if item!='': coords.append(item) x = (float(coords[0]) - centroid[0, 0]) / furthest_distance y = (float(coords[1]) - centroid[0, 1]) / furthest_distance z = (float(coords[2]) - centroid[0, 2]) / furthest_distance f.write('%.6f %.6f %.6f\n' % (x, y, z)) f.writelines(lines[nVert + 2:]) # continue # path0 = file_path.replace('poisson_20k','20000_normalize') # save_file(path0, data_4096) # idx = np.argsort(data_4096[:,np.random.randint(0,2)]) # data_4096 = data_4096[idx] # path1 = file_path.replace('poisson_20k','poisson_5k2') # idx1 = nonuniform_sampling(num=len(data_4096), sample_num=len(data_4096)/4) # data1 = data_4096[idx1, ...] # save_file(path1,data1) def nonuniformsample_from_pointcloud_fn(): file_list1 = glob.glob('/home/lqyu/server/proj49/PointSR_data/test_data/our_collected_data/Poisson_20k/*.xyz') file_list1.sort() # #handle with file_list2 and select the complete whole # tmp_list1 = [] # tmp_list2 = [] # for item in file_list2: # tmp_item = item.replace('big_girl','biggirl') # if len(tmp_item.split('/')[-1].split('_'))==3:#biggirl_01_1 # tmp_list1.append(item) # else:#biggirl_1 # tmp_list2.append(item) file_list = file_list1 for file_path in tqdm(file_list): print file_path phase = file_path.split('/')[-2] name = file_path.split('/')[-1].replace("xyz", "xyz") data_4096 = np.loadtxt(file_path) # data = data_4096[:, 0:3] # centroid = np.mean(data, axis=0, keepdims=True) # data = data - centroid # furthest_distance = np.amax(np.sqrt(np.sum(abs(data) ** 2, axis=-1))) # data = data / furthest_distance # data_4096[:, 0:3] = data path_4096 = file_path path_nonuniform_1024 = path_4096.replace('Poisson_20k','Poisson_5k_nonuniform') #idx_nonuniform_2048 = nonuniform_sampling(num=4096, sample_num=2048) #data_nonuniform_2048 = data_4096[idx_nonuniform_2048, ...] sort_idx = np.argsort(data_4096[:, np.random.randint(0, 2)]) perm_idx = np.random.permutation(np.arange(len(data_4096))) idx_nonuniform_1024 = nonuniform_sampling(num=len(data_4096), sample_num=5000) data_nonuniform_1024 = data_4096[sort_idx[idx_nonuniform_1024],...] # data_nonuniform_1024 = data_4096[perm_idx[:5000], ...] #save_file(path_4096,data_4096) #save_file(path_nonuniform_2048,data_nonuniform_2048) save_file(path_nonuniform_1024,data_nonuniform_1024) def possion_sample_fn(file_path): phase = file_path.split('/')[-2] name = file_path.split('/')[-1].replace("off", "xyz") xyz_name = os.path.join(SAVE_ROOT_PATH, '20000~', phase, name) if not os.path.exists(xyz_name): sample_cmd = '../../third_party/PdSampling_nofix %s %s %s' % (str(20000), file_path, xyz_name) print sample_cmd if os.system(sample_cmd): print "cannot sample file: %s" % (file_path) return 1 # xyz_name = os.path.join(SAVE_ROOT_PATH, '4096', phase, name) # if not os.path.exists(xyz_name): # sample_cmd = '../../third_party_pc901/PdSampling %s %s %s' % (str(4096), file_path, xyz_name) # print sample_cmd # if os.system(sample_cmd): # print "cannot sample file: %s" % (file_path) # return 1 # # xyz_name = os.path.join(SAVE_ROOT_PATH, '2048', phase, name) # if not os.path.exists(xyz_name): # sample_cmd = '../../third_party_pc901/PdSampling %s %s %s' % (str(2048), file_path, xyz_name) # print sample_cmd # if os.system(sample_cmd): # print "cannot sample file: %s" % (file_path) # return 1 # xyz_name = os.path.join(SAVE_ROOT_PATH, '8192~', phase, name) # if os.path.exists(xyz_name): # return # print file_path # sample_cmd = '../../third_party/PdSampling_nofix %s %s %s' % (str(8192), file_path, xyz_name) # if os.system(sample_cmd): # print "cannot sample file: %s" % (file_path) # return 1 def normal_off_file(folder='/home/lqyu/workspace/PointSR/data/ModelNet10'): file_list = glob.glob(folder+'/*/train/*.off') file_list.sort() for file_path in file_list: phase = file_path.split('/')[-2] name = file_path.split('/')[-1][:-4]+'.xyz' data_1024 = np.loadtxt(os.path.join('../../data/ModelNet10_poisson', '1024', phase, name)) data_2048 = np.loadtxt(os.path.join('../../data/ModelNet10_poisson', '2048', phase, name)) data_4096 = np.loadtxt(os.path.join('../../data/ModelNet10_poisson', '4096', phase, name)) data_ori = np.concatenate((data_1024, data_2048, data_4096), axis=0) data = data_ori[:, 0:3] centroid = np.mean(data, axis=0, keepdims=True) data = data - centroid furthest_distance = np.amax(np.sqrt(np.sum(abs(data) ** 2, axis=-1))) save_off_path = file_path.replace('ModelNet10','ModelNet10_normalize') if not os.path.exists(os.path.split(save_off_path)[0]): os.makedirs(os.path.split(save_off_path)[0]) offFile = open(file_path, 'r') lines = offFile.readlines() offFile.close() with open(save_off_path,'w') as f: f.writelines(lines[0:2]) params = lines[1].split(' ') nVert = int(params[0]) for i in range(2,nVert+2): coord = lines[i].split(' ') x = (float(coord[0])-centroid[0,0])/furthest_distance y = (float(coord[1])-centroid[0,1])/furthest_distance z = (float(coord[2])-centroid[0,2])/furthest_distance f.write('%.6f %.6f %.6f\n'%(x,y,z)) f.writelines(lines[nVert+2:]) def recalculateNormal_fn(file_path): SAVE_ROOT_PATH='../../data/ModelNet10_poisson_normal' phase = file_path.split('/')[-2] name = file_path.split('/')[-1] xyz_name = file_path.split('/')[-1][:-4] + "_p.xyz" xyz_normal_name = file_path.split('/')[-1].replace("off", "xyz") if os.path.exists(os.path.join(SAVE_ROOT_PATH, '2048_nonuniform', phase, name)): return data_1024 = np.loadtxt(os.path.join('../../ModelNet10', '1024', phase, name)) data_2048 = np.loadtxt(os.path.join('../../ModelNet10', '2048', phase, name)) data_4096 = np.loadtxt(os.path.join('../../ModelNet10', '4096', phase, name)) data_8196 = np.loadtxt(os.path.join('../../ModelNet10', '20000~', phase, name)) data = np.concatenate((data_1024, data_2048, data_4096,data_8196), axis=0) np.savetxt(xyz_name, data) normal_cmd = 'meshlabserver -i %s -o %s -s ../../third_party/calculate_normal.mlx -om vn' % ( xyz_name, xyz_normal_name) if os.system(normal_cmd): print "cannot calculate normal file: %s" % (file_path) return 1 data_ori = np.loadtxt(xyz_normal_name) data_ori = data_ori[0:1024*7,:] data = data_ori[:, 0:3] centroid = np.mean(data, axis=0, keepdims=True) data = data - centroid furthest_distance = np.amax(np.sqrt(np.sum(abs(data) ** 2, axis=-1))) data = data / furthest_distance data_ori[:, 0:3] = data data_1024 = data_ori[0:1024 * 1, :] data_2048 = data_ori[1024 * 1:1024 * 3, :] data_4096 = data_ori[1024 * 3:1024 * 7, :] #generate nonuniform data idx_nonuniform_2048 = nonuniform_sampling(num=len(data_4096), sample_num=2048) data_nonuniform_2048 = data_4096[idx_nonuniform_2048, ...] idx_nonuniform_1024 = nonuniform_sampling(num=len(data_4096), sample_num=1024) data_nonuniform_1024 = data_4096[idx_nonuniform_1024, ...] # save path_1024 = os.path.join(SAVE_ROOT_PATH, '1024', phase, name) path_2048 = os.path.join(SAVE_ROOT_PATH, '2048', phase, name) path_4096 = os.path.join(SAVE_ROOT_PATH, '4096', phase, name) path_nonuniform_1024 = os.path.join(SAVE_ROOT_PATH, '1024_nonuniform', phase, name) path_nonuniform_2048 = os.path.join(SAVE_ROOT_PATH, '2048_nonuniform', phase, name) if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '1024', phase)): os.makedirs(os.path.join(SAVE_ROOT_PATH, '1024', phase)) if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '2048', phase)): os.makedirs(os.path.join(SAVE_ROOT_PATH, '2048', phase)) if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '4096', phase)): os.makedirs(os.path.join(SAVE_ROOT_PATH, '4096', phase)) if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '1024_nonuniform', phase)): os.makedirs(os.path.join(SAVE_ROOT_PATH, '1024_nonuniform', phase)) if not os.path.exists(os.path.join(SAVE_ROOT_PATH, '2048_nonuniform', phase)): os.makedirs(os.path.join(SAVE_ROOT_PATH, '2048_nonuniform', phase)) np.savetxt(path_1024, data_1024, fmt='%.6f') np.savetxt(path_2048, data_2048, fmt='%.6f') np.savetxt(path_4096, data_4096, fmt='%.6f') np.savetxt(path_nonuniform_1024, data_nonuniform_1024, fmt='%.6f') np.savetxt(path_nonuniform_2048, data_nonuniform_2048, fmt='%.6f') os.remove(xyz_name) os.remove(xyz_normal_name) return def fix_off_file(filepath): with open(filepath,'r') as f: line = f.readline() if line=='OFF\n': return print filepath lines = f.readlines() nums = line.split(' ') n1 = nums[0][3:] n2 = nums[1] n3 = nums[2] with open(filepath,'w') as f: f.write('OFF\n%s %s %s'%(n1,n2,n3)) f.writelines(lines) def possion_sample_fn(phase='train'): file_list = glob.glob(os.path.join('../../data/ModelNet10', '*', phase, '*.off')) file_list.sort() new_file_list = [] for item in file_list: name = item.split('/')[-1][:-3]+"xyz" xyz_name = os.path.join(SAVE_ROOT_PATH, '20000~', phase,name) if not os.path.exists(xyz_name): new_file_list.append(item) print('Got %d files in modelnet10.' % (len(new_file_list))) pool = ThreadPool(8) pool.map(possion_sample_fn, new_file_list) def recalculateNormal(phase='train'): file_list = glob.glob(os.path.join('../../ModelNet10/1024',phase,'*.xyz')) file_list.sort() print('Got %d files in modelnet10.' % (len(file_list))) pool = ThreadPool(1) pool.map(recalculateNormal_fn, file_list) def nonuniformsample_from_pointcloud(phase='train'): file_list = glob.glob(os.path.join('../../data/surface_with_area/pcl_4096~','*.xyz')) file_list.sort() print('Got %d files in modelnet10.' % (len(file_list))) for item in file_list: tmp(item) def save_h52(save_names = ['poisson_4096','poisson_2048']): h5_filename = '/home/lqyu/workspace/PointSR/data/Patches_tt.h5' file_names1 = glob.glob(os.path.join('/home/lqyu/server/proj49/PointSR_data/train_data/SHREC',save_names[0],'*.xyz')) file_names1.sort() #select which data to save select_file_names = [] for name in file_names1: id = int(name.split('/')[-1].split('_')[0][1:]) if id<1000: select_file_names.append(name) ##read data names = [] catetogy = len(save_names) data = [[] for item in range(catetogy)] # for item in tqdm(select_file_names): # path1 = os.path.join('/home/lqyu/server/proj49/PointSR_data/train_data/poisson2_4096', item.split('/')[-1]) # path2 = os.path.join('/home/lqyu/server/proj49/PointSR_data/train_data/poisson_1024', item.split('/')[-1]) # path3 = os.path.join('/home/lqyu/server/proj49/PointSR_data/train_data/MC_4096', item.split('/')[-1]) # # if os.path.exists(item) and os.path.exists(path1) and os.path.exists(path2): # tmp_data = np.loadtxt(item) # data[0].append(tmp_data) # # tmp_data = np.loadtxt(path1) # data[1].append(tmp_data) # # tmp_data = np.loadtxt(path2) # data[2].append(tmp_data) # # tmp_data = np.loadtxt(path3) # data[3].append(tmp_data) # names.append(item) # else: # print item for item in tqdm(select_file_names): item_data = [] for i in range(catetogy): path = item.replace(save_names[0],save_names[i]) tmp_data = np.loadtxt(path) centroid = np.mean(tmp_data[:, 0:3], axis=0, keepdims=True) tmp_data[:, 0:3] = tmp_data[:, 0:3] - centroid furthest_distance = np.amax(np.sqrt(np.sum(abs(tmp_data[:, 0:3]) ** 2, axis=-1))) tmp_data[:, 0:3] = tmp_data[:, 0:3] / furthest_distance item_data.append(tmp_data) if len(item_data)==catetogy: names.append(item) for i in range(catetogy): data[i].append(item_data[i]) for i in range(catetogy-1): assert len(data[i])==len(data[i+1]) assert len(data[i])==len(names) print len(names) h5_fout = h5py.File(h5_filename) for i in range(catetogy): h5_fout.create_dataset( save_names[i], data=data[i], compression='gzip', compression_opts=4, dtype=np.float32) string_dt = h5py.special_dtype(vlen=str) h5_fout.create_dataset( 'name', data=names, compression='gzip', compression_opts=1, dtype=string_dt) h5_fout.close() def save_h5(h5_filename,save_names = ['patch_poisson_4096','patch_poisson_1024_nonuniform']): file_names = os.listdir(os.path.join('',save_names[0])) file_names.sort() #select which data to save test_name = ['nicolo','vaselion','bunny','gril'] select_file_names = [] for name in file_names: mark = False for tt in test_name: if tt in name: mark = True break if mark==False: select_file_names.append(name) ##read data names = [] catetogy = len(save_names) data = [[] for item in range(catetogy)] for item in tqdm(select_file_names): item_data = [] for i in range(catetogy): path = os.path.join(pointcloud_path,save_names[i],item) tmp_data = np.loadtxt(path) if tmp_data.shape[0]==int(save_names[i].split('_')[0]): item_data.append(tmp_data) if len(item_data)==catetogy: names.append(item) for i in range(catetogy): data[i].append(item_data[i]) for i in range(catetogy-1): assert len(data[i])==len(data[i+1]) assert len(data[i])==len(names) h5_fout = h5py.File(h5_filename) for i in range(catetogy): h5_fout.create_dataset( save_names[i], data=data[i], compression='gzip', compression_opts=4, dtype=np.float32) string_dt = h5py.special_dtype(vlen=str) h5_fout.create_dataset( 'name', data=names, compression='gzip', compression_opts=1, dtype=string_dt) h5_fout.close() def load_h5(h5_filename): f = h5py.File(h5_filename) data_1024 = f['data_1024'][:] data_4096 = f['data_4096'][:] name = f['name'][:] return (data_1024,data_4096) if __name__ == '__main__': #nonuniform_subsample_from_pointcloud('/home/lqyu/workspace/PointSR/data/ModelNet10_pc',sampling_num=2048) pointcloud_path = '/home/lqyu/workspace/PointSR/data/Patches' h5_filename = '/home/lqyu/workspace/PointSR/data/Patches_tt.h5' #nonuniformsample_from_pointcloud_fn('aa') #nonuniformsample_from_pointcloud_fn('ss') #nonuniformsample_from_pointcloud('train') #calculateNormal('train') #load_h5('/home/lqyu/workspace/PointSR/data/ModelNet10_tt.h5') save_h52() #normal_off_file() ================================================ FILE: code/utils/off2obj.py ================================================ #! /usr/bin/python # Written by John Bowers # http://johnsresearch.wordpress.com # 2009 # You are welcome to use this however you want, this is public domain. import sys if len(sys.argv) == 3: off_path = sys.argv[1] obj_path = sys.argv[2] else: print "USAGE: off2obj.py [path to mesh] [output path]" sys.exit(0) # Class Mesh represents a mesh by a vertex list and a face list # and has a method loadFromOffFile to load the Mesh data from an # OFF file. class Mesh: """Class Represents a Mesh by (V, F)""" def __init__(self): self.verts = [] self.faces = [] self.nVerts = 0 self.nFaces = 0 self.edges = None def writeToObjFile(self, pathToObjFile): objFile = open(pathToObjFile, 'w') objFile.write("# off2obj OBJ File") objFile.write("# http://johnsresearch.wordpress.com\n") for vert in self.verts: objFile.write("v ") objFile.write(str(vert[0])) objFile.write(" ") objFile.write(str(vert[1])) objFile.write(" ") objFile.write(str(vert[2])) objFile.write("\n") objFile.write("s off\n") for face in self.faces: objFile.write("f ") objFile.write(str(face[0]+1)) objFile.write(" ") objFile.write(str(face[1]+1)) objFile.write(" ") objFile.write(str(face[2]+1)) objFile.write("\n") objFile.close() def loadFromOffFile(self, pathToOffFile): #Reset this mesh: self.verts = [] self.faces = [] self.nVerts = 0 self.nFaces = 0 #Open the file for reading: offFile = open(pathToOffFile, 'r') lines = offFile.readlines() #Read the number of verts and faces params = lines[1].split() self.nVerts = int(params[0]) self.nFaces = int(params[1]) #split the remaining lines into vert and face arrays vertLines = lines[2:2+self.nVerts] faceLines = lines[2+self.nVerts:2+self.nVerts+self.nFaces] #Create the verts array for vertLine in vertLines: XYZ = vertLine.split() self.verts.append([float(XYZ[0]), float(XYZ[1]), float(XYZ[2])]) #Create the faces array for faceLine in faceLines: XYZ = faceLine.split() self.faces.append((int(XYZ[1]), int(XYZ[2]), int(XYZ[3]))) if not(int(XYZ[0]) == 3): print "ERROR: This OFF loader can only handle meshes with 3 vertex faces." print "A face with", XYZ[0], "vertices is included in the file. Exiting." offFile.close() sys.exit(0) #Cleanup offFile.close() def edgeList(self): if not(self.edges == None): return self.edges self.edges = [] for i in range(0, self.nVerts): self.edges.append([]) for face in self.faces: i = face[0] j = face[1] k = face[2] if not(j in self.edges[i]): self.edges[i].append(j) if not(k in self.edges[i]): self.edges[i].append(k) if not(i in self.edges[j]): self.edges[j].append(i) if not(k in self.edges[j]): self.edges[j].append(k) if not(i in self.edges[k]): self.edges[k].append(i) if not(j in self.edges[k]): self.edges[k].append(j) return self.edges """ Main Program """ mesh = Mesh() mesh.loadFromOffFile(off_path) mesh.writeToObjFile(obj_path) ================================================ FILE: code/utils/pc_util.py ================================================ """ Utility functions for processing point clouds. Author: Charles R. Qi, Hao Su Date: November 2016 """ import os import sys from matplotlib import pyplot as plt from matplotlib import colors BASE_DIR = os.path.dirname(os.path.abspath(__file__)) sys.path.append(BASE_DIR) # Draw point cloud from eulerangles import euler2mat # Point cloud IO import numpy as np from plyfile import PlyData, PlyElement # ---------------------------------------- # Point Cloud/Volume Conversions # ---------------------------------------- def point_cloud_to_volume_batch(point_clouds, vsize=12, radius=1.0, flatten=True): """ Input is BxNx3 batch of point cloud Output is Bx(vsize^3) """ vol_list = [] for b in range(point_clouds.shape[0]): vol = point_cloud_to_volume(np.squeeze(point_clouds[b,:,:]), vsize, radius) if flatten: vol_list.append(vol.flatten()) else: vol_list.append(np.expand_dims(np.expand_dims(vol, -1), 0)) if flatten: return np.vstack(vol_list) else: return np.concatenate(vol_list, 0) def point_cloud_to_volume(points, vsize, radius=1.0): """ input is Nx3 points. output is vsize*vsize*vsize assumes points are in range [-radius, radius] """ vol = np.zeros((vsize,vsize,vsize)) voxel = 2*radius/float(vsize) locations = (points + radius)/voxel locations = locations.astype(int) vol[locations[:,0],locations[:,1],locations[:,2]] = 1.0 return vol #a = np.zeros((16,1024,3)) #print point_cloud_to_volume_batch(a, 12, 1.0, False).shape def volume_to_point_cloud(vol): """ vol is occupancy grid (value = 0 or 1) of size vsize*vsize*vsize return Nx3 numpy array. """ vsize = vol.shape[0] assert(vol.shape[1] == vsize and vol.shape[1] == vsize) points = [] for a in range(vsize): for b in range(vsize): for c in range(vsize): if vol[a,b,c] == 1: points.append(np.array([a,b,c])) if len(points) == 0: return np.zeros((0,3)) points = np.vstack(points) return points # ---------------------------------------- # Point cloud IO # ---------------------------------------- def read_ply(filename): """ read XYZ point cloud from filename PLY file """ plydata = PlyData.read(filename) pc = plydata['vertex'].data pc_array = np.array([[x, y, z] for x,y,z in pc]) return pc_array def write_ply(points, filename, text=True): """ input: Nx3, write points to filename as PLY format. """ points = [(points[i,0], points[i,1], points[i,2]) for i in range(points.shape[0])] vertex = np.array(points, dtype=[('x', 'f4'), ('y', 'f4'),('z', 'f4')]) el = PlyElement.describe(vertex, 'vertex', comments=['vertices']) PlyData([el], text=text).write(filename) # ---------------------------------------- # Simple Point cloud and Volume Renderers # ---------------------------------------- def draw_point_cloud(input_points, canvasSize=500, space=240, diameter=10, xrot=0, yrot=0, zrot=0, switch_xyz=[0,1,2], normalize=True): """ Render point cloud to image with alpha channel. Input: points: Nx3 numpy array (+y is up direction) Output: gray image as numpy array of size canvasSizexcanvasSize """ canvasSizeX = canvasSize canvasSizeY = canvasSize image = np.zeros((canvasSizeX, canvasSizeY)) if input_points is None or input_points.shape[0] == 0: return image points = input_points[:, switch_xyz] M = euler2mat(zrot, yrot, xrot) points = (np.dot(M, points.transpose())).transpose() # Normalize the point cloud # We normalize scale to fit points in a unit sphere if normalize: centroid = np.mean(points, axis=0) points -= centroid furthest_distance = np.max(np.sqrt(np.sum(abs(points)**2,axis=-1))) points /= furthest_distance # Pre-compute the Gaussian disk radius = (diameter-1)/2.0 disk = np.zeros((diameter, diameter)) for i in range(diameter): for j in range(diameter): if (i - radius) * (i-radius) + (j-radius) * (j-radius) <= radius * radius: disk[i, j] = np.exp((-(i-radius)**2 - (j-radius)**2)/(radius**2)) mask = np.argwhere(disk > 0) dx = mask[:, 0] dy = mask[:, 1] dv = disk[disk > 0] # Order points by z-buffer zorder = np.argsort(points[:, 2]) points = points[zorder, :] points[:, 2] = (points[:, 2] - np.min(points[:, 2])) / (np.max(points[:, 2] - np.min(points[:, 2]))) max_depth = np.max(points[:, 2]) for i in range(points.shape[0]): j = points.shape[0] - i - 1 x = points[j, 0] y = points[j, 1] xc = canvasSizeX/2 + (x*space) yc = canvasSizeY/2 + (y*space) xc = int(np.round(xc)) yc = int(np.round(yc)) px = dx + xc py = dy + yc #image[px, py] = image[px, py] * 0.7 + dv * (max_depth - points[j, 2]) * 0.3 image[px, py] = image[px, py] * 0.7 + dv * 0.3 val = np.max(image) val = np.percentile(image,99.9) image = image / val mask = image==0 image[image>1.0]=1.0 image = 1.0-image #image = np.expand_dims(image, axis=-1) #image = np.concatenate((image*0.3+0.7,np.ones_like(image), np.ones_like(image)), axis=2) #image = colors.hsv_to_rgb(image) image[mask]=1.0 return image def point_cloud_three_views(points,diameter=5): """ input points Nx3 numpy array (+y is up direction). return an numpy array gray image of size 500x1500. """ # +y is up direction # xrot is azimuth # yrot is in-plane # zrot is elevation # img1 = draw_point_cloud(points, xrot=90/180.0*np.pi, yrot=0/180.0*np.pi, zrot=0/180.0*np.pi,diameter=diameter) # img2 = draw_point_cloud(points, xrot=180/180.0*np.pi, yrot=0/180.0*np.pi, zrot=0/180.0*np.pi,diameter=diameter) # img3 = draw_point_cloud(points, xrot=0/180.0*np.pi, yrot=-90/180.0*np.pi, zrot=0/180.0*np.pi,diameter=diameter) # image_large = np.concatenate([img1, img2, img3], 1) img1 = draw_point_cloud(points, zrot=110 / 180.0 * np.pi, xrot=135 / 180.0 * np.pi, yrot=0 / 180.0 * np.pi,diameter=diameter) img2 = draw_point_cloud(points, zrot=70 / 180.0 * np.pi, xrot=135 / 180.0 * np.pi, yrot=0 / 180.0 * np.pi,diameter=diameter) img3 = draw_point_cloud(points, zrot=180.0 / 180.0 * np.pi, xrot=90 / 180.0 * np.pi, yrot=0 / 180.0 * np.pi,diameter=diameter) image_large = np.concatenate([img1, img2, img3], 1) return image_large from PIL import Image def point_cloud_three_views_demo(): """ Demo for draw_point_cloud function """ points = read_ply('../third_party/mesh_sampling/piano.ply') im_array = point_cloud_three_views(points) img = Image.fromarray(np.uint8(im_array*255.0)) img.save('piano.jpg') if __name__=="__main__": point_cloud_three_views_demo() import matplotlib.pyplot as plt def pyplot_draw_point_cloud(points, output_filename): """ points is a Nx3 numpy array """ fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(points[:,0], points[:,1], points[:,2]) ax.set_xlabel('x') ax.set_ylabel('y') ax.set_zlabel('z') #savefig(output_filename) def pyplot_draw_volume(vol, output_filename): """ vol is of size vsize*vsize*vsize output an image to output_filename """ points = volume_to_point_cloud(vol) pyplot_draw_point_cloud(points, output_filename) ================================================ FILE: code/utils/plyfile.py ================================================ # Copyright 2014 Darsh Ranjan # # This file is part of python-plyfile. # # python-plyfile is free software: you can redistribute it and/or # modify it under the terms of the GNU General Public License as # published by the Free Software Foundation, either version 3 of the # License, or (at your option) any later version. # # python-plyfile is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # General Public License for more details. # # You should have received a copy of the GNU General Public License # along with python-plyfile. If not, see # . from itertools import islice as _islice import numpy as _np from sys import byteorder as _byteorder try: _range = xrange except NameError: _range = range # Many-many relation _data_type_relation = [ ('int8', 'i1'), ('char', 'i1'), ('uint8', 'u1'), ('uchar', 'b1'), ('uchar', 'u1'), ('int16', 'i2'), ('short', 'i2'), ('uint16', 'u2'), ('ushort', 'u2'), ('int32', 'i4'), ('int', 'i4'), ('uint32', 'u4'), ('uint', 'u4'), ('float32', 'f4'), ('float', 'f4'), ('float64', 'f8'), ('double', 'f8') ] _data_types = dict(_data_type_relation) _data_type_reverse = dict((b, a) for (a, b) in _data_type_relation) _types_list = [] _types_set = set() for (_a, _b) in _data_type_relation: if _a not in _types_set: _types_list.append(_a) _types_set.add(_a) if _b not in _types_set: _types_list.append(_b) _types_set.add(_b) _byte_order_map = { 'ascii': '=', 'binary_little_endian': '<', 'binary_big_endian': '>' } _byte_order_reverse = { '<': 'binary_little_endian', '>': 'binary_big_endian' } _native_byte_order = {'little': '<', 'big': '>'}[_byteorder] def _lookup_type(type_str): if type_str not in _data_type_reverse: try: type_str = _data_types[type_str] except KeyError: raise ValueError("field type %r not in %r" % (type_str, _types_list)) return _data_type_reverse[type_str] def _split_line(line, n): fields = line.split(None, n) if len(fields) == n: fields.append('') assert len(fields) == n + 1 return fields def make2d(array, cols=None, dtype=None): ''' Make a 2D array from an array of arrays. The `cols' and `dtype' arguments can be omitted if the array is not empty. ''' if (cols is None or dtype is None) and not len(array): raise RuntimeError("cols and dtype must be specified for empty " "array") if cols is None: cols = len(array[0]) if dtype is None: dtype = array[0].dtype return _np.fromiter(array, [('_', dtype, (cols,))], count=len(array))['_'] class PlyParseError(Exception): ''' Raised when a PLY file cannot be parsed. The attributes `element', `row', `property', and `message' give additional information. ''' def __init__(self, message, element=None, row=None, prop=None): self.message = message self.element = element self.row = row self.prop = prop s = '' if self.element: s += 'element %r: ' % self.element.name if self.row is not None: s += 'row %d: ' % self.row if self.prop: s += 'property %r: ' % self.prop.name s += self.message Exception.__init__(self, s) def __repr__(self): return ('PlyParseError(%r, element=%r, row=%r, prop=%r)' % self.message, self.element, self.row, self.prop) class PlyData(object): ''' PLY file header and data. A PlyData instance is created in one of two ways: by the static method PlyData.read (to read a PLY file), or directly from __init__ given a sequence of elements (which can then be written to a PLY file). ''' def __init__(self, elements=[], text=False, byte_order='=', comments=[], obj_info=[]): ''' elements: sequence of PlyElement instances. text: whether the resulting PLY file will be text (True) or binary (False). byte_order: '<' for little-endian, '>' for big-endian, or '=' for native. This is only relevant if `text' is False. comments: sequence of strings that will be placed in the header between the 'ply' and 'format ...' lines. obj_info: like comments, but will be placed in the header with "obj_info ..." instead of "comment ...". ''' if byte_order == '=' and not text: byte_order = _native_byte_order self.byte_order = byte_order self.text = text self.comments = list(comments) self.obj_info = list(obj_info) self.elements = elements def _get_elements(self): return self._elements def _set_elements(self, elements): self._elements = tuple(elements) self._index() elements = property(_get_elements, _set_elements) def _get_byte_order(self): return self._byte_order def _set_byte_order(self, byte_order): if byte_order not in ['<', '>', '=']: raise ValueError("byte order must be '<', '>', or '='") self._byte_order = byte_order byte_order = property(_get_byte_order, _set_byte_order) def _index(self): self._element_lookup = dict((elt.name, elt) for elt in self._elements) if len(self._element_lookup) != len(self._elements): raise ValueError("two elements with same name") @staticmethod def _parse_header(stream): ''' Parse a PLY header from a readable file-like stream. ''' lines = [] comments = {'comment': [], 'obj_info': []} while True: line = stream.readline().decode('ascii').strip() fields = _split_line(line, 1) if fields[0] == 'end_header': break elif fields[0] in comments.keys(): lines.append(fields) else: lines.append(line.split()) a = 0 if lines[a] != ['ply']: raise PlyParseError("expected 'ply'") a += 1 while lines[a][0] in comments.keys(): comments[lines[a][0]].append(lines[a][1]) a += 1 if lines[a][0] != 'format': raise PlyParseError("expected 'format'") if lines[a][2] != '1.0': raise PlyParseError("expected version '1.0'") if len(lines[a]) != 3: raise PlyParseError("too many fields after 'format'") fmt = lines[a][1] if fmt not in _byte_order_map: raise PlyParseError("don't understand format %r" % fmt) byte_order = _byte_order_map[fmt] text = fmt == 'ascii' a += 1 while a < len(lines) and lines[a][0] in comments.keys(): comments[lines[a][0]].append(lines[a][1]) a += 1 return PlyData(PlyElement._parse_multi(lines[a:]), text, byte_order, comments['comment'], comments['obj_info']) @staticmethod def read(stream): ''' Read PLY data from a readable file-like object or filename. ''' (must_close, stream) = _open_stream(stream, 'read') try: data = PlyData._parse_header(stream) for elt in data: elt._read(stream, data.text, data.byte_order) finally: if must_close: stream.close() return data def write(self, stream): ''' Write PLY data to a writeable file-like object or filename. ''' (must_close, stream) = _open_stream(stream, 'write') try: stream.write(self.header.encode('ascii')) stream.write(b'\r\n') for elt in self: elt._write(stream, self.text, self.byte_order) finally: if must_close: stream.close() @property def header(self): ''' Provide PLY-formatted metadata for the instance. ''' lines = ['ply'] if self.text: lines.append('format ascii 1.0') else: lines.append('format ' + _byte_order_reverse[self.byte_order] + ' 1.0') # Some information is lost here, since all comments are placed # between the 'format' line and the first element. for c in self.comments: lines.append('comment ' + c) for c in self.obj_info: lines.append('obj_info ' + c) lines.extend(elt.header for elt in self.elements) lines.append('end_header') return '\r\n'.join(lines) def __iter__(self): return iter(self.elements) def __len__(self): return len(self.elements) def __contains__(self, name): return name in self._element_lookup def __getitem__(self, name): return self._element_lookup[name] def __str__(self): return self.header def __repr__(self): return ('PlyData(%r, text=%r, byte_order=%r, ' 'comments=%r, obj_info=%r)' % (self.elements, self.text, self.byte_order, self.comments, self.obj_info)) def _open_stream(stream, read_or_write): if hasattr(stream, read_or_write): return (False, stream) try: return (True, open(stream, read_or_write[0] + 'b')) except TypeError: raise RuntimeError("expected open file or filename") class PlyElement(object): ''' PLY file element. A client of this library doesn't normally need to instantiate this directly, so the following is only for the sake of documenting the internals. Creating a PlyElement instance is generally done in one of two ways: as a byproduct of PlyData.read (when reading a PLY file) and by PlyElement.describe (before writing a PLY file). ''' def __init__(self, name, properties, count, comments=[]): ''' This is not part of the public interface. The preferred methods of obtaining PlyElement instances are PlyData.read (to read from a file) and PlyElement.describe (to construct from a numpy array). ''' self._name = str(name) self._check_name() self._count = count self._properties = tuple(properties) self._index() self.comments = list(comments) self._have_list = any(isinstance(p, PlyListProperty) for p in self.properties) @property def count(self): return self._count def _get_data(self): return self._data def _set_data(self, data): self._data = data self._count = len(data) self._check_sanity() data = property(_get_data, _set_data) def _check_sanity(self): for prop in self.properties: if prop.name not in self._data.dtype.fields: raise ValueError("dangling property %r" % prop.name) def _get_properties(self): return self._properties def _set_properties(self, properties): self._properties = tuple(properties) self._check_sanity() self._index() properties = property(_get_properties, _set_properties) def _index(self): self._property_lookup = dict((prop.name, prop) for prop in self._properties) if len(self._property_lookup) != len(self._properties): raise ValueError("two properties with same name") def ply_property(self, name): return self._property_lookup[name] @property def name(self): return self._name def _check_name(self): if any(c.isspace() for c in self._name): msg = "element name %r contains spaces" % self._name raise ValueError(msg) def dtype(self, byte_order='='): ''' Return the numpy dtype of the in-memory representation of the data. (If there are no list properties, and the PLY format is binary, then this also accurately describes the on-disk representation of the element.) ''' return [(prop.name, prop.dtype(byte_order)) for prop in self.properties] @staticmethod def _parse_multi(header_lines): ''' Parse a list of PLY element definitions. ''' elements = [] while header_lines: (elt, header_lines) = PlyElement._parse_one(header_lines) elements.append(elt) return elements @staticmethod def _parse_one(lines): ''' Consume one element definition. The unconsumed input is returned along with a PlyElement instance. ''' a = 0 line = lines[a] if line[0] != 'element': raise PlyParseError("expected 'element'") if len(line) > 3: raise PlyParseError("too many fields after 'element'") if len(line) < 3: raise PlyParseError("too few fields after 'element'") (name, count) = (line[1], int(line[2])) comments = [] properties = [] while True: a += 1 if a >= len(lines): break if lines[a][0] == 'comment': comments.append(lines[a][1]) elif lines[a][0] == 'property': properties.append(PlyProperty._parse_one(lines[a])) else: break return (PlyElement(name, properties, count, comments), lines[a:]) @staticmethod def describe(data, name, len_types={}, val_types={}, comments=[]): ''' Construct a PlyElement from an array's metadata. len_types and val_types can be given as mappings from list property names to type strings (like 'u1', 'f4', etc., or 'int8', 'float32', etc.). These can be used to define the length and value types of list properties. List property lengths always default to type 'u1' (8-bit unsigned integer), and value types default to 'i4' (32-bit integer). ''' if not isinstance(data, _np.ndarray): raise TypeError("only numpy arrays are supported") if len(data.shape) != 1: raise ValueError("only one-dimensional arrays are " "supported") count = len(data) properties = [] descr = data.dtype.descr for t in descr: if not isinstance(t[1], str): raise ValueError("nested records not supported") if not t[0]: raise ValueError("field with empty name") if len(t) != 2 or t[1][1] == 'O': # non-scalar field, which corresponds to a list # property in PLY. if t[1][1] == 'O': if len(t) != 2: raise ValueError("non-scalar object fields not " "supported") len_str = _data_type_reverse[len_types.get(t[0], 'u1')] if t[1][1] == 'O': val_type = val_types.get(t[0], 'i4') val_str = _lookup_type(val_type) else: val_str = _lookup_type(t[1][1:]) prop = PlyListProperty(t[0], len_str, val_str) else: val_str = _lookup_type(t[1][1:]) prop = PlyProperty(t[0], val_str) properties.append(prop) elt = PlyElement(name, properties, count, comments) elt.data = data return elt def _read(self, stream, text, byte_order): ''' Read the actual data from a PLY file. ''' if text: self._read_txt(stream) else: if self._have_list: # There are list properties, so a simple load is # impossible. self._read_bin(stream, byte_order) else: # There are no list properties, so loading the data is # much more straightforward. self._data = _np.fromfile(stream, self.dtype(byte_order), self.count) if len(self._data) < self.count: k = len(self._data) del self._data raise PlyParseError("early end-of-file", self, k) self._check_sanity() def _write(self, stream, text, byte_order): ''' Write the data to a PLY file. ''' if text: self._write_txt(stream) else: if self._have_list: # There are list properties, so serialization is # slightly complicated. self._write_bin(stream, byte_order) else: # no list properties, so serialization is # straightforward. self.data.astype(self.dtype(byte_order), copy=False).tofile(stream) def _read_txt(self, stream): ''' Load a PLY element from an ASCII-format PLY file. The element may contain list properties. ''' self._data = _np.empty(self.count, dtype=self.dtype()) k = 0 for line in _islice(iter(stream.readline, b''), self.count): fields = iter(line.strip().split()) for prop in self.properties: try: self._data[prop.name][k] = prop._from_fields(fields) except StopIteration: raise PlyParseError("early end-of-line", self, k, prop) except ValueError: raise PlyParseError("malformed input", self, k, prop) try: next(fields) except StopIteration: pass else: raise PlyParseError("expected end-of-line", self, k) k += 1 if k < self.count: del self._data raise PlyParseError("early end-of-file", self, k) def _write_txt(self, stream): ''' Save a PLY element to an ASCII-format PLY file. The element may contain list properties. ''' for rec in self.data: fields = [] for prop in self.properties: fields.extend(prop._to_fields(rec[prop.name])) _np.savetxt(stream, [fields], '%.18g', newline='\r\n') def _read_bin(self, stream, byte_order): ''' Load a PLY element from a binary PLY file. The element may contain list properties. ''' self._data = _np.empty(self.count, dtype=self.dtype(byte_order)) for k in _range(self.count): for prop in self.properties: try: self._data[prop.name][k] = \ prop._read_bin(stream, byte_order) except StopIteration: raise PlyParseError("early end-of-file", self, k, prop) def _write_bin(self, stream, byte_order): ''' Save a PLY element to a binary PLY file. The element may contain list properties. ''' for rec in self.data: for prop in self.properties: prop._write_bin(rec[prop.name], stream, byte_order) @property def header(self): ''' Format this element's metadata as it would appear in a PLY header. ''' lines = ['element %s %d' % (self.name, self.count)] # Some information is lost here, since all comments are placed # between the 'element' line and the first property definition. for c in self.comments: lines.append('comment ' + c) lines.extend(list(map(str, self.properties))) return '\r\n'.join(lines) def __getitem__(self, key): return self.data[key] def __setitem__(self, key, value): self.data[key] = value def __str__(self): return self.header def __repr__(self): return ('PlyElement(%r, %r, count=%d, comments=%r)' % (self.name, self.properties, self.count, self.comments)) class PlyProperty(object): ''' PLY property description. This class is pure metadata; the data itself is contained in PlyElement instances. ''' def __init__(self, name, val_dtype): self._name = str(name) self._check_name() self.val_dtype = val_dtype def _get_val_dtype(self): return self._val_dtype def _set_val_dtype(self, val_dtype): self._val_dtype = _data_types[_lookup_type(val_dtype)] val_dtype = property(_get_val_dtype, _set_val_dtype) @property def name(self): return self._name def _check_name(self): if any(c.isspace() for c in self._name): msg = "Error: property name %r contains spaces" % self._name raise RuntimeError(msg) @staticmethod def _parse_one(line): assert line[0] == 'property' if line[1] == 'list': if len(line) > 5: raise PlyParseError("too many fields after " "'property list'") if len(line) < 5: raise PlyParseError("too few fields after " "'property list'") return PlyListProperty(line[4], line[2], line[3]) else: if len(line) > 3: raise PlyParseError("too many fields after " "'property'") if len(line) < 3: raise PlyParseError("too few fields after " "'property'") return PlyProperty(line[2], line[1]) def dtype(self, byte_order='='): ''' Return the numpy dtype description for this property (as a tuple of strings). ''' return byte_order + self.val_dtype def _from_fields(self, fields): ''' Parse from generator. Raise StopIteration if the property could not be read. ''' return _np.dtype(self.dtype()).type(next(fields)) def _to_fields(self, data): ''' Return generator over one item. ''' yield _np.dtype(self.dtype()).type(data) def _read_bin(self, stream, byte_order): ''' Read data from a binary stream. Raise StopIteration if the property could not be read. ''' try: return _np.fromfile(stream, self.dtype(byte_order), 1)[0] except IndexError: raise StopIteration def _write_bin(self, data, stream, byte_order): ''' Write data to a binary stream. ''' _np.dtype(self.dtype(byte_order)).type(data).tofile(stream) def __str__(self): val_str = _data_type_reverse[self.val_dtype] return 'property %s %s' % (val_str, self.name) def __repr__(self): return 'PlyProperty(%r, %r)' % (self.name, _lookup_type(self.val_dtype)) class PlyListProperty(PlyProperty): ''' PLY list property description. ''' def __init__(self, name, len_dtype, val_dtype): PlyProperty.__init__(self, name, val_dtype) self.len_dtype = len_dtype def _get_len_dtype(self): return self._len_dtype def _set_len_dtype(self, len_dtype): self._len_dtype = _data_types[_lookup_type(len_dtype)] len_dtype = property(_get_len_dtype, _set_len_dtype) def dtype(self, byte_order='='): ''' List properties always have a numpy dtype of "object". ''' return '|O' def list_dtype(self, byte_order='='): ''' Return the pair (len_dtype, val_dtype) (both numpy-friendly strings). ''' return (byte_order + self.len_dtype, byte_order + self.val_dtype) def _from_fields(self, fields): (len_t, val_t) = self.list_dtype() n = int(_np.dtype(len_t).type(next(fields))) data = _np.loadtxt(list(_islice(fields, n)), val_t, ndmin=1) if len(data) < n: raise StopIteration return data def _to_fields(self, data): ''' Return generator over the (numerical) PLY representation of the list data (length followed by actual data). ''' (len_t, val_t) = self.list_dtype() data = _np.asarray(data, dtype=val_t).ravel() yield _np.dtype(len_t).type(data.size) for x in data: yield x def _read_bin(self, stream, byte_order): (len_t, val_t) = self.list_dtype(byte_order) try: n = _np.fromfile(stream, len_t, 1)[0] except IndexError: raise StopIteration data = _np.fromfile(stream, val_t, n) if len(data) < n: raise StopIteration return data def _write_bin(self, data, stream, byte_order): ''' Write data to a binary stream. ''' (len_t, val_t) = self.list_dtype(byte_order) data = _np.asarray(data, dtype=val_t).ravel() _np.array(data.size, dtype=len_t).tofile(stream) data.tofile(stream) def __str__(self): len_str = _data_type_reverse[self.len_dtype] val_str = _data_type_reverse[self.val_dtype] return 'property list %s %s %s' % (len_str, val_str, self.name) def __repr__(self): return ('PlyListProperty(%r, %r, %r)' % (self.name, _lookup_type(self.len_dtype), _lookup_type(self.val_dtype))) ================================================ FILE: code/utils/pointnet_util.py ================================================ """ PointNet++ Layers Author: Charles R. Qi Date: November 2017 """ import os import sys from tf_ops.sampling.tf_sampling import farthest_point_sample, gather_point from tf_ops.grouping.tf_grouping import query_ball_point, group_point, knn_point from tf_ops.interpolation.tf_interpolate import three_nn, three_interpolate import tensorflow as tf import numpy as np import tf_util2 def sample_and_group(npoint, radius, nsample, xyz, points, tnet_spec=None, knn=False, use_xyz=True): ''' Input: npoint: int32 radius: float32 nsample: int32 xyz: (batch_size, ndataset, 3) TF tensor points: (batch_size, ndataset, channel) TF tensor, if None will just use xyz as points tnet_spec: dict (keys: mlp, mlp2, is_training, bn_decay), if None do not apply tnet knn: bool, if True use kNN instead of radius search use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features Output: new_xyz: (batch_size, npoint, 3) TF tensor new_points: (batch_size, npoint, nsample, 3+channel) TF tensor idx: (batch_size, npoint, nsample) TF tensor, indices of local points as in ndataset points grouped_xyz: (batch_size, npoint, nsample, 3) TF tensor, normalized point XYZs (subtracted by seed point XYZ) in local regions ''' new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz)) # (batch_size, npoint, 3) if knn: _,idx = knn_point(nsample, xyz, new_xyz) else: if np.isscalar(radius): idx, pts_cnt = query_ball_point(radius, nsample, xyz, new_xyz) else: idx_list = [] for radius_one, xyz_one, new_xyz_one in zip(tf.unstack(radius,axis=0), tf.unstack(xyz, axis=0),tf.unstack(new_xyz, axis=0)): idx_one, _ = query_ball_point(radius_one, nsample, tf.expand_dims(xyz_one, axis=0), tf.expand_dims(new_xyz_one, axis=0)) idx_list.append(idx_one) idx = tf.stack(idx_list, axis=0) idx = tf.squeeze(idx, axis=1) grouped_xyz = group_point(xyz, idx) # (batch_size, npoint, nsample, 3) grouped_xyz -= tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1]) # translation normalization if tnet_spec is not None: grouped_xyz = tnet(grouped_xyz, tnet_spec) if points is not None: grouped_points = group_point(points, idx) # (batch_size, npoint, nsample, channel) if use_xyz: # new_points = tf.concat([grouped_xyz, tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1]),grouped_points], axis=-1) # (batch_size, npoint, nample, 3+channel) new_points = tf.concat([grouped_xyz, grouped_points],axis=-1) # (batch_size, npoint, nample, 3+channel) else: new_points = grouped_points else: # new_points = tf.concat([grouped_xyz, tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1])], axis=-1) new_points = grouped_xyz return new_xyz, new_points, idx, grouped_xyz def sample_and_group_all(xyz, points, use_xyz=True): ''' Inputs: xyz: (batch_size, ndataset, 3) TF tensor points: (batch_size, ndataset, channel) TF tensor, if None will just use xyz as points use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features Outputs: new_xyz: (batch_size, 1, 3) as (0,0,0) new_points: (batch_size, 1, ndataset, 3+channel) TF tensor Note: Equivalent to sample_and_group with npoint=1, radius=inf, use (0,0,0) as the centroid ''' batch_size = xyz.get_shape()[0].value nsample = xyz.get_shape()[1].value new_xyz = tf.constant(np.tile(np.array([0,0,0]).reshape((1,1,3)), (batch_size,1,1)),dtype=tf.float32) # (batch_size, 1, 3) idx = tf.constant(np.tile(np.array(range(nsample)).reshape((1,1,nsample)), (batch_size,1,1))) grouped_xyz = tf.reshape(xyz, (batch_size, 1, nsample, 3)) # (batch_size, npoint=1, nsample, 3) if points is not None: if use_xyz: new_points = tf.concat([xyz, points], axis=2) # (batch_size, 16, 259) else: new_points = points new_points = tf.expand_dims(new_points, 1) # (batch_size, 1, 16, 259) else: new_points = grouped_xyz return new_xyz, new_points, idx, grouped_xyz def pointnet_sa_module(xyz, points, npoint, radius, nsample, mlp, mlp2, group_all, is_training, bn_decay, scope, bn=True, ibn=False, pooling='max', tnet_spec=None, knn=False, use_xyz=True): ''' PointNet Set Abstraction (SA) Module Input: xyz: (batch_size, ndataset, 3) TF tensor points: (batch_size, ndataset, channel) TF tensor npoint: int32 -- #points sampled in farthest point sampling radius: float32 -- search radius in local region batch_radius: the size of each object nsample: int32 -- how many points in each local region mlp: list of int32 -- output size for MLP on each point mlp2: list of int32 -- output size for MLP on each region group_all: bool -- group all points into one PC if set true, OVERRIDE npoint, radius and nsample settings use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features Return: new_xyz: (batch_size, npoint, 3) TF tensor new_points: (batch_size, npoint, mlp[-1] or mlp2[-1]) TF tensor idx: (batch_size, npoint, nsample) int32 -- indices for local regions ''' with tf.variable_scope(scope) as sc: if group_all: nsample = xyz.get_shape()[1].value new_xyz, new_points, idx, grouped_xyz = sample_and_group_all(xyz, points, use_xyz) else: new_xyz, new_points, idx, grouped_xyz = sample_and_group(npoint, radius, nsample, xyz, points, tnet_spec, knn, use_xyz) if mlp2 is None: mlp2 = [] for i, num_out_channel in enumerate(mlp): new_points = tf_util2.conv2d(new_points, num_out_channel, [1,1], padding='VALID', stride=[1,1], bn=bn, ibn=ibn, is_training=is_training, scope='conv%d'%(i), bn_decay=bn_decay) if pooling=='avg': new_points = tf.layers.average_pooling2d(new_points, [1,nsample], [1,1], padding='VALID', name='avgpool1') elif pooling=='weighted_avg': with tf.variable_scope('weighted_avg1'): dists = tf.norm(grouped_xyz,axis=-1,ord=2,keep_dims=True) exp_dists = tf.exp(-dists * 5) weights = exp_dists/tf.reduce_sum(exp_dists,axis=2,keep_dims=True) # (batch_size, npoint, nsample, 1) new_points *= weights # (batch_size, npoint, nsample, mlp[-1]) new_points = tf.reduce_sum(new_points, axis=2, keep_dims=True) elif pooling=='max': new_points = tf.reduce_max(new_points, axis=[2], keep_dims=True) elif pooling=='min': new_points = tf.layers.max_pooling2d(-1 * new_points, [1, nsample], [1, 1], padding='VALID',name='minpool1') elif pooling=='max_and_avg': avg_points = tf.layers.max_pooling2d(new_points, [1,nsample], [1,1], padding='VALID', name='maxpool1') max_points = tf.layers.average_pooling2d(new_points, [1,nsample],[1,1], padding='VALID', name='avgpool1') new_points = tf.concat([avg_points, max_points], axis=-1) if mlp2 is None: mlp2 = [] for i, num_out_channel in enumerate(mlp2): new_points = tf_util2.conv2d(new_points, num_out_channel, [1,1], padding='VALID', stride=[1,1], bn=bn, ibn=ibn,is_training=is_training, scope='conv_post_%d'%(i), bn_decay=bn_decay) new_points = tf.squeeze(new_points, [2]) # (batch_size, npoints, mlp2[-1]) return new_xyz, new_points, idx def pointnet_sa_module_msg(xyz, points, npoint, radius_list, nsample_list, mlp_list, is_training, bn_decay, scope, bn=True, ibn = False, use_xyz=True): ''' PointNet Set Abstraction (SA) module with Multi-Scale Grouping (MSG) Input: xyz: (batch_size, ndataset, 3) TF tensor points: (batch_size, ndataset, channel) TF tensor npoint: int32 -- #points sampled in farthest point sampling radius: list of float32 -- search radius in local region nsample: list of int32 -- how many points in each local region mlp: list of list of int32 -- output size for MLP on each point use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features Return: new_xyz: (batch_size, npoint, 3) TF tensor new_points: (batch_size, npoint, \sum_k{mlp[k][-1]}) TF tensor ''' with tf.variable_scope(scope) as sc: new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz)) new_points_list = [] for i in range(len(radius_list)): radius = radius_list[i] nsample = nsample_list[i] idx, pts_cnt = query_ball_point(radius, nsample, xyz, new_xyz) grouped_xyz = group_point(xyz, idx) grouped_xyz -= tf.expand_dims(new_xyz, 2) if points is not None: grouped_points = group_point(points, idx) if use_xyz: grouped_points = tf.concat([grouped_points, grouped_xyz], axis=-1) else: grouped_points = grouped_xyz for j,num_out_channel in enumerate(mlp_list[i]): grouped_points = tf_util2.conv2d(grouped_points, num_out_channel, [1,1], padding='VALID', stride=[1,1], bn=bn, ibn=ibn,is_training=is_training, scope='conv%d_%d'%(i,j), bn_decay=bn_decay) new_points = tf.reduce_max(grouped_points, axis=[2]) new_points_list.append(new_points) new_points_concat = tf.concat(new_points_list, axis=-1) return new_xyz, new_points_concat def pointnet_fp_module(xyz1, xyz2, points1, points2, mlp, is_training, bn_decay, scope, bn=True,ibn=False): ''' PointNet Feature Propogation (FP) Module Input: xyz1: (batch_size, ndataset1, 3) TF tensor xyz2: (batch_size, ndataset2, 3) TF tensor, sparser than xyz1 points1: (batch_size, ndataset1, nchannel1) TF tensor points2: (batch_size, ndataset2, nchannel2) TF tensor mlp: list of int32 -- output size for MLP on each point Return: new_points: (batch_size, ndataset1, mlp[-1]) TF tensor ''' with tf.variable_scope(scope) as sc: dist, idx = three_nn(xyz1, xyz2) dist = tf.maximum(dist, 1e-10) norm = tf.reduce_sum((1.0/dist),axis=2,keep_dims=True) norm = tf.tile(norm,[1,1,3]) weight = (1.0/dist) / norm interpolated_points = three_interpolate(points2, idx, weight) if points1 is not None: new_points1 = tf.concat(axis=2, values=[interpolated_points, points1]) # B,ndataset1,nchannel1+nchannel2 else: new_points1 = interpolated_points new_points1 = tf.expand_dims(new_points1, 2) for i, num_out_channel in enumerate(mlp): new_points1 = tf_util2.conv2d(new_points1, num_out_channel, [1,1], padding='VALID', stride=[1,1], bn=bn, ibn=ibn,is_training=is_training, scope='conv_%d'%(i), bn_decay=bn_decay) new_points1 = tf.squeeze(new_points1, [2]) # B,ndataset1,mlp[-1] return new_points1 ================================================ FILE: code/utils/provider.py ================================================ import os import sys import numpy as np import h5py BASE_DIR = os.path.dirname(os.path.abspath(__file__)) sys.path.append(BASE_DIR) # Download dataset for point cloud classification DATA_DIR = os.path.join(BASE_DIR, 'data') if not os.path.exists(DATA_DIR): os.mkdir(DATA_DIR) if not os.path.exists(os.path.join(DATA_DIR, 'modelnet40_ply_hdf5_2048')): www = 'https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip' zipfile = os.path.basename(www) os.system('wget %s; unzip %s' % (www, zipfile)) os.system('mv %s %s' % (zipfile[:-4], DATA_DIR)) os.system('rm %s' % (zipfile)) def shuffle_data(data, labels): """ Shuffle data and labels. Input: data: B,N,... numpy array label: B,... numpy array Return: shuffled data, label and shuffle indices """ idx = np.arange(len(labels)) np.random.shuffle(idx) return data[idx, ...], labels[idx], idx def rotate_point_cloud(batch_data): """ Randomly rotate the point clouds to augument the dataset rotation is per shape based along up direction Input: BxNx3 array, original batch of point clouds Return: BxNx3 array, rotated batch of point clouds """ rotated_data = np.zeros(batch_data.shape, dtype=np.float32) for k in xrange(batch_data.shape[0]): rotation_angle = np.random.uniform() * 2 * np.pi cosval = np.cos(rotation_angle) sinval = np.sin(rotation_angle) rotation_matrix = np.array([[cosval, 0, sinval], [0, 1, 0], [-sinval, 0, cosval]]) shape_pc = batch_data[k, ...] rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), rotation_matrix) return rotated_data def rotate_point_cloud_by_angle(batch_data, rotation_angle): """ Rotate the point cloud along up direction with certain angle. Input: BxNx3 array, original batch of point clouds Return: BxNx3 array, rotated batch of point clouds """ rotated_data = np.zeros(batch_data.shape, dtype=np.float32) for k in xrange(batch_data.shape[0]): #rotation_angle = np.random.uniform() * 2 * np.pi cosval = np.cos(rotation_angle) sinval = np.sin(rotation_angle) rotation_matrix = np.array([[cosval, 0, sinval], [0, 1, 0], [-sinval, 0, cosval]]) shape_pc = batch_data[k, ...] rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), rotation_matrix) return rotated_data def rotate_perturbation_point_cloud(batch_data, angle_sigma=0.06, angle_clip=0.18): """ Randomly perturb the point clouds by small rotations Input: BxNx3 array, original batch of point clouds Return: BxNx3 array, rotated batch of point clouds """ rotated_data = np.zeros(batch_data.shape, dtype=np.float32) for k in xrange(batch_data.shape[0]): angles = np.clip(angle_sigma*np.random.randn(3), -angle_clip, angle_clip) Rx = np.array([[1,0,0], [0,np.cos(angles[0]),-np.sin(angles[0])], [0,np.sin(angles[0]),np.cos(angles[0])]]) Ry = np.array([[np.cos(angles[1]),0,np.sin(angles[1])], [0,1,0], [-np.sin(angles[1]),0,np.cos(angles[1])]]) Rz = np.array([[np.cos(angles[2]),-np.sin(angles[2]),0], [np.sin(angles[2]),np.cos(angles[2]),0], [0,0,1]]) R = np.dot(Rz, np.dot(Ry,Rx)) shape_pc = batch_data[k, ...] rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), R) return rotated_data def jitter_point_cloud(batch_data, sigma=0.01, clip=0.05): """ Randomly jitter points. jittering is per point. Input: BxNx3 array, original batch of point clouds Return: BxNx3 array, jittered batch of point clouds """ B, N, C = batch_data.shape assert(clip > 0) jittered_data = np.clip(sigma * np.random.randn(B, N, C), -1*clip, clip) jittered_data += batch_data return jittered_data def shift_point_cloud(batch_data, shift_range=0.1): """ Randomly shift point cloud. Shift is per point cloud. Input: BxNx3 array, original batch of point clouds Return: BxNx3 array, shifted batch of point clouds """ B, N, C = batch_data.shape shifts = np.random.uniform(-shift_range, shift_range, (B,3)) for batch_index in range(B): batch_data[batch_index,:,:] += shifts[batch_index,:] return batch_data def random_scale_point_cloud(batch_data, scale_low=0.8, scale_high=1.25): """ Randomly scale the point cloud. Scale is per point cloud. Input: BxNx3 array, original batch of point clouds Return: BxNx3 array, scaled batch of point clouds """ B, N, C = batch_data.shape scales = np.random.uniform(scale_low, scale_high, B) for batch_index in range(B): batch_data[batch_index,:,:] *= scales[batch_index] return batch_data def getDataFiles(list_filename): return [line.rstrip() for line in open(list_filename)] def load_h5(h5_filename): f = h5py.File(h5_filename) data = f['data'][:] label = f['label'][:] return (data, label) def loadDataFile(filename): return load_h5(filename) ================================================ FILE: code/utils/show3d.py ================================================ ''' The default behavior is to visualize the points as white dots >>>show3d.showpoints(np.random.rand(10000,3)) Control: key q: quit key Q: sys.exit(0) key n: zoom in key m: zoom out key s: save screenshot to 'show3d.png' Mouse: rotate You can also play a video by specifying waittime >>>[show3d.showpoints(np.random.rand(10000,3),waittime=10) for i in xrange(10000)] Color can also be useful >>>green=np.linspace(0,1,10000) >>>red=np.linspace(1,0,10000) >>>blue=np.linspace(1,0,10000)**2 >>>show3d.showpoints(np.random.rand(10000,3),green,red,blue) Additional Parameters --------------------- normalizecolor: if True (default), scale the maximum color to 1 for each channel. magnifyBlue: if True, magnify the blue dots to make them more visible background: the background color. Defaults to black (0,0,0) freezerot: disable rotation ''' import numpy as np # import cv2 import sys showsz=800 mousex,mousey=0.5,0.5 zoom=1.0 changed=True def onmouse(*args): global mousex,mousey,changed y=args[1] x=args[2] mousex=x/float(showsz) mousey=y/float(showsz) changed=True def showpoints(xyz,c0=None,c1=None,c2=None,waittime=0,showrot=False,magnifyBlue=0,freezerot=False,background=(0,0,0),normalizecolor=True): # cv2.namedWindow('show3d') # cv2.moveWindow('show3d', 0, 0) # cv2.setMouseCallback('show3d', onmouse) global showsz,mousex,mousey,zoom,changed if len(xyz.shape)!=2 or xyz.shape[1]!=3: raise Exception('showpoints expects (n,3) shape for xyz') if c0 is not None and c0.shape!=xyz.shape[:1]: raise Exception('showpoints expects (n,) shape for c0') if c1 is not None and c1.shape!=xyz.shape[:1]: raise Exception('showpoints expects (n,) shape for c1') if c2 is not None and c2.shape!=xyz.shape[:1]: raise Exception('showpoints expects (n,) shape for c2') xyz=xyz-xyz.mean(axis=0) radius=((xyz**2).sum(axis=-1)**0.5).max() xyz/=(radius*2.2)/showsz if c0 is None: c0=np.zeros((len(xyz),),dtype='float32')+255 if c1 is None: c1=c0 if c2 is None: c2=c0 if normalizecolor: c0=c0/((c0.max()+1e-14)/255.0) c1=c1/((c1.max()+1e-14)/255.0) c2=c2/((c2.max()+1e-14)/255.0) show=np.zeros((showsz,showsz,3),dtype='uint8') def render(): rotmat=np.eye(3) if not freezerot: xangle=(mousey-0.5)*np.pi*1.2 else: xangle=0 rotmat=rotmat.dot(np.array([ [1.0,0.0,0.0], [0.0,np.cos(xangle),-np.sin(xangle)], [0.0,np.sin(xangle),np.cos(xangle)], ])) if not freezerot: yangle=(mousex-0.5)*np.pi*1.2 else: yangle=0 rotmat=rotmat.dot(np.array([ [np.cos(yangle),0.0,-np.sin(yangle)], [0.0,1.0,0.0], [np.sin(yangle),0.0,np.cos(yangle)], ])) rotmat*=zoom nxyz=xyz.dot(rotmat) nz=nxyz[:,2].argsort() nxyz=nxyz[nz] nxyz=(nxyz[:,:2]+[showsz/2,showsz/2]).astype('int32') p=nxyz[:,0]*showsz+nxyz[:,1] show[:]=background m=(nxyz[:,0]>=0)*(nxyz[:,0]=0)*(nxyz[:,1]0: show[:,:,0]=np.maximum(show[:,:,0],np.roll(show[:,:,0],1,axis=0)) if magnifyBlue>=2: show[:,:,0]=np.maximum(show[:,:,0],np.roll(show[:,:,0],-1,axis=0)) show[:,:,0]=np.maximum(show[:,:,0],np.roll(show[:,:,0],1,axis=1)) if magnifyBlue>=2: show[:,:,0]=np.maximum(show[:,:,0],np.roll(show[:,:,0],-1,axis=1)) if showrot: print("Show") # cv2.putText(show,'xangle %d'%(int(xangle/np.pi*180)),(30,showsz-30),0,0.5,cv2.cv.CV_RGB(255,0,0)) # cv2.putText(show,'yangle %d'%(int(yangle/np.pi*180)),(30,showsz-50),0,0.5,cv2.cv.CV_RGB(255,0,0)) # cv2.putText(show,'zoom %d%%'%(int(zoom*100)),(30,showsz-70),0,0.5,cv2.cv.CV_RGB(255,0,0)) changed=True while True: if changed: render() changed=False # cv2.imshow('show3d',show) if waittime==0: # cmd=cv2.waitKey(10)%256 print("Some waittime small") else: print("Some waittime") # cmd=cv2.waitKey(waittime)%256 if cmd==ord('q'): break elif cmd==ord('Q'): sys.exit(0) if cmd==ord('n'): zoom*=1.1 changed=True elif cmd==ord('m'): zoom/=1.1 changed=True elif cmd==ord('r'): zoom=1.0 changed=True elif cmd==ord('s'): print("to show img") # cv2.imwrite('show3d.png',show) if waittime!=0: break return cmd if __name__=='__main__': showpoints(np.random.rand(10000,3)) green=np.linspace(0,1,10000) red=np.linspace(1,0,10000)**0.5 blue=np.linspace(1,0,10000) showpoints(np.random.rand(10000,3),green,red,blue,magnifyBlue=True) ================================================ FILE: code/utils/tf_util.py ================================================ """ Wrapper functions for TensorFlow layers. Author: Charles R. Qi Date: November 2017 """ import numpy as np import tensorflow as tf def _variable_on_cpu(name, shape, initializer, use_fp16=False): """Helper to create a Variable stored on CPU memory. Args: name: name of the variable shape: list of ints initializer: initializer for Variable Returns: Variable Tensor """ dtype = tf.float16 if use_fp16 else tf.float32 var = tf.get_variable(name, shape, initializer=initializer, dtype=dtype) return var def _variable_with_weight_decay(name, shape, stddev, wd, use_xavier=True): """Helper to create an initialized Variable with weight decay. Note that the Variable is initialized with a truncated normal distribution. A weight decay is added only if one is specified. Args: name: name of the variable shape: list of ints stddev: standard deviation of a truncated Gaussian wd: add L2Loss weight decay multiplied by this float. If None, weight decay is not added for this Variable. use_xavier: bool, whether to use xavier initializer Returns: Variable Tensor """ if use_xavier: initializer = tf.contrib.layers.xavier_initializer() else: initializer = tf.truncated_normal_initializer(stddev=stddev) var = _variable_on_cpu(name, shape, initializer) if wd is not None: weight_decay = tf.multiply(tf.nn.l2_loss(var), wd, name='weight_loss') tf.add_to_collection('losses', weight_decay) return var def conv1d(inputs, num_output_channels, kernel_size, scope, stride=1, padding='SAME', use_xavier=True, stddev=1e-3, weight_decay=0.0, activation_fn=tf.nn.relu, bn=False, bn_decay=None, is_training=None, use_bias = True): """ 1D convolution with non-linear operation. Args: inputs: 3-D tensor variable BxLxC num_output_channels: int kernel_size: int scope: string stride: int padding: 'SAME' or 'VALID' use_xavier: bool, use xavier_initializer if true stddev: float, stddev for truncated_normal init weight_decay: float activation_fn: function bn: bool, whether to use batch norm bn_decay: float or float tensor variable in [0,1] is_training: bool Tensor variable Returns: Variable tensor """ with tf.variable_scope(scope) as sc: num_in_channels = inputs.get_shape()[-1].value kernel_shape = [kernel_size, num_in_channels, num_output_channels] kernel = _variable_with_weight_decay('weights', shape=kernel_shape, use_xavier=use_xavier, stddev=stddev, wd=weight_decay) outputs = tf.nn.conv1d(inputs, kernel, stride=stride, padding=padding) if use_bias: biases = _variable_on_cpu('biases', [num_output_channels], tf.constant_initializer(0.0)) outputs = tf.nn.bias_add(outputs, biases) if bn: outputs = batch_norm_for_conv1d(outputs, is_training, bn_decay=bn_decay, scope='bn') if activation_fn is not None: outputs = activation_fn(outputs) return outputs def conv2d(inputs, num_output_channels, kernel_size, scope, stride=[1, 1], padding='SAME', use_xavier=True, stddev=1e-3, weight_decay=0.00001, activation_fn=tf.nn.relu, bn=False, bn_decay=None, is_training=None): """ 2D convolution with non-linear operation. Args: inputs: 4-D tensor variable BxHxWxC num_output_channels: int kernel_size: a list of 2 ints scope: string stride: a list of 2 ints padding: 'SAME' or 'VALID' use_xavier: bool, use xavier_initializer if true stddev: float, stddev for truncated_normal init weight_decay: float activation_fn: function bn: bool, whether to use batch norm bn_decay: float or float tensor variable in [0,1] is_training: bool Tensor variable Returns: Variable tensor """ with tf.variable_scope(scope) as sc: kernel_h, kernel_w = kernel_size num_in_channels = inputs.get_shape()[-1].value kernel_shape = [kernel_h, kernel_w, num_in_channels, num_output_channels] kernel = _variable_with_weight_decay('weights', shape=kernel_shape, use_xavier=use_xavier, stddev=stddev, wd=weight_decay) stride_h, stride_w = stride outputs = tf.nn.conv2d(inputs, kernel, [1, stride_h, stride_w, 1], padding=padding) biases = _variable_on_cpu('biases', [num_output_channels], tf.constant_initializer(0.0)) outputs = tf.nn.bias_add(outputs, biases) if bn: outputs = batch_norm_for_conv2d(outputs, is_training, bn_decay=bn_decay, scope='bn') if activation_fn is not None: outputs = activation_fn(outputs) return outputs def conv2d_transpose(inputs, num_output_channels, kernel_size, scope, stride=[1, 1], padding='SAME', use_xavier=True, stddev=1e-3, weight_decay=0.0, activation_fn=tf.nn.relu, bn=False, bn_decay=None, is_training=None): """ 2D convolution transpose with non-linear operation. Args: inputs: 4-D tensor variable BxHxWxC num_output_channels: int kernel_size: a list of 2 ints scope: string stride: a list of 2 ints padding: 'SAME' or 'VALID' use_xavier: bool, use xavier_initializer if true stddev: float, stddev for truncated_normal init weight_decay: float activation_fn: function bn: bool, whether to use batch norm bn_decay: float or float tensor variable in [0,1] is_training: bool Tensor variable Returns: Variable tensor Note: conv2d(conv2d_transpose(a, num_out, ksize, stride), a.shape[-1], ksize, stride) == a """ with tf.variable_scope(scope) as sc: kernel_h, kernel_w = kernel_size num_in_channels = inputs.get_shape()[-1].value kernel_shape = [kernel_h, kernel_w, num_output_channels, num_in_channels] # reversed to conv2d kernel = _variable_with_weight_decay('weights', shape=kernel_shape, use_xavier=use_xavier, stddev=stddev, wd=weight_decay) stride_h, stride_w = stride # from slim.convolution2d_transpose def get_deconv_dim(dim_size, stride_size, kernel_size, padding): dim_size *= stride_size if padding == 'VALID' and dim_size is not None: dim_size += max(kernel_size - stride_size, 0) return dim_size # caculate output shape batch_size = inputs.get_shape()[0].value height = inputs.get_shape()[1].value width = inputs.get_shape()[2].value out_height = get_deconv_dim(height, stride_h, kernel_h, padding) out_width = get_deconv_dim(width, stride_w, kernel_w, padding) output_shape = [batch_size, out_height, out_width, num_output_channels] outputs = tf.nn.conv2d_transpose(inputs, kernel, output_shape, [1, stride_h, stride_w, 1], padding=padding) biases = _variable_on_cpu('biases', [num_output_channels], tf.constant_initializer(0.0)) outputs = tf.nn.bias_add(outputs, biases) if bn: outputs = batch_norm_for_conv2d(outputs, is_training, bn_decay=bn_decay, scope='bn') if activation_fn is not None: outputs = activation_fn(outputs) return outputs def conv3d(inputs, num_output_channels, kernel_size, scope, stride=[1, 1, 1], padding='SAME', use_xavier=True, stddev=1e-3, weight_decay=0.0, activation_fn=tf.nn.relu, bn=False, bn_decay=None, is_training=None): """ 3D convolution with non-linear operation. Args: inputs: 5-D tensor variable BxDxHxWxC num_output_channels: int kernel_size: a list of 3 ints scope: string stride: a list of 3 ints padding: 'SAME' or 'VALID' use_xavier: bool, use xavier_initializer if true stddev: float, stddev for truncated_normal init weight_decay: float activation_fn: function bn: bool, whether to use batch norm bn_decay: float or float tensor variable in [0,1] is_training: bool Tensor variable Returns: Variable tensor """ with tf.variable_scope(scope) as sc: kernel_d, kernel_h, kernel_w = kernel_size num_in_channels = inputs.get_shape()[-1].value kernel_shape = [kernel_d, kernel_h, kernel_w, num_in_channels, num_output_channels] kernel = _variable_with_weight_decay('weights', shape=kernel_shape, use_xavier=use_xavier, stddev=stddev, wd=weight_decay) stride_d, stride_h, stride_w = stride outputs = tf.nn.conv3d(inputs, kernel, [1, stride_d, stride_h, stride_w, 1], padding=padding) biases = _variable_on_cpu('biases', [num_output_channels], tf.constant_initializer(0.0)) outputs = tf.nn.bias_add(outputs, biases) if bn: outputs = batch_norm_for_conv3d(outputs, is_training, bn_decay=bn_decay, scope='bn') if activation_fn is not None: outputs = activation_fn(outputs) return outputs def fully_connected(inputs, num_outputs, scope, use_xavier=True, stddev=1e-3, weight_decay=0.0, activation_fn=tf.nn.relu, bn=False, bn_decay=None, is_training=None): """ Fully connected layer with non-linear operation. Args: inputs: 2-D tensor BxN num_outputs: int Returns: Variable tensor of size B x num_outputs. """ with tf.variable_scope(scope) as sc: num_input_units = inputs.get_shape()[-1].value weights = _variable_with_weight_decay('weights', shape=[num_input_units, num_outputs], use_xavier=use_xavier, stddev=stddev, wd=weight_decay) outputs = tf.matmul(inputs, weights) biases = _variable_on_cpu('biases', [num_outputs], tf.constant_initializer(0.0)) outputs = tf.nn.bias_add(outputs, biases) if bn: outputs = batch_norm_for_fc(outputs, is_training, bn_decay, 'bn') if activation_fn is not None: outputs = activation_fn(outputs) return outputs def max_pool2d(inputs, kernel_size, scope, stride=[2, 2], padding='VALID'): """ 2D max pooling. Args: inputs: 4-D tensor BxHxWxC kernel_size: a list of 2 ints stride: a list of 2 ints Returns: Variable tensor """ with tf.variable_scope(scope) as sc: kernel_h, kernel_w = kernel_size stride_h, stride_w = stride outputs = tf.nn.max_pool(inputs, ksize=[1, kernel_h, kernel_w, 1], strides=[1, stride_h, stride_w, 1], padding=padding, name=sc.name) return outputs def avg_pool2d(inputs, kernel_size, scope, stride=[2, 2], padding='VALID'): """ 2D avg pooling. Args: inputs: 4-D tensor BxHxWxC kernel_size: a list of 2 ints stride: a list of 2 ints Returns: Variable tensor """ with tf.variable_scope(scope) as sc: kernel_h, kernel_w = kernel_size stride_h, stride_w = stride outputs = tf.nn.avg_pool(inputs, ksize=[1, kernel_h, kernel_w, 1], strides=[1, stride_h, stride_w, 1], padding=padding, name=sc.name) return outputs def max_pool3d(inputs, kernel_size, scope, stride=[2, 2, 2], padding='VALID'): """ 3D max pooling. Args: inputs: 5-D tensor BxDxHxWxC kernel_size: a list of 3 ints stride: a list of 3 ints Returns: Variable tensor """ with tf.variable_scope(scope) as sc: kernel_d, kernel_h, kernel_w = kernel_size stride_d, stride_h, stride_w = stride outputs = tf.nn.max_pool3d(inputs, ksize=[1, kernel_d, kernel_h, kernel_w, 1], strides=[1, stride_d, stride_h, stride_w, 1], padding=padding, name=sc.name) return outputs def avg_pool3d(inputs, kernel_size, scope, stride=[2, 2, 2], padding='VALID'): """ 3D avg pooling. Args: inputs: 5-D tensor BxDxHxWxC kernel_size: a list of 3 ints stride: a list of 3 ints Returns: Variable tensor """ with tf.variable_scope(scope) as sc: kernel_d, kernel_h, kernel_w = kernel_size stride_d, stride_h, stride_w = stride outputs = tf.nn.avg_pool3d(inputs, ksize=[1, kernel_d, kernel_h, kernel_w, 1], strides=[1, stride_d, stride_h, stride_w, 1], padding=padding, name=sc.name) return outputs def batch_norm_template(inputs, is_training, scope, moments_dims, bn_decay): """ Batch normalization on convolutional maps and beyond... Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow Args: inputs: Tensor, k-D input ... x C could be BC or BHWC or BDHWC is_training: boolean tf.Varialbe, true indicates training phase scope: string, variable scope moments_dims: a list of ints, indicating dimensions for moments calculation bn_decay: float or float tensor variable, controling moving average weight Return: normed: batch-normalized maps """ # For support of GAN #bn_decay = bn_decay if bn_decay is not None else 0.9 #return tf.contrib.layers.batch_norm(inputs, # center=True, scale=True, # is_training=is_training, decay=bn_decay,updates_collections=None, # scope=scope) with tf.variable_scope(scope) as sc: num_channels = inputs.get_shape()[-1].value beta = tf.Variable(tf.constant(0.0, shape=[num_channels]), name='beta', trainable=True) gamma = tf.Variable(tf.constant(1.0, shape=[num_channels]), name='gamma', trainable=True) batch_mean, batch_var = tf.nn.moments(inputs, moments_dims, name='moments') decay = bn_decay if bn_decay is not None else 0.9 ema = tf.train.ExponentialMovingAverage(decay=decay) # Operator that maintains moving averages of variables. # Need to set reuse=False, otherwise if reuse, will see moments_1/mean/ExponentialMovingAverage/ does not exist # https://github.com/shekkizh/WassersteinGAN.tensorflow/issues/3 with tf.variable_scope(tf.get_variable_scope(), reuse=False): ema_apply_op = tf.cond(is_training, lambda: ema.apply([batch_mean, batch_var]), lambda: tf.no_op()) # Update moving average and return current batch's avg and var. def mean_var_with_update(): with tf.control_dependencies([ema_apply_op]): return tf.identity(batch_mean), tf.identity(batch_var) # ema.average returns the Variable holding the average of var. mean, var = tf.cond(is_training, mean_var_with_update, lambda: (ema.average(batch_mean), ema.average(batch_var))) normed = tf.nn.batch_normalization(inputs, mean, var, beta, gamma, 1e-3) return normed def batch_norm_for_fc(inputs, is_training, bn_decay, scope): """ Batch normalization on FC data. Args: inputs: Tensor, 2D BxC input is_training: boolean tf.Varialbe, true indicates training phase bn_decay: float or float tensor variable, controling moving average weight scope: string, variable scope Return: normed: batch-normalized maps """ return batch_norm_template(inputs, is_training, scope, [0,], bn_decay) def batch_norm_for_conv1d(inputs, is_training, bn_decay, scope): """ Batch normalization on 1D convolutional maps. Args: inputs: Tensor, 3D BLC input maps is_training: boolean tf.Varialbe, true indicates training phase bn_decay: float or float tensor variable, controling moving average weight scope: string, variable scope Return: normed: batch-normalized maps """ return batch_norm_template(inputs, is_training, scope, [0,1], bn_decay) def batch_norm_for_conv2d(inputs, is_training, bn_decay, scope): """ Batch normalization on 2D convolutional maps. Args: inputs: Tensor, 4D BHWC input maps is_training: boolean tf.Varialbe, true indicates training phase bn_decay: float or float tensor variable, controling moving average weight scope: string, variable scope Return: normed: batch-normalized maps """ return batch_norm_template(inputs, is_training, scope, [0,1,2], bn_decay) def batch_norm_for_conv3d(inputs, is_training, bn_decay, scope): """ Batch normalization on 3D convolutional maps. Args: inputs: Tensor, 5D BDHWC input maps is_training: boolean tf.Varialbe, true indicates training phase bn_decay: float or float tensor variable, controling moving average weight scope: string, variable scope Return: normed: batch-normalized maps """ return batch_norm_template(inputs, is_training, scope, [0,1,2,3], bn_decay) def dropout(inputs, is_training, scope, keep_prob=0.5, noise_shape=None): """ Dropout layer. Args: inputs: tensor is_training: boolean tf.Variable scope: string keep_prob: float in [0,1] noise_shape: list of ints Returns: tensor variable """ with tf.variable_scope(scope) as sc: outputs = tf.cond(is_training, lambda: tf.nn.dropout(inputs, keep_prob, noise_shape), lambda: inputs) return outputs ================================================ FILE: code/utils/tf_util2.py ================================================ import tensorflow as tf def lrelu(x, alpha=0.2): return tf.nn.relu(x) - alpha * tf.nn.relu(-x) # def lrelu2(x, leak=0.2, name="lrelu"): # with tf.variable_scope(name): # f1 = 0.5 * (1 + leak) # f2 = 0.5 * (1 - leak) # return f1 * x + f2 * abs(x) def instance_norm(net, train=True,weight_decay=0.00001): batch, rows, cols, channels = [i.value for i in net.get_shape()] var_shape = [channels] mu, sigma_sq = tf.nn.moments(net, [1, 2], keep_dims=True) shift = tf.get_variable('shift',shape=var_shape, initializer=tf.zeros_initializer, regularizer=tf.contrib.layers.l2_regularizer(weight_decay)) scale = tf.get_variable('scale', shape=var_shape, initializer=tf.ones_initializer, regularizer=tf.contrib.layers.l2_regularizer(weight_decay)) epsilon = 1e-3 normalized = (net - mu) / tf.square(sigma_sq + epsilon) return scale * normalized + shift def conv2d(inputs, num_output_channels, kernel_size, scope=None, stride=[1, 1], padding='SAME', use_xavier=True, stddev=1e-3, weight_decay=0.00001, activation_fn=tf.nn.relu, bn=False, ibn = False, bn_decay=None, use_bias = True, is_training=None, reuse=None): """ 2D convolution with non-linear operation. Args: inputs: 4-D tensor variable BxHxWxC num_output_channels: int kernel_size: a list of 2 ints scope: string stride: a list of 2 ints padding: 'SAME' or 'VALID' use_xavier: bool, use xavier_initializer if true stddev: float, stddev for truncated_normal init weight_decay: float activation_fn: function bn: bool, whether to use batch norm bn_decay: float or float tensor variable in [0,1] is_training: bool Tensor variable Returns: Variable tensor """ with tf.variable_scope(scope,reuse=reuse) as sc: if use_xavier: initializer = tf.contrib.layers.xavier_initializer() else: initializer = tf.truncated_normal_initializer(stddev=stddev) outputs = tf.layers.conv2d(inputs,num_output_channels,kernel_size,stride,padding, kernel_initializer=initializer, kernel_regularizer=tf.contrib.layers.l2_regularizer(weight_decay), bias_regularizer=tf.contrib.layers.l2_regularizer(weight_decay), use_bias=use_bias,reuse=None) assert not (bn and ibn) if bn: outputs = tf.layers.batch_normalization(outputs,momentum=bn_decay,training=is_training,renorm=False,fused=True) #outputs = tf.contrib.layers.batch_norm(outputs,is_training=is_training) if ibn: outputs = instance_norm(outputs,is_training) if activation_fn is not None: outputs = activation_fn(outputs) return outputs def fully_connected(inputs, num_outputs, scope, use_xavier=True, stddev=1e-3, weight_decay=0.00001, activation_fn=tf.nn.relu, bn=False, bn_decay=None, use_bias = True, is_training=None): """ Fully connected layer with non-linear operation. Args: inputs: 2-D tensor BxN num_outputs: int Returns: Variable tensor of size B x num_outputs. """ with tf.variable_scope(scope) as sc: if use_xavier: initializer = tf.contrib.layers.xavier_initializer() else: initializer = tf.truncated_normal_initializer(stddev=stddev) outputs = tf.layers.dense(inputs,num_outputs, use_bias=use_bias,kernel_initializer=initializer, kernel_regularizer=tf.contrib.layers.l2_regularizer(weight_decay), bias_regularizer=tf.contrib.layers.l2_regularizer(weight_decay), reuse=None) if bn: outputs = tf.layers.batch_normalization(outputs, momentum=bn_decay, training=is_training, renorm=False) if activation_fn is not None: outputs = activation_fn(outputs) return outputs ================================================ FILE: code/utils/write_result2html.py ================================================ import os import numpy as np from tqdm import tqdm from utils import pc_util from scipy.misc import imsave def write_result(): root_path = "/home/lqyu/server/proj49/PointSR_data/test_data/our_collected_data" model_names = ['1024_nonormal_generator2_2', '1024_nonormal_generator2_2_uniformloss', '1024_nonormal_generator2_2_recursive'] index_path = os.path.join("index.html") index = open(index_path, "w") index.write("") index.write("") index.write("") for model in model_names: index.write("" % model) index.write("") # get sample list items = os.listdir(root_path + "/" + model_names[0]) items.sort() # mkdir model image path for model in model_names: if not os.path.exists(root_path + "/" + model + "_three_view_img/"): os.makedirs(root_path + "/" + model + "_three_view_img/") # write img to file for item in tqdm(items): index.write("") index.write("" % item) # write prediction for model in model_names: path = root_path + "/" + model +"/" + item if not os.path.exists(path): continue img_path = root_path + "/" + model + "_three_view_img/" + item img_path = img_path.replace("xyz", "png") if not os.path.exists(img_path): data = np.loadtxt(path) data = data[:, 0:3] img = pc_util.point_cloud_three_views(data, diameter=8) imsave(img_path, img) index.write("" % img_path) index.write("") index.close() def write_result2html_benchmark(): root_path = "/home/lqyu/server/proj49/PointSR_data/test_data/our_collected_data" phase = 'surface_benchmark' input_path ="../data/"+phase+"/1024_nonuniform" gt_path = "../data/"+phase+"/4096" model_names = ['1024_nonormal_generator2_2','1024_nonormal_generator2_2_uniformloss','1024_nonormal_generator2_2_recursive'] index_path = os.path.join(root_path, phase + "_index.html") index = open(index_path, "w") index.write("
name
%s
%s
") index.write("") index.write("") index.write("") for model in model_names: index.write("" % model) index.write("") # get sample list items = os.listdir(root_path + "/" + model_names[0] + "/result/" + phase) items.sort() # mkdir model image path for model in model_names: if not os.path.exists(root_path + "/" + model + "/result/" + phase + "_three_view_img/"): os.makedirs(root_path + "/" + model + "/result/" + phase + "_three_view_img/") # write img to file for item in tqdm(items): index.write("") index.write("" % item) # write input image object = item.split("_")[0] id = item.split(".")[0] path = input_path + "/%s.xyz" % (id) img_path = input_path + "_three_view_img/%s.png" % (id) if not os.path.exists(input_path + "_three_view_img/"): os.makedirs(input_path + "_three_view_img/") if not os.path.exists(img_path): data = np.loadtxt(path) data = data[:, 0:3] img = pc_util.point_cloud_three_views(data,diameter=8) imsave(img_path, img) index.write("" % img_path) # write gt image path = gt_path + "/%s.xyz" % (id) img_path = gt_path + "_three_view_img/%s.png" % (id) if not os.path.exists(gt_path + "_three_view_img/"): os.makedirs(gt_path + "_three_view_img/") if not os.path.exists(img_path): data = np.loadtxt(path) data = data[:, 0:3] img = pc_util.point_cloud_three_views(data,diameter=8) imsave(img_path, img) index.write("" % img_path) index.write("") index.write("") # write prediction for model in model_names: path = root_path + "/" + model + "/result/" + phase + "/" + item if not os.path.exists(path): continue img_path = root_path + "/" + model + "/result/" + phase + "_three_view_img/" + item img_path = img_path.replace("xyz", "png") if not os.path.exists(img_path): data = np.loadtxt(path) data = data[:, 0:3] img = pc_util.point_cloud_three_views(data,diameter=8) imsave(img_path, img) index.write("" % img_path) index.write("") index.close() def write_result2html_ModelNet(): root_path = "../model" gt_path = "../data/ModelNet10_poisson_normal" #gt_path = "../data/Patches" model_names = ['1024_generator2_2','new_1024_generator2_2','new_1024_generator2_2_fixed_lr'] phase = 'test' index_path = os.path.join(root_path, phase + "_index.html") index = open(index_path, "w") index.write("
nameInputRefered GT
%s
%s
") index.write("") index.write("") index.write("") for model in model_names: index.write("" % model) index.write("") # get sample list items = os.listdir(root_path + "/" + model_names[0] + "/result/" + phase) items.sort() # mkdir model image path for model in model_names: if not os.path.exists(root_path + "/" + model + "/result/" + phase + "_three_view_img/"): os.makedirs(root_path + "/" + model + "/result/" + phase + "_three_view_img/") # write img to file for item in tqdm(items[::25]): index.write("") index.write("" % item) # write input image object = item.split("_")[0] id = item.split(".")[0] fixed = "%s/1024_nonuniform/%s" % (gt_path, 'train') path = fixed + "/%s.xyz" % (id) img_path = fixed + "_three_view_img/%s.png" % (id) if not os.path.exists(fixed + "_three_view_img/"): os.makedirs(fixed + "_three_view_img/") if not os.path.exists(img_path): data = np.loadtxt(path) data = data[:, 0:3] img = pc_util.point_cloud_three_views(data,diameter=8) imsave(img_path, img) index.write("" % img_path) # write gt image fixed = "%s/4096/%s" % (gt_path, 'train') path = fixed + "/%s.xyz" % (id) img_path = fixed + "_three_view_img/%s.png" % (id) if not os.path.exists(fixed + "_three_view_img/"): os.makedirs(fixed + "_three_view_img/") if not os.path.exists(img_path): data = np.loadtxt(path) data = data[:, 0:3] img = pc_util.point_cloud_three_views(data,diameter=8) imsave(img_path, img) index.write("" % img_path) index.write("") index.write("") # write prediction for model in model_names: path = root_path + "/" + model + "/result/" + phase + "/" + item if not os.path.exists(path): continue img_path = root_path + "/" + model + "/result/" + phase + "_three_view_img/" + item img_path = img_path.replace("xyz", "png") if not os.path.exists(img_path): data = np.loadtxt(path) data = data[:, 0:3] img = pc_util.point_cloud_three_views(data,diameter=8) imsave(img_path, img) index.write("" % img_path) index.write("") index.close() if __name__ == '__main__': write_result2html_ModelNet() #write_result2html_benchmark() #calculate_emd_error('ModelNet40') ================================================ FILE: docs/.nojekyll ================================================ ================================================ FILE: docs/_site/css/main.css ================================================ body { margin: 60px auto; width: 70%; } a { text-decoration: none; color: #999; } a:hover { text-decoration: underline; }*/ p, ul { font-size: 1.5em; line-height: 1.4em; color: #333; } h1, h2, h3, h4 { font-family: 'Helvetica', 'Arial', 'Sans-Serif'; } h1 { font-size: 3em; } h2 { font-size: 2.7em; } h3 { font-size: 2.3em; } h4 { font-size: 1.9em; } nav ul, footer ul { font-size: 1em; font-family: 'Helvetica', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none; font-weight: bold; } nav ul li, footer ul li { display: inline; margin-right: 20px; } footer { border-top: 1px solid #d5d5d5; font-size: .8em; } /* Blog */ ul.posts { margin: 20px auto 40px; font-size: 1.5em; } ul.posts li { list-style: none; } /* CV */ .cv { margin: 0px 0px 60px; } .cv h1 { font-size: 3em; } .cv h2 { font-size: 2em; } .cv address, .cv .download { font-family: 'Helvetica', 'Arial', 'Sans-Serif'; } .cv address, .cv p { font-size: 1.2em; } .cv .download { float: right; font-size: .8em; text-transform: uppercase; } ================================================ FILE: docs/_site/css/project.css ================================================ @import url('https://fonts.googleapis.com/css?family=Lato:400,400i,700,900&subset=latin-ext'); body { font-family: "Lato", sans-serif; color: #322f30; background: #fff; -webkit-font-smoothing: antialiased; margin: 10px auto; width: 70%; } a { text-decoration: none; color: #001f3f; } a:link, a:visited{ color: #39CCCC; /* text-decoration: underline; */ } a:hover { text-decoration: underline; color: #001f3f; } a.paper { font-size: text-decoration: none; color: #001f3f; } a.paper:link, a.paper:visited{ color: #322f30; /* text-decoration: underline; */ } a.paper:hover { text-decoration: underline; color: #39CCCC; } anchor{ line-height: 0; font-size: 0; color: transparent; } /*p, ul { font-size: 1.5em; line-height: 1.4em; color: #fff; }*/ p { font-size: 1.25em; line-height: 1.2em; color: #322f30; font-weight: 200; text-align: justify;} p2{ font-size: 1.0em; line-height: 0.9em; color: #322f30; font-weight: 200; text-align: justify; margin-right: 20px} ul { font-size: 1.5em; line-height: 1.4em; color: #322f30;} h2, h3, h4 { font-family: 'Lato', 'Sans-Serif'; font-weight:300; } /* h1 { font-size: 3em; } */ h2 { font-size: 2.7em; } h3 { font-size: 2.3em; margin-top: 0.8em; margin-bottom: 0.8em; text-align: center;} h4 { font-size: 1.9em; margin-top: 0.8em; margin-bottom: 0.8em; text-align: center;} papertitle { font-family: 'Lato', Verdana, Helvetica, sans-serif; font-size: 14px; font-weight: 700 } /* video { width: 100% !important; height: auto !important; margin: 0 auto; display: block; } */ hr{border: 0px; height: 1px; background-image: linear-gradient(to right, rgba(0, 0, 0, 0), rgba(0, 0, 0, 0.75), rgba(0, 0, 0, 0)); } .center { padding-top: 20px; padding-bottom: 20px; width: 90% !important; height: 90% !important; margin: 0 auto; display: block; } h5{ font-family: 'Lato', 'Sans-Serif';font-weight:300; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;} h1{ font-family: 'Lato', 'Sans-Serif';font-weight:400; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;} /*nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px; list-style: none; text-align: center;}*/ nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px;list-style: none; text-align: center; margin-top: 10px;} footer ul { font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none;} /*nav ul li{ display: inline; margin-right: 20px; padding: 20px;}*/ nav ul li{ display: inline; padding: 20px;} /*footer div ul li{ display: inline; margin-right: 20px; text-align: left;}*/ footer div ul li{ display: inline; margin-right: 20px; text-align: left;} footer { border-top: 1px solid #d5d5d5; font-size: .8em; } /* Blog */ ul.posts { margin: 20px auto 40px; font-size: 1.5em; } ul.posts li { list-style: none; } /* CV */ .cv { margin: 0px 0px 60px; } .cv h1 { font-size: 3em; } .cv h2 { font-size: 2em; } .cv address, .cv .download { font-family: 'Lato', 'Arial', 'Sans-Serif'; } .cv address, .cv p { font-size: 1.2em; } .cv .download { float: right; font-size: .8em; text-transform: uppercase; } img.center_thumbnail{ margin: 0 auto; display: block; } img { display: inline; margin: 0 auto; height: 100%; width: auto; } ================================================ FILE: docs/_site/css/project_file.css ================================================ @import url(https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.css); /*body { font-family: "Lato", sans-serif; color: #fff; background: #322f30; -webkit-font-smoothing: antialiased; margin: 20px auto; width: 70%; }*/ body { font-family: "Lato", sans-serif; color: #fff; background: #322f30; -webkit-font-smoothing: antialiased; margin: 10px auto; width: 70%; } a { text-decoration: none; color: #fff; } a:link, a:visited{ color: #7fdbff; /*text-decoration: underline;*/ } a:hover { text-decoration: underline; } /*p, ul { font-size: 1.5em; line-height: 1.4em; color: #fff; }*/ p { font-size: 1.25em; line-height: 1.2em; color: #fff; font-weight: 200; text-align: justify;} ul { font-size: 1.5em; line-height: 1.4em; color: #fff;} h1, h2, h3, h4 { font-family: 'Lato', 'Sans-Serif'; font-weight:300; } h1 { font-size: 3em; } h2 { font-size: 2.7em; } h3 { font-size: 2.3em; } h4 { font-size: 1.9em; } h5{ font-family: 'Lato', 'Sans-Serif';font-weight:300; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;} /*nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px; list-style: none; text-align: center;}*/ nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px;list-style: none; text-align: center; margin-top: 10px;} footer ul { font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none;} /*nav ul li{ display: inline; margin-right: 20px; padding: 20px;}*/ nav ul li{ display: inline; padding: 20px;} /*footer div ul li{ display: inline; margin-right: 20px; text-align: left;}*/ footer div ul li{ display: inline; margin-right: 20px; text-align: left;} footer { border-top: 1px solid #d5d5d5; font-size: .8em; } ================================================ FILE: docs/_site/index.html ================================================ Ganesh Iyer

CalibNet: Self-Supervised Extrinsic Calibration using 3D Spatial Transformer Networks

Paper Teaser

3D LiDARs and 2D cameras are increasingly being used alongside each other in sensor rigs for perception tasks. Before these sensors can be used to gather meaningful data, however, their extrinsics (and intrinsics) need to be accurately calibrated, as the performance of the sensor rig is extremely sensitive to these calibration parameters. A vast majority of existing calibration techniques require significant amounts of data and/or calibration targets and human effort, severely impacting their applicability in large-scale production systems. We address this gap with CalibNet: a self-supervised deep network capable of automatically estimating the 6-DoF rigid body transformation between a 3D LiDAR and a 2D camera in real-time. CalibNet alleviates the need for calibration targets, thereby resulting in significant savings in calibration efforts. During training, the network only takes as input a LiDAR point cloud, the corresponding monocular image, and the camera calibration matrix K. At train time, we do not impose direct supervision (i.e., we do not directly regress to the calibration parameters, for example). Instead, we train the network to predict calibration parameters that maximize the geometric and photometric consistency of the input images and point clouds. CalibNet learns to iteratively solve the underlying geometric problem and accurately predicts extrinsic calibration parameters for a wide range of mis-calibrations, without requiring retraining or domain adaptation.


Video


Code

Coming Soon!


Paper

Paper Thumbnail

Coming Soon!

================================================ FILE: docs/css/main.css ================================================ body { margin: 60px auto; width: 70%; } a { text-decoration: none; color: #999; } a:hover { text-decoration: underline; }*/ p, ul { font-size: 1.5em; line-height: 1.4em; color: #333; } h1, h2, h3, h4 { font-family: 'Helvetica', 'Arial', 'Sans-Serif'; } h1 { font-size: 3em; } h2 { font-size: 2.7em; } h3 { font-size: 2.3em; } h4 { font-size: 1.9em; } nav ul, footer ul { font-size: 1em; font-family: 'Helvetica', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none; font-weight: bold; } nav ul li, footer ul li { display: inline; margin-right: 20px; } footer { border-top: 1px solid #d5d5d5; font-size: .8em; } /* Blog */ ul.posts { margin: 20px auto 40px; font-size: 1.5em; } ul.posts li { list-style: none; } /* CV */ .cv { margin: 0px 0px 60px; } .cv h1 { font-size: 3em; } .cv h2 { font-size: 2em; } .cv address, .cv .download { font-family: 'Helvetica', 'Arial', 'Sans-Serif'; } .cv address, .cv p { font-size: 1.2em; } .cv .download { float: right; font-size: .8em; text-transform: uppercase; } ================================================ FILE: docs/css/project.css ================================================ @import url('https://fonts.googleapis.com/css?family=Lato:400,400i,700,900&subset=latin-ext'); body { font-family: "Lato", sans-serif; color: #322f30; background: #fff; -webkit-font-smoothing: antialiased; margin: 10px auto; width: 70%; } a { text-decoration: none; color: #001f3f; } a:link, a:visited{ color: #39CCCC; /* text-decoration: underline; */ } a:hover { text-decoration: underline; color: #001f3f; } a.paper { font-size: text-decoration: none; color: #001f3f; } a.paper:link, a.paper:visited{ color: #322f30; /* text-decoration: underline; */ } a.paper:hover { text-decoration: underline; color: #39CCCC; } anchor{ line-height: 0; font-size: 0; color: transparent; } /*p, ul { font-size: 1.5em; line-height: 1.4em; color: #fff; }*/ p { font-size: 1.25em; line-height: 1.2em; color: #322f30; font-weight: 200; text-align: justify;} p2{ font-size: 1.0em; line-height: 0.9em; color: #322f30; font-weight: 200; text-align: justify; margin-right: 20px} ul { font-size: 1.5em; line-height: 1.4em; color: #322f30;} h2, h3, h4 { font-family: 'Lato', 'Sans-Serif'; font-weight:300; } /* h1 { font-size: 3em; } */ h2 { font-size: 2.7em; } h3 { font-size: 2.3em; margin-top: 0.8em; margin-bottom: 0.8em; text-align: center;} h4 { font-size: 1.9em; margin-top: 0.8em; margin-bottom: 0.8em; text-align: center;} papertitle { font-family: 'Lato', Verdana, Helvetica, sans-serif; font-size: 14px; font-weight: 700 } /* video { width: 100% !important; height: auto !important; margin: 0 auto; display: block; } */ hr{border: 0px; height: 1px; background-image: linear-gradient(to right, rgba(0, 0, 0, 0), rgba(0, 0, 0, 0.75), rgba(0, 0, 0, 0)); } .center { padding-top: 20px; padding-bottom: 20px; width: 90% !important; height: 90% !important; margin: 0 auto; display: block; } h5{ font-family: 'Lato', 'Sans-Serif';font-weight:300; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;} h1{ font-family: 'Lato', 'Sans-Serif';font-weight:400; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;} /*nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px; list-style: none; text-align: center;}*/ nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px;list-style: none; text-align: center; margin-top: 10px;} footer ul { font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none;} /*nav ul li{ display: inline; margin-right: 20px; padding: 20px;}*/ nav ul li{ display: inline; padding: 20px;} /*footer div ul li{ display: inline; margin-right: 20px; text-align: left;}*/ footer div ul li{ display: inline; margin-right: 20px; text-align: left;} footer { border-top: 1px solid #d5d5d5; font-size: .8em; } /* Blog */ ul.posts { margin: 20px auto 40px; font-size: 1.5em; } ul.posts li { list-style: none; } /* CV */ .cv { margin: 0px 0px 60px; } .cv h1 { font-size: 3em; } .cv h2 { font-size: 2em; } .cv address, .cv .download { font-family: 'Lato', 'Arial', 'Sans-Serif'; } .cv address, .cv p { font-size: 1.2em; } .cv .download { float: right; font-size: .8em; text-transform: uppercase; } img.center_thumbnail{ margin: 0 auto; display: block; } img { display: inline; margin: 0 auto; height: 100%; width: auto; } ================================================ FILE: docs/css/project_file.css ================================================ @import url(https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.css); /*body { font-family: "Lato", sans-serif; color: #fff; background: #322f30; -webkit-font-smoothing: antialiased; margin: 20px auto; width: 70%; }*/ body { font-family: "Lato", sans-serif; color: #fff; background: #322f30; -webkit-font-smoothing: antialiased; margin: 10px auto; width: 70%; } a { text-decoration: none; color: #fff; } a:link, a:visited{ color: #7fdbff; /*text-decoration: underline;*/ } a:hover { text-decoration: underline; } /*p, ul { font-size: 1.5em; line-height: 1.4em; color: #fff; }*/ p { font-size: 1.25em; line-height: 1.2em; color: #fff; font-weight: 200; text-align: justify;} ul { font-size: 1.5em; line-height: 1.4em; color: #fff;} h1, h2, h3, h4 { font-family: 'Lato', 'Sans-Serif'; font-weight:300; } h1 { font-size: 3em; } h2 { font-size: 2.7em; } h3 { font-size: 2.3em; } h4 { font-size: 1.9em; } h5{ font-family: 'Lato', 'Sans-Serif';font-weight:300; font-size: 3em; text-align: center; margin-top: 0.8em; margin-bottom: 0.8em;} /*nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px; list-style: none; text-align: center;}*/ nav ul{ font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 20px;list-style: none; text-align: center; margin-top: 10px;} footer ul { font-size: 1.5em; font-family: 'Lato', 'Arial', 'Sans-Serif'; padding: 0px; list-style: none;} /*nav ul li{ display: inline; margin-right: 20px; padding: 20px;}*/ nav ul li{ display: inline; padding: 20px;} /*footer div ul li{ display: inline; margin-right: 20px; text-align: left;}*/ footer div ul li{ display: inline; margin-right: 20px; text-align: left;} footer { border-top: 1px solid #d5d5d5; font-size: .8em; } ================================================ FILE: docs/index.html ================================================ CalibNet

CalibNet: Self-Supervised Extrinsic Calibration using 3D Spatial Transformer Networks

Paper Teaser

3D LiDARs and 2D cameras are increasingly being used alongside each other in sensor rigs for perception tasks. Before these sensors can be used to gather meaningful data, however, their extrinsics (and intrinsics) need to be accurately calibrated, as the performance of the sensor rig is extremely sensitive to these calibration parameters. A vast majority of existing calibration techniques require significant amounts of data and/or calibration targets and human effort, severely impacting their applicability in large-scale production systems. We address this gap with CalibNet: a self-supervised deep network capable of automatically estimating the 6-DoF rigid body transformation between a 3D LiDAR and a 2D camera in real-time. CalibNet alleviates the need for calibration targets, thereby resulting in significant savings in calibration efforts. During training, the network only takes as input a LiDAR point cloud, the corresponding monocular image, and the camera calibration matrix K. At train time, we do not impose direct supervision (i.e., we do not directly regress to the calibration parameters, for example). Instead, we train the network to predict calibration parameters that maximize the geometric and photometric consistency of the input images and point clouds. CalibNet learns to iteratively solve the underlying geometric problem and accurately predicts extrinsic calibration parameters for a wide range of mis-calibrations, without requiring retraining or domain adaptation.


Video


Paper

Paper Thumbnail

For the Tensorflow implementation

Check out the Code!

nameInputRefered GT
%s
%s