[
  {
    "path": "CMakeLists.txt",
    "content": "cmake_minimum_required(VERSION 2.8.3)\nproject(locnerf)\n\nfind_package(catkin REQUIRED COMPONENTS\n  geometry_msgs\n  sensor_msgs\n  std_msgs\n  rospy\n)\n\n#set(cv_bridge_DIR /usr/local/share/cv_bridge/cmake)\n\ncatkin_python_setup()\n\ncatkin_package()\n\ninclude_directories(\n  ${catkin_INCLUDE_DIRS}\n)\n\ncatkin_install_python(PROGRAMS src/nav_node.py\n  DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION}\n)\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2022 Massachusetts Institute of Technology\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# Loc-NeRF\n\nMonte Carlo Localization using Neural Radiance Fields. \n\n## Coordinate Frames\nTo be consistent throughout the code and in the yaml files, we define coordinates using the camera frame commonly used for NeRF (x right, y up, z inward from the perspective of the camera) unless stated otherwise. Coordinates are FROM Camera TO World unless otherwise stated. Note this is not the same as the more common camera frame used in robotics (x right, y down, z outward).\n\n## Publications\n\nIf you find this code relevant for your work, please consider citing our paper:\n\n[Loc-NeRF: Monte Carlo Localization using Neural Radiance Fields](https://arxiv.org/abs/2209.09050)\n\nThis work was done in collaboration with MIT and Draper Labs and was partially funded \nby the NASA Flight Opportunities under grant Nos 80NSSC21K0348, ARL DCIST CRA W911NF-17-2-0181, and an Amazon Research Award.\n\n# 1. Installation\n\n## A. Prerequisities\n\n- Install ROS by following [ROS website](http://wiki.ros.org/ROS/Installation).\n\n- If you want to use VIO for the predict step for a real robot demo, install a VIO such as \n[VINS-Fusion](https://github.com/HKUST-Aerial-Robotics/VINS-Fusion) or [Kimera](https://github.com/MIT-SPARK/Kimera-VIO-ROS).\n\n# 2. Loc-NeRF installation\n```bash\n# Setup catkin workspace\nmkdir -p ~/catkin_ws/src\ncd ~/catkin_ws/\ncatkin init\n\n# Clone repo\ncd ~/catkin_ws/src\ngit clone https://github.com/MIT-SPARK/Loc-NeRF\n\n# Compile code\ncatkin build\n\n# source workspace\nsource ~/catkin_ws/devel/setup.bash\n\n#install dependencies:\ncd ~/catkin_ws/src/Loc-NeRF\npip install -r requirements.txt\n```\n\n## Starting Loc-NeRF\nWe will use ROS and rviz as a structure for running Loc-NeRF and for visualizing performance. \nAs a general good practice, remember to source your workspace for each terminal you use.\n\n  1. Open a new terminal and run: `roscore`\n\n  2. In another terminal, launch Loc-NeRF:\n  ```bash\n  roslaunch locnerf navigate.launch parameter_file:=<param_file.yaml>\n  ```\n\n  3. In another terminal, launch rviz for visualization:\n  ```bash\n  rviz -d $(rospack find locnerf)/rviz/rviz.rviz\n  ```\n\n  4. If you are not running with a rosbag, i.e. you are using LLFF data, then Loc-NeRF should start and you should be set. If you are using a rosbag, continue to the next steps.\n\n  5. In another terminal launch VIO\n\n  6. Finally, in another terminal, play your rosbag:\n  ```bash\n  rosbag play /PATH/TO/ROSBAG\n  ```\n\n## Provided config files\nWe provide three yaml files in /cfg to get you started. \n\n```jackal.yaml``` is setup to run a real-time demo. The predict step runs at the rate of available prediction poses (for which we used VIO). The update step processes incoming images as fast as computation limits will allow (in our case about 2.5 Hz) and discards all other images to prevent large latency.\n\n```llff.yaml``` runs Loc-NeRF on the LLFF dataset as described in our paper.\n\n```llff_global.yaml``` runs Loc-NeRF on the LLFF dataset with a wider spread of particles to test the ability to perform global localization as described in our paper.\n\n# 3. Usage\n\nCurrently we provide example code to run Loc-NeRF on two types of data: running on LLFF data to compare with iNeRF (cite), and running a real-time \ndemo with custom data. For both we use NeRF-Pytorch (cite) as our NeRF map.\n\nThe fastest way to start running Loc-NeRF is to download LLFF data with pre-trained NeRF weights. We also provide instructions for running a demo on a real robot, which requires training a NeRF with metric scaled poses.\n\n## Using LLFF data\n\nDownload LLFF images and pretrained NeRF-Pytorch weights from [NeRF-Pytorch](https://github.com/yenchenlin/nerf-pytorch). If you download our fork of iNeRF here: \n[iNeRF](https://github.com/Dominic101/inerf) then the configs and ckpts folder will already be setup correctly with the pre-trained weights, and you just need to add the data folder from NeRF-Pytorch.\n\nPlace data using the following structure:\n\n```\n├── configs   \n│   ├── ...\n├── ckpts                                                                                                       \n│   │   ├── fern\n|   |   |   └── fern.tar                                                                                                                     \n│   │   ├── fortress\n|   |   |   └── fortress.tar                                                                                   \n│   │   ├── horns\n|   |   |   └── horns.tar   \n│   │   ├── room\n|   |   |   └── room.tar   \n|   |   └── ...                                                                                 \n                                                                                            \n├── data                                                                                                                                                                                                       \n│   ├── nerf_llff_data                                                                                                  \n│   │   └── fern  # downloaded llff dataset                                                                                                                         \n│   │   └── fortress  # downloaded llff dataset                                                                                  \n│   │   └── horns   # downloaded llff dataset\n|   |   └── room   # downloaded llff dataset\n|   |   └── ...\n```\n\nAfter updating your yaml file ```llff.yaml``` with the directory where you placed the data and any other params you want to change, you are ready to \nrun Loc-NeRF! By default, Loc-NeRF will estimate the camera pose of 5 random images from each of fern, fortress, horns, room (20 images in total). You can use rviz to provide real-time visualization of the ground truth pose and the particles.\n\n### Plotting results\n\nIf you log results from Loc-NeRF, we provide code to plot the position and rotation error inside  ```tools/eval_logs.py```. The script also contains \ncode to automatically run iNeRF using the same initial conditions that Loc-NeRF used and plot the results of iNeRF.\n\nIf you don't want to run iNeRF, running the plotting code is as easy as changing a few parameters at the top of ```tools/eval_logs.py``` and then running \n```python3 tools/eval_logs.py```. Note that ```run_inerf``` should be set to False.\n\n#### Automatically generating iNeRF - Loc-NeRF comparison\nInstall our fork of [iNeRF](https://github.com/Dominic101/inerf) which is set up to interface with our automatic comparison script.\n\nTo automatically run iNeRF after you have logged data from Loc-NeRF (because we need to know the initial start for iNeRF), run the following:\n``` python3 tools/eval_logs.py --config /home/dominic/inerf/configs/fern.txt --data_dir /home/dominic/inerf/data/nerf_llff_data/ --ckpt_dir /home/dominic/inerf/ckpts/ --N_importance 64 --batch_size 2048```. Note that ```run_inerf``` should be set to True. This will both run iNeRF and log the results so you don't need to rerun iNeRF. Note that to make the interface to iNeRF easier, we directly pass a specific config in the cmd line (in this case fern.txt), but setting N_importance and batch_size directly, along with the provided scripts, ensure that the correct params are used for each LLFF dataset.\n\n## Using Custom data for real-time experiment\n\nTo run Loc-NeRF on your robot or simulator, you will first need to train a NeRF with metric scaled poses. This is important so that a movement estimated \nin the world by the predict step roughly corresponds to a movement inside the NeRF. There are several options for training a NeRF, but for ours we \nused https://colmap.github.io/faq.html#geo-registration. We first estimate a unscaled trajectory with Colmap and then \nprovide Colmap with a small subset of scaled poses that it uses to scale the entire trajectory.\n\nOur procedure is as follows:\n\n1. Record a rosbag with RGB images (for NeRF) along with stereo images and IMU (for VIO).\n\n2. use the provided script ```tools/ros_to_jpg.py``` to convert a downsampled set of images from the rosbag into images for Colmap\n\n3. Use the imgs2poses.py script provided at [LLFF](https://github.com/fyusion/llff) to generate Colmap poses for the trajectory up to scale.\n\n4. You should now have a set of poses up to scale. To add scale, you will need to determine the metric scaled pose of a small subset of images (we used five images) and use the provided Colmap function at\n [colmap-add-scale](https://colmap.github.io/faq.html#geo-registration) to generate optimized metric scaled poses for your dataset.\n\n5. Use our fork of NeRF-Pytorch at [train-nerf](https://github.com/Dominic101/nerf-pytorch) to train a NeRF with metric scaled poses.\n\n6. Now you are all set to run Loc-NeRF with a rosbag for a real-time demo.\n\n\n # 4. FAQ\n\n This code is under active development with more updates coming soon. Below are some known common issues/TODOs. We also welcome PRs and feedback of any encountered issues to keep improving this code for the community:\n\n 1. We use ROS as a way to provide real-time visualzation with rviz and and as a structure to run on real robot platforms. For some users, a reliance on ROS may be an uneccessary requirement and future work is to separate Loc-NeRF from a ROS wrapper. \n\n  # Third-party code:\n Parts of this code were based on [this pytorch implementation of iNeRF](https://github.com/salykovaa/inerf) and [NeRF-Pytorch](https://github.com/yenchenlin/nerf-pytorch).\n\n ```\n NeRF-Pytorch:\n \n @misc{lin2020nerfpytorch,\n  title={NeRF-pytorch},\n  author={Yen-Chen, Lin},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished={\\url{https://github.com/yenchenlin/nerf-pytorch/}},\n  year={2020}\n}\n ```\n"
  },
  {
    "path": "cfg/jackal.yaml",
    "content": "#### main filtering flags ####\nrun_predicts: True # Use odometry for predict step, otherwise use 0 mean gaussian noise.\nuse_received_image: True # Update the sensor image.\ncenter_about_true_pose: False # Use gt orientation as the mean pose to generate an initial start pose - if False use origin.\nuse_weighted_avg: False # Use weighted average to computer position estimate, otherwise use simple average.\nno_ndc: True # note that custom data cannot use ndc but LLFF should\n\n#### main update step params ####\nfactor: 4 # image down-sample factor\ncourse_samples: 64 # number course samples per ray\nfine_samples: 64 # number fine samples per ray\nphotometric_loss: rgb # rgb\nsampling_strategy: random # only random is supported right now\nbatch_size: 32 # number of pixels to use for measurement points\nnum_particles: 400 # Number of initial particles.\n\n#### dataset and weight loading ####\ndataset_type: custom # llff, custom\nmodel_name: nerf_hallway_low\nckpt_dir: /home/dominic/nerf-pytorch/logs\ndata_dir: /home/dominic/inerf/data/nerf_llff_data # only needed for LLFF data\n\n#### convergence protection #####\n# randomly sample a few particle positions to protect against false convergence\n# TODO eventually want to just remove the lowest weight particles instead of random particles\nuse_convergence_protection: True\nnumber_convergence_particles: 15 # number of particles to distribute\nconvergence_noise: 0.2 # meters, uniformly sample this far from mean for each axis.\n\n#### inerf comparision ####\nrun_inerf_compare: False\nglobal_loc_mode: False # Perform Loc-NeRF with Loc-NeRF.\nlog_results: False # Save pose estimate for each iteration. Also save a file with gt pose and the initial pose guess if use_logged_start is False.\nuse_logged_start: True # use saved logs for initial pose\nlog_prefix: use_annealing # add a prefix to the log file names for convenience to keep track of different runs\nlog_directory: /home/dominic/mocNeRF_ws/src/lonerf/logs/inerf_compare # Directory to save npy files.\nforward_passes_limit: 78643200 # end optimization after this number of forward passes have run\n\n#### custom data realtime tracking mode params ####\n# Note these values are all overwritten/not used if run_inerf_compare is True\n# subscriber topics\nrgb_topic: /camera_d455/color/image_raw\nvio_topic: /vins_estimator/odometry\nfocal: 635\nH: 720\nW: 1280\n# Near and far bounds of the NeRF scene, should be similar to how the NeRF was trained.\nnear: 0\nfar: 8\n# set distribution of particles, units are meters and degrees \nmin_bounds:\n  px: -0.5\n  py: -0.3\n  pz: 0.0\n  rz: -2.5\n  ry: -179.0\n  rx: -2.5\nmax_bounds:\n  px: 0.5\n  py: 0.2\n  pz: 3.5\n  rz: 2.5\n  ry: 179.0\n  rx: 2.5\n# rotation from the body frame of VIO to the NeRF camera frame (x right, y up, z inward). all values in radians\nR_bodyVins_camNerf: \n  rz: 0.0\n  ry: 0.0\n  rx: 3.14159256\n\n#### Nerf Navigation comparision ####\n# uses user set distribution of particles\nrun_nerfnav_compare: False\nnerf_nav_directory : /home/dominic/Downloads/input_data1\n\n#### particle annealing params ####\nuse_refining: True # Decrease position noise as we start to converge.\nuse_particle_reduction: True # Reduce number of particles as they start to converge\nmin_number_particles: 150 # reduce particles to this amount during particle reduction\n\n# position standard deviation thresholds to activate particle annealing\nalpha_refine: 0.18\nalpha_super_refine: 0.03\n\n##### motion model noise params ####\n# position noise\npx_noise: 0.002\npy_noise: 0.002\npz_noise: 0.003\n# rotation noise\nrot_x_noise: 0.002\nrot_y_noise: 0.002\nrot_z_noise: 0.002\n\n##### visualization params #####\nvisualize_particles: True\nview_debug_image_iteration: 20 # view NeRF rendered image at estimated pose after number of iterations (set to 0 to disable)\n\n####### DO NOT CHANGE THESE UNLESS YOU KNOW WHAT YOU ARE DOING ########\n# These are params from iNeRF and NeRF-Pytorch\n# In general they should be the same as how the NeRF model was changed\nmultires: 10 # log2 of max freq for positional encoding (3D location)\nmultires_views: 4 # log2 of max freq for positional encoding (2D direction)\ni_embed: 0 # set 0 for default positional encoding, -1 for none\nnetdepth: 8 # layers in network\nnetwidth: 256 # channels per layer\nnetdepth_fine: 8 # layers in fine network\nnetwidth_fine: 256 # channels per layer in fine network\nuse_viewdirs: True\nperturb: 0 # set to 0. for no jitter, 1. for jitter\nwhite_bkgd: False # set to render synthetic data on a white bkgd (always use for dvoxels)\nraw_noise_std: 1.0 # std dev of noise added to regularize sigma_a output, 1e0 recommended\nlindisp: False # sampling linearly in disparity rather than depth\nchunk: 65536 # 1024x64 # number of rays processed in parallel, decrease if running out of memory\nnetchunk: 65536 # 1024x64 # number of pts sent through network in parallel, decrease if running out of memory\nbd_factor: 0.75\nkernel_size: 3\nlrate: 0.01\ndil_iter: 3 # only used with interest_regions sampling"
  },
  {
    "path": "cfg/llff.yaml",
    "content": "#### filtering flags ####\nrun_predicts: False # Use odometry for predict step, otherwise use 0 mean gaussian noise.\nuse_received_image: False # Update the sensor image.\ncenter_about_true_pose: True # Use gt orientation as the mean pose to generate an initial start pose - if False use origin.\nuse_weighted_avg: True # Use weighted average to computer position estimate, otherwise use simple average.\nno_ndc: False # note that custom data cannot use ndc but LLFF should\n\n#### main update step params ####\nfactor: 4 # image down-sample factor\ncourse_samples: 64 # number course samples per ray\nfine_samples: 64 # number fine samples per ray\nphotometric_loss: rgb # rgb\nsampling_strategy: random # only random is supported right now\nbatch_size: 64 # number of pixels to use for measurement points\nnum_particles: 300 # Number of initial particles.\n\n#### dataset and weight loading ####\ndataset_type: llff # llff, custom\nckpt_dir: /home/dominic/inerf/ckpts\ndata_dir: /home/dominic/inerf/data/nerf_llff_data # only needed for LLFF data\n\n#### convergence protection #####\n# randomly sample a few particle positions to protect against false convergence\n# TODO eventually want to just remove the lowest weight particles instead of random particles\nuse_convergence_protection: True\nnumber_convergence_particles: 10 # number of particles to distribute\nconvergence_noise: 0.2 # meters, uniformly sample this far from mean for each axis.\n\n#### inerf comparision ####\nrun_inerf_compare: True\nglobal_loc_mode: False # Perform Loc-NeRF with Loc-NeRF.\nlog_results: False # Save pose estimate for each iteration. Also save a file with gt pose and the initial pose guess if use_logged_start is False.\nuse_logged_start: True # use saved logs for initial pose\nlog_prefix: use_annealing # add a prefix to the log file names for convenience to keep track of different runs\nlog_directory: /home/dominic/mocNeRF_ws/src/locnerf/logs/inerf_compare # Directory to save npy files.\nforward_passes_limit: 78643200 # end optimization after this number of forward passes have run\n\n#### custom data realtime tracking mode params ####\n# Note these values are all overwritten/not used if run_inerf_compare is True\n# subscriber topics\nrgb_topic: /camera/color/image_raw\nvio_topic: /vins_estimator/odometry\nfocal: 635\nH: 720\nW: 1280\n# Near and far bounds of the NeRF scene, should be similar to how the NeRF was trained.\nnear: 0\nfar: 8\n# set distribution of particles, units are meters and degrees \nmin_bounds:\n  px: -0.5\n  py: -0.3\n  pz: 0.0\n  rz: -2.5\n  ry: -179.0\n  rx: -2.5\nmax_bounds:\n  px: 0.5\n  py: 0.2\n  pz: 3.5\n  rz: 2.5\n  ry: 179.0\n  rx: 2.5\n# rotation from the body frame of VIO to the NeRF camera frame (x right, y up, z inward). all values in radians\nR_bodyVins_camNerf: \n  rz: 0.0\n  ry: 0.0\n  rx: 3.14159256\n\n#### Nerf Navigation comparision ####\n# uses user set distribution of particles\nrun_nerfnav_compare: False\nnerf_nav_directory : /home/dominic/Downloads/input_data1\n\n#### particle annealing params ####\nuse_refining: True # Decrease position noise as we start to converge.\nuse_particle_reduction: True # Reduce number of particles as they start to converge\nmin_number_particles: 100 # reduce particles to this amount during particle reduction\n\n# position standard deviation thresholds to activate particle annealing\nalpha_refine: 0.08\nalpha_super_refine: 0.03\n\n#### motion model noise params ####\n# position noise\npx_noise: 0.01\npy_noise: 0.01\npz_noise: 0.01\n# rotation noise\nrot_x_noise: 0.02\nrot_y_noise: 0.02\nrot_z_noise: 0.02\n\n#### visualization params ####\nvisualize_particles: True # publish particles for rviz\nview_debug_image_iteration: 0 # view NeRF rendered image at estimated pose after number of iterations (set to 0 to disable)\n\n####### DO NOT CHANGE THESE UNLESS YOU KNOW WHAT YOU ARE DOING ########\n# These are params from iNeRF and NeRF-Pytorch\n# In general they should be the same as how the NeRF model was changed\nmultires: 10 # log2 of max freq for positional encoding (3D location)\nmultires_views: 4 # log2 of max freq for positional encoding (2D direction)\ni_embed: 0 # set 0 for default positional encoding, -1 for none\nnetdepth: 8 # layers in network\nnetwidth: 256 # channels per layer\nnetdepth_fine: 8 # layers in fine network\nnetwidth_fine: 256 # channels per layer in fine network\nuse_viewdirs: True\nperturb: 0 # set to 0. for no jitter, 1. for jitter\nwhite_bkgd: False # set to render synthetic data on a white bkgd (always use for dvoxels)\nraw_noise_std: 1.0 # std dev of noise added to regularize sigma_a output, 1e0 recommended\nlindisp: False # sampling linearly in disparity rather than depth\nchunk: 65536 # 1024x64 # number of rays processed in parallel, decrease if running out of memory\nnetchunk: 65536 # 1024x64 # number of pts sent through network in parallel, decrease if running out of memory\nbd_factor: 0.75\nkernel_size: 3\nlrate: 0.01\ndil_iter: 3 # only used with interest_regions sampling"
  },
  {
    "path": "cfg/llff_global.yaml",
    "content": "#### filtering flags ####\nrun_predicts: False # Use odometry for predict step, otherwise use 0 mean gaussian noise.\nuse_received_image: False # Update the sensor image.\ncenter_about_true_pose: True # Use gt orientation as the mean pose to generate an initial start pose - if False use origin.\nuse_weighted_avg: True # Use weighted average to computer position estimate, otherwise use simple average.\nno_ndc: False # note that custom data cannot use ndc but LLFF should\n\n#### main update step params ####\nfactor: 4 # image down-sample factor\ncourse_samples: 64 # number course samples per ray\nfine_samples: 64 # number fine samples per ray\nphotometric_loss: rgb # rgb\nsampling_strategy: random # only random is supported right now\nbatch_size: 32 # number of pixels to use for measurement points\nnum_particles: 600 # Number of initial particles.\n\n#### dataset and weight loading ####\ndataset_type: llff # llff, custom\nckpt_dir: /home/dominic/inerf/ckpts\ndata_dir: /home/dominic/inerf/data/nerf_llff_data # only needed for LLFF data\n\n#### convergence protection #####\n# randomly sample a few particle positions to protect against false convergence\n# TODO eventually want to just remove the lowest weight particles instead of random particles\nuse_convergence_protection: True\nnumber_convergence_particles: 10 # number of particles to distribute\nconvergence_noise: 0.2 # meters, uniformly sample this far from mean for each axis.\n\n#### inerf comparision ####\nrun_inerf_compare: True\nglobal_loc_mode: True # Perform Loc-NeRF with Loc-NeRF.\nlog_results: False # Save pose estimate for each iteration. Also save a file with gt pose and the initial pose guess if use_logged_start is False.\nuse_logged_start: True # use saved logs for initial pose\nlog_prefix: use_annealing # add a prefix to the log file names for convenience to keep track of different runs\nlog_directory: /home/dominic/mocNeRF_ws/src/locnerf/logs/global_loc # Directory to save npy files.\nforward_passes_limit: 78643200 # end optimization after this number of forward passes have run\n\n#### custom data realtime tracking mode params ####\n# Note these values are all overwritten/not used if run_inerf_compare is True\n# subscriber topics\nrgb_topic: /camera/color/image_raw\nvio_topic: /vins_estimator/odometry\nfocal: 635\nH: 720\nW: 1280\n# Near and far bounds of the NeRF scene, should be similar to how the NeRF was trained.\nnear: 0\nfar: 8\n# set distribution of particles, units are meters and degrees \nmin_bounds:\n  px: -0.5\n  py: -0.3\n  pz: 0.0\n  rz: -2.5\n  ry: -179.0\n  rx: -2.5\nmax_bounds:\n  px: 0.5\n  py: 0.2\n  pz: 3.5\n  rz: 2.5\n  ry: 179.0\n  rx: 2.5\n# rotation from the body frame of VIO to the NeRF camera frame (x right, y up, z inward). all values in radians\nR_bodyVins_camNerf: \n  rz: 0.0\n  ry: 0.0\n  rx: 3.14159256\n\n#### Nerf Navigation comparision ####\n# uses user set distribution of particles\nrun_nerfnav_compare: False\nnerf_nav_directory : /home/dominic/Downloads/input_data1\n\n#### particle annealing params ####\nuse_refining: True # Decrease position noise as we start to converge.\nuse_particle_reduction: True # Reduce number of particles as they start to converge\nmin_number_particles: 100 # reduce particles to this amount during particle reduction\n\n# position standard deviation thresholds to activate particle annealing\nalpha_refine: 0.08\nalpha_super_refine: 0.03\n\n#### motion model noise params ####\n# position noise\npx_noise: 0.02\npy_noise: 0.02\npz_noise: 0.02\n# rotation noise\nrot_x_noise: 0.01\nrot_y_noise: 0.04\nrot_z_noise: 0.01\n\n#### visualization params ####\nvisualize_particles: True # publish particles for rviz\nview_debug_image_iteration: 0 # view NeRF rendered image at estimated pose after number of iterations (set to 0 to disable)\n\n####### DO NOT CHANGE THESE UNLESS YOU KNOW WHAT YOU ARE DOING ########\n# These are params from iNeRF and NeRF-Pytorch\n# In general they should be the same as how the NeRF model was changed\nmultires: 10 # log2 of max freq for positional encoding (3D location)\nmultires_views: 4 # log2 of max freq for positional encoding (2D direction)\ni_embed: 0 # set 0 for default positional encoding, -1 for none\nnetdepth: 8 # layers in network\nnetwidth: 256 # channels per layer\nnetdepth_fine: 8 # layers in fine network\nnetwidth_fine: 256 # channels per layer in fine network\nuse_viewdirs: True\nperturb: 0 # set to 0. for no jitter, 1. for jitter\nwhite_bkgd: False # set to render synthetic data on a white bkgd (always use for dvoxels)\nraw_noise_std: 1.0 # std dev of noise added to regularize sigma_a output, 1e0 recommended\nlindisp: False # sampling linearly in disparity rather than depth\nchunk: 65536 # 1024x64 # number of rays processed in parallel, decrease if running out of memory\nnetchunk: 65536 # 1024x64 # number of pts sent through network in parallel, decrease if running out of memory\nbd_factor: 0.75\nkernel_size: 3\nlrate: 0.01\ndil_iter: 3 # only used with interest_regions sampling"
  },
  {
    "path": "launch/navigate.launch",
    "content": "<launch>\n  <arg name=\"parameter_file\" default=\"config.yaml\" />\n\n  <rosparam command=\"load\" file=\"$(find locnerf)/cfg/$(arg parameter_file)\" />\n\n  <node pkg=\"locnerf\" type=\"nav_node.py\" name=\"nav_node\" output=\"screen\" />\n</launch>\n"
  },
  {
    "path": "package.xml",
    "content": "<?xml version=\"1.0\"?>\n<package>\n  <name>locnerf</name>\n  <version>0.1.0</version>\n  <description>Run particle filter with NeRF and VIO</description>\n\n  <maintainer email=\"drmaggio@mit.edu\">Dominic Maggio</maintainer>\n\n  <license>MIT</license>\n\n  <buildtool_depend>catkin</buildtool_depend>\n  <build_depend>geometry_msgs</build_depend>\n  <build_depend>sensor_msgs</build_depend>\n  <build_depend>std_msgs</build_depend>\n\n  <export>\n  </export>\n</package>\n"
  },
  {
    "path": "requirements.txt",
    "content": "cv_bridge\ngtsam\nimageio\nmatplotlib\nnumpy\nrospy\nrostopic\nscipy\nsensor_msgs\nopencv_contrib_python\ntorch\ntorch_ema\ntorchvision\ntqdm\n"
  },
  {
    "path": "rviz/rviz.rviz",
    "content": "Panels:\n  - Class: rviz/Displays\n    Help Height: 78\n    Name: Displays\n    Property Tree Widget:\n      Expanded:\n        - /Global Options1\n        - /Status1\n        - /PoseArray1\n        - /PoseArray2\n        - /Image1\n        - /Odometry1\n      Splitter Ratio: 0.5\n    Tree Height: 213\n  - Class: rviz/Selection\n    Name: Selection\n  - Class: rviz/Tool Properties\n    Expanded:\n      - /2D Pose Estimate1\n      - /2D Nav Goal1\n      - /Publish Point1\n    Name: Tool Properties\n    Splitter Ratio: 0.5886790156364441\n  - Class: rviz/Views\n    Expanded:\n      - /Current View1\n    Name: Views\n    Splitter Ratio: 0.5\n  - Class: rviz/Time\n    Experimental: false\n    Name: Time\n    SyncMode: 0\n    SyncSource: \"\"\nPreferences:\n  PromptSaveOnExit: true\nToolbars:\n  toolButtonStyle: 2\nVisualization Manager:\n  Class: \"\"\n  Displays:\n    - Alpha: 0.5\n      Cell Size: 1\n      Class: rviz/Grid\n      Color: 160; 160; 164\n      Enabled: true\n      Line Style:\n        Line Width: 0.029999999329447746\n        Value: Lines\n      Name: Grid\n      Normal Cell Count: 0\n      Offset:\n        X: 0\n        Y: 0\n        Z: 0\n      Plane: XY\n      Plane Cell Count: 10\n      Reference Frame: <Fixed Frame>\n      Value: true\n    - Alpha: 1\n      Arrow Length: 0.30000001192092896\n      Axes Length: 0.30000001192092896\n      Axes Radius: 0.009999999776482582\n      Class: rviz/PoseArray\n      Color: 255; 25; 0\n      Enabled: true\n      Head Length: 0.07000000029802322\n      Head Radius: 0.029999999329447746\n      Name: PoseArray\n      Queue Size: 10\n      Shaft Length: 0.23000000417232513\n      Shaft Radius: 0.009999999776482582\n      Shape: Arrow (Flat)\n      Topic: /particles\n      Unreliable: false\n      Value: true\n    - Alpha: 1\n      Arrow Length: 0.30000001192092896\n      Axes Length: 0.30000001192092896\n      Axes Radius: 0.009999999776482582\n      Class: rviz/PoseArray\n      Color: 0; 128; 0\n      Enabled: true\n      Head Length: 0.07000000029802322\n      Head Radius: 0.029999999329447746\n      Name: PoseArray\n      Queue Size: 10\n      Shaft Length: 0.23000000417232513\n      Shaft Radius: 0.009999999776482582\n      Shape: Arrow (Flat)\n      Topic: /gt_pose\n      Unreliable: false\n      Value: true\n    - Alpha: 1\n      Class: rviz/Axes\n      Enabled: true\n      Length: 1\n      Name: Axes\n      Radius: 0.10000000149011612\n      Reference Frame: <Fixed Frame>\n      Show Trail: false\n      Value: true\n    - Class: rviz/Image\n      Enabled: true\n      Image Topic: /camera_d455/color/image_raw\n      Max Value: 1\n      Median window: 5\n      Min Value: 0\n      Name: Image\n      Normalize Range: true\n      Queue Size: 2\n      Transport Hint: raw\n      Unreliable: false\n      Value: true\n    - Angle Tolerance: 0.10000000149011612\n      Class: rviz/Odometry\n      Covariance:\n        Orientation:\n          Alpha: 0.5\n          Color: 255; 255; 127\n          Color Style: Unique\n          Frame: Local\n          Offset: 1\n          Scale: 1\n          Value: true\n        Position:\n          Alpha: 0.30000001192092896\n          Color: 204; 51; 204\n          Scale: 1\n          Value: true\n        Value: true\n      Enabled: true\n      Keep: 1000\n      Name: Odometry\n      Position Tolerance: 0.10000000149011612\n      Queue Size: 1000\n      Shape:\n        Alpha: 1\n        Axes Length: 1\n        Axes Radius: 0.10000000149011612\n        Color: 255; 25; 0\n        Head Length: 0.30000001192092896\n        Head Radius: 0.10000000149011612\n        Shaft Length: 1\n        Shaft Radius: 0.05000000074505806\n        Value: Arrow\n      Topic: /vins_estimator/odometry\n      Unreliable: false\n      Value: true\n  Enabled: true\n  Global Options:\n    Background Color: 48; 48; 48\n    Default Light: true\n    Fixed Frame: world\n    Frame Rate: 30\n  Name: root\n  Tools:\n    - Class: rviz/Interact\n      Hide Inactive Objects: true\n    - Class: rviz/MoveCamera\n    - Class: rviz/Select\n    - Class: rviz/FocusCamera\n    - Class: rviz/Measure\n    - Class: rviz/SetInitialPose\n      Theta std deviation: 0.2617993950843811\n      Topic: /initialpose\n      X std deviation: 0.5\n      Y std deviation: 0.5\n    - Class: rviz/SetGoal\n      Topic: /move_base_simple/goal\n    - Class: rviz/PublishPoint\n      Single click: true\n      Topic: /clicked_point\n  Value: true\n  Views:\n    Current:\n      Class: rviz/Orbit\n      Distance: 2.0610387325286865\n      Enable Stereo Rendering:\n        Stereo Eye Separation: 0.05999999865889549\n        Stereo Focal Distance: 1\n        Swap Stereo Eyes: false\n        Value: false\n      Field of View: 0.7853981852531433\n      Focal Point:\n        X: 0\n        Y: 0\n        Z: 0\n      Focal Shape Fixed Size: true\n      Focal Shape Size: 0.05000000074505806\n      Invert Z Axis: false\n      Name: Current View\n      Near Clip Distance: 0.009999999776482582\n      Pitch: 0.19980168342590332\n      Target Frame: <Fixed Frame>\n      Yaw: 5.5282206535339355\n    Saved: ~\nWindow Geometry:\n  Displays:\n    collapsed: false\n  Height: 973\n  Hide Left Dock: false\n  Hide Right Dock: true\n  Image:\n    collapsed: false\n  QMainWindow State: 000000ff00000000fd0000000400000000000001e40000032ffc0200000009fb0000001200530065006c0065006300740069006f006e00000001e10000009b0000005c00fffffffb0000001e0054006f006f006c002000500072006f007000650072007400690065007302000001ed000001df00000185000000a3fb000000120056006900650077007300200054006f006f02000001df000002110000018500000122fb000000200054006f006f006c002000500072006f0070006500720074006900650073003203000002880000011d000002210000017afb000000100044006900730070006c006100790073010000003d00000160000000c900fffffffb0000002000730065006c0065006300740069006f006e00200062007500660066006500720200000138000000aa0000023a00000294fb00000014005700690064006500530074006500720065006f02000000e6000000d2000003ee0000030bfb0000000c004b0069006e0065006300740200000186000001060000030c00000261fb0000000a0049006d00610067006501000001a3000001c90000001600ffffff000000010000010f000002b0fc0200000003fb0000001e0054006f006f006c002000500072006f00700065007200740069006500730100000041000000780000000000000000fb0000000a00560069006500770073000000003d000002b0000000a400fffffffb0000001200530065006c0065006300740069006f006e010000025a000000b200000000000000000000000200000490000000a9fc0100000001fb0000000a00560069006500770073030000004e00000080000002e10000019700000003000006b70000003efc0100000002fb0000000800540069006d00650100000000000006b7000002eb00fffffffb0000000800540069006d00650100000000000004500000000000000000000004cd0000032f00000004000000040000000800000008fc0000000100000002000000010000000a0054006f006f006c00730100000000ffffffff0000000000000000\n  Selection:\n    collapsed: false\n  Time:\n    collapsed: false\n  Tool Properties:\n    collapsed: false\n  Views:\n    collapsed: true\n  Width: 1719\n  X: 3640\n  Y: 142\n"
  },
  {
    "path": "setup.py",
    "content": "from distutils.core import setup\nfrom catkin_pkg.python_setup import generate_distutils_setup\n\nargs = generate_distutils_setup(\n    packages=['locnerf'],\n    package_dir={'':'src'}\n)\nsetup(**args)"
  },
  {
    "path": "src/full_filter.py",
    "content": "from matplotlib.markers import MarkerStyle\nimport torch\nimport numpy as np\nimport cv2\nimport matplotlib.pyplot as plt\nimport time\nfrom utils import show_img, find_POI, img2mse, load_llff_data, get_pose\nfrom full_nerf_helpers import load_nerf\nfrom render_helpers import render, to8b, get_rays\nfrom particle_filter import ParticleFilter\n\nfrom scipy.spatial.transform import Rotation as R\n\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\n# part of this script is adapted from iNeRF https://github.com/salykovaa/inerf\n# and NeRF-Pytorch https://github.com/yenchenlin/nerf-pytorch/blob/master/load_llff.py\n\nclass NeRF:\n    \n    def __init__(self, nerf_params):\n        # Parameters\n        self.output_dir = './output/'\n        self.data_dir = nerf_params['data_dir']\n        self.model_name = nerf_params['model_name']\n        self.obs_img_num = nerf_params['obs_img_num']\n        self.batch_size = nerf_params['batch_size']\n        self.factor = nerf_params['factor']\n        self.near = nerf_params['near']\n        self.far = nerf_params['far']\n        self.spherify = False\n        self.kernel_size = nerf_params['kernel_size']\n        self.lrate = nerf_params['lrate']\n        self.dataset_type = nerf_params['dataset_type']\n        self.sampling_strategy = nerf_params['sampling_strategy']\n        self.delta_phi, self.delta_theta, self.delta_psi, self.delta_x, self.delta_y, self.delta_z = [0,0,0,0,0,0]\n        self.no_ndc = nerf_params['no_ndc']\n        self.dil_iter = nerf_params['dil_iter']\n        self.chunk = nerf_params['chunk'] # number of rays processed in parallel, decrease if running out of memory\n        self.bd_factor = nerf_params['bd_factor']\n\n        print(\"dataset type:\", self.dataset_type)\n        print(\"no ndc:\", self.no_ndc)\n        \n        if self.dataset_type == 'custom':\n            print(\"self.factor\", self.factor)\n            self.focal = nerf_params['focal'] / self.factor\n            self.H =  nerf_params['H'] / self.factor\n            self.W =  nerf_params['W'] / self.factor\n\n            # we don't actually use obs_img_pose when we run live images. this prevents attribute errors later in the code\n            self.obs_img_pose = None\n\n            self.H, self.W = int(self.H), int(self.W)\n\n        else:\n            self.obs_img, self.hwf, self.start_pose, self.obs_img_pose, self.bds = load_llff_data(self.data_dir, self.model_name, self.obs_img_num, self.delta_phi,\n                                                self.delta_theta, self.delta_psi, self.delta_x, self.delta_y, self.delta_z, factor=self.factor, recenter=True, bd_factor=self.bd_factor, spherify=self.spherify)\n        \n            self.H, self.W, self.focal = self.hwf\n\n            if self.no_ndc:\n                self.near = np.ndarray.min(self.bds) * .9\n                self.far = np.ndarray.max(self.bds) * 1.\n                print(self.near, self.far)\n            else:\n                self.near = 0.\n                self.far = 1.\n            \n            self.H, self.W = int(self.H), int(self.W)\n            self.obs_img = (np.array(self.obs_img) / 255.).astype(np.float32)\n            self.obs_img_noised = self.obs_img\n\n            # create meshgrid from the observed image\n            self.coords = np.asarray(np.stack(np.meshgrid(np.linspace(0, self.W - 1, self.W), np.linspace(0, self.H - 1, self.H)), -1),\n                                dtype=int)\n\n            self.coords = self.coords.reshape(self.H * self.W, 2)\n        \n        # print(\"height, width, focal:\", self.H, self.W, self.focal)\n\n        # Load NeRF Model\n        self.render_kwargs = load_nerf(nerf_params, device)\n        bds_dict = {\n            'near': self.near,\n            'far': self.far,\n        }\n        self.render_kwargs.update(bds_dict)\n    \n    def get_poi_interest_regions(self, show_img=False, sampling_type = None):\n        # TODO see if other image normalization routines are better\n        self.obs_img_noised = (np.array(self.obs_img) / 255.0).astype(np.float32)\n\n        if show_img:\n            plt.imshow(self.obs_img_noised)\n            plt.show()\n\n        self.coords = np.asarray(np.stack(np.meshgrid(np.linspace(0, self.W - 1, self.W), np.linspace(0, self.H - 1, self.H)), -1),\n                            dtype=int)\n\n        if sampling_type == 'random':\n            self.coords = self.coords.reshape(self.H * self.W, 2)\n\n    def get_loss(self, particles, batch, photometric_loss='rgb'):\n        target_s = self.obs_img_noised[batch[:, 1], batch[:, 0]] # TODO check ordering here\n        target_s = torch.Tensor(target_s).to(device)\n\n        start_time = time.time()\n        num_pixels = len(particles) * len(batch)\n        all_rays_o = np.zeros((num_pixels,3))\n        all_rays_d = np.zeros((num_pixels,3))\n        for i, particle in enumerate(particles):\n            pose = torch.Tensor(particle).to(device)\n\n            rays_o, rays_d = get_rays(self.H, self.W, self.focal, pose) # TODO this line can be stored as a param\n            rays_o = rays_o[batch[:, 1], batch[:, 0]]\n            rays_d = rays_d[batch[:, 1], batch[:, 0]]\n            all_rays_o[i*len(batch): i*len(batch) + len(batch),:] = rays_o.cpu().detach().numpy()\n            all_rays_d[i*len(batch): i*len(batch) + len(batch),:] = rays_d.cpu().detach().numpy()\n\n        all_rays_o = torch.Tensor(all_rays_o).to(device)\n        all_rays_d = torch.Tensor(all_rays_d).to(device)\n        batch_rays = torch.stack([all_rays_o, all_rays_d], 0)\n        rgb_all, disp, acc, extras = render(self.H, self.W, self.focal, chunk=self.chunk, rays=batch_rays,\n                                        retraw=True,\n                                        **self.render_kwargs)\n        nerf_time = time.time() - start_time\n                   \n        # print(rgb_all)\n        # print()\n        # print(target_s)\n\n        losses = []\n        for i in range(len(particles)):\n            rgb = rgb_all[i*len(batch): i*len(batch) + len(batch)]\n            if photometric_loss == 'rgb':\n                loss = img2mse(rgb, target_s)\n\n            else:\n                # TODO throw an error\n                print(\"DID NOT ENTER A VALID LOSS METRIC\")\n            losses.append(loss.item())\n        return losses, nerf_time\n    \n    def visualize_nerf_image(self, nerf_pose):\n        pose_dummy = torch.from_numpy(nerf_pose).cuda()\n        with torch.no_grad():\n            print(nerf_pose)\n            rgb, disp, acc, _ = render(self.H, self.W, self.focal, chunk=self.chunk, c2w=pose_dummy[:3, :4], **self.render_kwargs)\n            rgb = rgb.cpu().detach().numpy()\n            rgb8 = to8b(rgb)\n            ref = to8b(self.obs_img)\n        plt.imshow(rgb8)\n        plt.show()\n"
  },
  {
    "path": "src/full_nerf_helpers.py",
    "content": "import torch\nimport os\ntorch.autograd.set_detect_anomaly(True)\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport numpy as np\n\n# most of this script is adapted from iNeRF https://github.com/salykovaa/inerf\n# and NeRF-Pytorch https://github.com/yenchenlin/nerf-pytorch/blob/master/load_llff.py\n\n\ndef load_nerf(nerf_params, device):\n    \"\"\"Instantiate NeRF's MLP model.\n    \"\"\"\n    embed_fn, input_ch = get_embedder(nerf_params['multires'], nerf_params['i_embed'])\n    embeddirs_fn, input_ch_views = get_embedder(nerf_params['multires_views'], nerf_params['i_embed'])\n    output_ch = 4\n    skips = [4]\n    model = NeRF(D=nerf_params['netdepth'], W=nerf_params['netwidth'],\n                 input_ch=input_ch, output_ch=output_ch, skips=skips,\n                 input_ch_views=input_ch_views, use_viewdirs=nerf_params['use_viewdirs']).to(device)\n\n    model_fine = NeRF(D=nerf_params['netdepth_fine'], W=nerf_params['netwidth_fine'],\n                      input_ch=input_ch, output_ch=output_ch, skips=skips,\n                      input_ch_views=input_ch_views, use_viewdirs=nerf_params['use_viewdirs']).to(device)\n\n    network_query_fn = lambda inputs, viewdirs, network_fn: run_network(inputs, viewdirs, network_fn,\n                                                                        embed_fn=embed_fn,\n                                                                        embeddirs_fn=embeddirs_fn,\n                                                                        netchunk=nerf_params['netchunk'])\n\n    # Load checkpoint, order tar files and pick the most trained if multiple exist.\n    ckpt_dir = nerf_params['ckpt_dir']\n    ckpt_name = \"\" \n    ckpts = [os.path.join(ckpt_dir, ckpt_name, f) for f in sorted(os.listdir(os.path.join(ckpt_dir, ckpt_name))) if 'tar' in f]\n    ckpt_path = ckpts[-1]\n    print('Found ckpts', ckpts)\n    print('Reloading from', ckpt_path)\n    ckpt = torch.load(ckpt_path)\n\n    # Load model\n\n    model.load_state_dict(ckpt['network_fn_state_dict'], strict=False)\n    model_fine.load_state_dict(ckpt['network_fine_state_dict'], strict=False)\n\n    render_kwargs = {\n        'network_query_fn': network_query_fn,\n        'perturb': nerf_params['perturb'],\n        'N_importance': nerf_params['fine_samples'],\n        'network_fine': model_fine,\n        'N_samples': nerf_params['course_samples'],\n        'network_fn': model,\n        'use_viewdirs': nerf_params['use_viewdirs'],\n        'white_bkgd': nerf_params['white_bkgd'],\n        'raw_noise_std': nerf_params['raw_noise_std']\n    }\n\n    # NDC only good for LLFF-style forward facing data\n    if nerf_params['dataset_type'] != 'llff' or nerf_params['no_ndc']:\n        print('Not ndc!')\n        render_kwargs['ndc'] = False\n        render_kwargs['lindisp'] = nerf_params['lindisp']\n\n    if nerf_params['dataset_type'] != 'llff' and not nerf_params['no_ndc']:\n        print(\"WARNING YOU CANNOT USE NDC WITH NON LLFF DATA - NDC WILL NOT BE USED\")\n\n    # Disable updating of the weights\n    for param in model.parameters():\n        param.requires_grad = False\n    for param in model_fine.parameters():\n        param.requires_grad = False\n\n    return render_kwargs\n\n\n# Positional encoding (section 5.1)\nclass Embedder:\n    def __init__(self, **kwargs):\n        self.kwargs = kwargs\n        self.create_embedding_fn()\n\n    def create_embedding_fn(self):\n        embed_fns = []\n        d = self.kwargs['input_dims']\n        out_dim = 0\n        if self.kwargs['include_input']:\n            embed_fns.append(lambda x: x)\n            out_dim += d\n\n        max_freq = self.kwargs['max_freq_log2']\n        N_freqs = self.kwargs['num_freqs']\n\n        if self.kwargs['log_sampling']:\n            freq_bands = 2. ** torch.linspace(0., max_freq, steps=N_freqs)\n        else:\n            freq_bands = torch.linspace(2. ** 0., 2. ** max_freq, steps=N_freqs)\n\n        for freq in freq_bands:\n            for p_fn in self.kwargs['periodic_fns']:\n                embed_fns.append(lambda x, p_fn=p_fn, freq=freq: p_fn(x * freq))\n                out_dim += d\n\n        self.embed_fns = embed_fns\n        self.out_dim = out_dim\n\n    def embed(self, inputs):\n        return torch.cat([fn(inputs) for fn in self.embed_fns], -1)\n\n\ndef get_embedder(multires, i=0):\n    if i == -1:\n        return nn.Identity(), 3\n\n    embed_kwargs = {\n        'include_input': True,\n        'input_dims': 3,\n        'max_freq_log2': multires - 1,\n        'num_freqs': multires,\n        'log_sampling': True,\n        'periodic_fns': [torch.sin, torch.cos],\n    }\n\n    embedder_obj = Embedder(**embed_kwargs)\n    embed = lambda x, eo=embedder_obj: eo.embed(x)\n    return embed, embedder_obj.out_dim\n\n\n# Model\nclass NeRF(nn.Module):\n    def __init__(self, D=8, W=256, input_ch=3, input_ch_views=3, output_ch=4, skips=[4], use_viewdirs=False):\n        \"\"\"\n        \"\"\"\n        super(NeRF, self).__init__()\n        self.D = D\n        self.W = W\n        self.input_ch = input_ch\n        self.input_ch_views = input_ch_views\n        self.skips = skips\n        self.use_viewdirs = use_viewdirs\n\n        self.pts_linears = nn.ModuleList(\n            [nn.Linear(input_ch, W)] + [nn.Linear(W, W) if i not in self.skips else nn.Linear(W + input_ch, W) for i in\n                                        range(D - 1)])\n\n        ### Implementation according to the official code release (https://github.com/bmild/nerf/blob/master/run_nerf_helpers.py#L104-L105)\n        self.views_linears = nn.ModuleList([nn.Linear(input_ch_views + W, W // 2)])\n\n        ### Implementation according to the paper\n        # self.views_linears = nn.ModuleList(\n        #     [nn.Linear(input_ch_views + W, W//2)] + [nn.Linear(W//2, W//2) for i in range(D//2)])\n\n        if use_viewdirs:\n            self.feature_linear = nn.Linear(W, W)\n            self.alpha_linear = nn.Linear(W, 1)\n            self.rgb_linear = nn.Linear(W // 2, 3)\n        else:\n            self.output_linear = nn.Linear(W, output_ch)\n\n    def forward(self, x):\n        input_pts, input_views = torch.split(x, [self.input_ch, self.input_ch_views], dim=-1)\n        h = input_pts\n        for i, l in enumerate(self.pts_linears):\n            h = self.pts_linears[i](h)\n            h = F.relu(h)\n            if i in self.skips:\n                h = torch.cat([input_pts, h], -1)\n\n        if self.use_viewdirs:\n            alpha = self.alpha_linear(h)\n            feature = self.feature_linear(h)\n            h = torch.cat([feature, input_views], -1)\n\n            for i, l in enumerate(self.views_linears):\n                h = self.views_linears[i](h)\n                h = F.relu(h)\n\n            rgb = self.rgb_linear(h)\n            outputs = torch.cat([rgb, alpha], -1)\n        else:\n            outputs = self.output_linear(h)\n\n        return outputs\n\n    def load_weights_from_keras(self, weights):\n        assert self.use_viewdirs, \"Not implemented if use_viewdirs=False\"\n        print(\"here\")\n\n        # Load pts_linears\n        for i in range(self.D):\n            idx_pts_linears = 2 * i\n            self.pts_linears[i].weight.data = torch.from_numpy(np.transpose(weights[idx_pts_linears]))\n            self.pts_linears[i].bias.data = torch.from_numpy(np.transpose(weights[idx_pts_linears + 1]))\n\n        # Load feature_linear\n        idx_feature_linear = 2 * self.D\n        self.feature_linear.weight.data = torch.from_numpy(np.transpose(weights[idx_feature_linear]))\n        self.feature_linear.bias.data = torch.from_numpy(np.transpose(weights[idx_feature_linear + 1]))\n\n        # Load views_linears\n        idx_views_linears = 2 * self.D + 2\n        self.views_linears[0].weight.data = torch.from_numpy(np.transpose(weights[idx_views_linears]))\n        self.views_linears[0].bias.data = torch.from_numpy(np.transpose(weights[idx_views_linears + 1]))\n\n        # Load rgb_linear\n        idx_rbg_linear = 2 * self.D + 4\n        self.rgb_linear.weight.data = torch.from_numpy(np.transpose(weights[idx_rbg_linear]))\n        self.rgb_linear.bias.data = torch.from_numpy(np.transpose(weights[idx_rbg_linear + 1]))\n\n        # Load alpha_linear\n        idx_alpha_linear = 2 * self.D + 6\n        self.alpha_linear.weight.data = torch.from_numpy(np.transpose(weights[idx_alpha_linear]))\n        self.alpha_linear.bias.data = torch.from_numpy(np.transpose(weights[idx_alpha_linear + 1]))\n\n\ndef batchify(fn, chunk):\n    \"\"\"Constructs a version of 'fn' that applies to smaller batches.\n    \"\"\"\n    if chunk is None:\n        return fn\n    def ret(inputs):\n        return torch.cat([fn(inputs[i:i+chunk]) for i in range(0, inputs.shape[0], chunk)], 0)\n    return ret\n\n\ndef run_network(inputs, viewdirs, fn, embed_fn, embeddirs_fn, netchunk=1024*64):\n    \"\"\"Prepares inputs and applies network 'fn'.\n    \"\"\"\n    inputs_flat = torch.reshape(inputs, [-1, inputs.shape[-1]])\n    embedded = embed_fn(inputs_flat)\n\n    if viewdirs is not None:\n        input_dirs = viewdirs[:,None].expand(inputs.shape)\n        input_dirs_flat = torch.reshape(input_dirs, [-1, input_dirs.shape[-1]])\n        embedded_dirs = embeddirs_fn(input_dirs_flat)\n        embedded = torch.cat([embedded, embedded_dirs], -1)\n\n    outputs_flat = batchify(fn, netchunk)(embedded)\n    outputs = torch.reshape(outputs_flat, list(inputs.shape[:-1]) + [outputs_flat.shape[-1]])\n    return outputs"
  },
  {
    "path": "src/nav_node.py",
    "content": "#!/usr/bin/env python\n\nfrom turtle import pos\n\nimport rospy\nimport numpy as np\nimport gtsam\nimport cv2\nimport torch\nimport time\nimport glob\nimport matplotlib.pyplot as plt\n\nfrom cv_bridge import CvBridge\nfrom sensor_msgs.msg import Image\nfrom nav_msgs.msg import Odometry\nfrom geometry_msgs.msg import PoseArray, Pose\n\nfrom scipy.spatial.transform import Rotation as R\nfrom copy import copy\n\nimport locnerf\nfrom full_filter import NeRF\nfrom particle_filter import ParticleFilter\nfrom utils import get_pose\nfrom navigator_base import NavigatorBase\n\ntorch.set_default_tensor_type('torch.cuda.FloatTensor')\ndevice = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n\nclass Navigator(NavigatorBase):\n    def __init__(self, img_num=None, dataset_name=None):\n        # Base class handles loading most params.\n        NavigatorBase.__init__(self, img_num=img_num, dataset_name=dataset_name)\n\n        # Set initial distribution of particles.\n        self.get_initial_distribution()\n\n        self.br = CvBridge()\n\n        # Set up publishers.\n        self.particle_pub = rospy.Publisher(\"/particles\", PoseArray, queue_size = 10)\n        self.pose_pub = rospy.Publisher(\"/estimated_pose\", Odometry, queue_size = 10)\n        self.gt_pub = rospy.Publisher(\"/gt_pose\", PoseArray, queue_size = 10)\n\n        # Set up subscribers.\n        # We don't need callbacks to compare against inerf.\n        if not self.run_inerf_compare:\n            self.image_sub = rospy.Subscriber(self.rgb_topic,Image,self.rgb_callback, queue_size=1, buff_size=2**24)\n            if self.run_predicts:\n                self.vio_sub = rospy.Subscriber(self.pose_topic, Odometry, self.vio_callback, queue_size = 10)\n\n        # Show initial distribution of particles\n        if self.plot_particles:\n            self.visualize()\n\n        if self.log_results:\n            # If using a provided start we already have ground truth, so don't log redundant gt.\n            if not self.use_logged_start:\n                with open(self.log_directory + \"/\" + \"gt_\" + self.model_name + \"_\" + str(self.obs_img_num) + \"_\" + \"poses.npy\", 'wb') as f:\n                    np.save(f, self.gt_pose)\n\n            # Add initial pose estimate before first update step is run.\n            if self.use_weighted_avg:\n                position_est = self.filter.compute_weighted_position_average()\n            else:\n                position_est = self.filter.compute_simple_position_average()\n            rot_est = self.filter.compute_simple_rotation_average()\n            pose_est = gtsam.Pose3(rot_est, position_est).matrix()\n            self.all_pose_est.append(pose_est)\n\n    def get_initial_distribution(self):\n        # NOTE for now assuming everything stays in NeRF coordinates (x right, y up, z inward)\n        if self.run_inerf_compare:\n            # for non-global loc mode, get random pose based on iNeRF evaluation method from their paper\n            # sample random axis from unit sphere and then rotate by a random amount between [-40, 40] degrees\n            # translate along each axis by a random amount between [-10, 10] cm\n            rot_rand = 40.0\n            if self.global_loc_mode:\n                trans_rand = 1.0\n            else:\n                trans_rand = 0.1\n            \n            # get random axis and angle for rotation\n            x = np.random.rand()\n            y = np.random.rand()\n            z = np.random.rand()\n            axis = np.array([x,y,z])\n            axis = axis / np.linalg.norm(axis)\n            angle = np.pi * np.random.uniform(low=-rot_rand, high=rot_rand) / 180.0\n            euler = (gtsam.Rot3.AxisAngle(axis, angle)).ypr()\n\n            # get random translation offset\n            t_x = np.random.uniform(low=-trans_rand, high=trans_rand)\n            t_y = np.random.uniform(low=-trans_rand, high=trans_rand)\n            t_z = np.random.uniform(low=-trans_rand, high=trans_rand)\n\n            # use initial random pose from previously saved log\n            if self.use_logged_start:\n                log_file = self.log_directory + \"/\" + \"initial_pose_\" + self.model_name + \"_\" + str(self.obs_img_num) + \"_\" + \"poses.npy\"\n                start = np.load(log_file)\n                print(start)\n                euler[0], euler[1], euler[2], t_x, t_y, t_z = start\n\n            # log initial random pose\n            elif self.log_results:\n                with open(self.log_directory + \"/\" + \"initial_pose_\" + self.model_name + \"_\" + str(self.obs_img_num) + \"_\" + \"poses.npy\", 'wb') as f:\n                    np.save(f, np.array([euler[0], euler[1], euler[2], t_x, t_y, t_z]))\n\n            if self.global_loc_mode:\n                # 360 degree rotation distribution about yaw\n                self.initial_particles_noise = np.random.uniform(np.array([-trans_rand, -trans_rand, -trans_rand, 0, -179, 0]), np.array([trans_rand, trans_rand, trans_rand, 0, 179, 0]), size = (self.num_particles, 6))\n            else:\n                self.initial_particles_noise = np.random.uniform(np.array([-trans_rand, -trans_rand, -trans_rand, 0, 0, 0]), np.array([trans_rand, trans_rand, trans_rand, 0, 0, 0]), size = (self.num_particles, 6))\n\n            # center translation at randomly sampled position\n            self.initial_particles_noise[:, 0] += t_x\n            self.initial_particles_noise[:, 1] += t_y\n            self.initial_particles_noise[:, 2] += t_z\n\n            if not self.global_loc_mode:\n                for i in range(self.initial_particles_noise.shape[0]):\n                    # rotate random 3 DOF rotation about initial random rotation for each particle\n                    n1 = np.pi * np.random.uniform(low=-rot_rand, high=rot_rand) / 180.0\n                    n2 = np.pi * np.random.uniform(low=-rot_rand, high=rot_rand) / 180.0\n                    n3 = np.pi * np.random.uniform(low=-rot_rand, high=rot_rand) / 180.0\n                    euler_particle = gtsam.Rot3.AxisAngle(axis, angle).retract(np.array([n1, n2, n3])).ypr()\n\n                    # add rotation noise for initial particle distribution\n                    self.initial_particles_noise[i,3] = euler_particle[0] * 180.0 / np.pi\n                    self.initial_particles_noise[i,4] = euler_particle[1] * 180.0 / np.pi \n                    self.initial_particles_noise[i,5] = euler_particle[2] * 180.0 / np.pi  \n        \n        else:\n            # get distribution of particles from user\n            self.initial_particles_noise = np.random.uniform(np.array(\n                [self.min_bounds['px'], self.min_bounds['py'], self.min_bounds['pz'], self.min_bounds['rz'], self.min_bounds['ry'], self.min_bounds['rx']]),\n                np.array([self.max_bounds['px'], self.max_bounds['py'], self.max_bounds['pz'], self.max_bounds['rz'], self.max_bounds['ry'], self.max_bounds['rx']]),\n                size = (self.num_particles, 6))\n\n        self.initial_particles = self.set_initial_particles()\n        self.filter = ParticleFilter(self.initial_particles)\n\n    def vio_callback(self, msg):\n        # extract rotation and position from msg\n        quat = msg.pose.pose.orientation\n        position = msg.pose.pose.position\n\n        # rotate vins to be in nerf frame\n        rx = self.R_bodyVins_camNerf['rx']\n        ry = self.R_bodyVins_camNerf['ry']\n        rz = self.R_bodyVins_camNerf['rz']\n        T_bodyVins_camNerf = gtsam.Pose3(gtsam.Rot3.Ypr(rz, ry, rx), gtsam.Point3(0,0,0))\n        T_wVins_camVins = gtsam.Pose3(gtsam.Rot3(quat.w, quat.x, quat.y, quat.z), gtsam.Point3(position.x, position.y, position.z))\n        T_wVins_camNeRF = gtsam.Pose3(T_wVins_camVins.matrix() @ T_bodyVins_camNerf.matrix())\n\n        if self.previous_vio_pose is not None:\n            T_camNerft_camNerftp1 = gtsam.Pose3(self.previous_vio_pose.inverse().matrix() @ T_wVins_camNeRF.matrix())\n            self.run_predict(T_camNerft_camNerftp1)\n\n        # log pose for next transform computation\n        self.previous_vio_pose = T_wVins_camNeRF\n\n        # publish particles for rviz\n        if self.plot_particles:\n            self.visualize()\n    \n    def rgb_callback(self, msg):\n        self.img_msg = msg\n        \n    def rgb_run(self, msg, get_rays_fn=None, render_full_image=False):\n        print(\"processing image\")\n        start_time = time.time()\n        self.rgb_input_count += 1\n\n        # make copies to prevent mutations\n        particles_position_before_update = np.copy(self.filter.particles['position'])\n        particles_rotation_before_update = [gtsam.Rot3(i.matrix()) for i in self.filter.particles['rotation']]\n\n        if self.use_convergence_protection:\n            for i in range(self.number_convergence_particles):\n                t_x = np.random.uniform(low=-self.convergence_noise, high=self.convergence_noise)\n                t_y = np.random.uniform(low=-self.convergence_noise, high=self.convergence_noise)\n                t_z = np.random.uniform(low=-self.convergence_noise, high=self.convergence_noise)\n                # TODO this is not thread safe. have two lines because we need to both update\n                # particles to check the loss and the actual locations of the particles\n                self.filter.particles[\"position\"][i] = self.filter.particles[\"position\"][i] + np.array([t_x, t_y, t_z])\n                particles_position_before_update[i] = particles_position_before_update[i] + np.array([t_x, t_y, t_z])\n\n        if self.use_received_image:\n            img = self.br.imgmsg_to_cv2(msg)\n            # resize input image so it matches the scale that NeRF expects\n            img = cv2.resize(self.br.imgmsg_to_cv2(msg), (int(self.nerf.W), int(self.nerf.H)))\n            # img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)\n            self.nerf.obs_img = img\n            show_true = self.view_debug_image_iteration != 0 and self.num_updates == self.view_debug_image_iteration-1\n            self.nerf.get_poi_interest_regions(show_true, self.sampling_strategy)\n            # plt.imshow(self.nerf.obs_img)\n            # plt.show()\n\n        total_nerf_time = 0\n\n        if self.sampling_strategy == 'random':\n            rand_inds = np.random.choice(self.nerf.coords.shape[0], size=self.nerf.batch_size, replace=False)\n            batch = self.nerf.coords[rand_inds]\n\n        loss_poses = []\n        for index, particle in enumerate(particles_position_before_update):\n            loss_pose = np.zeros((4,4))\n            rot = particles_rotation_before_update[index]\n            loss_pose[0:3, 0:3] = rot.matrix()\n            loss_pose[0:3,3] = particle[0:3]\n            loss_pose[3,3] = 1.0\n            loss_poses.append(loss_pose)\n        losses, nerf_time = self.nerf.get_loss(loss_poses, batch, self.photometric_loss)\n   \n        for index, particle in enumerate(particles_position_before_update):\n            self.filter.weights[index] = 1/losses[index]\n        total_nerf_time += nerf_time\n\n        self.filter.update()\n        self.num_updates += 1\n        print(\"UPDATE STEP NUMBER\", self.num_updates, \"RAN\")\n        print(\"number particles:\", self.num_particles)\n\n        if self.use_refining: # TODO make it where you can reduce number of particles without using refining\n            self.check_refine_gate()\n\n        if self.use_weighted_avg:\n            avg_pose = self.filter.compute_weighted_position_average()\n        else:\n            avg_pose = self.filter.compute_simple_position_average()\n\n        avg_rot = self.filter.compute_simple_rotation_average()\n        self.nerf_pose = gtsam.Pose3(avg_rot, gtsam.Point3(avg_pose[0], avg_pose[1], avg_pose[2])).matrix()\n\n        if self.plot_particles:\n            self.visualize()\n            \n        # TODO add ability to render several frames\n        if self.view_debug_image_iteration != 0 and (self.num_updates == self.view_debug_image_iteration):\n            self.nerf.visualize_nerf_image(self.nerf_pose)\n\n        if not self.use_received_image:\n            if self.use_weighted_avg:\n                print(\"average position of all particles: \", self.filter.compute_weighted_position_average())\n                print(\"position error: \", np.linalg.norm(self.gt_pose[0:3,3] - self.filter.compute_weighted_position_average()))\n            else:\n                print(\"average position of all particles: \", self.filter.compute_simple_position_average())\n                print(\"position error: \", np.linalg.norm(self.gt_pose[0:3,3] - self.filter.compute_simple_position_average()))\n\n        if self.use_weighted_avg:\n            position_est = self.filter.compute_weighted_position_average()\n        else:\n            position_est = self.filter.compute_simple_position_average()\n        rot_est = self.filter.compute_simple_rotation_average()\n        pose_est = gtsam.Pose3(rot_est, position_est).matrix()\n\n        if self.log_results:\n            self.all_pose_est.append(pose_est)\n        \n        if not self.run_inerf_compare:\n            img_timestamp = msg.header.stamp\n            self.publish_pose_est(pose_est, img_timestamp)\n        else:\n            self.publish_pose_est(pose_est)\n    \n        update_time = time.time() - start_time\n        print(\"forward passes took:\", total_nerf_time, \"out of total\", update_time, \"for update step\")\n\n        if not self.run_predicts:\n            self.filter.predict_no_motion(self.px_noise, self.py_noise, self.pz_noise, self.rot_x_noise, self.rot_y_noise, self.rot_z_noise) #  used if you want to localize a static image\n        \n        # return is just for logging\n        return pose_est\n    \n    def check_if_position_error_good(self, return_error = False):\n        \"\"\"\n        check if position error is less than 5cm, or return the error if return_error is True\n        \"\"\"\n        acceptable_error = 0.05\n        if self.use_weighted_avg:\n            error = np.linalg.norm(self.gt_pose[0:3,3] - self.filter.compute_weighted_position_average())\n            if return_error:\n                return error\n            return error < acceptable_error\n        else:\n            error = np.linalg.norm(self.gt_pose[0:3,3] - self.filter.compute_simple_position_average())\n            if return_error:\n                return error\n            return error < acceptable_error\n\n    def check_if_rotation_error_good(self, return_error = False):\n        \"\"\"\n        check if rotation error is less than 5 degrees, or return the error if return_error is True\n        \"\"\"\n        acceptable_error = 5.0\n        average_rot_t = (self.filter.compute_simple_rotation_average()).transpose()\n        # check rot in bounds by getting angle using https://math.stackexchange.com/questions/2113634/comparing-two-rotation-matrices\n\n        r_ab = average_rot_t @ (self.gt_pose[0:3,0:3])\n        rot_error = np.rad2deg(np.arccos((np.trace(r_ab) - 1) / 2))\n        print(\"rotation error: \", rot_error)\n        if return_error:\n            return rot_error\n        return abs(rot_error) < acceptable_error\n\n    def run_predict(self, delta_pose):\n        self.filter.predict_with_delta_pose(delta_pose, self.px_noise, self.py_noise, self.pz_noise, self.rot_x_noise, self.rot_y_noise, self.rot_z_noise)\n\n        if self.plot_particles:\n            self.visualize()\n    \n    def set_initial_particles(self):\n        initial_positions = np.zeros((self.num_particles, 3))\n        rots = []\n        for index, particle in enumerate(self.initial_particles_noise):\n            x = particle[0]\n            y = particle[1]\n            z = particle[2]\n            phi = particle[3]\n            theta = particle[4]\n            psi = particle[5]\n\n            particle_pose = get_pose(phi, theta, psi, x, y, z, self.nerf.obs_img_pose, self.center_about_true_pose)\n            \n            # set positions\n            initial_positions[index,:] = [particle_pose[0,3], particle_pose[1,3], particle_pose[2,3]]\n            # set orientations\n            rots.append(gtsam.Rot3(particle_pose[0:3,0:3]))\n            # print(initial_particles)\n        return {'position':initial_positions, 'rotation':np.array(rots)}\n\n    def set_noise(self, scale):\n        self.px_noise = rospy.get_param('px_noise') / scale\n        self.py_noise = rospy.get_param('py_noise') / scale\n        self.pz_noise = rospy.get_param('pz_noise') / scale\n        self.rot_x_noise = rospy.get_param('rot_x_noise') / scale\n        self.rot_y_noise = rospy.get_param('rot_y_noise') / scale\n        self.rot_z_noise = rospy.get_param('rot_z_noise') / scale\n\n\n    def check_refine_gate(self):\n    \n        # get standard deviation of particle position\n        sd_xyz = np.std(self.filter.particles['position'], axis=0)\n        norm_std = np.linalg.norm(sd_xyz)\n        refining_used = False\n        print(\"sd_xyz:\", sd_xyz)\n        print(\"norm sd_xyz:\", np.linalg.norm(sd_xyz))\n\n        if norm_std < self.alpha_super_refine:\n            print(\"SUPER REFINE MODE ON\")\n            # reduce original noise by a factor of 4\n            self.set_noise(scale = 4.0)\n            refining_used = True\n        elif norm_std < self.alpha_refine:\n            print(\"REFINE MODE ON\")\n            # reduce original noise by a factor of 2\n            self.set_noise(scale = 2.0)\n            refining_used = True\n        else:\n            # reset noise to original value\n            self.set_noise(scale = 1.0)\n        \n        if refining_used and self.use_particle_reduction:\n            self.filter.reduce_num_particles(self.min_number_particles)\n            self.num_particles = self.min_number_particles\n\n    def publish_pose_est(self, pose_est_gtsam, img_timestamp = None):\n        pose_est = Odometry()\n        pose_est.header.frame_id = \"world\"\n\n        # if we don't run on rosbag data then we don't have timestamps\n        if img_timestamp is not None:\n            pose_est.header.stamp = img_timestamp\n\n        pose_est_gtsam = gtsam.Pose3(pose_est_gtsam)\n        position_est = pose_est_gtsam.translation()\n        rot_est = pose_est_gtsam.rotation().quaternion()\n\n        # populate msg with pose information\n        pose_est.pose.pose.position.x = position_est[0]\n        pose_est.pose.pose.position.y = position_est[1]\n        pose_est.pose.pose.position.z = position_est[2]\n        pose_est.pose.pose.orientation.w = rot_est[0]\n        pose_est.pose.pose.orientation.x = rot_est[1]\n        pose_est.pose.pose.orientation.y = rot_est[2]\n        pose_est.pose.pose.orientation.z = rot_est[3]\n        # print(pose_est_gtsam.rotation().ypr())\n\n        # publish pose\n        self.pose_pub.publish(pose_est)\n\n    def visualize(self):\n        # publish pose array of particles' poses\n        poses = []\n        R_nerf_body = gtsam.Rot3.Rx(-np.pi/2)\n        for index, particle in enumerate(self.filter.particles['position']): \n            p = Pose()\n            p.position.x = particle[0]\n            p.position.y = particle[1]\n            p.position.z = particle[2]\n            # print(particle[3],particle[4],particle[5])\n            rot = self.filter.particles['rotation'][index]\n            orient = rot.quaternion()\n            p.orientation.w = orient[0]\n            p.orientation.x = orient[1]\n            p.orientation.y = orient[2]\n            p.orientation.z = orient[3]\n            poses.append(p)\n            \n        pa = PoseArray()\n        pa.poses = poses\n        pa.header.frame_id = \"world\"\n        self.particle_pub.publish(pa)\n\n        # if we have a ground truth pose then publish it\n        if not self.use_received_image or self.gt_pose is not None:\n            gt_array = PoseArray()\n            gt = Pose()\n            gt_rot = gtsam.Rot3(self.gt_pose[0:3,0:3]).quaternion()\n            gt.orientation.w = gt_rot[0]\n            gt.orientation.x = gt_rot[1]\n            gt.orientation.y = gt_rot[2]\n            gt.orientation.z = gt_rot[3]\n            gt.position.x = self.gt_pose[0,3]\n            gt.position.y = self.gt_pose[1,3]\n            gt.position.z = self.gt_pose[2,3]\n            gt_array.poses = [gt]\n            gt_array.header.frame_id = \"world\"\n            self.gt_pub.publish(gt_array)\n \ndef average_arrays(axis_list):\n    \"\"\"\n    average arrays of different size\n    adapted from https://stackoverflow.com/questions/49037902/how-to-interpolate-a-line-between-two-other-lines-in-python/49041142#49041142\n\n    axis_list = [forward_passes_all, accuracy]\n    \"\"\"\n    min_max_xs = [(min(axis), max(axis)) for axis in axis_list[0]]\n\n    new_axis_xs = [np.linspace(min_x, max_x, 100) for min_x, max_x in min_max_xs]\n    new_axis_ys = []\n    for i in range(len(axis_list[0])):\n        new_axis_ys.append(np.interp(new_axis_xs[i], axis_list[0][i], axis_list[1][i]))\n\n    midx = [np.mean([new_axis_xs[axis_idx][i] for axis_idx in range(len(axis_list[0]))])/1000.0 for i in range(100)]\n    midy = [np.mean([new_axis_ys[axis_idx][i] for axis_idx in range(len(axis_list[0]))]) for i in range(100)]\n\n    plt.plot(midx, midy)\n    plt.xlabel(\"number of forward passes (in thousands)\")\n    plt.grid()\n    plt.show()\n\nif __name__ == \"__main__\":\n    rospy.init_node(\"nav_node\")\n\n    run_inerf_compare = rospy.get_param(\"run_inerf_compare\")\n    use_logged_start = rospy.get_param(\"use_logged_start\")\n    log_directory = rospy.get_param(\"log_directory\")\n\n    if run_inerf_compare:\n        num_starts_per_dataset = 5 # TODO make this a param\n        datasets = ['fern', 'horns', 'fortress', 'room'] # TODO make this a param\n        # datasets = ['fern']\n\n        total_position_error_good = []\n        total_rotation_error_good = []\n        total_num_forward_passes = []\n        for dataset_index, dataset_name in enumerate(datasets):\n            print(\"Starting iNeRF Style Test on Dataset: \", dataset_name)\n            if use_logged_start:\n                start_pose_files = glob.glob(log_directory + \"/initial_pose_\" + dataset_name + '_' +'*')\n\n            # only use an image number once per dataset\n            used_img_nums = set()\n            for i in range(num_starts_per_dataset):\n                if not use_logged_start:\n                    img_num = np.random.randint(low=0, high=20) # TODO can increase the range of images\n                    while img_num in used_img_nums:\n                        img_num = np.random.randint(low=0, high=20)\n                    used_img_nums.add(img_num)\n                \n                else:\n                    start_file = start_pose_files[i]\n                    img_num = int(start_file.split('_')[5])\n\n                mcl_local = Navigator(img_num, dataset_name)\n                print()\n                print(\"Using Image Number:\", mcl_local.obs_img_num)\n                print(\"Test\", i+1, \"out of\", num_starts_per_dataset)\n\n                num_forward_passes_per_iteration = [0]\n                position_error_good = []\n                rotation_error_good = []\n                ii = 0\n                while num_forward_passes_per_iteration[-1] < mcl_local.forward_passes_limit:\n                    print()\n                    print(\"forward pass limit, current number forward passes:\", mcl_local.forward_passes_limit, num_forward_passes_per_iteration[-1])\n\n                    position_error_good.append(int(mcl_local.check_if_position_error_good()))\n                    rotation_error_good.append(int(mcl_local.check_if_rotation_error_good()))\n                    if ii != 0:\n                        mcl_local.rgb_run('temp')\n                        num_forward_passes_per_iteration.append(num_forward_passes_per_iteration[ii-1] + mcl_local.num_particles * (mcl_local.course_samples + mcl_local.fine_samples) * mcl_local.batch_size)\n                    ii += 1\n\n                if mcl_local.log_results:\n                    with open(mcl_local.log_directory + \"/\" + \"mocnerf_\" + mcl_local.log_prefix + \"_\" + mcl_local.model_name + \"_\" + str(mcl_local.obs_img_num) + \"_\" + \"poses.npy\", 'wb') as f:\n                        np.save(f, np.array(mcl_local.all_pose_est))\n                    with open(mcl_local.log_directory + \"/\" + \"mocnerf_\" + mcl_local.log_prefix + \"_\" + mcl_local.model_name + \"_\" + str(mcl_local.obs_img_num) + \"_\" + \"forward_passes.npy\", 'wb') as f:\n                        np.save(f, np.array(num_forward_passes_per_iteration))\n\n                total_num_forward_passes.append(num_forward_passes_per_iteration)\n                total_position_error_good.append(position_error_good)\n                total_rotation_error_good.append(rotation_error_good)\n\n        average_arrays([total_num_forward_passes, total_position_error_good])\n        average_arrays([total_num_forward_passes, total_rotation_error_good])\n\n    # run normal live ROS mode\n    else:\n        mcl = Navigator()      \n        while not rospy.is_shutdown():\n            if mcl.img_msg is not None:\n                mcl.rgb_run(mcl.img_msg)\n                mcl.img_msg = None # TODO not thread safe\n"
  },
  {
    "path": "src/navigator_base.py",
    "content": "import rospy\nimport numpy as np\nimport warnings\nfrom full_filter import NeRF\n\n # Base class to handle loading params from yaml.\n\nclass NavigatorBase:\n    def __init__(self, img_num=0, dataset_name=None):\n        # extract params\n        self.factor = rospy.get_param('factor')\n        self.focal = rospy.get_param('focal')\n        self.H = rospy.get_param('H')\n        self.W = rospy.get_param('W')\n        self.dataset_type = rospy.get_param('dataset_type')\n        self.num_particles = rospy.get_param('num_particles')\n        self.plot_particles  = rospy.get_param('visualize_particles')\n        self.rgb_topic = rospy.get_param('rgb_topic')\n        self.pose_topic = rospy.get_param('vio_topic')\n        self.near = rospy.get_param('near')\n        self.far = rospy.get_param('far')\n        self.course_samples = rospy.get_param('course_samples')\n        self.fine_samples = rospy.get_param('fine_samples')\n        self.batch_size = rospy.get_param('batch_size')\n        self.kernel_size = rospy.get_param('kernel_size')\n        self.lrate = rospy.get_param('lrate')\n        self.sampling_strategy = rospy.get_param('sampling_strategy')\n        self.no_ndc = rospy.get_param('no_ndc')\n        self.dil_iter = rospy.get_param('dil_iter')\n        self.multires = rospy.get_param('multires')\n        self.multires_views = rospy.get_param('multires_views')\n        self.i_embed = rospy.get_param('i_embed')\n        self.netwidth = rospy.get_param('netwidth')\n        self.netdepth = rospy.get_param('netdepth')\n        self.netdepth_fine = rospy.get_param('netdepth_fine')\n        self.netwidth_fine = rospy.get_param('netwidth_fine')\n        self.use_viewdirs = rospy.get_param('use_viewdirs')\n        self.perturb = rospy.get_param('perturb')\n        self.white_bkgd = rospy.get_param('white_bkgd')\n        self.raw_noise_std = rospy.get_param('raw_noise_std')\n        self.lindisp = rospy.get_param('lindisp')\n        self.netchunk = rospy.get_param('netchunk')\n        self.chunk = rospy.get_param('chunk')\n        self.bd_factor = rospy.get_param('bd_factor')\n\n        self.log_prefix = rospy.get_param('log_prefix')\n\n        # just used for Nerf-Navigation comparison\n        self.model_ngp = None\n        self.ngp_opt = None\n\n        if dataset_name is not None:\n            self.model_name = dataset_name\n        else:\n            self.model_name = rospy.get_param('model_name')\n        self.data_dir = rospy.get_param('data_dir') + \"/\" + self.model_name\n        self.ckpt_dir = rospy.get_param('ckpt_dir') + \"/\" + self.model_name     \n\n        self.obs_img_num = img_num\n\n        # TODO these don't individually need to be part of the navigator class\n        nerf_params = {'near':self.near, 'far':self.far, 'course_samples':self.course_samples, 'fine_samples':self.fine_samples,\n                       'batch_size':self.batch_size, 'factor':self.factor, 'focal':self.focal, 'H':self.H, 'W':self.W, 'dataset_type':self.dataset_type,\n                       'obs_img_num':self.obs_img_num, 'kernel_size':self.kernel_size, 'lrate':self.lrate, 'sampling_strategy':self.sampling_strategy,\n                       'model_name':self.model_name, 'data_dir':self.data_dir, 'no_ndc':self.no_ndc, 'dil_iter':self.dil_iter,\n                       'multires':self.multires, 'multires_views':self.multires_views, 'i_embed':self.i_embed, 'netwidth':self.netwidth, 'netdepth':self.netdepth,\n                       'netdepth_fine':self.netdepth_fine, 'netwidth_fine':self.netwidth_fine, 'use_viewdirs':self.use_viewdirs, 'ckpt_dir':self.ckpt_dir,\n                       'perturb':self.perturb, 'white_bkgd':self.white_bkgd, 'raw_noise_std':self.raw_noise_std, 'lindisp':self.lindisp,\n                       'netchunk':self.netchunk, 'chunk':self.chunk, 'bd_factor':self.bd_factor}\n        self.nerf = NeRF(nerf_params)\n        \n        self.image = None\n        self.rgb_input_count = 0\n        self.num_updates = 0\n        self.photometric_loss = rospy.get_param('photometric_loss')\n\n        self.view_debug_image_iteration = rospy.get_param('view_debug_image_iteration')\n\n        self.px_noise = rospy.get_param('px_noise')\n        self.py_noise = rospy.get_param('py_noise')\n        self.pz_noise = rospy.get_param('pz_noise')\n        self.rot_x_noise = rospy.get_param('rot_x_noise')\n        self.rot_y_noise = rospy.get_param('rot_y_noise')\n        self.rot_z_noise = rospy.get_param('rot_z_noise')\n\n        self.use_convergence_protection = rospy.get_param('use_convergence_protection')\n        self.number_convergence_particles = rospy.get_param('number_convergence_particles')\n        self.convergence_noise = rospy.get_param('convergence_noise')\n\n        self.use_weighted_avg = rospy.get_param('use_weighted_avg')\n\n        self.min_number_particles = rospy.get_param('min_number_particles')\n        self.use_particle_reduction = rospy.get_param('use_particle_reduction')\n\n        self.alpha_refine = rospy.get_param('alpha_refine')\n        self.alpha_super_refine = rospy.get_param('alpha_super_refine')\n\n        self.run_predicts = rospy.get_param('run_predicts')\n        self.use_received_image = rospy.get_param('use_received_image')\n        self.run_inerf_compare = rospy.get_param('run_inerf_compare')\n        self.global_loc_mode = rospy.get_param('global_loc_mode')\n        self.run_nerfnav_compare = rospy.get_param('run_nerfnav_compare')\n        self.nerf_nav_directory = rospy.get_param('nerf_nav_directory')\n        self.center_about_true_pose = rospy.get_param('center_about_true_pose')\n        self.use_refining = rospy.get_param('use_refining')\n        self.log_results = rospy.get_param('log_results')\n        self.log_directory = rospy.get_param('log_directory')\n        self.use_logged_start = rospy.get_param('use_logged_start')\n        self.forward_passes_limit = rospy.get_param('forward_passes_limit')\n\n        self.min_bounds = rospy.get_param('min_bounds')\n        self.max_bounds = rospy.get_param('max_bounds')\n\n        self.R_bodyVins_camNerf = rospy.get_param('R_bodyVins_camNerf')\n        \n        self.previous_vio_pose = None\n        self.nerf_pose = None\n        self.all_pose_est = [] # plus 1 since we put in the initial pose before the first update\n        self.img_msg = None\n        \n        # for now only have gt pose for llff dataset for inerf comparison and nerf-nav comparison\n        self.gt_pose = None\n        if not self.use_received_image:\n            self.gt_pose = np.copy(self.nerf.obs_img_pose)\n        \n        self.check_params()\n\n    def check_params(self):\n        \"\"\"\n        Useful helper function to check if suspicious or invalid params are being used.\n        TODO: Not all bad combinations of params are currently checked here.\n        \"\"\"\n\n        if self.alpha_super_refine > self.alpha_refine:\n            warnings.warn(\"alpha_super_refine is larger than alpha_refine, code will run but they are probably flipped by the user\")\n        \n        if self.sampling_strategy != \"random\":\n            warnings.warn(\"did not enter a valid sampling strategy. Currently the following are supported: random\")\n\n        if self.photometric_loss != \"rgb\":\n            warnings.warn(\"did not enter a valid photometric loss. Currently the following are supported: rgb\")"
  },
  {
    "path": "src/particle_filter.py",
    "content": "import numpy as np\nimport gtsam\n\nfrom multiprocessing import Lock\n\nclass ParticleFilter:\n\n    def __init__(self, initial_particles):\n        self.num_particles=len(initial_particles['position'])\n        self.particles = initial_particles\n        self.weights=np.ones(self.num_particles)\n        self.particle_lock = Lock()\n\n    def reduce_num_particles(self, num_particles):\n        self.particle_lock.acquire()\n        self.num_particles = num_particles\n        self.weights = self.weights[0:num_particles]\n        self.particles['position'] = self.particles['position'][0:num_particles]\n        self.particles['rotation'] = self.particles['rotation'][0:num_particles]\n        self.particle_lock.release()\n\n    def predict_no_motion(self, p_x, p_y, p_z, r_x, r_y, r_z):\n        self.particle_lock.acquire()\n        self.particles['position'][:,0] += p_x * np.random.normal(size = (self.particles['position'].shape[0]))\n        self.particles['position'][:,1] += p_y * np.random.normal(size = (self.particles['position'].shape[0]))\n        self.particles['position'][:,2] += p_z * np.random.normal(size = (self.particles['position'].shape[0]))\n\n        # TODO see if this can be made faster\n        for i in range(len(self.particles['rotation'])):\n            n1 = r_x * np.random.normal()\n            n2 = r_y * np.random.normal()\n            n3 = r_z * np.random.normal()\n            self.particles['rotation'][i] = self.particles['rotation'][i].retract(np.array([n1, n2, n3]))\n        self.particle_lock.release()\n\n    def predict_with_delta_pose(self, delta_pose, p_x, p_y, p_z, r_x, r_y, r_z):\n        self.particle_lock.acquire()\n\n        # TODO see if this can be made faster\n        delta_rot_t_tp1= delta_pose.rotation()\n        for i in range(len(self.particles['rotation'])):\n            # TODO do rotation in gtsam without casting to matrix\n            pose = gtsam.Pose3(self.particles['rotation'][i], self.particles['position'][i])\n            new_pose = gtsam.Pose3(pose.matrix() @ delta_pose.matrix())\n            new_position = new_pose.translation()\n            self.particles['position'][i][0] = new_position[0]\n            self.particles['position'][i][1] = new_position[1]\n            self.particles['position'][i][2] = new_position[2]\n            self.particles['rotation'][i] = new_pose.rotation()\n\n            n1 = r_x * np.random.normal()\n            n2 = r_y * np.random.normal()\n            n3 = r_z * np.random.normal()\n            self.particles['rotation'][i] = gtsam.Rot3(self.particles['rotation'][i].retract(np.array([n1, n2, n3])).matrix())\n\n        self.particles['position'][:,0] += (p_x * np.random.normal(size = (self.particles['position'].shape[0])))\n        self.particles['position'][:,1] += (p_y * np.random.normal(size = (self.particles['position'].shape[0])))\n        self.particles['position'][:,2] += (p_z * np.random.normal(size = (self.particles['position'].shape[0])))\n        \n        self.particle_lock.release()\n\n    def update(self):\n        # use fourth power\n        self.weights = np.square(self.weights)\n        self.weights = np.square(self.weights)\n\n        # normalize weights\n        sum_weights=np.sum(self.weights)\n        # print(\"pre-normalized weight sum\", sum_weights)\n        self.weights=self.weights / sum_weights\n    \n        #resample\n        self.particle_lock.acquire()\n        choice = np.random.choice(self.num_particles, self.num_particles, p = self.weights, replace=True)\n        temp = {'position':np.copy(self.particles['position'])[choice, :], 'rotation':np.copy(self.particles['rotation'])[choice]}\n        self.particles = temp\n        self.particle_lock.release()\n\n    def compute_simple_position_average(self):\n        # Simple averaging does not use weighted average or k means.\n        avg_pose = np.average(self.particles['position'], axis=0)\n        return avg_pose\n\n    def compute_weighted_position_average(self):\n        avg_pose = np.average(self.particles['position'], weights=self.weights, axis=0)\n        return avg_pose\n    \n    def compute_simple_rotation_average(self):\n        # Simple averaging does not use weighted average or k means.\n        # https://users.cecs.anu.edu.au/~hartley/Papers/PDF/Hartley-Trumpf:Rotation-averaging:IJCV.pdf section 5.3 Algorithm 1\n        \n        epsilon = 0.000001\n        max_iters = 300\n        rotations = self.particles['rotation']\n        R = rotations[0]\n        for i in range(max_iters):\n            rot_sum = np.zeros((3))\n            for rot in rotations:\n                rot_sum = rot_sum  + gtsam.Rot3.Logmap(gtsam.Rot3(R.transpose() @ rot.matrix()))\n\n            r = rot_sum / len(rotations)\n            if np.linalg.norm(r) < epsilon:\n                # print(\"rotation averaging converged at iteration: \", i)\n                # print(\"average rotation: \", R)\n                return R\n            else:\n                # TODO do the matrix math in gtsam to avoid all the type casting\n                R = gtsam.Rot3(R.matrix() @ gtsam.Rot3.Expmap(r).matrix())"
  },
  {
    "path": "src/render_helpers.py",
    "content": "import os\nimport torch\nfrom tqdm import tqdm\nimport time\nimport imageio\nimport numpy as np\nimport torch.nn.functional as F\n#from torchsearchsorted import searchsorted\nfrom utils import to8b\nDEBUG = False\n\n# most of this script is from iNeRF https://github.com/salykovaa/inerf\n# and NeRF-Pytorch https://github.com/yenchenlin/nerf-pytorch/blob/master/load_llff.py\n\n# Ray helpers\ndef get_rays(H, W, focal, c2w):\n    i, j = torch.meshgrid(torch.linspace(0, W-1, W), torch.linspace(0, H-1, H))  # pytorch's meshgrid has indexing='ij'\n    i = i.t()\n    j = j.t()\n    dirs = torch.stack([(i-W*.5)/focal, -(j-H*.5)/focal, -torch.ones_like(i)], -1)\n    # Rotate ray directions from camera frame to the world frame\n    rays_d = torch.sum(dirs[..., np.newaxis, :] * c2w[:3,:3], -1)  # dot product, equals to: [c2w.dot(dir) for dir in dirs]\n    # Translate camera frame's origin to the world frame. It is the origin of all rays.\n    rays_o = c2w[:3,-1].expand(rays_d.shape)\n    return rays_o, rays_d\n\n\ndef get_rays_np(H, W, focal, c2w):\n    i, j = np.meshgrid(np.arange(W, dtype=np.float32), np.arange(H, dtype=np.float32), indexing='xy')\n    dirs = np.stack([(i - W * .5) / focal, -(j - H * .5) / focal, -np.ones_like(i)], -1)\n    # Rotate ray directions from camera frame to the world frame\n    rays_d = np.sum(dirs[..., np.newaxis, :] * c2w[:3, :3],\n                    -1)  # dot product, equals to: [c2w.dot(dir) for dir in dirs]\n    # Translate camera frame's origin to the world frame. It is the origin of all rays.\n    rays_o = np.broadcast_to(c2w[:3, -1], np.shape(rays_d))\n    return rays_o, rays_d\n\n\ndef ndc_rays(H, W, focal, near, rays_o, rays_d):\n    # Shift ray origins to near plane\n    t = -(near + rays_o[..., 2]) / rays_d[..., 2]\n    rays_o = rays_o + t[..., None] * rays_d\n\n    # Projection\n    o0 = -1. / (W / (2. * focal)) * rays_o[..., 0] / rays_o[..., 2]\n    o1 = -1. / (H / (2. * focal)) * rays_o[..., 1] / rays_o[..., 2]\n    o2 = 1. + 2. * near / rays_o[..., 2]\n\n    d0 = -1. / (W / (2. * focal)) * (rays_d[..., 0] / rays_d[..., 2] - rays_o[..., 0] / rays_o[..., 2])\n    d1 = -1. / (H / (2. * focal)) * (rays_d[..., 1] / rays_d[..., 2] - rays_o[..., 1] / rays_o[..., 2])\n    d2 = -2. * near / rays_o[..., 2]\n\n    rays_o = torch.stack([o0, o1, o2], -1)\n    rays_d = torch.stack([d0, d1, d2], -1)\n\n    return rays_o, rays_d\n\n\n# Hierarchical sampling (section 5.2)\ndef sample_pdf(bins, weights, N_samples, det=False, pytest=False):\n    # Get pdf\n    weights = weights + 1e-5  # prevent nans\n    pdf = weights / torch.sum(weights, -1, keepdim=True)\n    cdf = torch.cumsum(pdf, -1)\n    cdf = torch.cat([torch.zeros_like(cdf[..., :1]), cdf], -1)  # (batch, len(bins))\n\n    # Take uniform samples\n    if det:\n        u = torch.linspace(0., 1., steps=N_samples)\n        u = u.expand(list(cdf.shape[:-1]) + [N_samples])\n    else:\n        u = torch.rand(list(cdf.shape[:-1]) + [N_samples])\n\n    # Pytest, overwrite u with numpy's fixed random numbers\n    if pytest:\n        print(\"pytest is true\")\n        np.random.seed(0)\n        new_shape = list(cdf.shape[:-1]) + [N_samples]\n        if det:\n            u = np.linspace(0., 1., N_samples)\n            u = np.broadcast_to(u, new_shape)\n        else:\n            u = np.random.rand(*new_shape)\n        u = torch.Tensor(u)\n\n    # Invert CDF\n    u = u.contiguous()\n    inds = torch.searchsorted(cdf, u, right=True)\n    below = torch.max(torch.zeros_like(inds - 1), inds - 1)\n    above = torch.min((cdf.shape[-1] - 1) * torch.ones_like(inds), inds)\n    inds_g = torch.stack([below, above], -1)  # (batch, N_samples, 2)\n\n    # cdf_g = tf.gather(cdf, inds_g, axis=-1, batch_dims=len(inds_g.shape)-2)\n    # bins_g = tf.gather(bins, inds_g, axis=-1, batch_dims=len(inds_g.shape)-2)\n    matched_shape = [inds_g.shape[0], inds_g.shape[1], cdf.shape[-1]]\n    cdf_g = torch.gather(cdf.unsqueeze(1).expand(matched_shape), 2, inds_g)\n    bins_g = torch.gather(bins.unsqueeze(1).expand(matched_shape), 2, inds_g)\n\n    denom = (cdf_g[..., 1] - cdf_g[..., 0])\n    denom = torch.where(denom < 1e-5, torch.ones_like(denom), denom)\n    t = (u - cdf_g[..., 0]) / denom\n    samples = bins_g[..., 0] + t * (bins_g[..., 1] - bins_g[..., 0])\n    return samples\n\n\ndef raw2outputs(raw, z_vals, rays_d, raw_noise_std=0, white_bkgd=False, pytest=False):\n    \"\"\"Transforms model's predictions to semantically meaningful values.\n    Args:\n        raw: [num_rays, num_samples along ray, 4]. Prediction from model.\n        z_vals: [num_rays, num_samples along ray]. Integration time.\n        rays_d: [num_rays, 3]. Direction of each ray.\n    Returns:\n        rgb_map: [num_rays, 3]. Estimated RGB color of a ray.\n        disp_map: [num_rays]. Disparity map. Inverse of depth map.\n        acc_map: [num_rays]. Sum of weights along each ray.\n        weights: [num_rays, num_samples]. Weights assigned to each sampled color.\n        depth_map: [num_rays]. Estimated distance to object.\n    \"\"\"\n    raw2alpha = lambda raw, dists, act_fn=F.relu: 1.-torch.exp(-act_fn(raw)*dists)\n\n    dists = z_vals[...,1:] - z_vals[...,:-1]\n    dists = torch.cat([dists, torch.Tensor([1e10]).expand(dists[...,:1].shape)], -1)  # [N_rays, N_samples]\n\n    dists = dists * torch.norm(rays_d[...,None,:], dim=-1)\n\n    rgb = torch.sigmoid(raw[...,:3])  # [N_rays, N_samples, 3]\n    noise = 0.\n    if raw_noise_std > 0.:\n        noise = torch.randn(raw[...,3].shape) * raw_noise_std\n\n        # Overwrite randomly sampled data if pytest\n        if pytest:\n            np.random.seed(0)\n            noise = np.random.rand(*list(raw[...,3].shape)) * raw_noise_std\n            noise = torch.Tensor(noise)\n\n    alpha = raw2alpha(raw[...,3] + noise, dists)  # [N_rays, N_samples]\n    # weights = alpha * tf.math.cumprod(1.-alpha + 1e-10, -1, exclusive=True)\n    weights = alpha * torch.cumprod(torch.cat([torch.ones((alpha.shape[0], 1)), 1.-alpha + 1e-10], -1), -1)[:, :-1]\n    rgb_map = torch.sum(weights[...,None] * rgb, -2)  # [N_rays, 3]\n\n    depth_map = torch.sum(weights * z_vals, -1)\n    disp_map = 1./torch.max(1e-10 * torch.ones_like(depth_map), depth_map / torch.sum(weights, -1))\n    acc_map = torch.sum(weights, -1)\n\n    if white_bkgd:\n        rgb_map = rgb_map + (1.-acc_map[...,None])\n\n    return rgb_map, disp_map, acc_map, weights, depth_map\n\n\ndef render_rays(ray_batch,\n                network_fn,\n                network_query_fn,\n                N_samples,\n                retraw=False,\n                lindisp=False,\n                perturb=0.,\n                N_importance=0,\n                network_fine=None,\n                white_bkgd=False,\n                raw_noise_std=0.,\n                verbose=False,\n                pytest=False):\n    \"\"\"Volumetric rendering.\n    Args:\n      ray_batch: array of shape [batch_size, ...]. All information necessary\n        for sampling along a ray, including: ray origin, ray direction, min\n        dist, max dist, and unit-magnitude viewing direction.\n      network_fn: function. Model for predicting RGB and density at each point\n        in space.\n      network_query_fn: function used for passing queries to network_fn.\n      N_samples: int. Number of different times to sample along each ray.\n      retraw: bool. If True, include model's raw, unprocessed predictions.\n      lindisp: bool. If True, sample linearly in inverse depth rather than in depth.\n      perturb: float, 0 or 1. If non-zero, each ray is sampled at stratified\n        random points in time.\n      N_importance: int. Number of additional times to sample along each ray.\n        These samples are only passed to network_fine.\n      network_fine: \"fine\" network with same spec as network_fn.\n      white_bkgd: bool. If True, assume a white background.\n      raw_noise_std: ...\n      verbose: bool. If True, print more debugging info.\n    Returns:\n      rgb_map: [num_rays, 3]. Estimated RGB color of a ray. Comes from fine model.\n      disp_map: [num_rays]. Disparity map. 1 / depth.\n      acc_map: [num_rays]. Accumulated opacity along each ray. Comes from fine model.\n      raw: [num_rays, num_samples, 4]. Raw predictions from model.\n      rgb0: See rgb_map. Output for coarse model.\n      disp0: See disp_map. Output for coarse model.\n      acc0: See acc_map. Output for coarse model.\n      z_std: [num_rays]. Standard deviation of distances along ray for each\n        sample.\n    \"\"\"\n    N_rays = ray_batch.shape[0]\n    rays_o, rays_d = ray_batch[:,0:3], ray_batch[:,3:6] # [N_rays, 3] each\n    viewdirs = ray_batch[:,-3:] if ray_batch.shape[-1] > 8 else None\n    bounds = torch.reshape(ray_batch[...,6:8], [-1,1,2])\n    near, far = bounds[...,0], bounds[...,1] # [-1,1]\n\n    t_vals = torch.linspace(0., 1., steps=N_samples)\n    if not lindisp:\n        z_vals = near * (1.-t_vals) + far * (t_vals)\n    else:\n        z_vals = 1./(1./near * (1.-t_vals) + 1./far * (t_vals))\n\n    z_vals = z_vals.expand([N_rays, N_samples])\n\n    if perturb > 0.:\n        # get intervals between samples\n        mids = .5 * (z_vals[...,1:] + z_vals[...,:-1])\n        upper = torch.cat([mids, z_vals[...,-1:]], -1)\n        lower = torch.cat([z_vals[...,:1], mids], -1)\n        # stratified samples in those intervals\n        t_rand = torch.rand(z_vals.shape)\n\n        # Pytest, overwrite u with numpy's fixed random numbers\n        if pytest:\n            np.random.seed(0)\n            t_rand = np.random.rand(*list(z_vals.shape))\n            t_rand = torch.Tensor(t_rand)\n\n        z_vals = lower + (upper - lower) * t_rand\n\n    pts = rays_o[...,None,:] + rays_d[...,None,:] * z_vals[...,:,None] # [N_rays, N_samples, 3]\n\n\n#     raw = run_network(pts)\n    raw = network_query_fn(pts, viewdirs, network_fn)\n    rgb_map, disp_map, acc_map, weights, depth_map = raw2outputs(raw, z_vals, rays_d, raw_noise_std, white_bkgd, pytest=pytest)\n\n    if N_importance > 0:\n\n        rgb_map_0, disp_map_0, acc_map_0 = rgb_map, disp_map, acc_map\n\n        z_vals_mid = .5 * (z_vals[...,1:] + z_vals[...,:-1])\n        z_samples = sample_pdf(z_vals_mid, weights[...,1:-1], N_importance, det=(perturb==0.), pytest=pytest)\n        z_samples = z_samples.detach()\n\n        z_vals, _ = torch.sort(torch.cat([z_vals, z_samples], -1), -1)\n        pts = rays_o[...,None,:] + rays_d[...,None,:] * z_vals[...,:,None] # [N_rays, N_samples + N_importance, 3]\n\n        run_fn = network_fn if network_fine is None else network_fine\n#         raw = run_network(pts, fn=run_fn)\n        raw = network_query_fn(pts, viewdirs, run_fn)\n\n        rgb_map, disp_map, acc_map, weights, depth_map = raw2outputs(raw, z_vals, rays_d, raw_noise_std, white_bkgd, pytest=pytest)\n\n    ret = {'rgb_map' : rgb_map, 'disp_map' : disp_map, 'acc_map' : acc_map}\n    if retraw:\n        ret['raw'] = raw\n    if N_importance > 0:\n        ret['rgb0'] = rgb_map_0\n        ret['disp0'] = disp_map_0\n        ret['acc0'] = acc_map_0\n        ret['z_std'] = torch.std(z_samples, dim=-1, unbiased=False)  # [N_rays]\n\n    for k in ret:\n        if (torch.isnan(ret[k]).any() or torch.isinf(ret[k]).any()) and DEBUG:\n            print(f\"! [Numerical Error] {k} contains nan or inf.\")\n\n    return ret\n\ndef batchify_rays(rays_flat, chunk=1024*32, **kwargs):\n    \"\"\"Render rays in smaller minibatches to avoid OOM.\n    \"\"\"\n    all_ret = {}\n    for i in range(0, rays_flat.shape[0], chunk):\n        ret = render_rays(rays_flat[i:i+chunk], **kwargs)\n        for k in ret:\n            if k not in all_ret:\n                all_ret[k] = []\n            all_ret[k].append(ret[k])\n\n    all_ret = {k : torch.cat(all_ret[k], 0) for k in all_ret}\n    return all_ret\n\n\ndef render(H, W, focal, chunk=1024*32, rays=None, c2w=None, ndc=True,\n                  near=0., far=1.,\n                  use_viewdirs=False, c2w_staticcam=None,\n                  **kwargs):\n    \"\"\"Render rays\n    Args:\n      H: int. Height of image in pixels.\n      W: int. Width of image in pixels.\n      focal: float. Focal length of pinhole camera.\n      chunk: int. Maximum number of rays to process simultaneously. Used to\n        control maximum memory usage. Does not affect final results.\n      rays: array of shape [2, batch_size, 3]. Ray origin and direction for\n        each example in batch.\n      c2w: array of shape [3, 4]. Camera-to-world transformation matrix.\n      ndc: bool. If True, represent ray origin, direction in NDC coordinates.\n      near: float or array of shape [batch_size]. Nearest distance for a ray.\n      far: float or array of shape [batch_size]. Farthest distance for a ray.\n      use_viewdirs: bool. If True, use viewing direction of a point in space in model.\n      c2w_staticcam: array of shape [3, 4]. If not None, use this transformation matrix for\n       camera while using other c2w argument for viewing directions.\n    Returns:\n      rgb_map: [batch_size, 3]. Predicted RGB values for rays.\n      disp_map: [batch_size]. Disparity map. Inverse of depth.\n      acc_map: [batch_size]. Accumulated opacity (alpha) along a ray.\n      extras: dict with everything returned by render_rays().\n    \"\"\"\n    if c2w is not None:\n        # special case to render full image\n        rays_o, rays_d = get_rays(H, W, focal, c2w)\n    else:\n        # use provided ray batch\n        rays_o, rays_d = rays\n\n    if use_viewdirs:\n        # provide ray directions as input\n        viewdirs = rays_d\n        if c2w_staticcam is not None:\n            # special case to visualize effect of viewdirs\n            rays_o, rays_d = get_rays(H, W, focal, c2w_staticcam)\n        viewdirs = viewdirs / torch.norm(viewdirs, dim=-1, keepdim=True)\n        viewdirs = torch.reshape(viewdirs, [-1,3]).float()\n\n    sh = rays_d.shape # [..., 3]\n    if ndc:\n        # for forward facing scenes\n        rays_o, rays_d = ndc_rays(H, W, focal, 1., rays_o, rays_d)\n\n    # Create ray batch\n    rays_o = torch.reshape(rays_o, [-1,3]).float()\n    rays_d = torch.reshape(rays_d, [-1,3]).float()\n\n    near, far = near * torch.ones_like(rays_d[...,:1]), far * torch.ones_like(rays_d[...,:1])\n    rays = torch.cat([rays_o, rays_d, near, far], -1)\n    if use_viewdirs:\n        rays = torch.cat([rays, viewdirs], -1)\n\n    # Render and reshape\n    all_ret = batchify_rays(rays, chunk, **kwargs)\n    for k in all_ret:\n        k_sh = list(sh[:-1]) + list(all_ret[k].shape[1:])\n        all_ret[k] = torch.reshape(all_ret[k], k_sh)\n\n    k_extract = ['rgb_map', 'disp_map', 'acc_map']\n    ret_list = [all_ret[k] for k in k_extract]\n    ret_dict = {k : all_ret[k] for k in all_ret if k not in k_extract}\n    return ret_list + [ret_dict]\n\n\ndef render_path(render_poses, hwf, chunk, render_kwargs, gt_imgs=None, savedir=None, render_factor=0):\n\n    H, W, focal = hwf\n\n    if render_factor!=0:\n        # Render downsampled for speed\n        H = H//render_factor\n        W = W//render_factor\n        focal = focal/render_factor\n\n    rgbs = []\n    disps = []\n\n    t = time.time()\n    for i, c2w in enumerate(tqdm(render_poses)):\n        print(i, time.time() - t)\n        t = time.time()\n        rgb, disp, acc, _ = render(H, W, focal, chunk=chunk, c2w=c2w[:3,:4], **render_kwargs)\n        rgbs.append(rgb.cpu().numpy())\n        disps.append(disp.cpu().numpy())\n        if i==0:\n            print(rgb.shape, disp.shape)\n\n        \"\"\"\n        if gt_imgs is not None and render_factor==0:\n            p = -10. * np.log10(np.mean(np.square(rgb.cpu().numpy() - gt_imgs[i])))\n            print(p)\n        \"\"\"\n\n        if savedir is not None:\n            rgb8 = to8b(rgbs[-1])\n            filename = os.path.join(savedir, '{:03d}.png'.format(i))\n            imageio.imwrite(filename, rgb8)\n\n\n    rgbs = np.stack(rgbs, 0)\n    disps = np.stack(disps, 0)\n\n    return rgbs, disps"
  },
  {
    "path": "src/utils.py",
    "content": "import torch\nimport numpy as np\nimport imageio\nimport cv2\nimport json\nimport os\n\n# most of this script is adapted from iNeRF https://github.com/salykovaa/inerf\n# and NeRF-Pytorch https://github.com/yenchenlin/nerf-pytorch/blob/master/load_llff.py\n\nrot_psi = lambda phi: np.array([\n        [1, 0, 0, 0],\n        [0, np.cos(phi), -np.sin(phi), 0],\n        [0, np.sin(phi), np.cos(phi), 0],\n        [0, 0, 0, 1]])\n\nrot_theta = lambda th: np.array([\n        [np.cos(th), 0, -np.sin(th), 0],\n        [0, 1, 0, 0],\n        [np.sin(th), 0, np.cos(th), 0],\n        [0, 0, 0, 1]])\n\nrot_phi = lambda psi: np.array([\n        [np.cos(psi), -np.sin(psi), 0, 0],\n        [np.sin(psi), np.cos(psi), 0, 0],\n        [0, 0, 1, 0],\n        [0, 0, 0, 1]])\n\ntrans_t = lambda x,y,z: np.array([\n        [1, 0, 0, x],\n        [0, 1, 0, y],\n        [0, 0, 1, z],\n        [0, 0, 0, 1]])\n\ndef load_blender(data_dir, model_name, obs_img_num, half_res, white_bkgd, *kwargs):\n\n    with open(os.path.join(data_dir + str(model_name) + \"/obs_imgs/\", 'transforms.json'), 'r') as fp:\n        meta = json.load(fp)\n    frames = meta['frames']\n\n    img_path =  os.path.join(data_dir + str(model_name) + \"/obs_imgs/\", frames[obs_img_num]['file_path'] + '.png')\n    img_rgba = imageio.imread(img_path)\n    img_rgba = (np.array(img_rgba) / 255.).astype(np.float32) # rgba image of type float32\n    H, W = img_rgba.shape[:2]\n    camera_angle_x = float(meta['camera_angle_x'])\n    focal = .5 * W / np.tan(.5 * camera_angle_x)\n    if white_bkgd:\n        img_rgb = img_rgba[..., :3] * img_rgba[..., -1:] + (1. - img_rgba[..., -1:])\n    else:\n        img_rgb = img_rgba[..., :3]\n\n    if half_res:\n        H = H // 2\n        W = W // 2\n        focal = focal / 2.\n        img_rgb = cv2.resize(img_rgb, (W, H), interpolation=cv2.INTER_AREA)\n\n    img_rgb = np.asarray(img_rgb*255, dtype=np.uint8)\n    obs_img_pose = np.array(frames[obs_img_num]['transform_matrix']).astype(np.float32)\n    phi, theta, psi, x, y, z = kwargs\n    start_pose =  trans_t(x, y, z) @ rot_phi(phi/180.*np.pi) @ rot_theta(theta/180.*np.pi) @ rot_psi(psi/180.*np.pi)  @ obs_img_pose\n    return img_rgb, [H, W, focal], start_pose, obs_img_pose # image of type uint8\n\ndef get_pose(phi, theta, psi, x, y, z, obs_img_pose, center_about_true_pose):\n    if center_about_true_pose:\n        # print(\"recentering\")\n        # print(obs_img_pose)\n        pose = trans_t(x, y, z) @ rot_phi(phi/180.*np.pi) @ rot_theta(theta/180.*np.pi) @ rot_psi(psi/180.*np.pi)  @ obs_img_pose\n    else:\n        pose = trans_t(x, y, z) @ rot_phi(phi/180.*np.pi) @ rot_theta(theta/180.*np.pi) @ rot_psi(psi/180.*np.pi)\n        \n    return pose\n\ndef rgb2bgr(img_rgb):\n    img_bgr = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2BGR)\n    return img_bgr\n\n\ndef show_img(title, img_rgb):  # img - rgb image\n    img_bgr = rgb2bgr(img_rgb)\n    cv2.imshow(title, img_bgr)\n    cv2.waitKey(0)\n    cv2.destroyAllWindows()\n\n\ndef find_POI(img_rgb, num_points, DEBUG=False): # img - RGB image in range 0...255\n    img = np.copy(img_rgb)\n    img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)\n\n    sift = cv2.SIFT_create()\n    keypoints = sift.detect(img_gray, None)\n    if DEBUG:\n        img = cv2.drawKeypoints(img_gray, keypoints, img)\n        show_img(\"Detected points\", img)\n    xy = [keypoint.pt for keypoint in keypoints]\n    xy = np.array(xy).astype(int)\n    # Remove duplicate points\n    xy_set = set(tuple(point) for point in xy)\n    xy = np.array([list(point) for point in xy_set]).astype(int)\n    return xy # pixel coordinates\n\n\n# Misc\nimg2mse = lambda x, y : torch.mean((x - y) ** 2)\nmse2psnr = lambda x : -10. * torch.log(x) / torch.log(torch.Tensor([10.]))\nto8b = lambda x : (255*np.clip(x,0,1)).astype(np.uint8)\n\n# Load llff data\n\n########## Slightly modified version of LLFF data loading code\n##########  see https://github.com/Fyusion/LLFF for original\n\ndef _minify(basedir, factors=[], resolutions=[]):\n    needtoload = False\n    for r in factors:\n        imgdir = os.path.join(basedir, 'images_{}'.format(r))\n        if not os.path.exists(imgdir):\n            needtoload = True\n    for r in resolutions:\n        imgdir = os.path.join(basedir, 'images_{}x{}'.format(r[1], r[0]))\n        if not os.path.exists(imgdir):\n            needtoload = True\n    if not needtoload:\n        return\n\n    from subprocess import check_output\n\n    imgdir = os.path.join(basedir, 'images')\n    imgs = [os.path.join(imgdir, f) for f in sorted(os.listdir(imgdir))]\n    imgs = [f for f in imgs if any([f.endswith(ex) for ex in ['JPG', 'jpg', 'png', 'jpeg', 'PNG']])]\n    imgdir_orig = imgdir\n\n    wd = os.getcwd()\n\n    for r in factors + resolutions:\n        if isinstance(r, int):\n            name = 'images_{}'.format(r)\n            resizearg = '{}%'.format(100. / r)\n        else:\n            name = 'images_{}x{}'.format(r[1], r[0])\n            resizearg = '{}x{}'.format(r[1], r[0])\n        imgdir = os.path.join(basedir, name)\n        if os.path.exists(imgdir):\n            continue\n\n        print('Minifying', r, basedir)\n\n        os.makedirs(imgdir)\n        check_output('cp {}/* {}'.format(imgdir_orig, imgdir), shell=True)\n\n        ext = imgs[0].split('.')[-1]\n        args = ' '.join(['mogrify', '-resize', resizearg, '-format', 'png', '*.{}'.format(ext)])\n        print(args)\n        os.chdir(imgdir)\n        check_output(args, shell=True)\n        os.chdir(wd)\n\n        if ext != 'png':\n            check_output('rm {}/*.{}'.format(imgdir, ext), shell=True)\n            print('Removed duplicates')\n        print('Done')\n\n\ndef _load_data(basedir, factor=None, width=None, height=None, load_imgs=True):\n    poses_arr = np.load(os.path.join(basedir, 'poses_bounds.npy'))\n    poses = poses_arr[:, :-2].reshape([-1, 3, 5]).transpose([1, 2, 0])\n    bds = poses_arr[:, -2:].transpose([1, 0])\n\n    img0 = [os.path.join(basedir, 'images', f) for f in sorted(os.listdir(os.path.join(basedir, 'images'))) \\\n            if f.endswith('JPG') or f.endswith('jpg') or f.endswith('png')][0]\n    sh = imageio.imread(img0).shape\n\n    sfx = ''\n\n    if factor is not None:\n        sfx = '_{}'.format(factor)\n        _minify(basedir, factors=[factor])\n        factor = factor\n    elif height is not None:\n        factor = sh[0] / float(height)\n        width = int(sh[1] / factor)\n        _minify(basedir, resolutions=[[height, width]])\n        sfx = '_{}x{}'.format(width, height)\n    elif width is not None:\n        factor = sh[1] / float(width)\n        height = int(sh[0] / factor)\n        _minify(basedir, resolutions=[[height, width]])\n        sfx = '_{}x{}'.format(width, height)\n    else:\n        factor = 1\n\n    imgdir = os.path.join(basedir, 'images' + sfx)\n    if not os.path.exists(imgdir):\n        print(imgdir, 'does not exist, returning')\n        return\n\n    imgfiles = [os.path.join(imgdir, f) for f in sorted(os.listdir(imgdir)) if\n                f.endswith('JPG') or f.endswith('jpg') or f.endswith('png')]\n    if poses.shape[-1] != len(imgfiles):\n        print('Mismatch between imgs {} and poses {} !!!!'.format(len(imgfiles), poses.shape[-1]))\n        return\n\n    sh = imageio.imread(imgfiles[0]).shape\n    poses[:2, 4, :] = np.array(sh[:2]).reshape([2, 1])\n    poses[2, 4, :] = poses[2, 4, :] * 1. / factor\n\n    if not load_imgs:\n        return poses, bds\n\n    def imread(f):\n        if f.endswith('png'):\n            return imageio.imread(f, ignoregamma=True)\n        else:\n            return imageio.imread(f)\n\n    imgs = imgs = [imread(f)[..., :3] / 255. for f in imgfiles]\n    imgs = np.stack(imgs, -1)\n\n    # print('Loaded image data', imgs.shape, poses[:, -1, 0])\n    return poses, bds, imgs\n\n\ndef normalize(x):\n    return x / np.linalg.norm(x)\n\n\ndef viewmatrix(z, up, pos):\n    vec2 = normalize(z)\n    vec1_avg = up\n    vec0 = normalize(np.cross(vec1_avg, vec2))\n    vec1 = normalize(np.cross(vec2, vec0))\n    m = np.stack([vec0, vec1, vec2, pos], 1)\n    return m\n\n\ndef ptstocam(pts, c2w):\n    tt = np.matmul(c2w[:3, :3].T, (pts - c2w[:3, 3])[..., np.newaxis])[..., 0]\n    return tt\n\n\ndef poses_avg(poses):\n    hwf = poses[0, :3, -1:]\n\n    center = poses[:, :3, 3].mean(0)\n    vec2 = normalize(poses[:, :3, 2].sum(0))\n    up = poses[:, :3, 1].sum(0)\n    c2w = np.concatenate([viewmatrix(vec2, up, center), hwf], 1)\n\n    return c2w\n\n\ndef recenter_poses(poses):\n    poses_ = poses + 0\n    bottom = np.reshape([0, 0, 0, 1.], [1, 4])\n    c2w = poses_avg(poses)\n    c2w = np.concatenate([c2w[:3, :4], bottom], -2)\n    bottom = np.tile(np.reshape(bottom, [1, 1, 4]), [poses.shape[0], 1, 1])\n    poses = np.concatenate([poses[:, :3, :4], bottom], -2)\n\n    poses = np.linalg.inv(c2w) @ poses\n    poses_[:, :3, :4] = poses[:, :3, :4]\n    poses = poses_\n    return poses\n\n\n#####################\n\n\ndef spherify_poses(poses, bds):\n    p34_to_44 = lambda p: np.concatenate([p, np.tile(np.reshape(np.eye(4)[-1, :], [1, 1, 4]), [p.shape[0], 1, 1])], 1)\n\n    rays_d = poses[:, :3, 2:3]\n    rays_o = poses[:, :3, 3:4]\n\n    def min_line_dist(rays_o, rays_d):\n        A_i = np.eye(3) - rays_d * np.transpose(rays_d, [0, 2, 1])\n        b_i = -A_i @ rays_o\n        pt_mindist = np.squeeze(-np.linalg.inv((np.transpose(A_i, [0, 2, 1]) @ A_i).mean(0)) @ (b_i).mean(0))\n        return pt_mindist\n\n    pt_mindist = min_line_dist(rays_o, rays_d)\n\n    center = pt_mindist\n    up = (poses[:, :3, 3] - center).mean(0)\n\n    vec0 = normalize(up)\n    vec1 = normalize(np.cross([.1, .2, .3], vec0))\n    vec2 = normalize(np.cross(vec0, vec1))\n    pos = center\n    c2w = np.stack([vec1, vec2, vec0, pos], 1)\n\n    poses_reset = np.linalg.inv(p34_to_44(c2w[None])) @ p34_to_44(poses[:, :3, :4])\n\n    rad = np.sqrt(np.mean(np.sum(np.square(poses_reset[:, :3, 3]), -1)))\n\n    sc = 1. / rad\n    poses_reset[:, :3, 3] *= sc\n    bds *= sc\n    rad *= sc\n\n    centroid = np.mean(poses_reset[:, :3, 3], 0)\n    zh = centroid[2]\n\n    poses_reset = np.concatenate(\n        [poses_reset[:, :3, :4], np.broadcast_to(poses[0, :3, -1:], poses_reset[:, :3, -1:].shape)], -1)\n\n    return poses_reset, bds\n\n\ndef load_llff_data(data_dir, model_name, obs_img_num, *kwargs, factor=8, recenter=True, bd_factor=.75, spherify=False):\n    poses, bds, imgs = _load_data(data_dir + \"/\", factor=factor)  # factor=8 downsamples original imgs by 8x\n    # print('Loaded', data_dir + str(model_name) + \"/\", bds.min(), bds.max())\n\n    # Correct rotation matrix ordering and move variable dim to axis 0\n    poses = np.concatenate([poses[:, 1:2, :], -poses[:, 0:1, :], poses[:, 2:, :]], 1)\n    poses = np.moveaxis(poses, -1, 0).astype(np.float32)\n    images = np.moveaxis(imgs, -1, 0).astype(np.float32)\n    bds = np.moveaxis(bds, -1, 0).astype(np.float32)\n\n    # Rescale if bd_factor is provided\n    sc = 1. if bd_factor is None else 1. / (bds.min() * bd_factor)\n    poses[:, :3, 3] *= sc\n    bds *= sc\n\n    if recenter:\n        poses = recenter_poses(poses)\n\n    if spherify:\n        poses, bds = spherify_poses(poses, bds)\n    print(\"loading image number: \", obs_img_num)\n    #images = images.astype(np.float32)\n    images = np.asarray(images * 255, dtype=np.uint8)\n    poses = poses.astype(np.float32)\n    hwf = poses[0,:3,-1]\n    poses = poses[:,:3,:4]\n    obs_img = images[obs_img_num]\n    obs_img_pose = np.concatenate((poses[obs_img_num], np.array([[0,0,0,1.]])), axis=0)\n    phi, theta, psi, x, y, z = kwargs\n    start_pose = rot_phi(phi/180.*np.pi) @ rot_theta(theta/180.*np.pi) @ rot_psi(psi/180.*np.pi) @ trans_t(x, y, z) @ obs_img_pose\n    return obs_img, hwf, start_pose, obs_img_pose, bds\n"
  },
  {
    "path": "tools/eval_logs.py",
    "content": "import matplotlib.pyplot as plt\nimport numpy as np\nimport glob\nimport gtsam\nimport sys\n\n######################## user params ########################\nrun_inerf = False\n# only needed if you run iNeRF\nsys.path.insert(0, '/home/dominic')\nsys.path.insert(0, '/home/dominic/inerf') # path to iNeRF root directory\nfrom inerf.run import run\n\n# font size to use for plots\nfontsize = 17\n\n# plot mean error or plot ratio of trials with position error < 5 cm and rotation error < 5 degrees\nget_mean_error = False\n\n# directory where logs are stored\nlog_directory = '/home/dominic/mocNeRF_ws/src/nav/logs/inerf_compare'\n\nnum_starts_per_dataset = 5 # number of trials per dataset\ndatasets = ['fern', 'horns', 'fortress', 'room']\n# datasets = ['fern']\n\n# input number of iterations runs by iNeRF and M\n# note this is used for plotting number of forward passes and not for telling iNeRF what params to use\nparams = {'inerf': {'num_iterations': 300, 'batch_size': 2048, 'course_samples':64, 'fine_samples':64}}\n\n##############################################################\n\ndef check_if_position_error_good(gt_pose, est_pose, get_mean_error=False):\n    acceptable_error = 0.05 # meters\n    error = np.linalg.norm(gt_pose.translation() - est_pose.translation())\n    if get_mean_error:\n        return error\n    return error < acceptable_error\n\ndef check_if_rotation_error_good(gt_pose, est_pose, get_mean_error=False):\n    acceptable_error = 5.0 # degrees\n    est_t = np.transpose(est_pose.rotation().matrix())\n    r_ab = est_t @ gt_pose.rotation().matrix()\n    rot_error = np.rad2deg(np.arccos((np.trace(r_ab) - 1) / 2)) # TODO make 0 if nan\n    \n    if get_mean_error:\n        return rot_error\n\n    return abs(rot_error) < acceptable_error\n\ndef average_arrays(axis_list, million_scale = True):\n    \"\"\"\n    average arrays of different size\n    adapted from https://stackoverflow.com/questions/49037902/how-to-interpolate-a-line-between-two-other-lines-in-python/49041142#49041142\n\n    forward_passes_all, accuracy\n    \"\"\"\n    min_max_xs = [(min(axis), max(axis)) for axis in axis_list[0]]\n\n    new_axis_xs = [np.linspace(min_x, max_x, 100) for min_x, max_x in min_max_xs]\n    new_axis_ys = []\n    for i in range(len(axis_list[0])):\n        new_axis_ys.append(np.interp(new_axis_xs[i], axis_list[0][i], axis_list[1][i]))\n    \n    scale = 1.0\n    if million_scale:\n        scale = 1000000.0\n\n    midx = [np.mean([new_axis_xs[axis_idx][i] for axis_idx in range(len(axis_list[0]))])/scale for i in range(100)]\n    midy = [np.mean([new_axis_ys[axis_idx][i] for axis_idx in range(len(axis_list[0]))]) for i in range(100)]\n\n    return midx, midy\n\ndef get_ratio(alg, get_mean_error=False, prefix=\"\"):\n    # alg can be 'mocnerf' or 'inerf'\n\n    if prefix != \"\":\n        prefix = prefix + \"_\"\n\n    if alg == 'inerf':\n        num_forward_passes_per_iteration = (params[alg]['course_samplese'] + params[alg]['fine_samplese']) * params[alg]['batch_size'] \n        forward_passes_count = [i * num_forward_passes_per_iteration / 1000000.0 for i in range(params[alg]['num_iterations'] + 1)]\n\n    total_num_forward_passes_per_iteration = []\n    total_position_error_good = []\n    total_rotation_error_good = []\n\n    for dataset_index, dataset_name in enumerate(datasets):\n        gt_files = glob.glob(log_directory + '/gt_' + dataset_name + '*')\n        for start_num, gt_file in enumerate(gt_files):\n            img_num = gt_file.split('/')[-1].split('_')[2]\n            gt_pose = gtsam.Pose3(np.load(gt_file))\n            # print(gt_pose)\n\n            if run_inerf and alg == 'inerf':\n                # delta_phi, delta_theta, delta_psi, delta_x, delta_y, delta_z, obs_img_num, model_name\n                pertubation_file = log_directory + \"/initial_pose_\" + dataset_name + \"_\" + str(img_num) + \"_\" + \"poses.npy\"\n                pertubation_pose = np.load(pertubation_file)\n                experiment_params =  [pertubation_pose[0] * 180.0/np.pi, pertubation_pose[1] * 180.0/np.pi, pertubation_pose[2] * 180.0/np.pi,\n                                      pertubation_pose[3], pertubation_pose[4], pertubation_pose[5],\n                                      int(img_num), dataset_name]\n                # print(experiment_params)\n                run(experiment_params, log_directory=log_directory, log_results=True)\n\n            est_file = log_directory + \"/\" + alg + \"_\" + prefix + dataset_name + \"_\" + str(img_num) + \"_\" + \"poses.npy\"\n            pose_estimates = np.load(est_file)\n\n            if alg == 'mocnerf':\n                iteration_file = log_directory + \"/\" + alg + \"_\" + prefix + dataset_name + \"_\" + str(img_num) + \"_\" + \"forward_passes.npy\"\n                num_forward_passes_per_iteration = np.load(iteration_file)\n\n            position_error_good = []\n            rotation_error_good = []\n\n            for pose_index in range(pose_estimates.shape[0]):\n                est_pose_matrix = gtsam.Pose3(pose_estimates[pose_index])\n                position_error_good.append(check_if_position_error_good(gt_pose, est_pose_matrix, get_mean_error))\n                rotation_error_good.append(check_if_rotation_error_good(gt_pose, est_pose_matrix, get_mean_error))\n\n            total_num_forward_passes_per_iteration.append(num_forward_passes_per_iteration)\n            total_position_error_good.append(position_error_good)\n            total_rotation_error_good.append(rotation_error_good)\n\n    if alg == 'inerf':\n        average_position_metric = np.average(np.array(total_position_error_good), axis=0)\n        average_rotation_metric = np.average(np.array(total_rotation_error_good), axis=0)\n        update_steps_count = None\n    \n    else:\n        total_update_steps_per_iteration = [ [i for i in range(len(update_in_trial))] for update_in_trial in total_num_forward_passes_per_iteration]\n        forward_passes_count, average_position_metric = average_arrays([total_num_forward_passes_per_iteration, total_position_error_good])\n        forward_passes_count, average_rotation_metric = average_arrays([total_num_forward_passes_per_iteration, total_rotation_error_good])\n        update_steps_count, average_position_metric = average_arrays([total_update_steps_per_iteration, total_position_error_good], million_scale=False)\n\n    return forward_passes_count, average_position_metric, average_rotation_metric, update_steps_count\n\ninerf_forward_passes_count, inerf_average_position_metric, inerf_average_rotation_metric, _ = get_ratio('inerf', get_mean_error)\nmcl_forward_passes_count, mcl_average_position_metric, mcl_average_rotation_metric, _ = get_ratio('mocnerf', get_mean_error)\n# Note the prefix argument can be used if you have multiple logs with different params from Loc-NeRF.\n# For example, in this case we wanted to do an ablation study on running with and without annealing.\nmcl_forward_passes_count_norefine, mcl_average_position_metric_norefine, mcl_average_rotation_metric_norefine, _ = get_ratio('mocnerf', get_mean_error, prefix=\"started\")\n\nposition_plot = plt.figure(1, figsize=(8,4.5))\nplt.plot(mcl_forward_passes_count, mcl_average_position_metric, label=\"Loc-NeRF w/ annealing\", linewidth=2.5, color='b')\nplt.plot(mcl_forward_passes_count_norefine, mcl_average_position_metric_norefine, label=\"Loc-NeRF w/o annealing\", linewidth=2.5, color='k')\nplt.plot(inerf_forward_passes_count[0:-1], inerf_average_position_metric[0:-1], label=\"iNeRF\", linewidth=2.5, color='orange')\nplt.xlabel(\"Number of Forward Passes (in millions)\", fontsize=fontsize, family='Arial')\nplt.grid()\nplt.legend(loc='upper left', fontsize=fontsize-4)\n\nif not get_mean_error:\n    plt.ylim([0, 1.01])\n    plt.title(\"Translation Accuracy Evaluation\", fontsize=fontsize, family='Arial')\n    plt.ylabel(\"Ratio of Trials with Translation \\n Error < 5 cm\", fontsize=fontsize, family='Arial')\nelse:\n    plt.title(\"Average Translation Error for 20 Trials\", fontsize=fontsize, family='Arial')\n    plt.ylabel(\"error (m)\", fontsize=fontsize, family='Arial')\n\nplt.tight_layout()\nplt.show()\n\nrotation_plot = plt.figure(2, figsize=(8,4.5))\nplt.plot(mcl_forward_passes_count, mcl_average_rotation_metric, label=\"Loc-NeRF w/ annealing\", linewidth=2.5, color='b')\nplt.plot(mcl_forward_passes_count_norefine, mcl_average_rotation_metric_norefine, label=\"Loc-NeRF w/o annealing\", linewidth=2.5, color='k')\nplt.plot(inerf_forward_passes_count[0:-1], inerf_average_rotation_metric[0:-1], label=\"iNeRF\", linewidth=2.5, color='orange')\nplt.xlabel(\"Number of Forward Passes (in millions)\", fontsize=fontsize, family='Arial')\nplt.grid()\nplt.legend(loc='upper right', fontsize=fontsize-4)\n\nif not get_mean_error:\n    plt.ylim([0, 1.01])\n    plt.title(\"Rotation Accuracy Evaluation\", fontsize=fontsize, family='Arial')\n    plt.ylabel(\"Ratio of Trials with Rotation Error < 5$^\\circ$\", fontsize=fontsize, family='Arial')\nelse:\n    plt.title(\"Average Rotation Error for 20 Trials\", fontsize=fontsize, family='Arial')\n    plt.ylabel(\"error (deg)\", fontsize=fontsize, family='Arial')\nplt.tight_layout()\nplt.show()\n\n# To run iNeRF with this code use this (this makes sure that iNeRF params are set correctly for all runs):\n#  python3 eval_logs.py --config /home/dominic/inerf/configs/fern.txt --data_dir /home/dominic/inerf/data/nerf_llff_data/ --ckpt_dir /home/dominic/inerf/ckpts/ --N_importance 64 --batch_size 2048"
  },
  {
    "path": "tools/ros_to_jpg.py",
    "content": "import rosbag\nimport rospy\nimport numpy as np\nimport csv\nimport os\nimport cv2\nfrom cv_bridge import CvBridge\n\n######## user params ##########\nTOPIC_NAME = \"/camera/color/image_raw\"\ndata_bag = rosbag.Bag(\"/home/dominic/<rosbag.bag>\")\nSAVE_LOC = \"/home/dominic\"\nDATA_NAME = \"nerf_hallway\"\ndownsample = 10\n###############################\n\nos.makedirs(SAVE_LOC + \"/\" + DATA_NAME)\n\ncount = 0\nindex=0\nfor topic, msg, t in data_bag.read_messages(topics=[TOPIC_NAME]):\n\tif topic == TOPIC_NAME:\n\t\tif count%downsample==0:\n\t\t\tbridge = CvBridge()\n\t\t\tcv_image = bridge.imgmsg_to_cv2(msg)\n\t\t\tcv_image.astype(np.uint8)\n\t\t\tcv_image = cv2.cvtColor(cv_image, cv2.COLOR_BGR2RGB)\n\t\t\tcv2.imwrite(SAVE_LOC + \"/\" + DATA_NAME + \"/r\" + str(index) + '.jpg', cv_image)\n\t\t\tindex += 1\n\t\tcount += 1\n"
  }
]