Repository: francoisp/DuplexRsync Branch: master Commit: c0ddfe64d12d Files: 3 Total size: 16.2 KB Directory structure: gitextract_ic2ikyi3/ ├── MIT-LICENSE ├── README.md └── duplexRsync.sh ================================================ FILE CONTENTS ================================================ ================================================ FILE: MIT-LICENSE ================================================ Copyright (c) 2019 Francois Payette Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # DuplexRsync 🌟 Simple realtime 2-way sync. ### Problem I often find myself editing quite a few files on remote hosts; for anything non-trivial I like to use local-running tools such as Sublime. I've used [rsub](https://github.com/henrikpersson/rsub), it's very nice and lightweight. Sometimes(often) the light editing turns heavier and more and more files are worked on. I have noticed that when the ssh tunnel dies and is recreated while a file is open, the file will be truncated to zlitch --a glitch to look out for that is more likely to occur when multiple files are open. When things keep getting heavier, I've then used [sshfs](https://github.com/osxfuse/osxfuse/wiki/SSHFS) to mount a remote directory and fuse it to the local filesystem. This usually works ok, but for some types of workflows such as sublime projects with a lot of files in subfolders (node_modules? --sometimes this one starts to feel like a whole Gentoo distro) it is inadequate. Search becomes extra slow. The SublimeText project tree spins and spins and spins, features that have become automatisms are unworkable. Also, open files prevent the tunnelling connection from exiting; and a broken tunnel (say you close your laptop without closing everything and unmounting) can leave the fuse subsystem in a weird state, where you cannot remount to the previous location until a reboot, as well as other minor glitches. ### Solution DuplexRsync is a simple and pretty sweet (although only lightly tested as of 2019/03, PLEASE BE CAREFUL AND ALWAYS HAVE BACKUPS and/or VERSIONING!) solution based on fswatch and rsync. It's a single file you'll put in your local directory that will maintain (DropBox|GoogleDrive)-style 2-way sync between the current directory and a remote directory via SSH. This has the advantage to work fine when offline. This bash script is a bit macOSX-centric because that's what I use locally, please feel free to adapt. By default the script excludes node_modules and all folders that start with a period. (.git etc) ### Merging If a file has been edited on both ends while offline (duplexRsync not running), merging will simply crush the oldest edit; it will never results in conflict files. This is harsh but simpler; with git these days I think edits that have some value should be committed, so we delegate versioning there. If you attempt to sync mismatched folders, a lot of files in the remote folder would get deleted. When launching duplexRsync you'll be prompted to either merge the folders (create these files in the local folder), or destroy all the extra files in the remote folder. Latency for multiple remote edits to propagate to local folder is set by default to 3 seconds, this prevents infinite cycling of change detection. Over very slow network connections you might need to increase this value. ### Setup on your remote machine you'll need fswatch: sudo add-apt-repository ppa:hadret/fswatch sudo apt-get update sudo apt-get install -y fswatch on your local machine you'll need brew, that's it. This script will install the other required components (socat fswatch and gnu-getopt) chmod u+x duplexRsync.sh ./duplexRsync.sh --remoteHost user@192.168.0.2 ### Caveats This is a simple solution, it does not implement any distributed locking. If you or processes are editing at both ends simultaneously, over and above the crushing of the oldest edit of the same file mentioned, there's a window while a newly created file can get deleted. Conversely but less serious, there's also a window during which a deleted file could be recreated. An argument to --delete-older-than "seconds" in rsync would mitigate the first edge case, I think the second one(zombie file coming back) is a an annoyance I can live with. ### Related Thanks for all the feedback in various forums. Here are a few related projects that have been brought to my attention. I have not tried any of these, they all look very well written; they could come in handy later. #### Heavier - [osync](https://github.com/deajan/osync) #### Heaviest - [Mutagen.io](https://mutagen.io/) - [Syncthing](https://github.com/syncthing/syncthing) - [Unison](https://github.com/bcpierce00/unison) That's it!🔥 Cheers! Please Note: A few hidden files are created to maintain the 2-way sync, they all start by .____*. The remote directory will be straight off the home of your remote user's home; there's an optional --remoteParent if you need to change that. License: MIT ================================================ FILE: duplexRsync.sh ================================================ #!/bin/bash # REQUIREMENT we need fswatch on both ends, run this to get it on ubuntu1604 #sudo add-apt-repository ppa:hadret/fswatch #sudo apt-get update #sudo apt-get install -y fswatch printHelp(){ echo "USAGE: duplexRsync --remoteHost user@host DuplexRsync requires fswatch on both ends, this tries to install it locally using brew(required). on the remote end run: sudo add-apt-repository ppa:hadret/fswatch sudo apt-get update sudo apt-get install -y fswatch you need to specify: --remoteHost ex: user@192.168.0.2. You can also optionally specify: --remoteParent contains/will contain the remoteDir" } # if our arguments match this string, it's the socat fork trgger for remote change detection; increment sentinel and exit if [ "$*" = "sentinelIncrement" ]; then sentval=$(cat .____sentinel);sentval=$((sentval+1));echo $sentval > .____sentinel; exit; fi if [ "$*" = "" ]; then printHelp; exit; fi # we need brew on macosx if [ -z $(command -v brew) ]; then printHelp; exit fi # this is for macosx, we also need socat to create a socket to remote trigger rsync brew install socat fswatch gnu-getopt function randomLocalPort() { localPort=42 localPort=$RANDOM; let "localPort %= 999"; localPort="42$localPort" } function randomRemotePort() { remotePort=42 remotePort=$RANDOM; let "remotePort %= 999"; remotePort="42$remotePort" } if ! options=$(/usr/local/Cellar/gnu-getopt/*/bin/getopt -u -o hr:p: -l help,remoteHost:,remoteParent: -- "$@") then # something went wrong, getopt will put out an error message for us exit 1 fi set -- $options while [ $# -gt 0 ] do case $1 in # for options with required arguments, an additional shift is required -h|--help ) printHelp; exit; shift;; -r|--remoteHost ) remoteHost=$2; shift;; -p|--remoteParent ) remoteParent=$2; shift;; --) shift; break;; #(-*) echo "$0: error - unrecognized option $1" 1>&2; exit 1;; (*) break;; esac shift done if [ -z "$remoteHost" ]; then echo "Missing Argument: --remoteHost" printHelp; exit; fi remoteDir=${PWD##*/} remoteDir="$remoteParent$remoteDir" if [ ! -f ~/.ssh/id_rsa.pub ]; then echo "You need a key pair to use duplexRsync. You can generate one using: ssh-keygen -t rsa" exit; fi # we'll need to ssh without pass - use public key crypto to ssh into remote end, rsync needs this #we are copying our pubkey to ssh in without prompt cat ~/.ssh/id_rsa.pub | ssh "$remoteHost" 'mkdir .ssh;pubkey=$(cat); touch .ssh/authorized_keys; if grep -q "$pubkey" ".ssh/authorized_keys"; then echo "puublic key for this user already present"; else echo $pubkey >> .ssh/authorized_keys;fi' fswatchPath=$(ssh "$remoteHost" 'command -v fswatch') #on macosx remote the $PATH variable is different when local or ssh, lets try with looking up the local path if [ -z "$fswatchPath" ]; then fswatchPath=$(ssh "$remoteHost" 'command -v /usr/local/bin/fswatch') fi if [ -z "$fswatchPath" ]; then echo "ERROR: missing fswatch at remote end" printHelp; exit; fi # kill all remote fswatches for this path that might be lingering ssh $remoteHost "pkill -P \$(ps alx | egrep '.*pipe_w.*____rsyncSignal.sh --pwd $PWD --port $remotePort' | awk '{print \$4}' | head -n 1)" ssh $remoteHost "pkill -f '____rsyncSignal.sh --pwd $PWD'" # if we have the ssh tunnel running this will match and we kill it; pwd args to prevent killing other folders being watched pkill -f "rsyncSignal.sh --pwd $PWD" # if we have a lingering socat kill it # we shouldnt have one, this is a bad plan if using multple sockets #pkill -f "sentinelIncrement.sh --pwd $PWD" echo '0' > .____sentinel #create localsocket to listen for remote changes socatRes="not listening yet, we get a random port in the following loop"; while [ ! -z "$socatRes" ] do randomLocalPort; socatRes=""; # frok call this script with a special argument that simply inccrement snetinel and exits socatRes=$(socat TCP-LISTEN:$localPort,fork EXEC:"./duplexRsync.sh sentinelIncrement" 2>&1 &) & # result should be empty when listen works done; echo "listening locally on:$localPort" #for now we use the same port at both ends, this is a bit sloppy we should test to make sure it's not used with the ssh -R call remotePort=$localPort #we dump to a remote file the fswatch command that allows local running socat to get a signal of a remote change # modification to add the -r switch to all subs excluding node_modules. This is required because fswatch will still iterate over all subdirs because the -e switch is a pattern, not a path # if you get a bunch of: inotify_add_watch: No space left on device # you will need to https://github.com/guard/listen/wiki/Increasing-the-amount-of-inotify-watchers # check your current limit: cat /proc/sys/fs/inotify/max_user_watches # ATTENTION: you cannot change this kernel param if running in an unpriviledged container, you'll need to run this in the hosting kernel's env # echo fs.inotify.max_user_watches=524288 | tee -a /etc/sysctl.conf && sysctl -p; echo "increasing the limit of watches, cannot be done in unpriv container" #echo "$fswatchPath -r -e \"node_modules\" -o . | while read f; do echo 1 | nc localhost $remotePort; done" | ssh $remoteHost "mkdir -p $remoteDir; cd $remoteDir; cat > .____rsyncSignal.sh" absPath=$(ssh $remoteHost "mkdir -p $remoteDir; cd $remoteDir; pwd") # we are exluding node_modules and folders starting with . ssh $remoteHost "mkdir -p $remoteDir; cd $remoteDir; find $absPath -maxdepth 1 -mindepth 1 -type d ! -name \"node_modules\" ! -name \".*\"| awk '{ print \"\\\"\"\$0\"\\\"\"}' | nl | awk -F\\\" '{printf \"/usr/bin/fswatch -x --event Updated --event Created --event Removed --event Renamed --event MovedFrom --event MovedTo -r \\\"%s\\\" | while read f; do echo 1 | nc localhost $remotePort; done \& \n\", \$2, \$1, \$1}' > .____rsyncSignal.sh" ssh $remoteHost "cd $remoteDir; echo \"/usr/bin/fswatch -x --event Updated --event Created --event Removed --event Renamed --event MovedFrom --event MovedTo -o $absPath | while read f; do echo 1 | nc localhost $remotePort; done\" >> .____rsyncSignal.sh" # we are exluding node_modules and folders starting with . # this should work, but there seems to be a bug in fswatch, so we are using multiple processes instead #ssh $remoteHost "mkdir -p $remoteDir; cd $remoteDir; find $absPath -maxdepth 1 -mindepth 1 -type d ! -name \"node_modules\" ! -name \".*\" | awk '{ print \"\\\"\"\$0\"\\\"\"}' | awk -F\\\" '{printf \" \\\"%s\\\" \", \$2}' | (echo -n \" /usr/bin/fswatch -x --event Updated --event Created --event Removed --event Renamed --event MovedFrom --event MovedTo -r \" && cat) > .____rsyncSignal.sh" #ssh $remoteHost "cd $remoteDir; echo \" | while read f; do if [ -z \\\"\$skip\\\" ]; then skip=\\\"recursive first msg is spurious\\\"; else echo 1 | nc localhost $remotePort; fi done & /usr/bin/fswatch -o $absPath | while read f; do echo 1 | nc localhost $remotePort; done\" >> .____rsyncSignal.sh" #exit 1; function duplex_rsync() { # kill all remote fswatches, also supress kill notice in bash ssh $remoteHost "pkill -P \$(ps alx | egrep '.*pipe_w.*____rsyncSignal.sh --pwd $PWD --port $remotePort' | awk '{print \$4}' | head -n 1) >/dev/null 2&>1" # kill the remote fswatch while we sync, pwd arg used to prevent attempting to kill other watches; port prevent killing if 2 locals have the exact same path local # also this discloses local path to remote end; dont think this is serious ssh $remoteHost "pkill -f '____rsyncSignal.sh --pwd $PWD --port $remotePort'" # also kill the tunnel pkill -f "rsyncSignal.sh --pwd $PWD" # order matters; if we got a remote trigger we'll process remote as src first to prevent restoring files that might have just been deleted if [ "$trigger" = "remote" ]; then rsync -auzP --exclude ".*/" --exclude ".____*" --exclude "node_modules" --delete "$remoteHost:$remoteDir/" .; rsync -auzP --exclude ".*/" --exclude ".____*" --exclude "node_modules" --delete . "$remoteHost:$remoteDir"; else # local as src first rsync -auzP --exclude ".*/" --exclude ".____*" --exclude "node_modules" --delete . "$remoteHost:$remoteDir"; rsync -auzP --exclude ".*/" --exclude ".____*" --exclude "node_modules" --delete "$remoteHost:$remoteDir/" .; fi; ssh -R localhost:$localPort:127.0.0.1:$remotePort $remoteHost "cd $remoteDir; bash .____rsyncSignal.sh --pwd $PWD --port $remotePort"& #tunnelPid="$!" # echo "tunnelPid:$tunnelPid" } lastSentinel=$(cat .____sentinel); # we always start from the local dir trigger=local; # do a trial run to see if we'd delete files on the remote end wouldDeleteCount=$(rsync -anuzP --exclude ".*/" --exclude ".____*" --exclude "node_modules" --delete . $remoteHost:$remoteDir/ | grep deleting | wc -l); wouldDeleteCount="$(echo -e "${wouldDeleteCount}" | tr -d '[:space:]')" wouldDeleteRemoteFiles=$(rsync -anuzP --exclude ".*/" --exclude ".____*" --exclude "node_modules" --delete . $remoteHost:$remoteDir/ | grep deleting); if [ ! -z "$wouldDeleteRemoteFiles" ]; then unset destroyAhead unset localFileCount localFileCount=$(find . -type f | egrep -v '\..+/' | egrep -v '\./duplexRsync.sh' | egrep -v '\./\.____*' | wc -l | tr -d '[:space:]') # if the local directory is empty using same pattern as rsync above we always merge if [ "$localFileCount" -eq 0 ] then destroyAhead="merge" else echo "WOULD delete count: $wouldDeleteCount" echo "$wouldDeleteRemoteFiles" fi while ! [[ "$destroyAhead" =~ ^(destroy|merge|abort)$ ]] do if [ "$wouldDeleteCount" -gt 5 ] then major=" ----MAJOR----- "; fi if [ "$wouldDeleteCount" -gt 42 ] then major=" ----INTERSTELLAR BYPASS LEVEL----- "; fi echo "ATTENTION $major DESTRUCTION AHEAD: There is/are $wouldDeleteCount file(s) present in the remote folder that are not present locally. Could the remote folder be totally unrelated? Would you like to merge the folders by creeating these locally(merge),Sync and destroy(destroy) or abort?(merge/destroy/abort)" read destroyAhead done if [ "$destroyAhead" = "abort" ]; then exit; elif [ "$destroyAhead" = "merge" ]; then # sync from remote without delete rsync -auzP --exclude ".*/" --exclude ".____*" --exclude "node_modules" "$remoteHost:$remoteDir/" .; fi fi; duplex_rsync;fswatch -r -o . | while read f; do sentinel=$(cat .____sentinel); echo "sentinel $sentinel lastSentinel: $lastSentinel" sentinelInc=$((sentinel-lastSentinel)); # if the change is remote(incremented ____sentinel) lets slow down and wait to gobble multiple events if [ $sentinelInc -gt 0 ] then echo 'remote change detected'; trigger=remote; duplex_rsync; sleep 3; else echo 'local change detected'; trigger=local; duplex_rsync; fi lastSentinel=$sentinel; done;