[
  {
    "path": ".gitignore",
    "content": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nenv/\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\n*.egg-info/\n.installed.cfg\n*.egg\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*,cover\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n"
  },
  {
    "path": "LICENSE",
    "content": "The MIT License (MIT)\n\nCopyright (c) 2015 Prashanth Ellina\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n\n"
  },
  {
    "path": "README.md",
    "content": "# Pullbox\n\n`Pullbox` is a very simple implementation that can serve as an alternative\nfor Dropbox that is based on Git. It works currently on any Linux-like OS\nand OSX but not on Windows.\n\n## Why?\n\nDropbox works well enough and works on many platforms. Although your data is\non someone else's server, it is probably safer over there than with you (for\nmost cases). I wrote `Pullbox` to overcome a specific limitation in Dropbox\ni.e. Symlinks. Dropbox does not \"see\" symlinks. Although it synchronizes the\ncontent pointed to by the symlink, it forgets that fact that it is a Symlink\nwhen you sync to another computer.\n\nI want to maintain my personal wiki and journal as plain text files. In\norder to organize my notes structure, I depend on symlinks (so I can put the\nsame note under multiple directories). Dropbox does not support this\nuse-case.\n\n## How does it work?\n\n`Pullbox` needs SSH access to a remote Linux server that has `git` and\n`inotifywait` commands installed. This serves as the backup location for\nyour local data.\n\n`Pullbox` monitors file system activity in the local directory and\nautomatically pushes changes to the remote repo. The monitoring is done\nusing `inotify` on Linux, `FSEvents` on OSX, `kqueue` on BSD style OSs.\n\n`Pullbox` also monitors file system activity on the remote repo and\nautomatically pulls changes to the local repo when needed. This is achieved\nby using `ssh` and running `inotifywait` on the server (a lot like AJAX\nlong-polling except we use SSH here instead of HTTP).\n\n## Setting up\n\n### Backup Server\n\nInstructions shown below assume Ubuntu Linux. You can modify based on the\nactual distro you have. Let us say the domain name of the backup server is\n`example.com`\n\n```bash\nsudo apt-get install git inotify-tools\n```\n\n### Your local machine\n\n```bash\nsudo pip install git+git://github.com/prashanthellina/pullbox\n```\n\nI am assuming that the username on the backup server is `prashanth`. We need\nto setup password-less SSH login to `prashanth@example.com` (instructions\n[here](http://www.linuxproblem.org/art_9.html))\n\n`Pullbox` depends on password-less login, so make sure it is working before\nproceeding.\n\nLet us assume that your local directory that you want to sync is\n`/home/prashanth/notes`. Make sure that this directory is *not* present\nthe very first time you start `Pullbox`. This allows `Pullbox` to clone\nthe remote repo properly. You can run `Pullbox` manually by running the\nfollowing command.\n\n```bash\npullbox --log-level DEBUG /home/prashanth/notes prashanth@example.com\n```\n\nThat's it! Your directory will now be kept in sync with the remote\nserver repo as long as the `pullbox` command above runs. In order to have\nthe command run all the time (after system reboot and upon accidental\nkilling etc), put an entry in crontab like so\n\n```bash\n* * * * * /usr/local/bin/pullbox --log-level DEBUG --log /tmp/pullbox.log --quiet /home/prashanth/notes prashanth@example.com &> /dev/null\n```\n"
  },
  {
    "path": "pullbox/__init__.py",
    "content": "import os\nimport sys\nimport time\nimport shlex\nimport logging\nimport logging.handlers\nimport tempfile\nimport datetime\nimport argparse\nimport threading\nimport subprocess\nfrom distutils.spawn import find_executable\n\nimport filelock\nfrom watchdog.observers import Observer\nfrom watchdog.events import FileSystemEventHandler\n\n# prevent watchdog module from writing DEBUG logs\n# as that is adding too much confusion during debugging\nlogging.getLogger('watchdog').setLevel(logging.WARNING)\n\nDEFAULT_LOCK_FILE = os.path.join(tempfile.gettempdir(), 'pullbox.lock')\n\nclass PullboxException(Exception): pass\n\nclass PullboxCalledProcessError(PullboxException):\n    def __init__(self, cmd, retcode):\n        self.cmd = cmd\n        self.retcode = retcode\n\n    def __str__(self):\n        return 'PullboxCalledProcessError(code=%s, cmd=\"%s\")' % \\\n            (self.retcode, self.cmd)\n\n    __unicode__ = __str__\n    __repr__ = __str__\n\nLOG_FORMATTER = logging.Formatter('%(asctime)s %(levelname)s %(message)s')\nLOG_DEFAULT_FNAME = 'log.pullbox'\nMAX_LOG_FILE_SIZE = 10 * 1024 * 1024 # 10MB\n\ndef init_logger(fname, log_level, quiet=False):\n    log = logging.getLogger('')\n\n    stderr_hdlr = logging.StreamHandler(sys.stderr)\n    rofile_hdlr = logging.handlers.RotatingFileHandler(fname,\n        maxBytes=MAX_LOG_FILE_SIZE, backupCount=10)\n    hdlrs = (stderr_hdlr, rofile_hdlr)\n\n    for hdlr in hdlrs:\n        hdlr.setFormatter(LOG_FORMATTER)\n        log.addHandler(hdlr)\n\n    log.addHandler(rofile_hdlr)\n    if not quiet: log.addHandler(stderr_hdlr)\n\n    log.setLevel(getattr(logging, log_level.upper()))\n\n    return log\n\nclass LocalFSEventHandler(FileSystemEventHandler):\n    def __init__(self, on_change):\n        self.on_change = on_change\n\n    def on_any_event(self, evt):\n        is_git_dir = '.git' in evt.src_path.split(os.path.sep)\n        is_dot_file = os.path.basename(evt.src_path).startswith('.')\n        is_dir_modified = evt.event_type == 'modified' and evt.is_directory\n\n        if not (is_git_dir or is_dot_file or is_dir_modified):\n            self.on_change()\n\nclass Pullbox(object):\n    BINARIES_NEEDED = ['git', 'ssh']\n    BINARIES_NEEDED_REMOTE = ['git', 'inotifywait']\n\n    # Frequency at which client must poll data\n    # server for data changes (if any)\n    POLL_INTERVAL = 60 # seconds\n\n    def __init__(self, server, path, log, suffix):\n        self.server = server\n        self.path = os.path.abspath(path)\n        self.log = log\n\n        self.remote_name = os.path.basename(path.rstrip(os.path.sep))\n        if suffix:\n            self.remote_name += \".git\"\n        # Setup monitoring of local repo changes\n        self.fs_observer = Observer()\n        self.fs_observer.schedule(LocalFSEventHandler(self.on_fs_change),\n            path, recursive=True)\n\n        self.fs_changed = True\n\n        # time at which changes (if any) need to downloaded\n        # from remote repo\n        self.next_pull_at = 0\n\n    def on_fs_change(self):\n        self.fs_changed = True\n\n    def invoke_process(self, cmd, ignore_code=0):\n        self.log.debug('invoke_process(%s)' % cmd)\n\n        devnull = open(os.devnull, 'w')\n        r = subprocess.call(shlex.split(cmd), stdout=devnull, stderr=devnull)\n\n        if r == 130:\n            raise KeyboardInterrupt\n\n        if not isinstance(ignore_code, (list, tuple)):\n            ignore_code = [ignore_code]\n\n        if r != 0 and r not in ignore_code:\n            raise PullboxCalledProcessError(cmd, r)\n\n    def check_binaries(self):\n        self.log.debug('Checking presence of local binaries \"%s\"' % \\\n            ', '.join(self.BINARIES_NEEDED))\n\n        for binf in self.BINARIES_NEEDED:\n            if not find_executable(binf):\n                raise PullboxException('\"%s\" binary required' % binf)\n\n    def check_remote_binaries(self):\n        self.log.debug('Checking presence of remote binaries \"%s\"' % \\\n            ', '.join(self.BINARIES_NEEDED_REMOTE))\n\n        for binf in self.BINARIES_NEEDED_REMOTE:\n            cmd = 'ssh %s which %s' % (self.server, binf)\n            try:\n                self.invoke_process(cmd)\n            except PullboxCalledProcessError:\n                raise PullboxException(\n                    '\"%s\" remote binary required (or) '\n                    'could not connect to server' % binf)\n\n    def ensure_remote_repo(self):\n        # git init creates a new repo if none exists else\n        # \"reinitializes\" which is like a no-op for our purposes\n        cmd = 'ssh %s git init --bare %s' % (self.server,\n            self.remote_name)\n        self.invoke_process(cmd)\n            \n\n    def keeprunning(self, fn, wait=0, error_wait=1):\n        '''\n        Keep @fn running on success or failure in an infinite loop\n         - On failure, log exception, wait for @error_wait seconds\n         - On success, wait for @wait seconds\n        '''\n\n        while 1:\n            try:\n                fn()\n            except (SystemExit, KeyboardInterrupt): raise\n            except:\n                self.log.exception('During run of \"%s\" func' % fn.func_name)\n                time.sleep(error_wait)\n                continue\n\n            time.sleep(wait)\n\n    def track_remote_changes(self):\n        cmd = 'ssh %s inotifywait -rqq -e modify -e move -e create -e delete %s' % \\\n            (self.server, self.remote_name)\n        self.invoke_process(cmd)\n        self.next_pull_at = time.time()\n\n    def init_local_repo(self):\n        bpath = os.path.dirname(self.path.rstrip(os.path.sep))\n\n        if not os.path.exists(bpath):\n            os.makedirs(bpath)\n\n        cwd = os.getcwd()\n        try:\n            os.chdir(bpath)\n            self.invoke_process('git clone %s:%s' % (self.server, self.remote_name))\n            # add a dummy file to avoid trouble\n            os.chdir(self.path)\n            self.invoke_process('touch README.md')\n            self.invoke_process('git add README.md')\n            self.invoke_process('git commit -a -m \"initial\"')\n            self.invoke_process('git push origin master')\n        finally:\n            os.chdir(cwd)\n\n    def pull_changes(self):\n        if self.next_pull_at > time.time(): return\n\n        if not os.path.exists(self.path):\n            self.init_local_repo()\n\n        cwd = os.getcwd()\n        try:\n            os.chdir(self.path)\n            self.invoke_process('git pull')\n        finally:\n            os.chdir(cwd)\n\n        self.next_pull_at = time.time() + self.POLL_INTERVAL\n\n    def push_changes(self):\n        if not self.fs_changed: return\n\n        cwd = os.getcwd()\n        try:\n            os.chdir(self.path)\n            self.invoke_process('git add .')\n\n            dt = datetime.datetime.utcnow().strftime('%Y%m%dT%H%M%S')\n            msg = 'auto commit at %s' % dt\n            self.invoke_process('git commit -a -m \"%s\"' % msg, ignore_code=1)\n\n            self.invoke_process('git push origin master', ignore_code=-2)\n        finally:\n            os.chdir(cwd)\n\n        self.fs_changed = False\n\n    def run_thread(self, target):\n        t = threading.Thread(target=target)\n        t.daemon = True\n        t.start()\n        return t\n\n    def start(self):\n        # ensure required binaries are available in PATH\n        self.check_binaries()\n        self.check_remote_binaries()\n\n        # ensure remote git repo is present (if not init one)\n        self.ensure_remote_repo()\n\n        # ensure local repo has latest data from server\n        self.pull_changes()\n\n        # start listening for changes in local repo\n        self.fs_observer.start()\n\n        # start threads\n        K = self.keeprunning\n        R = self.run_thread\n\n        self.thread_track_remote_changes = R(lambda: K(self.track_remote_changes))\n        self.thread_pull_changes = R(lambda: K(self.pull_changes, wait=0.1))\n        self.thread_push_changes = R(lambda: K(self.push_changes, wait=0.1))\n\n        # wait for threads to complete\n        self.thread_track_remote_changes.join()\n        self.thread_pull_changes.join()\n        self.thread_push_changes.join()\n\ndef main():\n    parser = argparse.ArgumentParser(description='Pullbox')\n\n    parser.add_argument('path', help='Path to data directory')\n    parser.add_argument('server', help='IP/Domain name of backup server')\n    parser.add_argument('--standard-suffix', action='store_true',\n        help='Makes Pullbox use the standard .git suffix for bare git repos (server side only)')\n    parser.add_argument('--log', default=LOG_DEFAULT_FNAME,\n        help='Name of log file')\n    parser.add_argument('--log-level', default='WARNING',\n        help='Logging level as picked from the logging module')\n    parser.add_argument('--quiet', action='store_true')\n\n    parser.add_argument('--lock-file', default=DEFAULT_LOCK_FILE,\n        help='Lock file to prevent multiple instances from running')\n\n    args = parser.parse_args()\n    lock = filelock.FileLock(args.lock_file)\n\n    try:\n        with lock.acquire(timeout=0):\n            log = init_logger(args.log, args.log_level, quiet=args.quiet)\n            p = Pullbox(args.server, args.path, log, args.standard_suffix)\n            p.start()\n    except (SystemExit, KeyboardInterrupt): sys.exit(1)\n    except Exception, e:\n        log = logging.getLogger('')\n        log.exception('exiting process because of exception')\n        print >> sys.stderr, str(e)\n        sys.exit(1)\n\n    sys.exit(0)\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "setup.py",
    "content": "from setuptools import setup, find_packages\n\nsetup(\n    name=\"pullbox\",\n    version='0.1',\n    description=\"A dead-simle Dropbox alternative using Git\",\n    keywords='dropbox,file synchronization,git',\n    author='Prashanth Ellina',\n    author_email=\"Use the github issues\",\n    url=\"https://github.com/prashanthellina/pullbox\",\n    license='MIT License',\n    install_requires=[\n        'filelock',\n        'watchdog',\n    ],\n    package_dir={'pullbox': 'pullbox'},\n    packages=find_packages('.'),\n    include_package_data=True,\n\n    entry_points = {\n        'console_scripts': [\n            'pullbox = pullbox:main',\n        ],\n    },\n)\n"
  }
]