Repository: prashanthellina/pullbox
Branch: master
Commit: 35d970072dbf
Files: 5
Total size: 14.4 KB
Directory structure:
gitextract_2w4e_2aq/
├── .gitignore
├── LICENSE
├── README.md
├── pullbox/
│ └── __init__.py
└── setup.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
# C extensions
*.so
# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
# Translations
*.mo
*.pot
# Django stuff:
*.log
# Sphinx documentation
docs/_build/
# PyBuilder
target/
================================================
FILE: LICENSE
================================================
The MIT License (MIT)
Copyright (c) 2015 Prashanth Ellina
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
# Pullbox
`Pullbox` is a very simple implementation that can serve as an alternative
for Dropbox that is based on Git. It works currently on any Linux-like OS
and OSX but not on Windows.
## Why?
Dropbox works well enough and works on many platforms. Although your data is
on someone else's server, it is probably safer over there than with you (for
most cases). I wrote `Pullbox` to overcome a specific limitation in Dropbox
i.e. Symlinks. Dropbox does not "see" symlinks. Although it synchronizes the
content pointed to by the symlink, it forgets that fact that it is a Symlink
when you sync to another computer.
I want to maintain my personal wiki and journal as plain text files. In
order to organize my notes structure, I depend on symlinks (so I can put the
same note under multiple directories). Dropbox does not support this
use-case.
## How does it work?
`Pullbox` needs SSH access to a remote Linux server that has `git` and
`inotifywait` commands installed. This serves as the backup location for
your local data.
`Pullbox` monitors file system activity in the local directory and
automatically pushes changes to the remote repo. The monitoring is done
using `inotify` on Linux, `FSEvents` on OSX, `kqueue` on BSD style OSs.
`Pullbox` also monitors file system activity on the remote repo and
automatically pulls changes to the local repo when needed. This is achieved
by using `ssh` and running `inotifywait` on the server (a lot like AJAX
long-polling except we use SSH here instead of HTTP).
## Setting up
### Backup Server
Instructions shown below assume Ubuntu Linux. You can modify based on the
actual distro you have. Let us say the domain name of the backup server is
`example.com`
```bash
sudo apt-get install git inotify-tools
```
### Your local machine
```bash
sudo pip install git+git://github.com/prashanthellina/pullbox
```
I am assuming that the username on the backup server is `prashanth`. We need
to setup password-less SSH login to `prashanth@example.com` (instructions
[here](http://www.linuxproblem.org/art_9.html))
`Pullbox` depends on password-less login, so make sure it is working before
proceeding.
Let us assume that your local directory that you want to sync is
`/home/prashanth/notes`. Make sure that this directory is *not* present
the very first time you start `Pullbox`. This allows `Pullbox` to clone
the remote repo properly. You can run `Pullbox` manually by running the
following command.
```bash
pullbox --log-level DEBUG /home/prashanth/notes prashanth@example.com
```
That's it! Your directory will now be kept in sync with the remote
server repo as long as the `pullbox` command above runs. In order to have
the command run all the time (after system reboot and upon accidental
killing etc), put an entry in crontab like so
```bash
* * * * * /usr/local/bin/pullbox --log-level DEBUG --log /tmp/pullbox.log --quiet /home/prashanth/notes prashanth@example.com &> /dev/null
```
================================================
FILE: pullbox/__init__.py
================================================
import os
import sys
import time
import shlex
import logging
import logging.handlers
import tempfile
import datetime
import argparse
import threading
import subprocess
from distutils.spawn import find_executable
import filelock
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
# prevent watchdog module from writing DEBUG logs
# as that is adding too much confusion during debugging
logging.getLogger('watchdog').setLevel(logging.WARNING)
DEFAULT_LOCK_FILE = os.path.join(tempfile.gettempdir(), 'pullbox.lock')
class PullboxException(Exception): pass
class PullboxCalledProcessError(PullboxException):
def __init__(self, cmd, retcode):
self.cmd = cmd
self.retcode = retcode
def __str__(self):
return 'PullboxCalledProcessError(code=%s, cmd="%s")' % \
(self.retcode, self.cmd)
__unicode__ = __str__
__repr__ = __str__
LOG_FORMATTER = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
LOG_DEFAULT_FNAME = 'log.pullbox'
MAX_LOG_FILE_SIZE = 10 * 1024 * 1024 # 10MB
def init_logger(fname, log_level, quiet=False):
log = logging.getLogger('')
stderr_hdlr = logging.StreamHandler(sys.stderr)
rofile_hdlr = logging.handlers.RotatingFileHandler(fname,
maxBytes=MAX_LOG_FILE_SIZE, backupCount=10)
hdlrs = (stderr_hdlr, rofile_hdlr)
for hdlr in hdlrs:
hdlr.setFormatter(LOG_FORMATTER)
log.addHandler(hdlr)
log.addHandler(rofile_hdlr)
if not quiet: log.addHandler(stderr_hdlr)
log.setLevel(getattr(logging, log_level.upper()))
return log
class LocalFSEventHandler(FileSystemEventHandler):
def __init__(self, on_change):
self.on_change = on_change
def on_any_event(self, evt):
is_git_dir = '.git' in evt.src_path.split(os.path.sep)
is_dot_file = os.path.basename(evt.src_path).startswith('.')
is_dir_modified = evt.event_type == 'modified' and evt.is_directory
if not (is_git_dir or is_dot_file or is_dir_modified):
self.on_change()
class Pullbox(object):
BINARIES_NEEDED = ['git', 'ssh']
BINARIES_NEEDED_REMOTE = ['git', 'inotifywait']
# Frequency at which client must poll data
# server for data changes (if any)
POLL_INTERVAL = 60 # seconds
def __init__(self, server, path, log, suffix):
self.server = server
self.path = os.path.abspath(path)
self.log = log
self.remote_name = os.path.basename(path.rstrip(os.path.sep))
if suffix:
self.remote_name += ".git"
# Setup monitoring of local repo changes
self.fs_observer = Observer()
self.fs_observer.schedule(LocalFSEventHandler(self.on_fs_change),
path, recursive=True)
self.fs_changed = True
# time at which changes (if any) need to downloaded
# from remote repo
self.next_pull_at = 0
def on_fs_change(self):
self.fs_changed = True
def invoke_process(self, cmd, ignore_code=0):
self.log.debug('invoke_process(%s)' % cmd)
devnull = open(os.devnull, 'w')
r = subprocess.call(shlex.split(cmd), stdout=devnull, stderr=devnull)
if r == 130:
raise KeyboardInterrupt
if not isinstance(ignore_code, (list, tuple)):
ignore_code = [ignore_code]
if r != 0 and r not in ignore_code:
raise PullboxCalledProcessError(cmd, r)
def check_binaries(self):
self.log.debug('Checking presence of local binaries "%s"' % \
', '.join(self.BINARIES_NEEDED))
for binf in self.BINARIES_NEEDED:
if not find_executable(binf):
raise PullboxException('"%s" binary required' % binf)
def check_remote_binaries(self):
self.log.debug('Checking presence of remote binaries "%s"' % \
', '.join(self.BINARIES_NEEDED_REMOTE))
for binf in self.BINARIES_NEEDED_REMOTE:
cmd = 'ssh %s which %s' % (self.server, binf)
try:
self.invoke_process(cmd)
except PullboxCalledProcessError:
raise PullboxException(
'"%s" remote binary required (or) '
'could not connect to server' % binf)
def ensure_remote_repo(self):
# git init creates a new repo if none exists else
# "reinitializes" which is like a no-op for our purposes
cmd = 'ssh %s git init --bare %s' % (self.server,
self.remote_name)
self.invoke_process(cmd)
def keeprunning(self, fn, wait=0, error_wait=1):
'''
Keep @fn running on success or failure in an infinite loop
- On failure, log exception, wait for @error_wait seconds
- On success, wait for @wait seconds
'''
while 1:
try:
fn()
except (SystemExit, KeyboardInterrupt): raise
except:
self.log.exception('During run of "%s" func' % fn.func_name)
time.sleep(error_wait)
continue
time.sleep(wait)
def track_remote_changes(self):
cmd = 'ssh %s inotifywait -rqq -e modify -e move -e create -e delete %s' % \
(self.server, self.remote_name)
self.invoke_process(cmd)
self.next_pull_at = time.time()
def init_local_repo(self):
bpath = os.path.dirname(self.path.rstrip(os.path.sep))
if not os.path.exists(bpath):
os.makedirs(bpath)
cwd = os.getcwd()
try:
os.chdir(bpath)
self.invoke_process('git clone %s:%s' % (self.server, self.remote_name))
# add a dummy file to avoid trouble
os.chdir(self.path)
self.invoke_process('touch README.md')
self.invoke_process('git add README.md')
self.invoke_process('git commit -a -m "initial"')
self.invoke_process('git push origin master')
finally:
os.chdir(cwd)
def pull_changes(self):
if self.next_pull_at > time.time(): return
if not os.path.exists(self.path):
self.init_local_repo()
cwd = os.getcwd()
try:
os.chdir(self.path)
self.invoke_process('git pull')
finally:
os.chdir(cwd)
self.next_pull_at = time.time() + self.POLL_INTERVAL
def push_changes(self):
if not self.fs_changed: return
cwd = os.getcwd()
try:
os.chdir(self.path)
self.invoke_process('git add .')
dt = datetime.datetime.utcnow().strftime('%Y%m%dT%H%M%S')
msg = 'auto commit at %s' % dt
self.invoke_process('git commit -a -m "%s"' % msg, ignore_code=1)
self.invoke_process('git push origin master', ignore_code=-2)
finally:
os.chdir(cwd)
self.fs_changed = False
def run_thread(self, target):
t = threading.Thread(target=target)
t.daemon = True
t.start()
return t
def start(self):
# ensure required binaries are available in PATH
self.check_binaries()
self.check_remote_binaries()
# ensure remote git repo is present (if not init one)
self.ensure_remote_repo()
# ensure local repo has latest data from server
self.pull_changes()
# start listening for changes in local repo
self.fs_observer.start()
# start threads
K = self.keeprunning
R = self.run_thread
self.thread_track_remote_changes = R(lambda: K(self.track_remote_changes))
self.thread_pull_changes = R(lambda: K(self.pull_changes, wait=0.1))
self.thread_push_changes = R(lambda: K(self.push_changes, wait=0.1))
# wait for threads to complete
self.thread_track_remote_changes.join()
self.thread_pull_changes.join()
self.thread_push_changes.join()
def main():
parser = argparse.ArgumentParser(description='Pullbox')
parser.add_argument('path', help='Path to data directory')
parser.add_argument('server', help='IP/Domain name of backup server')
parser.add_argument('--standard-suffix', action='store_true',
help='Makes Pullbox use the standard .git suffix for bare git repos (server side only)')
parser.add_argument('--log', default=LOG_DEFAULT_FNAME,
help='Name of log file')
parser.add_argument('--log-level', default='WARNING',
help='Logging level as picked from the logging module')
parser.add_argument('--quiet', action='store_true')
parser.add_argument('--lock-file', default=DEFAULT_LOCK_FILE,
help='Lock file to prevent multiple instances from running')
args = parser.parse_args()
lock = filelock.FileLock(args.lock_file)
try:
with lock.acquire(timeout=0):
log = init_logger(args.log, args.log_level, quiet=args.quiet)
p = Pullbox(args.server, args.path, log, args.standard_suffix)
p.start()
except (SystemExit, KeyboardInterrupt): sys.exit(1)
except Exception, e:
log = logging.getLogger('')
log.exception('exiting process because of exception')
print >> sys.stderr, str(e)
sys.exit(1)
sys.exit(0)
if __name__ == '__main__':
main()
================================================
FILE: setup.py
================================================
from setuptools import setup, find_packages
setup(
name="pullbox",
version='0.1',
description="A dead-simle Dropbox alternative using Git",
keywords='dropbox,file synchronization,git',
author='Prashanth Ellina',
author_email="Use the github issues",
url="https://github.com/prashanthellina/pullbox",
license='MIT License',
install_requires=[
'filelock',
'watchdog',
],
package_dir={'pullbox': 'pullbox'},
packages=find_packages('.'),
include_package_data=True,
entry_points = {
'console_scripts': [
'pullbox = pullbox:main',
],
},
)
gitextract_2w4e_2aq/ ├── .gitignore ├── LICENSE ├── README.md ├── pullbox/ │ └── __init__.py └── setup.py
SYMBOL INDEX (23 symbols across 1 files)
FILE: pullbox/__init__.py
class PullboxException (line 24) | class PullboxException(Exception): pass
class PullboxCalledProcessError (line 26) | class PullboxCalledProcessError(PullboxException):
method __init__ (line 27) | def __init__(self, cmd, retcode):
method __str__ (line 31) | def __str__(self):
function init_logger (line 42) | def init_logger(fname, log_level, quiet=False):
class LocalFSEventHandler (line 61) | class LocalFSEventHandler(FileSystemEventHandler):
method __init__ (line 62) | def __init__(self, on_change):
method on_any_event (line 65) | def on_any_event(self, evt):
class Pullbox (line 73) | class Pullbox(object):
method __init__ (line 81) | def __init__(self, server, path, log, suffix):
method on_fs_change (line 100) | def on_fs_change(self):
method invoke_process (line 103) | def invoke_process(self, cmd, ignore_code=0):
method check_binaries (line 118) | def check_binaries(self):
method check_remote_binaries (line 126) | def check_remote_binaries(self):
method ensure_remote_repo (line 139) | def ensure_remote_repo(self):
method keeprunning (line 147) | def keeprunning(self, fn, wait=0, error_wait=1):
method track_remote_changes (line 165) | def track_remote_changes(self):
method init_local_repo (line 171) | def init_local_repo(self):
method pull_changes (line 190) | def pull_changes(self):
method push_changes (line 205) | def push_changes(self):
method run_thread (line 223) | def run_thread(self, target):
method start (line 229) | def start(self):
function main (line 256) | def main():
Condensed preview — 5 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (15K chars).
[
{
"path": ".gitignore",
"chars": 702,
"preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\n"
},
{
"path": "LICENSE",
"chars": 1084,
"preview": "The MIT License (MIT)\n\nCopyright (c) 2015 Prashanth Ellina\n\nPermission is hereby granted, free of charge, to any person "
},
{
"path": "README.md",
"chars": 2945,
"preview": "# Pullbox\n\n`Pullbox` is a very simple implementation that can serve as an alternative\nfor Dropbox that is based on Git. "
},
{
"path": "pullbox/__init__.py",
"chars": 9347,
"preview": "import os\nimport sys\nimport time\nimport shlex\nimport logging\nimport logging.handlers\nimport tempfile\nimport datetime\nimp"
},
{
"path": "setup.py",
"chars": 639,
"preview": "from setuptools import setup, find_packages\n\nsetup(\n name=\"pullbox\",\n version='0.1',\n description=\"A dead-simle"
}
]
About this extraction
This page contains the full source code of the prashanthellina/pullbox GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 5 files (14.4 KB), approximately 3.6k tokens, and a symbol index with 23 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.