[
  {
    "path": ".github/workflows/test.yaml",
    "content": "name: test-workflow\n\non: [push]\n\npermissions:\n  contents: read\n\njobs:\n  lint:\n    runs-on: ubuntu-24.04\n    steps:\n      - uses: actions/checkout@v4\n      - name: Set up Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.13\"\n      - name: Install dependencies\n        run: |\n          python -m pip install --upgrade pip\n          pip install -r requirements.txt\n      - name: Format\n        run: |\n          ruff format --check --no-cache\n      - name: Lint\n        run: |\n          ruff check --no-cache\n\n  test:\n    runs-on: ubuntu-24.04\n    strategy:\n      fail-fast: false\n      matrix:\n        python-version: [\"3.10\", \"3.11\", \"3.12\", \"3.13\"]\n    services:\n      redis:\n        image: redis:7.2.4\n        options: >-\n          --health-cmd \"redis-cli ping\"\n          --health-interval 10s\n          --health-timeout 5s\n          --health-retries 5\n        ports:\n          - 6379:6379\n    steps:\n      - uses: actions/checkout@v4\n      - name: Set up Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n      - name: Install dependencies\n        run: |\n          python -m pip install --upgrade pip\n          pip install -r requirements.txt\n      - name: Test\n        run: |\n          pytest\n"
  },
  {
    "path": ".gitignore",
    "content": "*.pyc\nvenv/\n*.egg\n*.egg-info/\n"
  },
  {
    "path": "LICENSE",
    "content": "The MIT License (MIT)\n\nCopyright (c) 2015-2024 Elastic Inc. (Close)\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE\n"
  },
  {
    "path": "README.md",
    "content": "# `redis-hashring`\n\n`redis-hashring` is a Python library that implements a consistent hash ring for\nbuilding distributed applications. The hash ring is stored in Redis.\n\n## The problem\n\nLet's assume you're building a distributed application that's responsible for\nsyncing accounts. Accounts are synced continuously, e.g. by keeping a\nconnection open. Given the large amount of accounts, the application can't run\nin one process and has to be distributed and split up in multiple processes.\nAlso, if one of the processes fails or crashes, other machines need to be able\nto take over accounts quickly. The load should be balanced equally between the\nmachines.\n\n## The solution\n\nA solution to this problem is to use a consistent hash ring: Different Python\ninstances (\"nodes\") are responsible for a different set of keys. In our account\nexample, the account IDs could be used as keys. A consistent hash ring is a\nlarge (integer) space that wraps around to form a circle. Each node picks a few\nrandom points (\"replicas\") on the hash ring when starting. Keys are hashed and\nlooked up on the hash ring: In order to find the node that's responsible for a\ngiven key, we move on the hash ring until we find the next smaller point that\nbelongs to a replica. The reason for multiple replicas per node is to ensure\nbetter distribution of the keys amongst the nodes. It can also be used to give\ncertain nodes more weight. The ring is automatically rebalanced when a node\nenters or leaves the ring: If a node crashes or shuts down, its replicas are\nremoved from the ring.\n\n## How it works\n\nThe ring is stored as a sorted set (ZSET) in Redis. Each replica is a member of\nthe set, scored by it's expiration time. Each node needs to periodically\nrefresh the score of its replicas to stay on the ring.\n\nThe ring contains 2^32 points, and a replica is created by randomly placing a\npoint on the ring. A replica of a node is responsible for the range of points\nfrom its randomly generated starting point until the starting point of the next\nnode or replica.\n\nTo check if a node is responsible for a given key, the key's position on the\nring is determined by hashing the key using xxHash (CRC-32 is also supported\nfor backwards-compatibility).\n\nFor example, let's say there are two nodes, having one replica each. The first\nnode is at 1 000 000 000 (1e9), the second at 2e9. In this case, the first node\nis responsible for the range [1e9, 2e9-1], the second node is responsible for\n[2e9, 2^32-1] and [0, 1e9-1], since the ring wraps. To check to which node the\nkey *hello* belongs, we compute its hash, which is 4 211 111 929, and the value\nis therefore on the second node.\n\nSince the node replica points are picked randomly, it is recommended to have\nmultiple replicas of the node on a ring to ensure a more even distribution of\nthe nodes.\n\n## Demo\n\nAs an example, let's assume you have a process that is responsible for syncing\naccounts. In this example they are numbered from 0 to 99. Starting node 1 will\nassign all accounts to node 1, since it's the only node on the ring.\n\nWe can see this by running the provided example script on node 1:\n\n```\n% python example.py\nINFO:root:PID 80721, 100 keys ([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])\n```\n\nWe can print the ring for debugging and see all the nodes and replicas on the\nring:\n\n```\n% python example.py --print\nHash ring \"ring\" replicas:\nStart      Range  Delay   Node\n 706234936  2.97%      0s mbp.local:80721:249d729d\n 833679955  3.58%      0s mbp.local:80721:aa60d44c\n 987624694 24.44%      0s mbp.local:80721:aa7d4433\n2037338983  3.41%      0s mbp.local:80721:e810d068\n2183761853  3.55%      0s mbp.local:80721:3917f572\n2336151471  2.82%      0s mbp.local:80721:e42b1b46\n2457297989  4.40%      0s mbp.local:80721:e6bd5726\n2646391033  4.37%      0s mbp.local:80721:6de2fc22\n2834073726  5.30%      0s mbp.local:80721:b6f950b2\n3061910569  3.96%      0s mbp.local:80721:d176c9e2\n3231812046  5.70%      0s mbp.local:80721:65432143\n3476455773  5.71%      0s mbp.local:80721:f2b29682\n3721589736  0.65%      0s mbp.local:80721:51d0cb09\n3749333446  5.53%      0s mbp.local:80721:3572f718\n3986767934  4.39%      0s mbp.local:80721:42147f45\n4175523935 19.22%      0s mbp.local:80721:296c9522\n\nHash ring \"ring\" nodes:\nRange    Replicas Delay   Hostname             PID\n100.00%       16      0s mbp.local            80721\n```\n\nWe can see that the node is responsible for the entire ring (range 100%) and\nhas 16 replicas on the ring.\n\nNow let's start another node by running the script again. It will add its\nreplicas to the ring and notify all the remaining nodes.\n\n```\n% python example.py\nINFO:root:PID 80721, 51 keys ([1, 5, 8, 9, 10, 14, 17, 20, 21, 24, 25, 28, 30, 32, 33, 34, 36, 38, 41, 42, 45, 46, 49, 50, 52, 54, 56, 58, 59, 60, 61, 62, 65, 66, 68, 69, 71, 74, 75, 78, 79, 81, 82, 85, 86, 87, 88, 89, 92, 93, 96])\n```\n\nNode 1 will rebalance and is now only responsible for keys not in node 2:\n\n```\nINFO:root:PID 80808, 49 keys ([0, 2, 3, 4, 6, 7, 11, 12, 13, 15, 16, 18, 19, 22, 23, 26, 27, 29, 31, 35, 37, 39, 40, 43, 44, 47, 48, 51, 53, 55, 57, 63, 64, 67, 70, 72, 73, 76, 77, 80, 83, 84, 90, 91, 94, 95, 97, 98, 99])\n```\n\nWe can inspect the ring:\n\n```\n% python example.py --print\nHash ring \"ring\" replicas:\nStart      Range  Delay   Node\n 204632062  1.06%      0s mbp.local:80808:f933c33c\n 250215779  0.36%      0s mbp.local:80808:3b104c45\n 265648189  1.15%      0s mbp.local:80808:84d71125\n 315059885  2.77%      0s mbp.local:80808:bab5a03c\n 434081415  6.34%      0s mbp.local:80808:6eec1b26\n 706234936  2.97%      0s mbp.local:80721:249d729d\n 833679955  1.59%      0s mbp.local:80721:aa60d44c\n 901926411  2.00%      0s mbp.local:80808:bd6f3b27\n 987624694  2.87%      0s mbp.local:80721:aa7d4433\n1110943067  5.42%      0s mbp.local:80808:abfa5d78\n1343923832  0.83%      0s mbp.local:80808:5261947f\n1379658747  4.70%      0s mbp.local:80808:cb0904de\n1581392642  1.06%      0s mbp.local:80808:3050daa3\n1627017290  9.55%      0s mbp.local:80808:8e1cef12\n2037338983  3.41%      0s mbp.local:80721:e810d068\n2183761853  3.55%      0s mbp.local:80721:3917f572\n2336151471  2.82%      0s mbp.local:80721:e42b1b46\n2457297989  4.40%      0s mbp.local:80721:e6bd5726\n2646391033  4.37%      0s mbp.local:80721:6de2fc22\n2834073726  2.30%      0s mbp.local:80721:b6f950b2\n2932842903  3.01%      0s mbp.local:80808:58f09769\n3061910569  3.08%      0s mbp.local:80721:d176c9e2\n3194206736  0.88%      0s mbp.local:80808:ce94a1cf\n3231812046  5.70%      0s mbp.local:80721:65432143\n3476455773  0.21%      0s mbp.local:80721:f2b29682\n3485592199  5.49%      0s mbp.local:80808:6fc107a3\n3721589736  0.65%      0s mbp.local:80721:51d0cb09\n3749333446  0.68%      0s mbp.local:80721:3572f718\n3778349273  4.85%      0s mbp.local:80808:e7cc7485\n3986767934  1.29%      0s mbp.local:80721:42147f45\n4042192844  3.10%      0s mbp.local:80808:001590b5\n4175523935  7.55%      0s mbp.local:80721:296c9522\n\nHash ring \"ring\" nodes:\nRange    Replicas Delay   Hostname             PID\n47.42%       16      0s mbp.local            80721\n52.58%       16      0s mbp.local            80808\n```\n\n## `gevent` example\n\n`redis-hashring` provides a `GeventRingNode` class for `gevent`-based\napplications. The `GeventRingNode.start()` method spawns a greenlet that\ninitializes the ring and periodically updates the node's replicas.\n\nAn example app could look as follows:\n\n```python\nfrom redis import Redis\nfrom redis_hashring import GeventRingNode\n\nKEY = \"example-ring\"\n\nredis = Redis()\nnode = GeventRingNode(redis, KEY)\nnode.start()\n\n\ndef get_items():\n    \"\"\"\n    Implement this method and return items to be processed.\n    \"\"\"\n    raise NotImplementedError()\n\n\ndef process_items(items):\n    \"\"\"\n    Implement this method and process the given items.\n    \"\"\"\n    raise NotImplementedError()\n\n\ntry:\n    while True:\n        # Only process items this node is reponsible for.\n        items = [item for item in get_items() if node.contains(item)]\n        process_items(items)\nexcept KeyboardInterrupt:\n    pass\n\nnode.stop()\n```\n\n## Implementation considerations\n\nWhen implementing a distributed application using `redis-hashring`, be aware of\nthe following:\n\n- Locking\n\n  When nodes are added to the ring, multiple nodes might assume they're\n  responsible for the same key until they are notified about the new state of\n  the ring. Depending on the application, locking may be necessary to avoid\n  duplicate processing.\n\n  For example, in the demo above the node could add a per-account-ID lock if an\n  account should never be synced by multiple nodes at the same time. This can\n  be done using a Redis lock class or any other distributed lock.\n\n- Limit\n\n  It is recommended to add an upper limit to the number of keys a node can\n  process to avoid overloading a node when there are few nodes on the ring or\n  all nodes need to be restarted.\n\n  For example, in the demo above we could implement a limit of 50 accounts, if\n  we know that a node may not be capable of syncing much more. In this case,\n  multiple nodes would need to be running to sync all the accounts. Also note\n  that the ring is not usually equally balanced, so running 2 nodes wouldn't be\n  enough in this example.\n"
  },
  {
    "path": "example.py",
    "content": "import argparse\nimport logging\nimport os\nimport sys\nimport time\n\nimport redis\n\nfrom redis_hashring import RingNode\n\nN_KEYS = 100\n\nlogging.basicConfig(level=logging.DEBUG)\n\n\ndef _parse_arguments():\n    parser = argparse.ArgumentParser(\"Hash ring example.\", add_help=False)\n    parser.add_argument(\n        \"--host\", \"-h\", default=\"localhost\", help=\"Redis hostname.\"\n    )\n    parser.add_argument(\n        \"--port\", \"-p\", type=int, default=6379, help=\"Redis port.\"\n    )\n    parser.add_argument(\n        \"--print\",\n        action=\"store_true\",\n        dest=\"print\",\n        help=\"Print the hash ring.\",\n    )\n    parser.add_argument(\n        \"--help\",\n        action=\"help\",\n        default=argparse.SUPPRESS,\n        help=\"Show this help message and exit.\",\n    )\n    return parser.parse_args()\n\n\nif __name__ == \"__main__\":\n    args = _parse_arguments()\n\n    print(f\"Attempting to connect to Redis at {args.host}:{args.port}.\")\n    r = redis.Redis(args.host, args.port)\n    try:\n        r.ping()\n    except redis.exceptions.ConnectionError:\n        print(\"Failed to connect to Redis.\")\n        sys.exit(1)\n\n    node = RingNode(r, \"ring\")\n\n    if args.print:\n        node.debug_print()\n        sys.exit()\n\n    pid = os.getpid()\n\n    node.start()\n\n    try:\n        while True:\n            keys = [key for key in range(N_KEYS) if node.contains(str(key))]\n            logging.info(\"PID %d, %d keys (%s)\", pid, len(keys), repr(keys))\n            time.sleep(2)\n    except KeyboardInterrupt:\n        pass\n\n    node.stop()\n"
  },
  {
    "path": "pyproject.toml",
    "content": "[tool.ruff]\ntarget-version = \"py38\"\nline-length = 79\nexclude = [\n    \".git\",\n    \"venv\",\n    \".venv\",\n    \"__pycache__\",\n]\n\n[tool.ruff.lint]\nextend-select = [\"I\"]\nunfixable = [\n    # Variable assigned but never used - automatically removing the assignment\n    # is annoying when running autofix on work-in-progress code.\n    \"F841\",\n]\n\n[tool.ruff.lint.flake8-tidy-imports]\nban-relative-imports = \"all\"\n\n[tool.ruff.lint.flake8-tidy-imports.banned-api]\n\"functools.partial\".msg = \"Use a lambda or a named function instead. Partials don't type check correctly.\"\n\"datetime.datetime.utcnow\".msg = \"Use `datetime.datetime.now(datetime.timezone.utc).replace(tzinfo=None)` instead.\"\n\"datetime.datetime.utcfromtimestamp\".msg = \"Use `datetime.datetime.fromtimestamp(timestamp, datetime.timezone.utc).replace(tzinfo=None)` instead.\"\n\n[tool.ruff.lint.isort]\ncombine-as-imports = true\nforced-separate = [\"tests\"]\n\n[tool.ruff.lint.pydocstyle]\n# https://google.github.io/styleguide/pyguide.html#383-functions-and-methods\nconvention = \"google\"\n\n[tool.ruff.lint.flake8-annotations]\nignore-fully-untyped = true\n\n[tool.pytest.ini_options]\ntimeout = 180\npython_files = \"tests.py\"\ntestpaths = [\".\"]\nxfail_strict = true\n"
  },
  {
    "path": "redis_hashring/__init__.py",
    "content": "import binascii\nimport collections\nimport enum\nimport operator\nimport os\nimport random\nimport select\nimport socket\nimport threading\nimport time\n\ntry:\n    import xxhash\nexcept ImportError:\n    xxhash = None\n\n# Amount of points on the ring. Must not be higher than 2**32.\nRING_SIZE = 2**32\n\n# Default amount of replicas per node.\nRING_REPLICAS = 16\n\n\nclass HashAlgorithm(enum.Enum):\n    CRC32 = \"crc32\"\n    XXHASH = \"xxhash\"\n\n\n# How often to update a node's heartbeat.\nPOLL_INTERVAL = 10\n\n# After how much time a node is considered to be dead.\nNODE_TIMEOUT = 60\n\n# How often expired nodes are cleaned up from the ring.\nCLEANUP_INTERVAL = 120\n\n\ndef _hash_with_xxhash(key):\n    return xxhash.xxh32(key.encode()).intdigest() % RING_SIZE\n\n\ndef _hash_with_crc32(key):\n    return binascii.crc32(key.encode()) % RING_SIZE\n\n\ndef _decode(data):\n    # Compatibility with different redis-py `decode_responses` settings.\n    if isinstance(data, bytes):\n        return data.decode()\n    else:\n        return data\n\n\nclass RingNode(object):\n    \"\"\"\n    A node in a Redis hash ring.\n\n    Each node may have multiple replicas on the ring for more balanced hashing.\n\n    The ring is stored as follows in Redis:\n\n    ZSET <key>\n    Represents the ring in Redis. The keys of this ZSET represent\n    \"start:replica_name\", where start is the start of the range for which the\n    replica is responsible.\n\n    CHANNEL <key>\n    Represents a pubsub channel in Redis which receives a message every time\n    the ring structure has changed.\n\n    Simple usage example:\n\n    ```\n    node = RingNode(redis, key)\n    node.start()\n\n    while is_running:\n        # Only process items this node is responsible for. `item` should be an\n        # object that can be encoded to bytes by calling `item.encode()` on it,\n        # like a `str`.\n        items = [item for item in get_items() if node.contains(item)]\n        process_items(items)\n\n    node.stop()\n    ```\n\n    Using CRC-32 (if you need to support hashrings created before xxHash\n    support was introduced):\n\n    ```\n    from redis_hashring import RingNode, HashAlgorithm\n\n    node = RingNode(redis, key, hash_algorithm=HashAlgorithm.CRC32)\n    node.start()\n    ```\n\n    As a context manager:\n\n    ```\n    with RingNode(redis, key) as node:\n        while is_running:\n            # Only process items this node is responsible for. `item` should be\n            # an object that can be encoded to bytes by calling `item.encode()`\n            # on it, like a `str`.\n            items = [item for item in get_items() if node.contains(item)]\n            process_items(items)\n    ```\n    \"\"\"\n\n    def __init__(\n        self,\n        conn,\n        key,\n        *,\n        n_replicas=RING_REPLICAS,\n        hash_algorithm=HashAlgorithm.XXHASH,\n    ):\n        \"\"\"\n        Initializes a Redis hash ring node.\n\n        Args:\n            conn: The Redis connection to use.\n            key: A key to use for this node.\n            n_replicas: Number of replicas this node should have on the ring.\n            hash_algorithm: Hash algorithm to use. It is recommended to use\n                `HashAlgorithm.XXHASH` (the default) because it provides better\n                uniform distribution than CRC-32 with faster hashing. If you\n                need to support hashrings created before we introduced support\n                for xxHash, use `HashAlgorithm.CRC32`.\n        \"\"\"\n        self._polling_thread = None\n        self._stop_polling_fd_r = None\n        self._stop_polling_fd_w = None\n\n        self._conn = conn\n        self._key = key\n\n        if hash_algorithm is HashAlgorithm.XXHASH:\n            if xxhash is None:\n                raise ImportError(\n                    \"xxhash library is required for XXHASH algorithm. \"\n                    \"Install with: pip install redis-hashring[xxhash]\"\n                )\n            self._hash_function = _hash_with_xxhash\n        elif hash_algorithm is HashAlgorithm.CRC32:\n            self._hash_function = _hash_with_crc32\n        else:\n            raise ValueError(\"Unexpected hash algorithm requested\")\n\n        host = socket.gethostname()\n        pid = os.getpid()\n\n        # Create unique identifiers for the replicas.\n        self._replicas = [\n            (\n                random.randrange(2**32),\n                \"{host}:{pid}:{id_}\".format(\n                    host=host,\n                    pid=pid,\n                    id_=binascii.hexlify(os.urandom(4)).decode(),\n                ),\n            )\n            for _ in range(n_replicas)\n        ]\n\n        # Number of nodes currently active in the ring.\n        self._node_count = 0\n        # List of tuples of ranges this node is responsible for, where a tuple\n        # (a, b) includes any N matching a <= N < b.\n        self._ranges = []\n\n        self._select = select.select\n\n    def _fetch_ring(self):\n        \"\"\"\n        Fetch the ring from Redis.\n\n        The fetched ring only includes active nodes. Returns a list of tuples\n        (start, replica) (see _fetch_all docs for more details).\n        \"\"\"\n        expiry_time = time.time() - NODE_TIMEOUT\n        data = self._conn.zrangebyscore(self._key, expiry_time, \"INF\")\n\n        ring = []\n        for replica_data in data:\n            start, replica = _decode(replica_data).split(\":\", 1)\n            ring.append((int(start), replica))\n        return sorted(ring, key=operator.itemgetter(0))\n\n    def _fetch_ring_all(self):\n        \"\"\"\n        Fetch the ring from Redis.\n\n        The fetched ring will include inactive nodes. Returns a list of tuples\n        (start, replica, heartbeat, expired), where:\n        * start: start of the range for which the replica is responsible.\n        * replica: name of the replica.\n        * heartbeat: timestamp of the last heartbeat.\n        * expired: boolean denoting whether this replica is inactive.\n        \"\"\"\n        expiry_time = time.time() - NODE_TIMEOUT\n        data = self._conn.zrange(self._key, 0, -1, withscores=True)\n\n        ring = []\n        for replica_data, heartbeat in data:\n            start, replica = _decode(replica_data).split(\":\", 1)\n            ring.append(\n                (int(start), replica, heartbeat, heartbeat < expiry_time)\n            )\n        return sorted(ring, key=operator.itemgetter(0))\n\n    def debug_print(self):\n        \"\"\"\n        Prints the ring for debugging purposes.\n        \"\"\"\n        ring = self._fetch_ring_all()\n\n        print('Hash ring \"{key}\" replicas:'.format(key=self._key))\n\n        now = time.time()\n\n        n_replicas = len(ring)\n        if ring:\n            print(\n                \"{:10} {:6} {:7} {}\".format(\"Start\", \"Range\", \"Delay\", \"Node\")\n            )\n        else:\n            print(\"(no replicas)\")\n\n        nodes = collections.defaultdict(list)\n\n        for n, (start, replica, heartbeat, expired) in enumerate(ring):\n            hostname, pid, _ = replica.split(\":\")\n            node = \":\".join([hostname, pid])\n\n            abs_size = (ring[(n + 1) % n_replicas][0] - ring[n][0]) % RING_SIZE\n            size = 100.0 / RING_SIZE * abs_size\n            delay = int(now - heartbeat)\n            expired_str = \"(EXPIRED)\" if expired else \"\"\n\n            nodes[node].append((hostname, pid, abs_size, delay, expired))\n\n            print(\n                f\"{start:10} {size:5.2f}% {delay:6}s {replica} {expired_str}\"\n            )\n\n        print()\n        print('Hash ring \"{key}\" nodes:'.format(key=self._key))\n\n        if nodes:\n            print(\n                \"{:8} {:8} {:7} {:20} {:5}\".format(\n                    \"Range\", \"Replicas\", \"Delay\", \"Hostname\", \"PID\"\n                )\n            )\n        else:\n            print(\"(no nodes)\")\n\n        for _, v in nodes.items():\n            hostname, pid = v[0][0], v[0][1]\n            abs_size = sum(replica[2] for replica in v)\n            size = 100.0 / RING_SIZE * abs_size\n            delay = max(replica[3] for replica in v)\n            expired = any(replica[4] for replica in v)\n            count = len(v)\n            expired_str = \"(EXPIRED)\" if expired else \"\"\n            print(\n                f\"{size:5.2f}% {count:8} {delay:6}s {hostname:20} {pid:5}\"\n                f\" {expired_str}\"\n            )\n\n    def heartbeat(self):\n        \"\"\"\n        Add/update the node in Redis.\n\n        Needs to be called regularly by the node.\n        \"\"\"\n        pipeline = self._conn.pipeline()\n\n        now = time.time()\n\n        for replica in self._replicas:\n            pipeline.zadd(self._key, {f\"{replica[0]}:{replica[1]}\": now})\n        ret = pipeline.execute()\n\n        # Only notify the other nodes if we're not in the ring yet.\n        if any(ret):\n            self._notify()\n\n    def remove(self):\n        \"\"\"\n        Remove the node from the ring.\n        \"\"\"\n        pipeline = self._conn.pipeline()\n\n        for replica in self._replicas:\n            pipeline.zrem(self._key, f\"{replica[0]}:{replica[1]}\")\n        pipeline.execute()\n\n        # Make sure this node won't contain any items.\n        self._node_count = 0\n        self._ranges = []\n\n        self._notify()\n\n    def _notify(self):\n        \"\"\"\n        Publish an update to the ring's activity channel.\n        \"\"\"\n        self._conn.publish(self._key, \"*\")\n\n    def cleanup(self):\n        \"\"\"\n        Removes expired nodes from the ring.\n        \"\"\"\n        expired = time.time() - NODE_TIMEOUT\n\n        if self._conn.zremrangebyscore(self._key, 0, expired):\n            self._notify()\n\n    def update(self):\n        \"\"\"\n        Fetches the updated ring from Redis and updates the current ranges.\n        \"\"\"\n        ring = self._fetch_ring()\n        nodes = set()\n        n_replicas = len(ring)\n\n        own_replicas = {r[1] for r in self._replicas}\n\n        self._ranges = []\n        for n, (start, replica) in enumerate(ring):\n            host, pid, _ = replica.split(\":\")\n            node = \":\".join([host, pid])\n            nodes.add(node)\n\n            if replica in own_replicas:\n                end = ring[(n + 1) % n_replicas][0] % RING_SIZE\n                if start < end:\n                    self._ranges.append((start, end))\n                elif end < start:\n                    self._ranges.append((start, RING_SIZE))\n                    self._ranges.append((0, end))\n                else:\n                    self._ranges.append((0, RING_SIZE))\n\n        self._node_count = len(nodes)\n\n    def get_ranges(self):\n        \"\"\"\n        Return the hash ring ranges that this node owns.\n        \"\"\"\n        return self._ranges\n\n    def get_node_count(self):\n        \"\"\"\n        Return the number of active nodes in the ring.\n        \"\"\"\n        return self._node_count\n\n    def contains(self, key):\n        \"\"\"\n        Check whether this node is responsible for the item.\n        \"\"\"\n        return self._contains_ring_point(self.key_as_ring_point(key))\n\n    def key_as_ring_point(self, key):\n        \"\"\"Turn a key into a point on a hash ring.\"\"\"\n        return self._hash_function(key)\n\n    def _contains_ring_point(self, n):\n        \"\"\"\n        Check whether this node is responsible for the ring point.\n        \"\"\"\n        for start, end in self._ranges:\n            if start <= n < end:\n                return True\n        return False\n\n    def poll(self):\n        \"\"\"\n        Keep a node in the hash ring.\n\n        This should be kept running for as long as the node needs to stay in\n        the ring. Can be run in a separate thread or in a greenlet. This takes\n        care of:\n        * Updating the heartbeat.\n        * Checking for ring updates.\n        * Cleaning up expired nodes periodically.\n        \"\"\"\n        pubsub = self._conn.pubsub()\n        pubsub.subscribe(self._key)\n        pubsub_fd = pubsub.connection._sock.fileno()\n\n        last_heartbeat = time.time()\n        self.heartbeat()\n\n        last_cleanup = time.time()\n        self.cleanup()\n\n        self._stop_polling_fd_r, self._stop_polling_fd_w = os.pipe()\n\n        try:\n            while True:\n                # Since Redis' `listen` method blocks, we use `select` to\n                # inspect the underlying socket to see if there is activity.\n                timeout = max(\n                    0.0, POLL_INTERVAL - (time.time() - last_heartbeat)\n                )\n                r, _, _ = self._select(\n                    [self._stop_polling_fd_r, pubsub_fd], [], [], timeout\n                )\n\n                if self._stop_polling_fd_r in r:\n                    os.close(self._stop_polling_fd_r)\n                    os.close(self._stop_polling_fd_w)\n                    self._stop_polling_fd_r = None\n                    self._stop_polling_fd_w = None\n                    break\n\n                if pubsub_fd in r:\n                    while pubsub.get_message():\n                        pass\n                    self.update()\n\n                last_heartbeat = time.time()\n                self.heartbeat()\n\n                now = time.time()\n                if now - last_cleanup > CLEANUP_INTERVAL:\n                    last_cleanup = now\n                    self.cleanup()\n        finally:\n            pubsub.close()\n\n    def start(self):\n        \"\"\"\n        Start the node for threads-based applications.\n        \"\"\"\n        self._polling_thread = threading.Thread(target=self.poll, daemon=True)\n        self._polling_thread.start()\n\n    def stop(self):\n        \"\"\"\n        Stop the node for threads-based applications.\n        \"\"\"\n        if self._polling_thread:\n            while not self._stop_polling_fd_w:\n                # Let's give the thread some time to create the fd.\n                time.sleep(0.1)\n            os.write(self._stop_polling_fd_w, b\"1\")\n            self._polling_thread.join()\n            self._polling_thread = None\n        self.remove()\n\n    def __enter__(self):\n        self.start()\n        return self\n\n    def __exit__(self, *args, **kwargs):\n        self.stop()\n\n\nclass GeventRingNode(RingNode):\n    \"\"\"\n    A node in a Redis hash ring.\n\n    This works exactly the same as `RingNode`, except that `start` and `stop`\n    will create a gevent greenlet to maintain the node information up to date\n    with the hash ring.\n\n    For a usage example, see the documentation for `RingNode`.\n    \"\"\"\n\n    def __init__(self, *args, **kwargs):\n        self._polling_greenlet = None\n        super().__init__(*args, **kwargs)\n\n    def start(self):\n        \"\"\"\n        Start the node for gevent-based applications.\n        \"\"\"\n        import gevent\n        import gevent.select\n\n        self._select = gevent.select.select\n        self._polling_greenlet = gevent.spawn(self.poll)\n\n        # Even though `self.poll` will run `self.heartbeat` and `self.update`\n        # immediately as it starts, this is gevent and `self.poll` may take a\n        # while to run, depending on how long the greenlet that creates the\n        # node takes to yield. So we'll run these functions here to make sure\n        # the node is up to date immediately.\n        self.heartbeat()\n        self.update()\n\n    def stop(self):\n        \"\"\"\n        Stop the node for gevent-based applications.\n        \"\"\"\n        if self._polling_greenlet:\n            while not self._stop_polling_fd_w:\n                # Let's give the greenlet some time to create the fd.\n                time.sleep(0.1)\n            os.write(self._stop_polling_fd_w, b\"1\")\n            self._polling_greenlet.join()\n            self._polling_greenlet = None\n        self.remove()\n        self._select = select.select\n"
  },
  {
    "path": "requirements.txt",
    "content": "pytest==7.2.2\nredis==4.6.0\nruff==0.4.3\nxxhash==3.5.0\n"
  },
  {
    "path": "setup.py",
    "content": "from setuptools import setup\n\nsetup(\n    name=\"redis-hashring\",\n    version=\"0.6.0\",\n    author=\"Close Engineering\",\n    author_email=\"engineering@close.com\",\n    url=\"https://github.com/closeio/redis-hashring\",\n    license=\"MIT\",\n    description=(\n        \"Python library for distributed applications using a Redis hash ring\"\n    ),\n    install_requires=[\"redis>=3\"],\n    extras_require={\n        \"xxhash\": [\"xxhash>=3.5.0\"],\n    },\n    platforms=\"any\",\n    classifiers=[\n        \"Intended Audience :: Developers\",\n        \"License :: OSI Approved :: MIT License\",\n        \"Operating System :: OS Independent\",\n        \"Topic :: Software Development :: Libraries :: Python Modules\",\n        \"Programming Language :: Python\",\n        \"Programming Language :: Python :: 3\",\n        \"Programming Language :: Python :: 3 :: Only\",\n        \"Programming Language :: Python :: 3.10\",\n        \"Programming Language :: Python :: 3.11\",\n        \"Programming Language :: Python :: 3.12\",\n        \"Programming Language :: Python :: 3.13\",\n    ],\n    packages=[\"redis_hashring\"],\n)\n"
  },
  {
    "path": "tests.py",
    "content": "import socket\nfrom unittest.mock import patch\n\nimport pytest\nfrom redis import Redis\n\nfrom redis_hashring import HashAlgorithm, RingNode\n\nTEST_KEY = \"hashring-test\"\n\n\n@pytest.fixture\ndef redis():\n    redis = Redis()\n    yield redis\n    redis.delete(TEST_KEY)\n\n\ndef get_node(redis, n_replicas, total_replicas, hash_algorithm):\n    node = RingNode(\n        redis, TEST_KEY, n_replicas=n_replicas, hash_algorithm=hash_algorithm\n    )\n\n    assert len(node._replicas) == n_replicas\n    assert redis.zcard(TEST_KEY) == total_replicas - n_replicas\n\n    node.heartbeat()\n\n    assert redis.zcard(TEST_KEY) == total_replicas\n    assert len(node.get_ranges()) == 0\n\n    return node\n\n\ndef test_node(redis):\n    with patch.object(socket, \"gethostname\", return_value=\"host1\"):\n        node1 = get_node(redis, 1, 1, HashAlgorithm.XXHASH)\n    node1.update()\n    assert len(node1.get_ranges()) == 1\n    assert node1.get_node_count() == 1\n\n    with patch.object(socket, \"gethostname\", return_value=\"host2\"):\n        node2 = get_node(redis, 1, 2, HashAlgorithm.XXHASH)\n    node1.update()\n    node2.update()\n    assert len(node1.get_ranges()) + len(node2.get_ranges()) == 3\n    assert node1.get_node_count() == 2\n    assert node2.get_node_count() == 2\n\n    with patch.object(socket, \"gethostname\", return_value=\"host3\"):\n        node3 = get_node(redis, 2, 4, HashAlgorithm.XXHASH)\n    node1.update()\n    node2.update()\n    node3.update()\n    assert (\n        len(node1.get_ranges())\n        + len(node2.get_ranges())\n        + len(node3.get_ranges())\n        == 5\n    )\n    assert node1.get_node_count() == 3\n    assert node2.get_node_count() == 3\n    assert node3.get_node_count() == 3\n\n    node1.remove()\n    node2.update()\n    node3.update()\n    assert len(node1.get_ranges()) == 0\n    assert node1.get_node_count() == 0\n    assert len(node2.get_ranges()) + len(node3.get_ranges()) == 4\n    assert node2.get_node_count() == 2\n    assert node3.get_node_count() == 2\n\n\n@pytest.mark.parametrize(\n    \"hash_algorithm\", [HashAlgorithm.CRC32, HashAlgorithm.XXHASH]\n)\ndef test_contains(redis, hash_algorithm):\n    node1 = get_node(redis, 1, 1, hash_algorithm=hash_algorithm)\n    node1.update()\n    assert node1.contains(\"item\") is True\n\n    node1.remove()\n    assert node1.contains(\"item\") is False\n"
  }
]