[
  {
    "path": "README.md",
    "content": "# How do I Locust.\n\n[Locust](https://locust.io/) generates load and does pretty much nothing else.\nIt comes with a web server with UI controls out of the box. It can also run in\na single-master, multiple-worker configuration in which the workers generate load\nand the master aggregates reports from the workers.\n\nYou write a locust file, which is just regular python file\n\n```python\n# a descriptively named locust file\nsomething.py\n```\n\nPut a task set in your locust file. Each task typically contains a particular\nAPI operation.\n\n```python\nimport locust\nclass MyTaskSet(locust.TaskSet):\n\n    @locust.task\n    def do_get(self):\n        self.client.get('/resources/{id}')\n```\n\nThen define other test parameters, including the api endpoint to hit and the\nwait time:\n\n```python\nclass MyLocust(locust.HttpLocust):\n    task_set = MyTaskSet\n\n    min_wait = 900\n    max_wait = 1100\n\n    host = 'http://localhost:8000'\n```\n\nThe min/max wait times control the amount of time each simulated user waits\nbetween executing tasks. Each user will execute a task at random, wait a random\namount of time between `min_wait` and `max_wait`, and then repeat.\n\nAt this point, you can start locust's web server:\n\n```bash\n$ locust -f something.py\n```\n\nThen you can start the test in one of two ways:\n\n1. Go to `localhost:8089` in your browser and type in numbers and click start.\n2. POST `localhost:8089/swarm` with `{\"locust_count\": 3, \"hatch_rate\": 1}`\n\nThe locust count is the total number of users to spawn. The hatch rate is the\nnumber of users to spawn per second, starting from zero when load generation\nfirst begins. (The \"hatching\" period is the rampup period from when the test\nfirst starts until the max number of users is reached.)\n\nEach user does the following:\n\n1. Pick one of the tasks from your locust file\n2. Run the task (execute that task function)\n3. Pick a random wait time between min_wait and max_wait (specified in your\nlocust file)\n4. Wait that amount of time\n5. Repeat from 1\n\n\n### But how do I use workers?\n\nLocust can be run in a single-master, multiple-worker configuration. The workers\ndo all the load generation, while the master controls and monitors. Every (by\ndefault) 3 seconds, a worker sends a single report for all requests made *on\nthat worker* in the last 3 seconds. The master receives these reports from all\nof its workers and consolidates them in real time.\n\nThe master controls the starting and stopping of *load generation* on the\nworkers. The master cannot start/stop the locust process running on the workers.\nThis means you need to create servers and start locust's processes yourself.\n\nTo start the master process:\n\n```bash\n$ locust -f something.py --master\n```\n\nYou must start the master process before the workers. Then start the workers:\n\n```bash\n$ locust -f something.py --worker --master-host=<master-ip>\n```\n\nYou should be able see the clients connect/disconnect in the master's logs.\n\n### Random tips\n\n##### Detecting whether you're the master or worker\nIt seems normal to use the same locust file on the master and worker (I've never\ntried using a master locust file and a separate, different worker locust file.\nEven if it works, you lose the ability to run your tests in non-distributed\nmode.)\n\nBut sometimes I wanted to do certain things only on the master (like saving a\nreport to disk) and some things only on the worker (like doing some data prep\nbefore starting load generation).\n\nFrom what I can tell, locust doesn't provide a way to detect which mode you're\nrunning in. What I did was check for the command line arguments:\n\n```python\nimport sys\n\ndef is_master():\n    return '--master' in sys.argv\n\ndef is_worker():\n    return '--worker' in sys.argv\n```\n\n##### Event hooks\nLocust has some nifty event hooks that let you execute a function when that\nevent is fired. Locust has [some built-in\nevents](http://docs.locust.io/en/latest/api.html#available-hooks), or you can\ncreate your own:\n\n```python\nimport locust.events\n\nmy_event = locust.events.EventHook()\n\ndef handler1(a, b): print \"handled\", a, b\ndef handler2(a, b): print \"sum\", a + b\n\nmy_event += handler1\nmy_event += handler2\n\n# invokes each event handler in the order you inserted them\n# note that you have to use keyworded arguments when you call .fire()\nmy_event.fire(a=1, b=2)\n```\n\n##### Pre-test actions\nThere are several different points at which you can do something before the\ntest starts. Some of these are kind of non-obvious, and I ran into situations\nwhere a setup task was being run every time a new user was spawned (during\nrampup) instead of just when load generation started. Anyway, these are the\npoints at which you can do something before/after your test:\n\n1. *At process startup*: You can do whatever you like in the global scope of\nyour locust file, but this will only occur once for the entire time your\nlocust process is running. I read config files and do other \"framework\" setup,\nlike registering event handlers or preparing integration with another service.\n\n```python\nimport locust\n# read the config file once, at the start of the locust process\nCONF = get_config()\n# only register this event handler once\nlocust.events.locust_start_hatching += get_auth_token\n\nclass MyTasks(locust.TaskSet):\n    ...\n```\n\n2. *At test start*: You click the start button to start load generation You can\nhook into this by attaching event handlers to the\n`locust.events.locust_start_hatching` event:\n\n```python\ndef do_thing():\n    # do a thing\nlocust.events.locust_start_hatching += do_thing\n```\n\nThis will call `do_thing` exactly once when you press the start button. If\nyou're running locust in master/worker mode, then\n`locust.events.locust_start_hatching` fires only on workers, and\n`locust.events.master_start_hatching` fires only on the master.\n\n3. *At user spawn time*: You can do per-user setup in two places. Locust will\ncall the `on_start` method when a Locust user is started:\n\n```python\nclass MyTasks(locust.TaskSet):\n    def on_start(self):\n        # Each locust user gets a different id\n        self.random_id = str(uuid.uuid4())\n```\n\nLocust creates a new instance of your TaskSet class once per user, so you can\nalso do setup in the class constructor (I think this is less preferred):\n\n```python\nclass MyTasks(locust.TaskSet):\n    def __init__(self, *args, **kwargs):\n        super(MyTasks, self).__init__(*args, **kwargs)\n        # Each locust user gets a different id\n        self.random_id = str(uuid.uuid4())\n```\n\n##### Grouping requests in the report\nBy default, locust will use `/resources/{uuid}` as the name that shows up in\nthe summary report. But if your the url has an id in it, you'll get a separate\nentry in locust's report for each id.\n\nInstead, you can provide an explicit `name` argument for Locust to use in its\nreport:\n\n```python\n@locust.task\ndef do_get(self):\n    self.client.get('/resources/{id}', name='/resources/UUID')\n```\n\n##### Manually marking requests as success/failure\nNormally, locust automatically records a success or failure by looking at the\nhttp status code. You can use a `with` block to override this:\n\n```python\nwith self.client.post('/things', catch_response=True) as post_resp:\n    if post_resp.json()['status'] == 'ACTIVE':\n        post_resp.success()\n    elif post_resp.json()['status'] == 'ERROR':\n        post_resp.failure(\"Saw ERROR status\")\n    else:\n        post_resp.failure(\"Unknown error\")\n```\n\n##### Achieving precise request rates\nYou can compute the approximate request rate using the average wait time and\nthe total number of users. This will be inaccurate though, because it doesn't\ntake into account the time each user spends executing a task. This is a problem\nwhen you have long running tasks (e.g. if you need to poll).\n\nFor example, consider a long running task like this:\n\n```python\n@task\ndef do_sleep(self):\n    gevent.sleep(60)\n```\n\nA locust user will enter this task and sleep for 60 seconds. If you have a\nsmall number of users, chances are they'll all eventually get stuck sleeping in\nthis task, unable to continue executing other tasks, so your request rates will\nplummet. (Due to reasons, you should actually use gevent.sleep() instead of\ntime.sleep(). See the next section for more on this.)\n\nInstead, you want the task to take as short a time as possible. To do this, you\ncan spawn a new greenlet that sleeps in the background:\n\n```python\ndef _do_async_thing_handler(self):\n    gevent.sleep(60)\n\n@task\ndef do_async_thing(self):\n    gevent.spawn(self._do_async_thing_handler)\n```\n\nThis accomplishes two things:\n\n1. Your task (do_async_thing) returns effectively immediately, and the locust\nuser can go on to do other things. This means your actual request rate\nshould much be closer to what you expect.\n2. Your task's functionality (_do_async_thing_handler) continues running in the\nbackground and will terminate on its own.\n\n##### How do I sleep\n\nWith gevent, you should use `gevent.sleep` instead of `time.sleep` to avoid\n\"blocking the world\", or causing all your task functions to block/sleep. Why?\nBecause gevent operates entirely within a single OS thread. Calling\n`time.sleep` actually sleeps the thread, which means gevent cannot continue\nexecuting your task functions.\n\n*However*, Locust runs gevent's [monkey\npatching](http://www.gevent.org/intro.html#monkey-patching), during which\ngevent replaces certain functions from the Python standard library with\ngevent-friendly versions. During this monkey patching step, `time.sleep` is\nreplaced with `gevent.sleep`. After monkey patching occurs, `time.sleep` _is_\nthe `gevent.sleep` function and you have nothing to worry about.\n\nBut, how do I not block the world? Either,\n\n- Call `gevent.sleep` explicitly within your Locust task functions and other\nLocust-related code, OR\n- Ensure gevent is monkey patched, so that `time.sleep` can be used safely.\nLocust runs monkey patching for you, but you must have performed an `import\nlocust` _prior_ to using `time.sleep`. (You can also call\n`gevent.monkey.patch_all()` explicitly yourself, if you need to).\n\nMonkey patching gevent is a must when you are using an external library that\ncan't be modified to use `gevent.sleep`.\n\n##### Asynchronous polling\n\nYou can use a similar pattern as above to poll for a status. One thing to\nwatch out for is having too many greenlets running in the background on a single\nmachine (you can lower the polling frequency or run in distributed mode to\nhelp manage this).\n\nFirst, I added functions to report the result of asynchronous operations to Locust.\nThese use the native [request_success](\nhttps://docs.locust.io/en/stable/api.html#locust.event.Events.request_success)\nand [request_failure](\nhttps://docs.locust.io/en/stable/api.html#locust.event.Events.request_failure)\nevents.\n\n```python\ndef async_success(name, start_time, resp):\n    locust.events.request_success.fire(\n        request_type=resp.request.method,\n        name=name,\n        response_time=int((time.monotonic() - start_time) * 1000),\n        response_length=len(resp.content),\n    )\n\ndef async_failure(name, start_time, resp, message):\n    locust.events.request_failure.fire(\n        request_type=resp.request.method,\n        name=name,\n        response_time=int((time.monotonic() - start_time) * 1000),\n        exception=Exception(message),\n    )\n```\n\nThen, I implemented polling logic in an async \"handler\" function and called\nthe `async_success`/`async_failure` functions above when polling is complete.\nThese calculate the total elapsed time for the async operation, and report\nthe result to Locust. I also added the usual `@task` function to spawn the async\nhandler function in the background.\n\n```python\ndef _do_async_thing_handler(self, timeout=600):\n    post_resp = self.client.post('/things')\n    if not post_resp.ok:\n        return\n    id = post_resp.json()['id']\n\n    # Now poll for an ACTIVE status\n    start_time = time.monotonic()\n    end_time = start_time + timeout\n    while time.monotonic() < end_time:\n        r = self.client.get('/things/' + id)\n        if r.ok and r.json()['status'] == 'ACTIVE':\n            async_success('POST /things/ID - async', start_time, post_resp)\n            return\n        elif r.ok and r.json()['status'] == 'ERROR':\n            async_failure('POST /things/ID - async', start_time, post_resp,\n                          'Failed - saw ERROR status')\n            return\n\n        # IMPORTANT: Sleep must be monkey-patched by gevent (typical), or else\n        # use gevent.sleep to avoid blocking the world.\n        time.sleep(1)\n    async_failure('POST /things/ID - async', start_time, post_resp,\n                  'Failed - timed out after %s seconds' % timeout)\n\n@task\ndef do_async_thing(self):\n    gevent.spawn(self._do_async_thing_handler)\n```\n\nThe key things here are:\n\n1. Spawning a new greenlet in your `@task` function so that other Locust tasks\nare not blocked by the long-running polling loop. This ensures that the `do_async_thing`\ntask is executed at a predictable (request) rate, that will not be affected by\nthe time spent in the polling loop.\n2. Calling the `async_success`/`async_failure` functions to compute the duration\nuntil the async operation was finished. These used the `request_success`/`request_failure`\nevent hooks to report the result to Locust, which is included in the request metrics.\n\n(Another issue here: When I stop load generation, any greenlets running in the background\nwill continue running - until the greenlets finish running, or until the process is\nterminated. I tracked my greenlets in a list and used an event hook to kill them all when\nthe test is stopped.)\n"
  }
]