[
  {
    "path": ".gitignore",
    "content": "*.a\n*.o\n*.so\ngmon.out\nmpsc_test\n"
  },
  {
    "path": "LICENSE",
    "content": "Use this code however you may see fit, as long as you maintain the\ncomments at the top the source code.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.\nIN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR\nOTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,\nARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR\nOTHER DEALINGS IN THE SOFTWARE.\n"
  },
  {
    "path": "Makefile",
    "content": "# 2015 Daniel Bittman <danielbittman1@gmail.com>: http://dbittman.github.io/\n\nCFLAGS=-Wall -Wextra -Werror -std=gnu11 -O3\nLDFLAGS=\nLDLIBS=-lpthread\nCC=gcc\n\nifeq ($(strip $(DEBUG)),tsan)\n\tCC=clang\n\tCFLAGS+=-fsanitize=thread\n\tLDFLAGS+=-fsanitize=thread\nelse ifeq ($(strip $(DEBUG)),asan)\n\tCC=clang\n\tCFLAGS+=-fsanitize=address\n\tLDFLAGS+=-fsanitize=address\nendif\n\nall: libmpscq.so libmpscq.a mpsc_test\n\nmpsc_test: mpsc.o mpsc_test.o\n\nmpsc_test.o: mpsc_test.c\n\nmpsc.o: mpsc.c mpscq.h\n\nlibmpscq.a: mpsc.o\n\tar -cvq libmpscq.a mpsc.o\n\nlibmpscq.so: mpsc.c mpscq.h Makefile\n\t$(CC) $(CFLAGS) -shared -o libmpscq.so -fPIC mpsc.c\n\nclean:\n\t-rm libmpscq.* *.o mpsc_test\n\nprof: mpsc_test\n\tfor i in $$(seq 1 100); do sudo nice -n -20 ./mpsc_test; mv -f gmon.out gmon.out.$$i; done\n\tgprof -s ./mpsc_test gmon.out.*;  gprof ./mpsc_test gmon.sum -bQ\n\n"
  },
  {
    "path": "README.md",
    "content": "MPSCQ - Multiple Producer, Single Consumer Wait-Free Queue\n==========================================================\nC11 library that allows multiple threads to enqueue something to a queue, and allows one thread (and only one thread) to dequeue from it.\n\nThis code is tested, but not proven. Use it at your own peril.\n\nInterface\n---------\nCreation and destruction of a queue can be done with:\n\n    struct mpscq *mpscq_create(struct mpscq *n, size_t capacity);\n    void mpscq_destroy(struct mpscq *q);\n\nPassing a NULL pointer as _n_ will allocate a new queue with malloc, initialize it, and return it. Passing a pointer to a struct mpscq as _n_ will initialize that object. Calling the destroy function will free the internal data of the object, and if the object was allocated via malloc, it will be freed as well.\n\nEnqueuing can be done with:\n\n    bool mpscq_enqueue(struct mpscq *q, void *obj);\n\nwhich will enqueue _obj_ in _q_, returning true if it was enqueued and false if it wasn't (queue was full).\n\nDequeuing can be done with:\n\n    void *mpscq_dequeue(struct mpscq *q);\n\nwhich will return NULL if the queue was empty or an object from the queue if it wasn't. Note that\na queue may appear to be empty if a thread is in the process if writing the object in the next slot in the buffer, but that's okay because the function can be called again (see the comments in the source for more interesting comments on this).\n\nThe queue may also be queried for current number of items and for total capacity:\n\n    size_t mpscq_capacity(struct mpscq *q);\n    size_t mpscq_count(struct mpscq *q);\n\nComments\n--------\nPLEASE report bugs to me if you find any (email me at danielbittman1@gmail.com).\n\nTechnical Details\n-----------------\nDuring the first half of the enqueuing function, we prevent writing to the queue if the queue is full. This is done by doing an add anyway, and then seeing if the old value was greater than or equal to max. If so, then we cannot write to the queue because it's full. This is safe for multiple threads, since the worst thing that can happen is a thread sees the count to be way above the max. This is okay, since it'll just report the queue as being full.\n\nThe second half of the enqueuing function gains near-exclusive access to the head element. It isn't completely exclusive, since the consumer thread may be observing that element. However, we prevent any producer threads from trying to write to the same area of the queue. Once head is fetched and incremented, we store the object to the head location, thus releasing that memory location.\n\nIn the dequeue function, we exchange the tail with NULL, and observe the return value. If the return value is NULL, then there's nothing in the queue and so we return NULL. If we got back an object, we just increment the tail and decrement the count, before returning.\n\nPerformance (preliminary)\n-------------------------\nHere's a quick comparison to a locked circular queue I wrote quickly, fueled by beer. With 64 threads, each writing 200 objects to the queue with the speed of 64 fairly slow threads (and, of course, a singular thread reading from it with the speed of a one fairly slow thread... ) the lock-free queue wins pretty convincingly:\n\n![I WILL WRITE 500 OBJECTS, AND I WILL WRITE 500 MORE](https://raw.githubusercontent.com/dbittman/waitfree-mpsc-queue/master/data/64-200.png)\n\n(hard to see: the left-most data points are at x=50, not 0)\n\nWell, that's pretty nice. If your queue is small, then MPSCQ does wonders compared to locking, which is what I would expect.\n\n"
  },
  {
    "path": "data/lmeans-64-200",
    "content": "50 860.97\n450 505.85\n850 386.29\n1250 358.71\n1650 251.43\n2050 256.77\n2450 229.72\n2850 235.15\n3250 168.17\n3650 152.26\n4050 145.67\n4450 150.03\n4850 167.51\n5250 144.98\n5650 157.39\n6050 121.42\n6450 131.01\n6850 105.88\n7250 119.45\n7650 108.03\n8050 143.55\n8450 111.33\n8850 107.54\n9250 92.02\n9650 112.6\n10050 102.3\n10450 94.12\n10850 102.36\n11250 102.26\n11650 94.43\n12050 87.08\n12450 89.49\n12850 85.4\n13250 84.51\n13650 85.61\n14050 82.17\n14450 83.93\n14850 89.81\n15250 79.43\n15650 82\n16050 81.19\n16450 79.62\n16850 79.62\n17250 80.31\n17650 79.04\n18050 81.74\n18450 78.78\n18850 80.6\n19250 79.81\n19650 79.13\n20050 79.27\n20450 79.71\n20850 81.15\n21250 79.12\n21650 79.26\n22050 79.12\n22450 79.35\n22850 79.14\n23250 79.39\n23650 80.1\n24050 78.49\n24450 79.86\n24850 79.19\n25250 78.23\n25650 80.26\n26050 79.83\n26450 79.05\n26850 79.62\n27250 79.17\n27650 80.14\n28050 80.54\n28450 78.57\n28850 80.49\n29250 78.96\n29650 79.78\n30050 79.49\n30450 79.15\n30850 78.71\n31250 79.66\n31650 79.31\n32050 79.51\n32450 77.89\n32850 79.2\n33250 79.11\n33650 78.81\n34050 78.41\n34450 78.82\n34850 77.99\n35250 79.3\n35650 80.08\n36050 78.52\n36450 79.96\n36850 80.1\n37250 79.45\n37650 79.62\n38050 79.06\n38450 78.94\n38850 80.75\n39250 80.14\n39650 79.65\n40050 79.36\n40450 78.67\n40850 79.95\n41250 79.58\n41650 78.29\n42050 78.63\n42450 78.81\n42850 79.54\n43250 79.27\n43650 78.73\n44050 79.3\n44450 78.72\n44850 80.26\n45250 80.61\n45650 79.96\n46050 78.74\n46450 80.24\n46850 79.55\n47250 79.39\n47650 78.94\n48050 79.02\n48450 79.64\n48850 78.58\n49250 80.15\n49650 79.59\n"
  },
  {
    "path": "data/means-64-200",
    "content": "50 95.14\n450 68.97\n850 62.81\n1250 67.54\n1650 81.86\n2050 67.5\n2450 74.89\n2850 59.35\n3250 62.08\n3650 58.8\n4050 54.12\n4450 62.22\n4850 78.34\n5250 59.76\n5650 53.9\n6050 53.22\n6450 53.98\n6850 54.87\n7250 46.89\n7650 47.05\n8050 46.98\n8450 48.91\n8850 52.5\n9250 45.67\n9650 45.52\n10050 45.73\n10450 45.29\n10850 45.66\n11250 46.21\n11650 45.25\n12050 45.85\n12450 45.72\n12850 45.83\n13250 46.06\n13650 45.34\n14050 46.05\n14450 46\n14850 45.71\n15250 45.78\n15650 46.61\n16050 44.48\n16450 45.1\n16850 45.62\n17250 45.77\n17650 45.33\n18050 45.65\n18450 45.99\n18850 45.85\n19250 46.57\n19650 45.72\n20050 45.74\n20450 45.73\n20850 46.36\n21250 45.98\n21650 45.88\n22050 46.53\n22450 46.26\n22850 46.05\n23250 46.39\n23650 45.74\n24050 45.47\n24450 45.93\n24850 45.95\n25250 45.75\n25650 45.56\n26050 45.69\n26450 45.73\n26850 46.67\n27250 46.95\n27650 46.1\n28050 46.06\n28450 45.88\n28850 45.32\n29250 45.87\n29650 46.06\n30050 44.71\n30450 45.49\n30850 45.23\n31250 47.05\n31650 46.2\n32050 45.5\n32450 45.57\n32850 46.3\n33250 44.79\n33650 45.04\n34050 46.67\n34450 45.81\n34850 45.35\n35250 46.06\n35650 46.46\n36050 45.99\n36450 45.61\n36850 45.95\n37250 45.84\n37650 46.12\n38050 46.11\n38450 46.08\n38850 45.88\n39250 44.78\n39650 45.99\n40050 46.44\n40450 45.98\n40850 46.1\n41250 46.14\n41650 46.14\n42050 46.16\n42450 45.82\n42850 45.23\n43250 45.85\n43650 45.7\n44050 46.55\n44450 46.03\n44850 46.39\n45250 46.05\n45650 46.29\n46050 46.49\n46450 46.17\n46850 46.17\n47250 46.84\n47650 45.65\n48050 45.87\n48450 46.26\n48850 45.84\n49250 45.32\n49650 45.26\n"
  },
  {
    "path": "mpsc.c",
    "content": "/* 2015 Daniel Bittman <danielbittman1@gmail.com>: http://dbittman.github.io/ */\n\n#include <stdatomic.h>\n#include <stdbool.h>\n#include <stdlib.h>\n#include <assert.h>\n\n#include \"mpscq.h\"\n\n/* multi-producer, single consumer queue *\n * Requirements: max must be >= 2 */\nstruct mpscq *mpscq_create(struct mpscq *n, size_t capacity)\n{\n\tif(!n) {\n\t\tn = calloc(1, sizeof(*n));\n\t\tn->flags |= MPSCQ_MALLOC;\n\t} else {\n\t\tn->flags = 0;\n\t}\n\tn->count = ATOMIC_VAR_INIT(0);\n\tn->head = ATOMIC_VAR_INIT(0);\n\tn->tail = 0;\n\tn->buffer = calloc(capacity, sizeof(void *));\n\tn->max = capacity;\n\tatomic_thread_fence(memory_order_release);\n\treturn n;\n}\n\nvoid mpscq_destroy(struct mpscq *q)\n{\n\tfree(q->buffer);\n\tif(q->flags & MPSCQ_MALLOC)\n\t\tfree(q);\n}\n\nbool mpscq_enqueue(struct mpscq *q, void *obj)\n{\n\tsize_t count = atomic_fetch_add_explicit(&q->count, 1, memory_order_acquire);\n\tif(count >= q->max) {\n\t\t/* back off, queue is full */\n\t\tatomic_fetch_sub_explicit(&q->count, 1, memory_order_release);\n\t\treturn false;\n\t}\n\n\t/* increment the head, which gives us 'exclusive' access to that element */\n\tsize_t head = atomic_fetch_add_explicit(&q->head, 1, memory_order_acquire);\n\tassert(q->buffer[head % q->max] == 0);\n\tvoid *rv = atomic_exchange_explicit(&q->buffer[head % q->max], obj, memory_order_release);\n\tassert(rv == NULL);\n\treturn true;\n}\n\nvoid *mpscq_dequeue(struct mpscq *q)\n{\n\tvoid *ret = atomic_exchange_explicit(&q->buffer[q->tail], NULL, memory_order_acquire);\n\tif(!ret) {\n\t\t/* a thread is adding to the queue, but hasn't done the atomic_exchange yet\n\t\t * to actually put the item in. Act as if nothing is in the queue.\n\t\t * Worst case, other producers write content to tail + 1..n and finish, but\n\t\t * the producer that writes to tail doesn't do it in time, and we get here.\n\t\t * But that's okay, because once it DOES finish, we can get at all the data\n\t\t * that has been filled in. */\n\t\treturn NULL;\n\t}\n\tif(++q->tail >= q->max)\n\t\tq->tail = 0;\n\tsize_t r = atomic_fetch_sub_explicit(&q->count, 1, memory_order_release);\n\tassert(r > 0);\n\treturn ret;\n}\n\nsize_t mpscq_count(struct mpscq *q)\n{\n\treturn atomic_load_explicit(&q->count, memory_order_relaxed);\n}\n\nsize_t mpscq_capacity(struct mpscq *q)\n{\n\treturn q->max;\n}\n\n"
  },
  {
    "path": "mpsc_test.c",
    "content": "/* 2015 Daniel Bittman <danielbittman1@gmail.com>: http://dbittman.github.io/ */\n#include <stdio.h>\n#include <stdatomic.h>\n#include <stdbool.h>\n#include <pthread.h>\n#include <stdlib.h>\n#include <assert.h>\n#include \"mpscq.h\"\n\nstruct mpscq *queue;\n_Atomic int amount_produced = ATOMIC_VAR_INIT(0);\n_Atomic int amount_consumed = ATOMIC_VAR_INIT(0);\n_Atomic bool done = ATOMIC_VAR_INIT(false);\n_Atomic int retries = ATOMIC_VAR_INIT(0);\n_Atomic long long total = ATOMIC_VAR_INIT(0);\n#define NUM_ITEMS 10000\n#define NUM_THREADS 32\n\nstruct item {\n\t_Atomic int sent, recv;\n};\n\nstruct item items[NUM_THREADS][NUM_ITEMS];\n\nvoid *producer_main(void *x)\n{\n\tlong tid = (long)x;\n\tstruct timespec start, end;\n\tfor(int i=0;i<NUM_ITEMS;i++) {\n\t\tassert(atomic_fetch_add(&items[tid][i].sent, 1) == 0);\n\t\tclock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start);\n\t\tbool r = mpscq_enqueue(queue, &items[tid][i]);\n\t\tclock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end);\n\t\tif(r) {\n\t\t\tatomic_fetch_add(&amount_produced, 1);\n\t\t\ttotal += (end.tv_nsec - start.tv_nsec) / 1000;\n\t\t} else {\n\t\t\titems[tid][i].sent = 0;\n\t\t\ti--;\n\t\t\tretries++;\n\t\t}\n\t}\n\tfor(int i=0;i<NUM_ITEMS;i++) {\n\t\tassert(items[tid][i].sent != 0);\n\t}\n\tatomic_thread_fence(memory_order_seq_cst);\n\tpthread_exit(0);\n}\n\nvoid *consumer_main(void *x)\n{\n\t(void)x;\n\tbool doublechecked = false;\n\twhile(true) {\n\t\tvoid *ret = mpscq_dequeue(queue);\n\t\tif(ret) {\n\t\t\tatomic_fetch_add(&amount_consumed, 1);\n\t\t\tstruct item *it = ret;\n\t\t\tassert(atomic_fetch_add(&it->sent, 1) == 1);\n\t\t\tassert(atomic_fetch_add(&it->recv, 1) == 0);\n\t\t\tdoublechecked = false;\n\t\t} else if(done && doublechecked) {\n\t\t\tbreak;\n\t\t} else if(done) {\n\t\t\tdoublechecked = true;\n\t\t}\n\t}\n\tassert(!mpscq_dequeue(queue));\n\tatomic_thread_fence(memory_order_seq_cst);\n\tassert(queue->count == 0);\n\tassert(queue->head % queue->max == queue->tail);\n\tpthread_exit(0);\n}\n\n#include <time.h>\nint main(int argc, char **argv)\n{\n\t(void)argc;\n\t(void)argv;\n\tint num_producers = NUM_THREADS-1;\n\tpthread_t producers[num_producers];\n\tpthread_t consumer;\n\n\tstruct timespec start, end;\n\n\tfor(int i=0;i<NUM_THREADS;i++) {\n\t\tfor(int j=0;j<NUM_ITEMS;j++) {\n\t\t\titems[i][j].sent = 0;\n\t\t\titems[i][j].recv = 0;\n\t\t}\n\t}\n\n\tint cap = atoi(argv[1]);\n\tqueue = mpscq_create(NULL, cap);\n\n\tpthread_create(&consumer, NULL, consumer_main, NULL);\n\tclock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start);\n\tfor(long i=0;i<num_producers;i++) {\n\t\tpthread_create(&producers[i], NULL, producer_main, (void *)i);\n\t}\n\n\tfor(int i=0;i<num_producers;i++) {\n\t\tpthread_join(producers[i], NULL);\n\t}\n\tclock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end);\n\tdone = true;\n\tpthread_join(consumer, NULL);\n\n\tatomic_thread_fence(memory_order_seq_cst);\n\t\n\tfor(int i=0;i<num_producers;i++) {\n\t\tfor(int j=0;j<NUM_ITEMS;j++) {\n\t\t\tif(items[i][j].sent != 2) {\n\t\t\t\tprintf(\":(%d %d): %d %d, %d %d\\n\", i, j, items[i][j].sent,\n\t\t\t\t\t\titems[i][j].recv, amount_produced, amount_consumed);\n\t\t\t}\n\t\t\tassert(items[i][j].sent == 2);\n\t\t\tassert(items[i][j].recv == 1);\n\t\t}\n\t}\n\t\n\tlong ms = (end.tv_sec - start.tv_sec) * 1000;\n\tms += (end.tv_nsec - start.tv_nsec) / 1000000;\n\n\tfprintf(stdout, \"\\t%d\\t%ld\\t%ld\\n\",\n\t\t\tretries, ms, (long)(total / amount_produced));\n\tassert(amount_produced == amount_consumed);\n\texit(amount_produced != amount_consumed);\n}\n\n"
  },
  {
    "path": "mpscq.h",
    "content": "/* 2015 Daniel Bittman <danielbittman1@gmail.com>: http://dbittman.github.io/ */\n#ifndef __MPSCQ_H\n#define __MPSCQ_H\n\n#include <stdint.h>\n#ifndef __cplusplus\n#include <stdatomic.h>\n#endif\n#include <stdbool.h>\n#include <sys/types.h>\n\n#define MPSCQ_MALLOC 1\n\n#ifndef __cplusplus\nstruct mpscq {\n\t_Atomic size_t count;\n\t_Atomic size_t head;\n\tsize_t tail;\n\tsize_t max;\n\tvoid * _Atomic *buffer;\n\tint flags;\n};\n#else\nstruct mpscq;\n#endif\n\n#ifdef __cplusplus\nextern \"C\" {\n#endif\n/* create a new mpscq. If n == NULL, it will allocate\n * a new one and return it. If n != NULL, it will\n * initialize the structure that was passed in. \n * capacity must be greater than 1, and it is recommended\n * to be much, much larger than that. It must also be a power of 2. */\nstruct mpscq *mpscq_create(struct mpscq *n, size_t capacity);\n\n/* enqueue an item into the queue. Returns true on success\n * and false on failure (queue full). This is safe to call\n * from multiple threads */\nbool mpscq_enqueue(struct mpscq *q, void *obj);\n\n/* dequeue an item from the queue and return it.\n * THIS IS NOT SAFE TO CALL FROM MULTIPLE THREADS.\n * Returns NULL on failure, and the item it dequeued\n * on success */\nvoid *mpscq_dequeue(struct mpscq *q);\n\n/* get the number of items in the queue currently */\nsize_t mpscq_count(struct mpscq *q);\n\n/* get the capacity of the queue */\nsize_t mpscq_capacity(struct mpscq *q);\n\n/* destroy a mpscq. Frees the internal buffer, and\n * frees q if it was created by passing NULL to mpscq_create */\nvoid mpscq_destroy(struct mpscq *q);\n\n#ifdef __cplusplus\n}\n#endif\n\n#endif\n\n"
  }
]