queue: Only ACK drain_queue once it has completed work on the list.

Currently, drain_queue and json_drain_queue ack every message as it is
pulled off of the queue, until the queue is empty.  This means that if
the consumer crashes between pulling a batch of messages off the
queue, and actually processing them, those messages will be
permanently lost.  Sending an ACK on every message also results in a
significant amount lot of traffic to rabbitmq, with notable
performance implications.

Send a singular ACK after the processing has completed, by making
`drain_queue` into a contextmanager.  Additionally, use the `multiple`
flag to ACK all of the messages at once -- or explicitly NACK the
messages if processing failed.  Sending a NACK will re-queue them at
the front of the queue.

Performance of a no-op dequeue before this change:
```
$ ./manage.py queue_rate --count 50000 --batch
Purging queue...
Enqueue rate: 10847 / sec
Dequeue rate: 2479 / sec
```
Performance of a no-op dequeue after this change (a 25% increase):
```
$ ./manage.py queue_rate --count 50000 --batch
Purging queue...
Enqueue rate: 10752 / sec
Dequeue rate: 3079 / sec
```
This commit is contained in:
Alex Vandiver
2020-09-29 19:03:57 -07:00
committed by Tim Abbott
parent df86a564dc
commit baf882a133
4 changed files with 62 additions and 23 deletions

View File

@@ -324,8 +324,8 @@ class LoopQueueProcessingWorker(QueueProcessingWorker):
self.initialize_statistics()
self.is_consuming = True
while self.is_consuming:
events = self.q.json_drain_queue(self.queue_name)
self.do_consume(self.consume_batch, events)
with self.q.json_drain_queue(self.queue_name) as events:
self.do_consume(self.consume_batch, events)
# To avoid spinning the CPU, we go to sleep if there's
# nothing in the queue, or for certain queues with
# sleep_only_if_empty=False, unconditionally.