This name seems to have been copied from
zerver.worker.email_senders.ImmediateEmailSenderWorker, but deferred
is the opposite of immediate.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Using postfix to handle the incoming email gateway complicates things
a great deal:
- It cannot verify that incoming email addresses exist in Zulip before
accepting them; it thus accepts mail at the `RCPT TO` stage which it
cannot handle, and thus must reject after the `DATA`.
- It is built to handle both incoming and outgoing email, which
results in subtle errors (1c17583ad5, 79931051bd, a53092687e,
#18600).
- Rate-limiting happens much too late to avoid denial of
service (#12501).
- Mis-configurations of the HTTP endpoint can break incoming
mail (#18105).
Provide a replacement SMTP server which accepts incoming email on port
25, verifies that Zulip can accept the address, and that no
rate-limits are being broken, and then adds it directly to the
relevant queue.
Removes an incorrect comment which implied that missed-message
addresses were only usable once. We leave rate-limiting to only
channel email addresses, since missed-message addresses are unlikely
to be placed into automated systems, as channel email addresses are.
Also simplifies #7814 somewhat.
Sending emails synchronously is useful because it reports
configuration errors -- but it also means that occasional failures can
result in ugly 500's, since those don't retry.
Add a setting which forces all email to go through the `emil_senders`
queue, so it can be retried as needed.
`deliver_scheduled_emails` tries to deliver the email synchronously,
and if it fails, it retries after 10 seconds. Since it does not track
retries, and always tries the earliest-scheduled-but-due message
first, the worker will not make forward progress if there is a
persistent failure with that message, and will retry indefinitely.
This can result in excessive network or email delivery charges from
the remote SMTP server.
Switch to delivering emails via a new queue worker. The
`deliver_scheduled_emails` job now serves only to pull deferred jobs
out of the table once they are due, insert them into RabbitMQ, and
then delete them. This limits the potential for head-of-queue
failures to failures inserting into RabbitMQ, which is more reasonable
than failures speaking to a complex external system we do not control.
Retries and any connections to the SMTP server are left to the
RabbitMQ consumer.
We build a new RabbitMQ queue, rather than use the existing
`email_senders` queue, because that queue is expected to be reasonably
low-latency, for things like missed message notifications. The
`send_future_email` codepath which inserts into ScheduledEmails is
also (ab)used to digest emails, which are extremely bursty in their
frequency -- and a large burst could significantly delay emails behind
it in the queue.
The new queue is explicitly only for messages which were not initiated
by user actions (e.g., invitation reminders, digests, new account
follow-ups) which are thus not latency-sensitive.
Fixes: #32463.
This means that only ImageAttachment row needs to be fetched, and
removes the need to pass around an extra parameter. This
denormalization is safe, since in general Attachment rows are
read-only, so we are not concerned with drift between the Attachment
and ImageAttachment tables.
We cannot make content_type non-null, since while the both the
`content_type` column in Attachment and populating that from requests
predates the ImageAttachment table, we have both backfilled
ImageAttachment rows to consider, and imports may also leave files
with no `content_type`. Any backfill of currently-null `content_type`
values will thus need to update both tables.
This change fixes a race condition when importing. ImageAttachment
rows are imported before rendering Messages, which are both before
importing Attachment rows; if the thumbnailing finished after the
Message was imported but before Attachment rows were imported, then
the re-rendering step would not know the image's content-type.
This commit updates embedded bots to mark messages they have process as
read. Since the service bots have their own `UserMessage` rows, this
change enables us to track whether the bot has in fact processed the
message by adding the `read` flag to their `UserMessage`.
Fixes#28869.
This commit updates outgoing bots to mark messages they process as read.
Since the service bots have their own `UserMessage` rows, this change
enables us to track whether the bot has in fact processed the message by
adding the `read` flag to their `UserMessage`.
71406ac767 switched the IMAGE_BOMB_TOTAL_PIXELS cutoff for what
images we preview to include the number of frames in the calculation.
While accurate to the implementation (thumbnailing a 1k-frame animation is
prohibitive, even a small resolutions), this was a behaviour change
from without thumbnailing -- animated gifs did not display inline at
all anymore.
Switch to thumbnailing as many frames as we can fit into a pixel-based
animated thumbnailing threshold, with a minimum of three (to be able
to convey that the image is actually animated). Smaller-resolution
images will hence get more frames in their preview. This also allows
the standard animate-on-hover or always-animate behaviour to be true
to their configurations, without confusing edge cases.
Fixes: #32609.
If the content-type of the image is not in INLINE_MIME_TYPES, then we
do not expect browsers to be able to display it. This behaviour is
particularly confusing because the thumbnail will render properly,
since that will be in the more widely-supported WebP format, but the
lightbox will show a broken image.
In these cases, generate a high-resolution (4032x3024) "thumbnail"
which clients can choose to use instead. This thumbnail format is not
in the listed in the server's advertised thumbnail size list, because
it is not reliably generated for every image.
The transcoded thumbnail format is set on the `img` tag if it is
generated, and the original content-type is always passed to the
client, so it can decide how or if to render the original image. This
content-type is as the _original uploader_ specified it, so may be
incorrect.
The transcoded image is not animated, even if the original was. HEIC
files can nominally be animated, but in testing libvips was not able
to correctly recognize them as such. TIFF files are parsed as being
"animated," with one page per frame; this is of dubious utility, so
we merely transcode the first page. Always generating a static
transcoded image serves to also limit the computational time spent.
THUMBNAIL_OUTPUT_FORMATS is switched to be a tuple to ensure that it
is not accidentally mutated.
Introduce a feature to schedule realm data deletion time during realm
deactivation. This includes a server-level setting to configure the
minimum and maximum allowed deletion days.
Co-authored-by: Ujjawal Modi <umodi2003@gmail.com>
Co-authored-by: Lauryn Menard <lauryn@zulip.com>
Fixes#24677.
This commit renames the 'queue_json_publish' function to
'queue_json_publish_rollback_unsafe' to reflect the fact that it doesn't
wait for the db transaction (within which it gets called, if any)
to commit and sends event irrespective of commit or rollback.
In most of the cases we don't want to send event in the case of
rollbacks, so the caller should be aware that calling the function
directly is rollback unsafe.
Fixes part of #30489.
This timeout needs to be short enough that we don't drop the RabbitMQ
connection. Also drop the offending message (by returning with no
further exception) so we don't hit a head-of-queue failure situation.
Ideally, the parser would just be lightning-fast, so this would never
happen.
It helps to avoid creating unintended savepoints in the future.
This is as a part of our plan to explicitly mark all the
transaction.atomic calls with either 'savepoint=False' or
'durable=True' as required.
This commit adds `durable=True` to the outermost db transactions
created in the following:
* confirm_email_change
* handle_upload_pre_finish_hook
* deliver_scheduled_emails
* restore_data_from_archive
* do_change_realm_subdomain
* do_create_realm
* do_deactivate_realm
* do_reactivate_realm
* do_delete_user
* do_delete_user_preserving_messages
* create_stripe_customer
* process_initial_upgrade
* do_update_plan
* request_sponsorship
* upload_message_attachment
* register_remote_server
* do_soft_deactivate_users
* maybe_send_batched_emails
It helps to avoid creating unintended savepoints in the future.
This is as a part of our plan to explicitly mark all the
transaction.atomic calls with either 'savepoint=False' or
'durable=True' as required.
* 'savepoint=True' is used in special cases.
Earlier, we used to store the key data related to realm exports
in RealmAuditLog. This commit adds a separate table to store
those data.
It includes the code to migrate the concerned existing data in
RealmAuditLog to RealmExport.
Fixes part of #31201.
This prevents a deadlock between the thumbnailing worker and message
sending, as follows:
1. A user uploads an image, making Attachment and ImageAttachment
rows, as well as enqueuing a job in the thumbnailing queue.
2. Message sending starts a transaction, creates the Message row,
and calls `do_claim_attachments`, which edits the Attachment row
of the upload (implicitly locking it).
3. The thumbnailing worker starts a transaction, locks the
ImageAttachment row for its image, thumbnails it, and then
attempts to `select_for_update()` the message objects (joined to
the Attachments table) to find the ones which link to the
attachment in question. This query blocks, since "a locking
clause without a table list affects all tables used in the
statement"[^1] and the message-send request already has a write
lock on the Attachments row in question.
4. The message-send request attempts to re-fetch the ImageAttachment
row inside the transaction, which tries to pull a lock on it.
5. Deadlock, because the message-send request has the Attachment
lock, and waits for the ImageAttachment lock; the thumbnailing
worker has the ImageAttachment lock, and waits for the Attachment
lock.
We break this deadlock by limiting the
`update_message_rendered_content` `select_for_update` to only take
the lock on the Message table, and not also the Attachments table --
no changes will be made to the Attachments, so no lock is necessary
there. This allows the thumbnailing worker to successfully pull the
empty list of messages (since the message-send request has not
commits its transaction, and thus the Message row is not visible
yet), and release its ImageAttachment lock so that the message-send
request can proceed.
[^1]: https://www.postgresql.org/docs/current/sql-select.html#SQL-FOR-UPDATE-SHARE
For exporting full with consent:
* Earlier, a message advertising users to react with thumbs up
was sent and later used to determine the users who consented.
* Now, we no longer need to send such a message. This commit
updates the logic to use `allow_private_data_export` user-setting
to determine users who consented.
Fixes part of #31201.
Messages are rendered outside of a transaction, for performance
reasons, and then sent inside of one. This opens thumbnailing up to a
race where the thumbnails have not yet been written when the message
is rendered, but the message has not been sent when thumbnailing
completes, causing `rewrite_thumbnailed_images` to be a no-op and the
message being left with a spinner which never resolves.
Explicitly lock and use he ImageAttachment data inside the
message-sending transaction, to rewrite the message content with the
latest information about the existing thumbnails.
Despite the thumbnailing worker taking a lock on Message rows to
update them, this does not lead to deadlocks -- the INSERT of the
Message rows happens in a transaction, ensuring that either the
message rending blocks the thumbnailing until the Message row is
created, or that the `rewrite_thumbnailed_images` and Message INSERT
waits until thumbnailing is complete (and updated no Message rows).
This allows clients to potentially lay out the thumbnails more
intelligently, or to provide a better "progressive-load" experience
when enlarging the thumbnail.
A new table is created to track which path_id attachments are images,
and for those their metadata, and which thumbnails have been created.
Using path_id as the effective primary key lets us ignore if the
attachment is archived or not, saving some foreign key messes.
A new worker is added to observe events when rows are added to this
table, and to generate and store thumbnails for those images in
differing sizes and formats.