Commit Graph

1565 Commits

Author SHA1 Message Date
Anders Kaseorg
996eb72e2a install-uv: Upgrade uv from 0.7.15 to 0.7.21.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-07-15 07:32:44 -07:00
Prakhar Pratyush
8b3cef554b settings: Add push_registration_encryption_keys map.
The `push_registration_encryption_keys` map stores the
assymetric key pair generated on bouncer.

The public key will be used by the client to encrypt
registration data and the bouncer will use the corresponding
private key to decrypt.

- Updated the `generate_secrets.py` script to generate the map
during installation in dev environment.
- Added a management command to add / remove key i.e. use it
for key rotation while retaining the older key-pair for a period
of time.
2025-07-06 21:11:26 -07:00
Anders Kaseorg
7959a1853c install-node: Upgrade Node.js from 22.16.0 to 22.17.0.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-06-25 16:53:09 -07:00
Anders Kaseorg
9f8f6e60d9 install-uv: Upgrade uv from 0.7.11 to 0.7.15.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-06-25 16:52:39 -07:00
Anders Kaseorg
cdbe2d157f flush_memcached: Respect DJANGO_SETTINGS_MODULE.
We don’t need to flush anything for zproject.test_settings, which
disables memcached.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-06-25 11:49:02 -07:00
Alex Vandiver
b924169d17 setup-apt-repo: Add libheif PPA, and debian bookworm backport.
libheif 1.18 is required to be able to parse images generated by iOS
18; none of Zulip's supported distributions package libheif 1.18, so
we pull new version of the package from PPA (Ubuntu) or backports
(Debian).
2025-06-25 11:39:18 -07:00
Alex Vandiver
a0683927ef check_rabbitmq_queue: Relax paging thresholds for email_senders. 2025-06-18 12:29:57 -07:00
Anders Kaseorg
acd6c51b6f manage: Delete custom PYTHONSTARTUP.
In Django 5.2, manage.py shell automatically imports models.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-06-12 09:28:00 -07:00
Anders Kaseorg
927ea011d3 upgrade-postgresql: Get PostgreSQL version without manage.py shell.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-06-11 16:25:25 -07:00
Alex Vandiver
6f1950ac0e restart-server: Send client reload events in the background.
For deploys with --skip-puppet, this makes the output visible much
more promptly.
2025-06-11 10:16:46 -07:00
Anders Kaseorg
56470bba8d install-uv: Upgrade uv from 0.7.2 to 0.7.11.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-06-06 11:35:32 -07:00
Tim Abbott
0ec07fe4c8 queue: Allow sharding user_activity worker.
This follows the existing patterns for the sharded mobile
notifications worker.
2025-06-06 10:33:20 -07:00
Anders Kaseorg
f6be163bcc install-node: Upgrade Node.js from 22.15.0 to 22.16.0.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-06-04 16:24:47 -07:00
Alex Vandiver
c6e0f0b436 email-mirror: Remove HTTP interface. 2025-05-19 16:39:44 -07:00
Alex Vandiver
7a62a9b509 upgrade: Swap postfix_localmail for local_mailserver. 2025-05-19 16:39:44 -07:00
Alex Vandiver
1f0cfd4662 email-mirror: Add a standalone server that processes incoming email.
Using postfix to handle the incoming email gateway complicates things
a great deal:

- It cannot verify that incoming email addresses exist in Zulip before
  accepting them; it thus accepts mail at the `RCPT TO` stage which it
  cannot handle, and thus must reject after the `DATA`.

- It is built to handle both incoming and outgoing email, which
  results in subtle errors (1c17583ad5, 79931051bd, a53092687e,
  #18600).

- Rate-limiting happens much too late to avoid denial of
  service (#12501).

- Mis-configurations of the HTTP endpoint can break incoming
  mail (#18105).

Provide a replacement SMTP server which accepts incoming email on port
25, verifies that Zulip can accept the address, and that no
rate-limits are being broken, and then adds it directly to the
relevant queue.

Removes an incorrect comment which implied that missed-message
addresses were only usable once.  We leave rate-limiting to only
channel email addresses, since missed-message addresses are unlikely
to be placed into automated systems, as channel email addresses are.

Also simplifies #7814 somewhat.
2025-05-19 16:39:44 -07:00
Alex Vandiver
0442bb6f0e upgrade-postgresql: Slightly better error-proof post-upgrade scripts. 2025-05-16 11:33:20 -07:00
Alex Vandiver
3ab6be650b upgrade-postgresql: Explicitly ask to not start the new cluster.
Recent versions of postgresql-common's `pg_upgradecluster`, starting
with version 254, (i.e. on Ubuntu 24.04, but not 22.04) will not just
_suggest_ running the analyze, but will do so automatically.  While
somewhat helpful, it always does so with `--analyze-in-stages`, which
as noted in f77bbd3323, is actually the incorrect choice for us.
Passing `--no-start` ensures that `pg_upgradecluster` consistently
does not do any analyzing, allowing us to start the cluster manually
and then perform the analyze correctly ourselves.
2025-05-16 11:33:20 -07:00
Alex Vandiver
e13f82f048 upgrade-postgresql: Use tags to partially-apply configuration.
This uses the same technique used in 840884ec89, to only apply select
parts of the Puppet configuration.  This is more correct, and simpler,
than attempting to chop out some base puppet roles, and hack around
the `purge => true` supervisor.d configuration.
2025-05-16 11:33:20 -07:00
Alex Vandiver
2dc5c6c50e upgrade-postgresql: Only touch pgroonga_setup.sql.applied if required.
Since c8ec3dfcf6, the file must contain the version that was
configured, or we run `ALTER EXTENSION pgroonga UPDATE`; if the file
is missing, and pgroonga was previously installed, it run `CREATE
EXTENSION pgroonga` which will be an error.  If the file is present
but pgroonga was not configured, a later attempt to enable pgroonga
will incorrectly run `ALTER EXTENSION pgroonga UPDATE` instead of
`CREATE EXTENSION pgroonga`.

If the file existed on the previous version, touch it in the new
PostgreSQL version.  This will ensure that puppet will *always* run
the pgroonga update, which may be necessary in case the pgroonga
version also changed.  At worst, if the pgroonga version has not
changed, this will be a safe no-op.
2025-05-16 11:33:20 -07:00
Anders Kaseorg
8e9de0b053 configure-rabbitmq: Restore startup retry loop.
‘rabbitmqctl await_startup’ does not retry to wait for the Erlang
runtime to start, only to wait for the RabbitMQ application to start
once Erlang is running.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-05-15 16:59:27 -07:00
Anders Kaseorg
2354653f33 install-node: Upgrade Node.js from 22.14.0 to 22.15.0.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-05-05 14:15:44 -07:00
Anders Kaseorg
fe96666782 install-uv: Upgrade uv from 0.6.13 to 0.7.2.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-05-05 09:10:19 -07:00
Anders Kaseorg
e4a2695f54 install-uv: Upgrade uv from 0.6.6 to 0.6.13.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-04-08 10:17:49 -07:00
Anders Kaseorg
80b607c8cb install: Remove PostgreSQL 13 support.
PostgreSQL 13 reaches end of life on November 13, 2025, and Django 5.2
does not support it.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-04-07 17:41:55 -07:00
Anders Kaseorg
818742c62b install: Support PostgreSQL 17.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-04-07 16:42:19 -07:00
Alex Vandiver
342d278c8a upgrade: Remove handling of "googleblob" emoji rename.
This code was originally added in 608173657d in Zulip Server 2.0;
since we can only directly upgrade from 5.0 or later, this code is
guaranteed to have run already. Remove it.
2025-04-03 10:46:12 -07:00
Alex Vandiver
e93d43e8d1 upgrade: Remove puppet_classes renaming code.
This code was originally added in 5f3765b872 in Zulip Server 4.0;
since we can only directly upgrade from 5.0 or later, this code is
guaranteed to have run already. Remove it.
2025-04-03 10:46:12 -07:00
Alex Vandiver
f48a3a772f upgrade: Remove ancient symlink of settings.py.
This code was originally added in 2b146012e1 in Zulip Server 1.7.0;
since we can only directly upgrade from 5.0 or later, this code is
guaranteed to have run already. Remove it.
2025-04-03 10:46:12 -07:00
Alex Vandiver
e404a9b71c upgrade: Remove explicit python3-yaml install step.
The python3-yaml dependency was added at install time in 3314fefaec
in Zulip Server 4.0, and this workaround was added in de41a10d38,
also in 4.0.  Since we can only directly upgrade from 5.0 or later,
the dependency is guaranteed to be installed already, by one or the
other of those ways. Remove this workaround.
2025-04-03 10:46:12 -07:00
Alex Vandiver
39cc830ae5 upgrade: Remove tsearch_extras cleanup code.
This code was originally added in 382261dc72 in Zulip Server 3.0;
since we can only directly upgrade from 5.0 or later, this code is
guaranteed to have run already. Remove it.
2025-04-03 10:46:12 -07:00
Alex Vandiver
53bf48a873 upgrade: Remove RabbitMQ cookie randomization code.
This code was originally added in e705883857 in Zulip Server 5.0;
since we can only directly upgrade from 5.0 or later, this code is
guaranteed to have run already. Remove it.
2025-04-03 10:46:12 -07:00
Alex Vandiver
ba9569a6fe sha256-tarball-to: Support zipfiles. 2025-03-27 21:56:54 -07:00
Anders Kaseorg
7702f53d90 clean-venv-cache: Remove.
The current stable branch is on uv, so we no longer need to preserve
the old-style zulip-venv-cache directories from the last 14 days.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-03-26 15:51:48 -07:00
Anders Kaseorg
c517e95e6b install: Move ourself to deployments path before creating venv.
This prevents the venv from ending up with references to /root.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-03-25 10:17:17 -07:00
Anders Kaseorg
ce81d8498d provision: Ignore Python warnings while building requirements.
Build warnings are unfortunately very common in third-party packages.
They’re difficult to reliably detect since packages don’t always build
from source, and they can’t be whitelisted on a per-package basis
since they’re all attributed to setuptools or an anonymous code
string.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-03-19 17:15:09 -07:00
Anders Kaseorg
838ae38b43 install-uv: Upgrade uv from 0.6.3 to 0.6.6.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-03-14 17:31:50 -07:00
Alex Vandiver
29a0d287fc puppet: Allow for arbitrary queues to have more than one worker.
This generalizes from thumbnail_workers, to include any other queue.
We only additionally choose to document `email_senders_workers`,
however, since other queues are not guaranteed to work correctly with
multiple consumers.
2025-03-14 14:07:09 -07:00
Alex Vandiver
232de4b98f check_rabbitmq_queue: Increase deferred_email_senders paging thresholds.
c5200e8b05 switched `digest_emails` from sending emails by inserting
into the ScheduledEmail table, and being processed later by
`deliver_scheduled_emails`, to inserting into the
`deferred_email_senders` RabbitMQ queue.  This moved it from being in
an unmonitored table, to a monitored queue.

This slightly improved throughput -- but began paging, since the
backlog was now in a monitored form.  Increase the paging thresholds
to not page for expected behaviour.
2025-03-11 12:34:11 -07:00
Alex Vandiver
a9337e7641 nagios: Change the cron jobs to exit 0 for all ok/warning/critical.
The cron jobs are potentially wrapped by Sentry, which logs "cron
failures" and sends emails.  We would like those failures to only be
when the cron job itself failed to run successfully -- not when the
underlying metric is outside of its normal range.  We would like to
differentiate a failure of the monitoring infrastructure from a
failure of what it is monitoring.

Swap to return 0 on everything except "unknown" results.
2025-03-05 09:49:36 -08:00
Alex Vandiver
c5200e8b05 deliver_scheduled_emails: Use a queue, instead of infinite retries.
`deliver_scheduled_emails` tries to deliver the email synchronously,
and if it fails, it retries after 10 seconds.  Since it does not track
retries, and always tries the earliest-scheduled-but-due message
first, the worker will not make forward progress if there is a
persistent failure with that message, and will retry indefinitely.
This can result in excessive network or email delivery charges from
the remote SMTP server.

Switch to delivering emails via a new queue worker.  The
`deliver_scheduled_emails` job now serves only to pull deferred jobs
out of the table once they are due, insert them into RabbitMQ, and
then delete them.  This limits the potential for head-of-queue
failures to failures inserting into RabbitMQ, which is more reasonable
than failures speaking to a complex external system we do not control.
Retries and any connections to the SMTP server are left to the
RabbitMQ consumer.

We build a new RabbitMQ queue, rather than use the existing
`email_senders` queue, because that queue is expected to be reasonably
low-latency, for things like missed message notifications.  The
`send_future_email` codepath which inserts into ScheduledEmails is
also (ab)used to digest emails, which are extremely bursty in their
frequency -- and a large burst could significantly delay emails behind
it in the queue.

The new queue is explicitly only for messages which were not initiated
by user actions (e.g., invitation reminders, digests, new account
follow-ups) which are thus not latency-sensitive.

Fixes: #32463.
2025-03-04 16:09:25 -08:00
Alex Vandiver
47e622f5a5 run_hooks: Pass down, and respect, --from-git argument.
The refactoring in 4e28e1d3ff incorrectly switched a check for
`if args.from_git` into `if NEW_ZULIP_MERGE_BASE`, which is
incorrect -- the merge-base is always defined, it may just match the
version.  This led to errors when installing from tarball, without a
git repo.

Since the run_hooks command was already set up to take a `--from-git`
argument, but was ignoring it, pass down that flag from
upgrade-zulip-stage-3 when necessary, and swap the run_hooks logic
back to basing the version-resolution logic on that flag.
2025-03-04 13:18:50 -08:00
Anders Kaseorg
3af4900891 install-node: Upgrade Node.js from 22.12.0 to 22.14.0.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-02-26 16:20:47 -08:00
Anders Kaseorg
d7556b4060 requirements: Migrate to uv.
https://docs.astral.sh/uv/

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-02-24 22:29:24 -08:00
Anders Kaseorg
72f5df2e09 install: Remove --cacert and CUSTOM_CA_CERTIFICATES.
This has been broken for many years and nobody’s complained.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-02-24 22:29:24 -08:00
Alex Vandiver
6ac9e3328e cache: Flush caches from all known key prefixes.
When flushing caches, we want to ensure that even processes which may
have a wrong cache-key-prefix know to fetch the latest data from the
database.  This is complicated by the cache-key-prefixes being stored
on disk, and thus checking that every cache delete is not sufficiently
performant.

We store the list of cache-key-prefixes in the cache, itself, with no
prefix.  This cache is updated when a new cache-key is written, and is
also allowed to lapse after 24 hours.  Updating this global cache
entry on new prefix creation ensures that even a
not-yet-restarted-into deployment will have its caches appropriately
purged if changes are made to the underlying data.

However, this both adds a cache-get, as well as multiplies the size of
all cache clears; for large bulk clears (e.g. for stream renames,
which clear the cache for all message-ids in them) this may prove
untenable.
2025-02-21 14:11:08 -08:00
Alex Vandiver
e2df4f52ef kandra: Update Teleport version. 2025-02-21 10:16:33 -08:00
Anders Kaseorg
3823697e6c clean_node_cache: Remove.
The old /srv/zulip-npm-cache system has been unused for two
years (Zulip Server ≥ 7.0).  We can just delete this directory.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-02-19 16:44:02 -08:00
Alex Vandiver
72f667fb31 upgrade-zulip: Prevent restarting only Django into inconsistent caching. 2025-02-14 12:03:13 -08:00
Mateusz Mandera
367d193639 register_server: Rename flag to --agree-to-terms-of-service.
That's a better style than the underscores.
2025-02-13 11:03:44 -08:00