When flushing caches, we want to ensure that even processes which may
have a wrong cache-key-prefix know to fetch the latest data from the
database. This is complicated by the cache-key-prefixes being stored
on disk, and thus checking that every cache delete is not sufficiently
performant.
We store the list of cache-key-prefixes in the cache, itself, with no
prefix. This cache is updated when a new cache-key is written, and is
also allowed to lapse after 24 hours. Updating this global cache
entry on new prefix creation ensures that even a
not-yet-restarted-into deployment will have its caches appropriately
purged if changes are made to the underlying data.
However, this both adds a cache-get, as well as multiplies the size of
all cache clears; for large bulk clears (e.g. for stream renames,
which clear the cache for all message-ids in them) this may prove
untenable.
The old /srv/zulip-npm-cache system has been unused for two
years (Zulip Server ≥ 7.0). We can just delete this directory.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This consolidates the list of stale migration to
`lib/migration_status.py` as `STALE_MIGRATIONS`.
This is a prep work to make the migration status tool at
`migration_status.py` be able to clean its output of these migrations
too.
Currently, it handles two hook types: 'pre-create' (to verify that the
user is authenticated and the file size is within the limit) and
'pre-finish' (which creates an attachment row).
No secret is shared between Django and tusd for authentication of the
hooks endpoints, because none is necessary -- tusd forwards the
end-user's credentials, and the hook checks them like it would any
end-user request. An end-user gaining access to the endpoint would be
able to do no more harm than via tusd or the normal file upload API.
Regardless, the previous commit has restricted access to the endpoint
at the nginx layer.
Co-authored-by: Brijmohan Siyag <brijsiyag@gmail.com>
5308fbdeac split out `zulip::postgresql_client`, and 80ef38757a
made it no longer depend on `zulip::postgresql_common`, but directly
on `zulipconf('postgresql', 'version', undef)`. However, the
installer depended on recognizing `zulip::postgresql_common` in the
list of pulled-in classes to know that we needed to keep the
`postgresql.version` setting in `/etc/zulip.conf`.
Update the installer to also recognize `zulip::postgresql_client` as a
class which tells us to keep `postgresql.version` in our settings.
This provides significant size savings:
| Emoji set | png size | webp size | webp/png percent |
| ----------- | -------- | --------- | ---------------- |
| google-blob | 1968954 | 1373350 | 69.75% |
| twitter | 2972820 | 2149672 | 72.31% |
| google | 3455270 | 2327834 | 67.37% |
Since these are the largest assets that we ship to clients, it is
worth shaving off every byte we can.
`setup_path()` previously only checked that some `zulip-py3-venv` was
the `sys.prefix`, not that it was the one associated with this
deployment. When `uwsgi` is started, it is started from `bin/uwsgi`
within a `zulip-py3-venv` virtualenv, and as such sets
`sys.executable` to that, resulting in uwsgi workers picking up the
library path of that virtualenv. On first start, `sys.path` thus
already matches the expected virtualenv, and the `setup_path` in
`zproject.wsgi` does nothing.
If a rolling restart was later done into a deployment with a different
virtualenv, the `zproject.wsgi` call to `setup_path()` did not change
`sys.path` to the new virtualenv, since it was already running within
_a_ virtualenv. This led to dependency version mismatches, and
potentially even more disastrous consequences if the old (but still
erroneously in use) virtualenv was later garbage-collected.
PR #26771 was a previous attempt to resolve this, but failed due to
not thinking of the uwsgi binary itself as possibly providing a
virtualenv path. We leave the `chdir` hooks from that in-place, since
it cannot hurt for the "master" uwsgi process to be chdir'd to `/`,
and the `hook-post-fork` `chdir` is reasonable as well.
Resolve the virtualenv in `setup_path()`, and activate it if it
differs from the one that is currently active. To be sure that no
other old virtualenvs are used, we also filter out any paths which
appear to be from other Zulip virtualenvs.
We should not proceed and send client reload events until we know that
all of the server processes have updated to the latest version, or
they may reload into the old server version if they hit a Django
worker which has not yet restarted.
Because the logic controlling the number of workers is mildly complex,
and lives in Puppet, use the `uwsgi` Python bindings to know when the
process being reloaded is the last one, and use that to write out a
file signifying the success of the chain reload. `restart-server`
awaits the creation of this file before proceeding.
We need this check when switching between branches without `help-beta`
package. `node_modules` will be removed when working on a non `help-beta`
branch, but if `node_modules/.pnpm/lock.yaml` has not been updated by that
branch, we will end up in a situation where we might not have `node_modules`
even when we run the provision command.
We might not need this check when the `help-beta` initial folder
has been merged for a week or two, where almost all active PRs
would have been rebased upon main, making switching branches an ease.
There's no need for sharding, but this allows one to spend a bit of
extra memory to reduce image-processing latency when bursts of images
are uploaded at once.
A new table is created to track which path_id attachments are images,
and for those their metadata, and which thumbnails have been created.
Using path_id as the effective primary key lets us ignore if the
attachment is archived or not, saving some foreign key messes.
A new worker is added to observe events when rows are added to this
table, and to generate and store thumbnails for those images in
differing sizes and formats.