This is most important for `send_zulip_update_announcements`, which
can race with the version run as a post-deploy hook. However, all of
these crons can tolerate being slightly delayed, and there's little
benefit to them taking CPU or possibly hitting odd borderline race
conditions when the deploy is in progress.
For safety, we only trust the deploy lockfile if it was created
within the last hour -- deploys should not take more than an hour, and
failing to ever run hourly crons is much worse than perhaps running
them during a real very-long deploy.
When these exceptions are thrown from the request-to-bouncer functions
inside of manage.py register_server/update_analytics_counts, they
shouldn't be silenced, merely calling maybe_mark_pushes_disabled in the
background.
This results in the occurrence of the error not being shown to the
user. Failure to upload analytics data when running these commands
should result in a loud, obvious error.
Failure of running register_server before this change:
```
./manage.py register_server
This command registers your server for the Mobile Push Notifications Service.
Doing so will share basic metadata with the service's maintainers:
* This server's configured hostname: zulipdev.com:9991
* This server's configured contact email address: desdemona+admin@zulip.com
* Metadata about each organization hosted by the server; see:
<https://zulip.com/doc-permalinks/basic-metadata>
Use of this service is governed by the Zulip Terms of Service:
<https://zulip.com/policies/terms>
Do you want to agree to the Zulip Terms of Service and proceed? [Y/n]
Mobile Push Notification Service registration successfully updated!
```
The occurrence of the error is not revealed to the user. Same concern
applies to the update_analytics_counts command.
After this change:
```
./manage.py register_server
This command registers your server for the Mobile Push Notifications Service.
Doing so will share basic metadata with the service's maintainers:
<...>
Do you want to agree to the Zulip Terms of Service and proceed? [Y/n]
Traceback (most recent call last):
File "/srv/zulip/./manage.py", line 150, in <module>
execute_from_command_line(sys.argv)
File "/srv/zulip/./manage.py", line 115, in execute_from_command_line
utility.execute()
File "/srv/zulip-venv-cache/bb36fc1fcb6d8c70a9a0bcb7bac45d78623a9ff4/zulip-py3-venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 436, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/srv/zulip-venv-cache/bb36fc1fcb6d8c70a9a0bcb7bac45d78623a9ff4/zulip-py3-venv/lib/python3.10/site-packages/django/core/management/base.py", line 413, in run_from_argv
self.execute(*args, **cmd_options)
File "/srv/zulip/zerver/lib/management.py", line 97, in execute
super().execute(*args, **options)
File "/srv/zulip-venv-cache/bb36fc1fcb6d8c70a9a0bcb7bac45d78623a9ff4/zulip-py3-venv/lib/python3.10/site-packages/django/core/management/base.py", line 459, in execute
output = self.handle(*args, **options)
File "/srv/zulip/zerver/management/commands/register_server.py", line 137, in handle
send_server_data_to_push_bouncer(consider_usage_statistics=False, raise_on_error=True)
File "/srv/zulip/zerver/lib/remote_server.py", line 453, in send_server_data_to_push_bouncer
response = send_to_push_bouncer(
File "/srv/zulip/zerver/lib/remote_server.py", line 233, in send_to_push_bouncer
raise JsonableError(msg)
zerver.lib.exceptions.JsonableError: Duplicate registration detected.
```
Instead of the PUSH_NOTIFICATIONS_BOUNCER_URL and
SUBMIT_USAGE_STATISTICS settings, we want servers to configure
individual ZULIP_SERVICE_* settings, while maintaining backward
compatibility with the old settings. Thus, if all the new
ZULIP_SERVICE_* are at their default False value, but the legacy
settings are activated, they need to be translated in computed_settings
to the modern way.
Factor out the repeated pattern of taking a lock, or immediately
aborting with a message if it cannot be acquired. The exit code in
that situation is changed to be exit code 1, rather than the successful
0; we are likely missing new work since that process started.
We move the lockfiles to a common directory under `/srv/zulip-locks`
rather than muddy up `/home/zulip/deployments`.
Adds a re-usable lockfile_nonblocking helper to context_managers.
Relying on naive `os.mkdir` is not enough especially now that the
successful operation of this command is necessary for push notifications
to work for many servers.
We can't use `lockfile` context manager from
`zerver.lib.context_managers`, because we want the custom behavior of
failing if the lock can't be acquired, instead of waiting.
That's because if an instance of this gets stuck, we don't want to start
queueing up more processes waiting forever whenever the cronjob runs
again and fail->exit is preferrable instead.
Given that most of the use cases for realms-only code path would
really like to upload audit logs too, and the others would likely
produce a better user experience if they upoaded audit logs, we
should just have a single main code path here i.e.
'send_analytics_to_push_bouncer'.
We still only upload usage statistics according to documented
option, and only from the analytics cron job.
The error handling takes place in 'send_analytics_to_push_bouncer'
itself.
These metadata are essentially all publicily available anyway, and
making uploading them unconditional will simplify some things.
The documentation is not quite accurate in that it claims the server
will upload some metadata that is not actually uploaded yet (but will
by soon). This seems harmless.
This reduces the giant load spike at 5 minute past the hour, when all
remote servers currently attempt to submit their records.
We do not wish to slew over a full hour, because we want to ensure
that we do not hold the lock when the next hour's analytics runs. It
is also not necessary to have that much variation; 10 minutes is
picked as an arbitrary "long enough" time to spread requests over.
The former name is kind of misleading - this function is for the remote
server to send analytics to the push bouncer. Under our usual
terminology, a "remote server" is a self-hosted Zulip server. So data is
sent FROM not TO a remote server.
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
datetime.timezone is available in Python ≥ 3.2. This also lets us
remove a pytz dependency from the PostgreSQL scripts.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This is a preparatory commit for using isort for sorting all of our
imports, merging changes to files where we can easily review the
changes as something we're happy with.
These are also files with relatively little active development, which
means we don't expect much merge conflict risk from these changes.
We were updating FillState with FillState.objects.filter(..).update(..),
which does not update the last_modified field (which has auto_now=True).
The correct incantation is the save() method of the actual FillState
object.
Adds two simplifying assumptions to how we process analytics stats:
* Sets the atomic unit of work to: a stat processed at an hour boundary.
* For any given stat, only allows these atomic units of work to be processed
in chronological order.
Adds a table FillState that, for each stat, keeps track of the last unit of
work that was processed.