restart-server has grown more complex since
efresh-sharding-and-restart was written -- use that complexity. For
instance, workers are stopped one-at-a-time, rather than requiring
that they all stop sequentially before being started again.
It also means that Django restarts will be rolling, which trades off
500s during login for realms which are moving, in exchange for leaving
other realms unaffected.
Staging and other hosts that are `zulip::app_frontend_base` but not
`zulip::app_frontend_once` do not have a
/etc/supervisor/conf.d/zulip/zulip-once.conf and as such do not have
`zulip_deliver_scheduled_emails` or `zulip_deliver_scheduled_messages`
and thus supervisor will fail to reload.
Making the contents of `zulip-workers` contingent on if the server is
_also_ a `-once` server is complicated, and would involve using Concat
fragments, which severely limit readability.
Instead, expel those two from `zulip-workers`; this is somewhat
reasonable, since they are use an entirely different codepath from
zulip_events_*, using the database rather than RabbitMQ for their
queuing.
This supports running puppet to pick up new sharding changes, which
will warn of the need to finalize them via
`refresh-sharding-and-restart`, or simply running that directly.
This allows straight-forward configuration of realm-based Tornado
sharding through simply editing /etc/zulip/zulip.conf to configure
shards and running scripts/refresh-sharding-and-restart.
Co-Author-By: Mateusz Mandera <mateusz.mandera@zulip.com>