Commit Graph

648 Commits

Author SHA1 Message Date
Aditya Bansal
efe8545303 local-uploads: Start running authentication checks on file requests.
From here on we start to authenticate uploaded file request before
serving this files in production. This involves allowing NGINX to
pass on these file requests to Django for authentication and then
serve these files by making use on internal redirect requests having
x-accel-redirect field. The redirection on requests and loading
of x-accel-redirect param is handled by django-sendfile.

NOTE: This commit starts to authenticate these requests for Zulip
servers running platforms either Ubuntu Xenial (16.04) or above.

Fixes: #320 and #291 partially.
2018-02-16 05:06:37 +05:30
Greg Price
20c734c90a puppet: Fix type error in new Nagios check for analytics state. 2018-02-09 17:46:46 -08:00
Tim Abbott
005b0fb566 puppet: Clean up ssh authorized_keys configuration rules. 2018-02-09 16:37:03 -08:00
Tim Abbott
aca25b6f0a puppet: Move ssh configuration to use notify.
This handles more correctly the case where we're using the upstream
sshd_config file.
2018-02-09 16:37:03 -08:00
Tim Abbott
486de8abfc puppet: Edit some rules to support chat.zulip.org.
This should make it possible to use the zulip_ops base rules
successfully on chat.zulip.org.  Many of the changes in this commit
are hacks and probably can be cleaned up later, but given that we plan
to drop trusty support soon, it's likely that most of them will simply
be deleted then.
2018-02-09 16:37:03 -08:00
Rishi Gupta
1d581a9c6e nagios: Add nagios check for analytics state.
This should help us detect issues where the analytics cron jobs aren't
running properly.

The cron/nagios part of the implementation done by tabbott.
2018-02-09 16:36:05 -08:00
Greg Price
7df29e7a7c puppet: Only use those "modern" options when on xenial.
On trusty, we of course have an older version -- 1.4.14 -- and it is
not so modern, so this just gives an error.
2018-02-08 18:11:52 -08:00
Greg Price
23e6a2e579 puppet: Update memcached config to turn on this decade's technology.
We've been running this change on zulipchat.com for a couple of months
now.  Before then, we used to regularly get exceptions like this:

     File "./zerver/views/messages.py", line 749, in get_messages_backend
       setter=stringify_message_dict)
     File "./zerver/lib/cache.py", line 275, in generic_bulk_cached_fetch
       cache_set_many(items_for_remote_cache)
     File "./zerver/lib/cache.py", line 215, in cache_set_many
       get_cache_backend(cache_name).set_many(items, timeout=timeout)
     File "/home/zulip/deployments/2017-09-28-21-04-12/zulip-py3-venv/lib/python3.5/site-packages/django/core/cache/backends/memcached.py", line 150, in set_many
       self._cache.set_multi(safe_data, self.get_backend_timeout(timeout))
   pylibmc.Error: error 48 from memcached_set_multi

This error means memcached was unable to find space for the new value.
You might think that because memcached provides an LRU cache, this
shouldn't happen because it would just evict something... but in fact
  * memcached splits its data into "slabs" by object size, and
  * until recently, once a 1MiB "chunk" is allocated to a given "slab"
    i.e. size class, it wouldn't be reclaimed to allocate to another.

So once the cache has been filled up with objects of some distribution
of sizes, if some objects come in that would go in a different size
class, we have no chunks for that size class / slab, and can't get one.
And that's exactly what was happening on zulipchat.com.

Useful background can be found in
  https://github.com/memcached/memcached/wiki/ServerMaint#slab-imbalance
  https://github.com/memcached/memcached/wiki/ReleaseNotes1411
  https://github.com/memcached/memcached/wiki/ReleaseNotes1425
  https://github.com/memcached/memcached/wiki/ReleaseNotes150
We're already running v1.4.25, which provides an "automover" that should
be well equipped to fix this; v1.5.0 turns it on by default.

With this commit, adopt the "modern start line" recommended in the
release notes for our v1.4.25, including turning on the automover.
2018-02-08 16:34:49 -08:00
Vishnu Ks
bf2961418b puppet: Remove comment about period of soft deactivate users.
This often becomes wrong over time as it is currently.
2018-01-24 17:15:08 -08:00
Vishnu Ks
a11b742984 messages: Calculate value of first visible message ID using cron job.
[greg: Fixed buggy time conversion in estimate_recent_messages.]
2018-01-24 17:15:08 -08:00
Tim Abbott
9ed2a94b8c nagios: Add configuration designed for full-stack servers.
This doesn't yet pass all Nagios checks correctly, and still has a few
flaws:
* The ideal setup code for the `nagios` user in the database isn't included.
* Some of the other details are a bit off; we need to split some host roles.

But it's better than nothing, and we can iterate from here.
2018-01-24 14:16:03 -08:00
Aditya Bansal
dd0e6c8025 reminders: Fix issue with log file permissions in production. 2018-01-24 03:33:40 +05:30
Tim Abbott
2365b13b68 puppet: Move postgres Nagios plugin to main postgres-common.
This plugins package is required in order to use Nagios checks to
verify the Zulip postgres database, and thus belongs in the default
package set.
2018-01-23 10:31:48 -08:00
Aditya Bansal
ec1297c1e8 schedulemessages: Add delivery system for scheduled message. 2018-01-10 09:18:02 -05:00
Umair Khan
68513952fb email-worker: Create EmailSendingWorker.
This commit just copies all the code from MissedMessageSendingWorker
class to a new EmailSendingWorker class. All the logic to send an email
through a queue was already there. This commit only makes the logic
generic. It does so by creating a special purpose queue called
'email_senders' to send any type of email. To make
MissedMessageSendingWorker still work we derive it from
EmailSendingWorker. All the tests that were testing
MissedMessageSendingWorker now run against EmailSendingWorker.
2017-12-20 19:36:27 -08:00
Tim Abbott
f423dc4930 check_send_receive_time: Fix parsing bug.
This was a regression introduced with the argparse migration.
2017-11-27 14:01:30 -08:00
rht
e55898850a Replace optparse with argparse in remaining tools.
Tweaked by tabbott to fix various bugs with the usage output.
2017-11-21 21:34:38 -08:00
Vishnu Ks
766511e519 actions: Mark all messages as read when user unsubscribes from stream.
This fixes a bug where, when a user is unsubscribed from a stream,
they might have unread messages on that stream leak.  While it might
seem to be a minor problem, it can cause significant problems for
computing the `unread_msgs` data structures, since it means we need to
add an extra filter for whether the user is still subscribed, either
in the backend or in the UI.

Fixes #7095.
2017-11-21 20:09:17 -08:00
Greg Price
ae901309fc certbot: Control auto-renew with a zulip.conf setting.
This causes the cron job to run only when a Zulip-managed certbot
install is actually set up.

Inside `install`, zulip.conf doesn't yet exist when we run
setup-certbot, so we write the setting later.  But we also give
setup-certbot the ability to write the setting itself, so that we
can recommend it in instructions for adopting certbot in an
existing Zulip installation.
2017-11-15 21:50:41 -08:00
Greg Price
dacf65b301 certbot: Move verification webroot under /var/lib/zulip .
If we were making an old-fashioned webroot where hand-written static
HTML files went, somewhere under `/srv` would be most appropriate.
Here, this webroot is really more of an implementation detail of the
certbot set up by the Zulip installer/packaging, containing transient
state.  So someplace under `/var` is appropriate, and specifically
under `/var/lib/zulip` in order to properly namespace it.

For background on `/var/www` and friends, see the top couple of answers
on
  https://unix.stackexchange.com/questions/47436/why-web-server-var-www
2017-11-15 21:50:41 -08:00
Tim Abbott
2afc3b9e50 certbot: Move path to /usr/local/sbin.
[greg: fixed typo bug]
2017-11-15 21:50:41 -08:00
rht
97ec56276c certbot: Add certbot renew cron job to puppet.
Tweaked by tabbott to use the proper command.
2017-11-15 21:50:41 -08:00
Tim Abbott
94554c65da certbot: Modify nginx configuration to support automated renewal. 2017-11-08 12:32:26 -08:00
Tim Abbott
62bb465896 puppet: Modify lb0 nginx configuration. 2017-11-08 12:32:26 -08:00
rht
ccf2792c1c refactor: Remove six.moves.configparser import. 2017-11-07 10:51:44 -08:00
rht
549a26860f refactor: Remove six.moves.range import. 2017-11-07 10:46:42 -08:00
Tim Abbott
acb0b6ee43 process_fts_updates: Fix pgroonga search in development.
For some reason, we have the USING_PGROONGA setting on in development
right now.  I'm going to disable that in another commit to match what
we're doing in production, but we'll still want that setting to work
in development.

The problem here was that process_fts_updates only attempted to read
the USING_PGROONGA setting from a /etc/zulip/zulip.conf source, and
thus would just not be updating the index in development.
2017-10-30 11:44:04 -07:00
Tim Abbott
0d1194811f mypy: Remove ignores for a few typeshed bugs fixed upstream. 2017-10-27 17:09:00 -07:00
Tim Abbott
89b97e7480 python3: Fix REMOTE_USER Apache configuration for Python 3.
We were previously still installing the Python 2 version of mod_wsgi,
which of course doesn't work and can't use the Zulip virtualenv.
2017-10-24 11:48:14 -07:00
Tim Abbott
15f3d5f714 nginx: Fix some buggy gzip compression configuration.
We weren't compressing SVG, while at the same time were incorrectly
compressing octet-stream (Which meant downloading .tar.gz files in
Chrome would get double-compressed).
2017-10-20 11:01:28 -07:00
Tim Abbott
540cae19a8 puppet: Remove obsolete sparkle configuration.
Sparkle was the auto-update system used by the legacy desktop app.  We
haven't been capable of using it for auto-update in years, so there's
no reason to keep around the configuration.

The new Electron app uses a different system anyway.
2017-10-19 16:35:55 -07:00
rht
b57289aacd py3: Remove all `from __future__ import print_function.
Except for these files:
- tools/linter_lib/*
- tools/lib
- tools/lister.py
2017-10-18 12:07:19 -07:00
rht
2f3ae84e5a py3: Remove all __future__ import division. 2017-10-17 23:09:12 -07:00
Tim Abbott
6a5cb0e48c puppet: Make problems with Zephyr mirroring pageable.
Generally this indicates sending messages is completely broken.
2017-10-12 00:16:32 -07:00
rht
de30400fc5 pg_backup_and_purge.py: Remove .py extension. 2017-10-08 15:32:43 -07:00
Tim Abbott
47c5aae5b2 log2zulip: Enforce using python 3 in cron job.
We aren't guaranteed to have the Zulip dependencies installed on
Python 2.
2017-10-06 16:37:17 -07:00
Tim Abbott
0f2e4a55c0 soft deactivation: Shorten management command name.
This command is really for soft deactivation; there's just an undo
feature.
2017-10-06 08:48:43 -07:00
Tim Abbott
f2055397c1 nagios: Update apache configuration to be generated.
Since this is basically just stock Apache configuration for Nagios
with a hostname put in, we can just fetch the hostname from our
configuration.
2017-10-05 21:51:29 -07:00
Tim Abbott
3af01bed85 puppet: Simplify zulip_ops nginx configuration.
Whatever dist/ functionality this had in 2014 is now served by
zulip.org, and since this serves as a sample, it should be as simple
as possible.

Previously, this was more cluttered than it needed to be.
2017-10-05 21:17:57 -07:00
Tim Abbott
e6e7bcf6e1 nagios: Move camo_check_url into configuration. 2017-10-05 21:09:24 -07:00
Tim Abbott
82cee4fde9 check_worker_memory: Increase limits for what leaking means.
The old limits were such that these would sometimes oscillated too
high and page erroneously.  The purpose of this check is to prevent
large memory leaks, and will still achieve that with a higher limit.
2017-10-05 20:54:03 -07:00
Tim Abbott
1c453fdf2a puppet: Add redis_password file for Nagios.
This allows the Nagios user to access redis without having full access
to the redis system.  Ideally, this would eventually use a password
that only has statistics read access, but I'm not sure redis supports
that.
2017-10-05 20:42:07 -07:00
Tim Abbott
13a36d9af3 puppet: Make old redis_tunnel configuration usable.
This old puppet configuration was never really used, and regardless
hardcoded an ancient zulip.net hostname.  We fix this to use the
zulipconf system to get the host domain (though not, at present, the
hostname).
2017-10-05 20:40:22 -07:00
Tim Abbott
96c3014da0 nagios: Automate configuration of outgoing email with msmtp.
Now we no longer need to check in a bunch of hostnames in order to
configure Nagios.
2017-10-05 20:29:47 -07:00
Tim Abbott
5b4c260c3f puppet: Add munin apache auth configuration.
This is completely stock configuration, and seems to be required for
munin to run properly.
2017-10-05 20:17:12 -07:00
Tim Abbott
ba7be4102e puppet: Update munin tunnels configuration to use zulipconf.
This eliminates another old hardcoding of zulip.net.
2017-10-05 20:14:43 -07:00
Tim Abbott
162eaf8917 nagios: Modify check for swap to allow no swap.
If a machine is configured with no swap intentationally, that
shouldn't be a Nagios problem.  This alert is intended to flag
machines which are swapping.
2017-10-05 20:07:44 -07:00
Tim Abbott
80a16bf873 nagios: Fix path to source zulip_nagios.cfg.
Arguably, we should make this a symlink, but it's probably a good idea
to have every change in the production Nagios configuration go through
the zulip-puppet-apply diff experience.
2017-10-05 20:06:50 -07:00
Tim Abbott
886a8853ac nagios: Move server-specific config into hostgroups.
These new hostgroups exist so we can eliminate explicit references to
individual hosts in services.cfg.
2017-10-05 20:06:48 -07:00
Tim Abbott
b6ce9583a9 nagios: Fetch list of hosts from zulip.conf.
This makes this much more configurable and much less hardcoded.
2017-10-05 20:06:30 -07:00