The nginx ‘add_header’ directive doesn’t inherit the way you’d
want (https://trac.nginx.org/nginx/ticket/854), so we need to manually
simulate inheritance using ‘include’, like we previously did with
api_headers.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013. We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not. Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.
While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.
This WebSockets code path had a lot of downsides/complexity,
including:
* The messy hack involving constructing an emulated request object to
hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
needs and must be provisioned independently from the rest of the
server).
* A duplicate check_send_receive_time Nagios test specific to
WebSockets.
* The requirement for users to have their firewalls/NATs allow
WebSocket connections, and a setting to disable them for networks
where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
been poorly maintained, and periodically throws random JavaScript
exceptions in our production environments without a deep enough
traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
server restart, and especially for large installations like
zulipchat.com, resulting in extra delay before messages can be sent
again.
As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients. We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.
If we later want WebSockets, we’ll likely want to just use Django
Channels.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines. Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Since 204 responses don’t contain a payload body, Content-Type is
neither required nor encouraged (RFC 7231 §3.1.1.5), and ours was
missing a semicolon to boot; Content-Length is expressly
forbidden (RFC 7230 §3.3.2).
Furthermore, these add_header directives were silencing the CORS
headers set in api_headers, because add_header inheritance doesn’t
work the way you think it does, as was known before commit
5614d51afc.
Fixes: #12902.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Disable TLS 1.0 and TLS 1.1. (We no longer need to support IE8 on
Windows XP.)
Prefer client-selected cipher order. (Now that all enabled ciphers
provide good security, this allows mobile clients lacking AES hardware
acceleration to pick ChaCha20 for better performance.)
Disable session tickets. (Mozilla discourages them based on
https://www.imperialviolet.org/2013/06/27/botchingpfs.html.)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This was only used in Ubuntu 14.04 Trusty.
Removing this also finally lets us simplify our security model
discussion of uploaded files.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Lengthen the session timeout and enlarge the session cache. Upgrade
Diffie-Hellman parameters from fixed 1024-bit to custom 2048-bit.
Enable OCSP stapling.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
The overall goal of this change is to fix an issue where on Ubuntu
Trusty, we were accidentally overriding the configuration to serve
uploads from disk with the regular expressions for adding access
control headers.
However, while investigating this, it became clear that we could
considerably simplify the mental energy required to understand this
system by making the uploads-route file be unconditionally available
and included from `zulip-include/app` (which means the zulip_ops code
can share behavior here).
We also move the Access-Control-Allow-* headers to a separate include
file, to avoid duplicating it in 5 places. Fixing this duplication
discovered a potential bug in the settings used for Tornado, where
DELETE was not allowed on a route that definitely expects DELETE.
Fixes#11758.
Update the list of ciphers that nginx will use to the current
Mozilla recommended ones.
These are Intermediate compatibility ones suitable for clients
running anything newer than Firefox 1, Chrome 1, IE 7, Opera 5
and Safari 1. Modern compatibility is not suitable as it excludes
Andriod 4 which is still seen on ~1% of traffic.
More info: https://wiki.mozilla.org/Security/Server_Side_TLS
This should be a nice performance improvement for browsers that
support it.
We can't yet enabled this in the Zulip on-premise nginx configuration,
because that still has to support Trusty.
This fixes a bug where our API routes for uploaded files (where we
need to use a consistent URL between session auth and API auth) were
not properly configured to pass through the API authentication headers
(and otherwise provide REST endpoint settings).
In particular, this prevented the Zulip mobile apps from being able to
access authenticated image files using these URLs.
This commits adds the necessary puppet configuration and
installer/upgrade code for installing and managing the thumbor service
in production. This configuration is gated by the 'thumbor.pp'
manifest being enabled (which is not yet the default), and so this
commit should have no effect in a default Zulip production environment
(or in the long term, in any Zulip production server that isn't using
thumbor).
Credit for this effort is shared by @TigorC (who initiated the work on
this project), @joshland (who did a great deal of work on this and got
it working during PyCon 2017) and @adnrs96, who completed the work.
While there are legitimate use cases for embedded Zulip in an iFrame,
they're rare, and it's more important to prevent this category of
attack by default.
Sysadmins can switch this to a whitelist when they want to use frames.
Apparently, our nginx configuration's use of "localhost", combined
with the default in modern Linux of having localhost resolve to both
the IPv4 and IPv6 addresses on a given machine, resulted in `nginx`
load-balancing requests to a given Zulip server between the IPv4 and
IPv6 addresses. This, in turn, resulted in irrelevant 502 errors
problems every few minutes on the /events endpoints for some clients.
Disabling IPv6 on the server resolved the problem, as does simply
spelling localhost as 127.0.0.1 for the `nginx` upstreams that we
declare for proxying to non-Django services on localhost.
This option is intended to support situations like a quick Docker
setup where doing HTTPS adds more setup overhead than it's worth.
It's not intended to be used in actual production environments.
Our recent addition of Content-Security-Policy to the file uploads
backend broke in-browser previews of PDFs.
The content-types change in the last commit fixed loading PDFs for
most users; but the result was ugly, because e.g. Chrome would put the
PDF previewer into a frame (so there were 2 left scrollbars).
There were two changes needed to fix this:
* Loading the style to use the plugin. We corrected this by adding
`style-src 'self' 'unsafe-inline';`
* Loading the plugin. Our CSP blocked loading the PDf viewer plugin.
To correct this, we add object-src 'self', and then limit the
plugin-type to just the one for application/pdf.
We verified this new CSP using https://csp-evaluator.withgoogle.com/
in addition to manual testing.
Previously, user-uploaded PDF files were not properly rendered by
browsers with the local uploads backend, because we weren't setting
the correct content-type.
This adds a basic Content-Security-Policy for user-uploaded avatars
served by the LOCAL_UPLOADS backend.
I think this is for now an unnecessary follow-up to
d608a9d315, but is worth doing because
we may later change what can be uploaded in the avatars directory.
This adds a basic Content-Security-Policy for user-uploaded files with local uploads.
While over time, we plan to add CSP for the main site as well, this CSP is particularly
important for the local-uploads backend, which often shares a domain with the main site.
From here on we start to authenticate uploaded file request before
serving this files in production. This involves allowing NGINX to
pass on these file requests to Django for authentication and then
serve these files by making use on internal redirect requests having
x-accel-redirect field. The redirection on requests and loading
of x-accel-redirect param is handled by django-sendfile.
NOTE: This commit starts to authenticate these requests for Zulip
servers running platforms either Ubuntu Xenial (16.04) or above.
Fixes: #320 and #291 partially.
If we were making an old-fashioned webroot where hand-written static
HTML files went, somewhere under `/srv` would be most appropriate.
Here, this webroot is really more of an implementation detail of the
certbot set up by the Zulip installer/packaging, containing transient
state. So someplace under `/var` is appropriate, and specifically
under `/var/lib/zulip` in order to properly namespace it.
For background on `/var/www` and friends, see the top couple of answers
on
https://unix.stackexchange.com/questions/47436/why-web-server-var-www
We weren't compressing SVG, while at the same time were incorrectly
compressing octet-stream (Which meant downloading .tar.gz files in
Chrome would get double-compressed).
Camo is a caching image proxy, used in Zulip to avoid mixed-content
warnings by proxying HTTP image content over HTTPS. We've been using
it in zulip.com production for years; this change makes it available
in standalone Zulip deployments.
Apparently, previously nginx was only compressing text/html content.
This should result in a substantial savings in network traffic -- some
quick testing I did found it cut the total data transferred for
loading a logged-in zulip.com instance from 3MB to 1.2MB.