Commit Graph

172 Commits

Author SHA1 Message Date
Alex Vandiver
21cedabbdf subdomains: Extend "static" to include resources hosted on S3.
This causes avatars and emoji which are hosted by Zulip in S3 (or
compatible) servers to no longer go through camo.  Routing these
requests through camo does not add any privacy benefit (as the request
logs there go to the Zulip admins regardless), and may break emoji
imported from Slack before 1bf385e35f,
which have `application/octet-stream` as their stored Content-Type.
2021-06-08 15:28:32 -07:00
Tim Abbott
43be62c7ef upload: Use get_public_upload_url for export tarballs too.
This deduplicates the code so that we now just have one function for
constructing S3 URLs.
2021-05-27 23:30:00 -07:00
ryanreh99
7b15ce71c2 s3 uploads: Refactor to access objects via get_public_upload_url.
Our current logic only allows S3 block storage providers whose
upload URL matches with the format used by AWS. This also allows
other styles such as the "virtual host" format used by Oracle cloud.

Fixes #17762.
2021-05-27 23:29:59 -07:00
Mateusz Mandera
6a8586e989 upload: Mention new difference between sanitize_name and slugify.
In Django 3.2 slugify strips trailing dashes and underscores:
0382ecfe02

sanitize_name doesn't so this difference should be documented like the
others.
2021-05-03 08:36:22 -07:00
Mateusz Mandera
389c7bdb5a upload: Fix docstring and regex in sanitize_name regarding underscore.
Underscore character is already covered by \w, so _ in the regex is
redundant. Also the docstring is mildly incorrect - underscore already
is an allowed character by django's slugify (and always was) for the
aforementioned reason.
2021-05-03 08:36:22 -07:00
Ganesh Pawar
830f1fa8c5 upload: Refactor and add tests for ensure_avatar_image in upload.py.
`ensure_basic_avatar_image` and `ensure_medium_avatar_image` are
essentially the same thing, except a size parameter.
So, refactor them into a single function.

This doesn't introduce any functional changes.
2021-04-29 21:18:13 -07:00
Anders Kaseorg
e7ed907cf6 python: Convert deprecated Django ugettext alias to gettext.
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-15 18:01:34 -07:00
Anders Kaseorg
6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg
11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
ryanreh99
dfa7ce5637 uploads: Support non-AWS S3-compatible server.
Boto3 does not allow setting the endpoint url from
the config file. Thus we create a django setting
variable (`S3_ENDPOINT_URL`) which is passed to
service clients and resources of `boto3.Session`.

We also update the uploads-backend documentation
and remove the config environment variable as now
AWS supports the SIGv4 signature format by default.
And the region name is passed as a parameter instead
of creating a config file for just this value.

Fixes #16246.
2020-10-28 21:59:07 -07:00
ryanreh99
1c370a975c refactor: Access a bucket by calling zerver.lib.uploads.get_bucket. 2020-10-28 21:52:08 -07:00
Anders Kaseorg
72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Cody Piersall
5dab6e9d31 emoji-upload: Fix transparency issues on GIF emoji upload.
This preserves the alpha layer on GIF images that need to be resized
before being uploaded.  Two important changes occur here:

1. The new frame is a *copy* of the original image, which preserves the
   GIF info.
2. The disposal method of the original GIF is preserved.  This
   essentially determines what state each frame of the GIF starts from
   when it is drawn; see PIL's docs:
   https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#saving
   for more info.

This resolves some but not all of the test cases in #16370.
2020-10-11 16:23:07 -07:00
akshatdalton
52c411df8a emoji: Add padding around the gif on GIF emoji upload.
Replaced ImageOps.fit by ImageOps.pad, in zerver/lib/upload.py, which
returns a sized and padded version of the image, expanded to fill the
requested aspect ratio and size.
Fixes part of #16370.
2020-10-06 17:28:02 -07:00
Anders Kaseorg
faf600e9f5 urls: Remove unused URL names and shorten others.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-22 10:46:28 -07:00
Anders Kaseorg
ddf8ec33df upload: Strip leading slash from deleted S3 export paths.
Previously, S3UploadBackend.delete_export_tarball failed to strip the
leading ‘/’ from the export path.  This mistake is now caught by Moto
1.3.15.  I expect it caused deletion failures in the real S3, although
I haven’t verified this.

We store export_path in the audit log with a leading ‘/’, but the
actual S3 keys do not have a leading ‘/’.  Changing either system
would require a migration.  So the new convention is that the
variables named ‘export_path’ have a leading ‘/’, while variables
named ‘path_id’ or ‘key’ do not.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-13 20:59:09 -07:00
Anders Kaseorg
b7b7475672 python: Use standard secrets module to generate random tokens.
There are three functional side effects:

• Correct an insignificant but mathematically offensive bias toward
repeated characters in generate_api_key introduced in commit
47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased
from 190.52864 bits to 190.53428 bits.

• Use the base32 alphabet in confirmation.models.generate_key; its
entropy is reduced from 124.07820 bits to the documented 120 bits, but
now it uses 1 syscall instead of 24.

• Use the base32 alphabet in get_bigbluebutton_url; its entropy is
reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall
instead of 10.

(The base32 alphabet is A-Z 2-7.  We could probably replace all of
these with plain secrets.token_urlsafe, since I expect most callers
can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without
problems.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-09 15:52:57 -07:00
Anders Kaseorg
f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Clara Dantas
05bf72a75c attachments: Add is_web_public field.
This commit adds the is_web_public field in the AbstractAttachment
class. This is useful when validating user access to the attachment,
as otherwise we would have to make a query in the db to check if
that attachment was sent in a message in a web-public stream or not.
2020-08-12 17:26:03 -07:00
Tim Abbott
6130a61be0 export: Only print .s with percent_callback to console.
The S3 data export tool's upload code path uses this nice boto
callback feature for showing a progress bar, which is nice for the
management command.  It's spammy/broken in production and the backend
tests, so we change percent_callback to be a parameter passed in so
that it can only be used in the contexts where it makes sense.
2020-07-30 13:14:53 -07:00
Tim Abbott
0b6ebb4fbb upload: Remove unused get_realm_for_filename. 2020-06-18 17:55:13 -07:00
Tim Abbott
5962d1ea14 upload: Avoid fetching bucket objects repeatedly.
This takes of advantage of saving the bucket object on the
UploadBackend class to deduplicate a bunch of redundant code getting
buckets.
2020-06-18 17:55:13 -07:00
Wyatt Hoodes
2ef791fc21 upload.py: Support using non S3-providers.
With #14378, we regressed back to the state of that
prior to 7e0ea61b00.

We fix this by getting our avatar bucket on
object initialization, and use the appropriate means
of gathering the network location for the urls.

Fixes #14484.
2020-06-18 17:55:13 -07:00
Anders Kaseorg
365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg
69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Graham Bleaney
461d5b1a3e pysa: Introduce sanitizers, models, and inline marking safe.
This commit adds three `.pysa` model files: `false_positives.pysa`
for ruling out false positive flows with `Sanitize` annotations,
`req_lib.pysa` for educating pysa about Zulip's `REQ()` pattern for
extracting user input, and `redirects.pysa` for capturing the risk
of open redirects within Zulip code. Additionally, this commit
introduces `mark_sanitized`, an identity function which can be used
to selectively clear taint in cases where `Sanitize` models will not
work. This commit also puts `mark_sanitized` to work removing known
false postive flows.
2020-06-11 12:57:49 -07:00
Anders Kaseorg
67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Anders Kaseorg
444fbbf964 python: Whitespace fixes from autopep8.
Generated by autopep8.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-08 15:21:30 -07:00
whoodes
cea7d713cd requirements: Upgrade boto to boto3.
Fixes: #3490

Contributors include:

Author:    whoodes <hoodesw@hawaii.edu>
Author:    zhoufeng1989 <zhoufengloop@gmail.com>
Author:    rht <rhtbot@protonmail.com>
2020-05-26 23:18:07 -07:00
Anders Kaseorg
bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Anders Kaseorg
fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Mateusz Mandera
4018dcb8e7 upload: Include filename at the end of temporary access URLs. 2020-04-20 10:25:48 -07:00
Tim Abbott
0ccc0f02ce upload: Support requesting a temporary unauthenticated URL.
This is be useful for the mobile and desktop apps to hand an uploaded
file off to the system browser so that it can render PDFs (Etc.).

The S3 backend implementation is simple; for the local upload backend,
we use Django's signing feature to simulate the same sort of 60-second
lifetime token.

Co-Author-By: Mateusz Mandera <mateusz.mandera@protonmail.com>
2020-04-17 09:08:10 -07:00
Tim Abbott
7f582b3861 upload: Increase the lifetime of signed upload URLs.
For some mobile use cases, 15 seconds is potentially too short for a
busy+slow device to open a browser and fetch the URL.  60 seconds is
plenty, and doesn't carry a materially increased security risk.
2020-04-17 09:08:10 -07:00
Anders Kaseorg
c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Anders Kaseorg
7ff9b22500 docs: Convert many http URLs to https.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-03-26 21:35:32 -07:00
Vishnu Ks
af3a37b58b upload: Refactor out realm_avatar_and_logo_path function. 2020-02-03 14:09:05 -08:00
Tim Abbott
7ccc8373e2 bugdown: Fix logic for extracting attachment path_id.
In 3892a8afd8, we restructured the
system for managing uploaded files to a much cleaner model where we
just do parsing inside bugdown.

That new model had potentially buggy handling of cases around both
relative URLs and URLS starting with `realm.host`.

We address this by further rewriting the handling of attachments to
avoid regular expressions entirely, instead relying on urllib for
parsing, and having bugdown output `path_id` values, so that there's
no need for any conversions between formats outside bugdowm.

The check_attachment_reference_change function for processing message
updates is significantly simplified in the process.

The new check on the hostname has the side effect of requiring us to
fix some previously weird/buggy test data.

Co-Author-By: Anders Kaseorg <anders@zulipchat.com>
Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>
2019-12-12 20:30:26 -08:00
Rohitt Vashishtha
3fbb050216 messages: Remove dependence on regex for claiming attachments.
This commit wraps up the work to remove basic regex based parsing
of messages to handle attachment claiming/unclaiming. We now use
the more dependable Bugdown processor to find potential links and
only operate upon those links instead of parsing the full message
content again.
2019-12-11 11:03:49 -08:00
Tim Abbott
7e0ea61b00 upload: Support S3-compatible S3 hosting providers.
Previously, we were hardcoding the domain s3.amazonaws.com.  Given
that we already have an interface for configuring the host in
/etc/zulip/boto.cfg (which in turn, automatically configures boto), we
just need to actually use the value configured in boto for what S3
hostname to use.

We don't have tests for this new use case, in part because they're
likely annoying to write with `moto` and there hasn't been a huge
amount of demand for it.  Since this doesn't regress existing S3
backend support, it seems worth merging.
2019-09-24 17:17:21 -07:00
Tim Abbott
b8b0ae362c uploads: Only initialize S3 connection once in __init__.
This should be a mild performance optimization for the S3
authentication backend, since we aren't initializing unnecessary
duplicate connections.
2019-09-24 17:15:44 -07:00
Tim Abbott
96726c00ce export: Fix broken URLs in UI with S3 backend.
Apparently, the Zulip notifications (and resulting emails) were
correct, but the download links inside the Zulip UI were incorrectly
not including S3 prefix on the URL, making them not work.

While we're at this, we rewrite the somewhat convoluted previous
system for formatting the data export output.
2019-09-24 13:56:49 -07:00
Anders Kaseorg
780ecb672b CVE-2019-16216: Fix MIME type validation.
* Whitelist a small number of image/ types to be served as
  non-attachments.
* Serve the file using the type that we validated rather than relying
  on an independent guess to match.

This issue can lead to a stored XSS security vulnerability for older
browsers that don't support Content-Security-Policy.

It primarily affects servers using Zulip's local file uploads backend
for servers running Ubuntu 16.04 Xenial or newer; the legacy local
file upload backend for (now EOL) Ubuntu 14.04 Trusty was not affected
and it has limited impact for the S3 upload backend (which uses an
unprivileged S3 bucket domain to serve files).

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-09-11 15:46:36 -07:00
neiljp (Neil Pilgrim)
5f673f5820 mypy: Remove type ignores after boto stub improvements. 2019-08-06 23:24:56 -07:00
Wyatt Hoodes
22481f63bf upload: Fix typing for key variable. 2019-07-29 15:23:10 -07:00
Wyatt Hoodes
b1900c406a public_export: Add logic for deleting the export tarball.
The path to the uploaded tarball is reconstructed via the relative url and
removed with the canonical methods in `upload.py`.
2019-07-26 15:52:03 -07:00
Wyatt Hoodes
e331a758c3 python: Migrate open statements to use with.
This is low priority, but it's nice to be consistently using the best
practice pattern.

Fixes: #12419.
2019-07-20 15:48:52 -07:00
Wyatt Hoodes
af4eb8c0d5 export/upload: Refactor tarball upload logic to upload_backend.
The conditional block containing the tarball upload logic for both S3
and local uploads was deconstructed and moved to the more appropriate
location within `zerver/lib/upload.py`.
2019-07-03 15:40:35 -07:00
vinitS101
a6eda858d0 ldap: Fix avatar sync not working with the S3 backend.
This fixes an issue that caused LDAP synchronization to fail for
avatars.  The problem occurred due to the lack of a 'name' attribute
on the BytesIO object that we pass to the upload backend (which is
only used in the S3 backend for computing Content-Type).

Fixes #12411.
2019-06-13 15:12:13 -07:00
Anders Kaseorg
61982d9d47 uploads: Revert "Url encoded name of the file should be an ascii."
This reverts commit fd9dd51d16 (#1815).

The issue described does not exist in Python 3, where urllib.parse now
_only_ accepts (Unicode) str and does the right thing with it.  The
workaround was not being triggered and would have failed if it were.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-22 22:28:39 -07:00