Commit Graph

184 Commits

Author SHA1 Message Date
Riken Shah
8c31e6f96e emoji: Add backend changes to support still image for animated emojis.
Now, when we add a custom animated emoji to the realm
we also save a still image of it (1st frame of the gif). So
we can avoid showing an animated emoji every time.
2021-09-12 07:13:04 +00:00
rht
6ff659d199 upload: Extract generate_message_upload_path helper.
This helper will let us avoid copying this logic in the data import
code path.
2021-09-02 16:31:08 -07:00
PIG208
04f5f25478 typing: Replace File with IO[bytes]. 2021-08-20 06:02:28 -07:00
Anders Kaseorg
1bdb7b1141 mypy: Add boto3-stubs.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg
5c90522e69 mypy: Add types-Pillow.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg
14f0594795 upload: Replace exif_rotate with Pillow exif_transpose.
Fixes #18599.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg
ad5f0c05b5 python: Remove default "utf8" argument for encode(), decode().
Partially generated by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 15:53:52 -07:00
PIG208
7d1c475f69 typing: Use assertions for function arguments.
Utilize the assert_is_not_None helper to eliminate errors of
'Argument x to "Foo" has incompatible type "Optional[Bar]"...'
2021-07-26 14:48:45 -07:00
Anders Kaseorg
fb3ddf50d4 python: Fix mypy no_implicit_reexport errors.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-16 14:02:31 -07:00
Mateusz Mandera
b9a8fb4453 upload: Deduplicate logic for public upload url creation.
get_public_upload_root_url and construct_public_upload_url_base were
both doing basically the same thing in the same. We deduplicate this,
making them share the same code, using the approach from
get_public_upload_root_url of using urljoin.

Using a format string is not a great idea, as it doesn't handle the case
of the URL already having parts that will be interpreted as format
string metacharacters. On the downside, this approach negatively affects
performance:

```
...: s = time.time()
...: for i in range(0, 250):
...:     r = u.get_public_upload_url("foo")
...: print(time.time()-s)
0.020366191864013672
```

up from 0.001 before this change.
2021-07-02 08:05:53 -07:00
Mateusz Mandera
85e19b2bde upload: Use URL manipulation for get_public_upload_url logic.
This is much faster than calling generate_presigned_url each time.

```
In [3]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
0.0010945796966552734
```
2021-06-22 09:35:56 -07:00
Mateusz Mandera
e883ab057f upload: Cache the boto client to improve performance.
Fixes #18915

This was very slow, causing performance issues. After investigating,
generate_presigned_url is the cheap part of this, but the
session.client() call is expensive - so that's what we should cache.

Before the change:
```
In [4]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
6.408717393875122
```

After:
```
In [4]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
0.48990607261657715
```

This is not good enough to avoid doing something ugly like replacing
generate_presigned_url with some manual URL manipulation, but it's a
helpful structure that we may find useful with further refactoring.
2021-06-22 09:35:19 -07:00
Alex Vandiver
721546dfc0 subdomains: Extend "static" to include resources hosted on S3.
This causes avatars and emoji which are hosted by Zulip in S3 (or
compatible) servers to no longer go through camo.  Routing these
requests through camo does not add any privacy benefit (as the request
logs there go to the Zulip admins regardless), and may break emoji
imported from Slack before 1bf385e35f,
which have `application/octet-stream` as their stored Content-Type.
2021-06-08 15:28:10 -07:00
Tim Abbott
9f2daeee45 upload: Use get_public_upload_url for export tarballs too.
This deduplicates the code so that we now just have one function for
constructing S3 URLs.
2021-05-27 23:26:45 -07:00
ryanreh99
5a4aecfc40 s3 uploads: Refactor to access objects via get_public_upload_url.
Our current logic only allows S3 block storage providers whose
upload URL matches with the format used by AWS. This also allows
other styles such as the "virtual host" format used by Oracle cloud.

Fixes #17762.
2021-05-27 23:26:42 -07:00
Mateusz Mandera
6a8586e989 upload: Mention new difference between sanitize_name and slugify.
In Django 3.2 slugify strips trailing dashes and underscores:
0382ecfe02

sanitize_name doesn't so this difference should be documented like the
others.
2021-05-03 08:36:22 -07:00
Mateusz Mandera
389c7bdb5a upload: Fix docstring and regex in sanitize_name regarding underscore.
Underscore character is already covered by \w, so _ in the regex is
redundant. Also the docstring is mildly incorrect - underscore already
is an allowed character by django's slugify (and always was) for the
aforementioned reason.
2021-05-03 08:36:22 -07:00
Ganesh Pawar
830f1fa8c5 upload: Refactor and add tests for ensure_avatar_image in upload.py.
`ensure_basic_avatar_image` and `ensure_medium_avatar_image` are
essentially the same thing, except a size parameter.
So, refactor them into a single function.

This doesn't introduce any functional changes.
2021-04-29 21:18:13 -07:00
Anders Kaseorg
e7ed907cf6 python: Convert deprecated Django ugettext alias to gettext.
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-15 18:01:34 -07:00
Anders Kaseorg
6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg
11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
ryanreh99
dfa7ce5637 uploads: Support non-AWS S3-compatible server.
Boto3 does not allow setting the endpoint url from
the config file. Thus we create a django setting
variable (`S3_ENDPOINT_URL`) which is passed to
service clients and resources of `boto3.Session`.

We also update the uploads-backend documentation
and remove the config environment variable as now
AWS supports the SIGv4 signature format by default.
And the region name is passed as a parameter instead
of creating a config file for just this value.

Fixes #16246.
2020-10-28 21:59:07 -07:00
ryanreh99
1c370a975c refactor: Access a bucket by calling zerver.lib.uploads.get_bucket. 2020-10-28 21:52:08 -07:00
Anders Kaseorg
72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Cody Piersall
5dab6e9d31 emoji-upload: Fix transparency issues on GIF emoji upload.
This preserves the alpha layer on GIF images that need to be resized
before being uploaded.  Two important changes occur here:

1. The new frame is a *copy* of the original image, which preserves the
   GIF info.
2. The disposal method of the original GIF is preserved.  This
   essentially determines what state each frame of the GIF starts from
   when it is drawn; see PIL's docs:
   https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#saving
   for more info.

This resolves some but not all of the test cases in #16370.
2020-10-11 16:23:07 -07:00
akshatdalton
52c411df8a emoji: Add padding around the gif on GIF emoji upload.
Replaced ImageOps.fit by ImageOps.pad, in zerver/lib/upload.py, which
returns a sized and padded version of the image, expanded to fill the
requested aspect ratio and size.
Fixes part of #16370.
2020-10-06 17:28:02 -07:00
Anders Kaseorg
faf600e9f5 urls: Remove unused URL names and shorten others.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-22 10:46:28 -07:00
Anders Kaseorg
ddf8ec33df upload: Strip leading slash from deleted S3 export paths.
Previously, S3UploadBackend.delete_export_tarball failed to strip the
leading ‘/’ from the export path.  This mistake is now caught by Moto
1.3.15.  I expect it caused deletion failures in the real S3, although
I haven’t verified this.

We store export_path in the audit log with a leading ‘/’, but the
actual S3 keys do not have a leading ‘/’.  Changing either system
would require a migration.  So the new convention is that the
variables named ‘export_path’ have a leading ‘/’, while variables
named ‘path_id’ or ‘key’ do not.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-13 20:59:09 -07:00
Anders Kaseorg
b7b7475672 python: Use standard secrets module to generate random tokens.
There are three functional side effects:

• Correct an insignificant but mathematically offensive bias toward
repeated characters in generate_api_key introduced in commit
47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased
from 190.52864 bits to 190.53428 bits.

• Use the base32 alphabet in confirmation.models.generate_key; its
entropy is reduced from 124.07820 bits to the documented 120 bits, but
now it uses 1 syscall instead of 24.

• Use the base32 alphabet in get_bigbluebutton_url; its entropy is
reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall
instead of 10.

(The base32 alphabet is A-Z 2-7.  We could probably replace all of
these with plain secrets.token_urlsafe, since I expect most callers
can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without
problems.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-09 15:52:57 -07:00
Anders Kaseorg
f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Clara Dantas
05bf72a75c attachments: Add is_web_public field.
This commit adds the is_web_public field in the AbstractAttachment
class. This is useful when validating user access to the attachment,
as otherwise we would have to make a query in the db to check if
that attachment was sent in a message in a web-public stream or not.
2020-08-12 17:26:03 -07:00
Tim Abbott
6130a61be0 export: Only print .s with percent_callback to console.
The S3 data export tool's upload code path uses this nice boto
callback feature for showing a progress bar, which is nice for the
management command.  It's spammy/broken in production and the backend
tests, so we change percent_callback to be a parameter passed in so
that it can only be used in the contexts where it makes sense.
2020-07-30 13:14:53 -07:00
Tim Abbott
0b6ebb4fbb upload: Remove unused get_realm_for_filename. 2020-06-18 17:55:13 -07:00
Tim Abbott
5962d1ea14 upload: Avoid fetching bucket objects repeatedly.
This takes of advantage of saving the bucket object on the
UploadBackend class to deduplicate a bunch of redundant code getting
buckets.
2020-06-18 17:55:13 -07:00
Wyatt Hoodes
2ef791fc21 upload.py: Support using non S3-providers.
With #14378, we regressed back to the state of that
prior to 7e0ea61b00.

We fix this by getting our avatar bucket on
object initialization, and use the appropriate means
of gathering the network location for the urls.

Fixes #14484.
2020-06-18 17:55:13 -07:00
Anders Kaseorg
365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg
69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Graham Bleaney
461d5b1a3e pysa: Introduce sanitizers, models, and inline marking safe.
This commit adds three `.pysa` model files: `false_positives.pysa`
for ruling out false positive flows with `Sanitize` annotations,
`req_lib.pysa` for educating pysa about Zulip's `REQ()` pattern for
extracting user input, and `redirects.pysa` for capturing the risk
of open redirects within Zulip code. Additionally, this commit
introduces `mark_sanitized`, an identity function which can be used
to selectively clear taint in cases where `Sanitize` models will not
work. This commit also puts `mark_sanitized` to work removing known
false postive flows.
2020-06-11 12:57:49 -07:00
Anders Kaseorg
67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Anders Kaseorg
444fbbf964 python: Whitespace fixes from autopep8.
Generated by autopep8.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-08 15:21:30 -07:00
whoodes
cea7d713cd requirements: Upgrade boto to boto3.
Fixes: #3490

Contributors include:

Author:    whoodes <hoodesw@hawaii.edu>
Author:    zhoufeng1989 <zhoufengloop@gmail.com>
Author:    rht <rhtbot@protonmail.com>
2020-05-26 23:18:07 -07:00
Anders Kaseorg
bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Anders Kaseorg
fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Mateusz Mandera
4018dcb8e7 upload: Include filename at the end of temporary access URLs. 2020-04-20 10:25:48 -07:00
Tim Abbott
0ccc0f02ce upload: Support requesting a temporary unauthenticated URL.
This is be useful for the mobile and desktop apps to hand an uploaded
file off to the system browser so that it can render PDFs (Etc.).

The S3 backend implementation is simple; for the local upload backend,
we use Django's signing feature to simulate the same sort of 60-second
lifetime token.

Co-Author-By: Mateusz Mandera <mateusz.mandera@protonmail.com>
2020-04-17 09:08:10 -07:00
Tim Abbott
7f582b3861 upload: Increase the lifetime of signed upload URLs.
For some mobile use cases, 15 seconds is potentially too short for a
busy+slow device to open a browser and fetch the URL.  60 seconds is
plenty, and doesn't carry a materially increased security risk.
2020-04-17 09:08:10 -07:00
Anders Kaseorg
c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Anders Kaseorg
7ff9b22500 docs: Convert many http URLs to https.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-03-26 21:35:32 -07:00
Vishnu Ks
af3a37b58b upload: Refactor out realm_avatar_and_logo_path function. 2020-02-03 14:09:05 -08:00
Tim Abbott
7ccc8373e2 bugdown: Fix logic for extracting attachment path_id.
In 3892a8afd8, we restructured the
system for managing uploaded files to a much cleaner model where we
just do parsing inside bugdown.

That new model had potentially buggy handling of cases around both
relative URLs and URLS starting with `realm.host`.

We address this by further rewriting the handling of attachments to
avoid regular expressions entirely, instead relying on urllib for
parsing, and having bugdown output `path_id` values, so that there's
no need for any conversions between formats outside bugdowm.

The check_attachment_reference_change function for processing message
updates is significantly simplified in the process.

The new check on the hostname has the side effect of requiring us to
fix some previously weird/buggy test data.

Co-Author-By: Anders Kaseorg <anders@zulipchat.com>
Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>
2019-12-12 20:30:26 -08:00