Commit Graph

204 Commits

Author SHA1 Message Date
Anders Kaseorg
924d530292 ruff: Fix N813 camelcase imported as lowercase.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-16 09:29:11 -08:00
Lauryn Menard
aa796af0a8 upload: Remove mimetype url parameter in get_file_info.
This `mimetype` parameter was introduced in c4fa29a and its last
usage removed in 5bab2a3. This parameter was undocumented in the
OpenAPI endpoint documentation for `/user_uploads`, therefore
there shouldn't be client implementations that rely on it's
presence.

Removes the `request.GET` call for the `mimetype` parameter and
replaces it by getting the `content_type` value from the file,
which is an instance of Django's `UploadedFile` class and stores
that file metadata as a property.

If that returns `None` or an empty string, then we try to guess
the `content_type` from the filename, which is the same as the
previous behaviour when `mimetype` was `None` (which we assume
has been true since it's usage was removed; see above).

If unable to guess the `content_type` from the filename, we now
fallback to "application/octet-stream", instead of an empty string
or `None` value.

Also, removes the specific test written for having `mimetype` as
a url parameter in the request, and replaces it with a test that
covers when we try to guess `content_type` from the filename.
2022-08-08 16:06:09 -07:00
Anders Kaseorg
2508b579a6 upload: Replace boto3.Session with boto3.session.Session.
boto3-stubs seems to have dropped the former for some reason.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-07-30 06:46:34 -07:00
Zixuan James Li
e68fb802f4 upload: Replace File with UploadedFile.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-29 14:09:12 -07:00
Zixuan James Li
f42465319b upload: Refactor file size out of get_file_info.
We have already checked the size of the file in `upload_file_backend`.
This is the only caller of `upload_message_image_from_request`, and
indirectly the only caller of `get_file_info`. There is no need to
retrieve this information again.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-29 14:09:12 -07:00
Alex Vandiver
0830d5e7ea emoji: Write "original" file before attempting resize.
Resizing emoji can fail, especially for animated GIFs; in such cases,
it is useful to have the original data on hand, to be able to dissect
the failure.
2022-07-06 17:20:40 -07:00
Zixuan James Li
40b4da8f58 emoji: Add none checks for uploaded file name.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-06-23 19:25:48 -07:00
Anders Kaseorg
e230ea2598 actions: Split out zerver.actions.uploads.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-04-14 17:14:32 -07:00
Alex Vandiver
4f93b4b6e4 uploads: Skip the outgoing proxy if S3_KEY is unset.
When the credentials are provided by dint of being run on an EC2
instance with an assigned Role, we must be able to fetch the instance
metadata from IMDS -- which is precisely the type of internal-IP
request that Smokescreen denies.

While botocore supports a `proxies` argument to the `Config` object,
this is not actually respected when making the IMDS queries; only the
environment variables are read from.  See
https://github.com/boto/botocore/issues/2644

As such, implement S3_SKIP_PROXY by monkey-patching the
`botocore.utils.should_bypass_proxies` function, to allow requests to
IMDS to be made without Smokescreen impeding them.

Fixes #20715.
2022-03-24 10:21:35 -07:00
Alex Vandiver
abed174b12 uploads: Add an endpoint which forces a download.
This is most useful for images hosted in S3, which are otherwise
always displayed in the browser.
2022-03-22 15:05:02 -07:00
Alex Vandiver
95892a5ed3 emoji: Support animated PNGs. 2022-03-15 12:47:21 -07:00
Alex Vandiver
96a5fa9d78 upload: Fix resizing non-animated images.
5dab6e9d31 began honoring the list of disposals for every frame.
Unfortunately, passing a list of disposals for a non-animated image
raises an exception:
```
  File "zerver/lib/upload.py", line 212, in resize_emoji
    image_data = resize_gif(im, size)
  File "zerver/lib/upload.py", line 165, in resize_gif
    frames[0].save(
  File "[...]/PIL/Image.py", line 2212, in save
    save_handler(self, fp, filename)
  File "[...]/PIL/GifImagePlugin.py", line 605, in _save
    _write_single_frame(im, fp, palette)
  File "[...]/PIL/GifImagePlugin.py", line 506, in _write_single_frame
    _write_local_header(fp, im, (0, 0), flags)
  File "[...]/PIL/GifImagePlugin.py", line 647, in _write_local_header
    disposal = int(im.encoderinfo.get("disposal", 0))
TypeError: int() argument must be a string, a bytes-like object or a
number, not 'list'
```

`check_add_realm_emoji` calls this as:

```
    try:
        is_animated = upload_emoji_image(image_file, emoji_file_name, a
uthor)
        emoji_uploaded_successfully = True
    finally:
        if not emoji_uploaded_successfully:
            realm_emoji.delete()
            return None
        # ...
```

This is equivalent to dropping _all_ exceptions silently.  As such,
Zulip has silently rejected all non-animated images larger than 64x64
since 5dab6e9d31.

Adjust to only pass a single disposal if there are no additional
frames.  Add a test for non-animated images, which requires also
fixing the incidental bug that all GIF images were being recorded as
animated, regardless of if they had more than 1 frame or not.
2022-02-17 12:19:47 -08:00
Tim Abbott
edf16cb861 upload: Mark migration code as nocoverage.
We want to add tests here, but it's more important to fix main failing
CI.
2022-02-11 10:37:58 -08:00
Mateusz Mandera
30ac291eba emoji: Add migration to reupload all RealmEmoji and ensure .author.
Fixes #19732.
2022-02-10 17:45:31 -08:00
Mateusz Mandera
fe61243cfe upload: Don't access emoji_file.name attribute upload_emoji_image.
The S3 backend implementation of upload_emoji_image was accessing
emoji_file.name - which is redundant because emoji_file_name already
gets passed in and can be used, and an object of type IO[bytes] may not
have the .name attribute. Spotted by @Fingel.
2022-02-09 11:26:39 -08:00
Mateusz Mandera
4102816240 upload: Pass the target realm to create_attachment.
The target realm was not being passed to create_attachment in
upload_message_file implementations. This was a bug in the edge-case of
cross-realm messages - in particular, causing a bug in the email
gateway:
When an email with an attachment is sent, the message is mirrored to
Zulip with Email Gateway Bot as the message sender and uploader of the
attachment. Due to the realm not being passed to create_attachment, the
Attachment would get created with .realm being the system bot realm,
making the attachment inaccessible under some conditions due to failing
the following condition check (that's expected to pass, provided that
the .realm is set correctly):
```
    if (
        attachment.is_realm_public
        and attachment.realm == user_profile.realm
        and user_profile.can_access_public_streams()
    ):
        # Any user in the realm can access realm-public files
        return True
```
2022-01-27 17:23:44 -08:00
Anders Kaseorg
78e54a0d7a python: Replace deprecated jinja2.utils.Markup with markupsafe.Markup.
Fixes “DeprecationWarning: 'jinja2.Markup' is deprecated and will be
removed in Jinja 3.1. Import 'markupsafe.Markup' instead.”

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-13 14:22:48 -08:00
Tim Abbott
22b5e105e6 upload: Remove incorrect animated GIF asserts.
GIF files can be `.GIF`, and also we determine the file format by
inspecting the image data, so there's no reason to have this
assertion.

(The code for serving still images does not rely on the file being a
GIF.)
2021-12-16 16:13:00 -08:00
Anders Kaseorg
58920affd4 python: Remove re.UNICODE flag (redundant in Python 3).
https://docs.python.org/3/library/re.html#re.A

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-22 13:42:29 -07:00
Anders Kaseorg
3bd3173b1f avatar: Remove ?x=x kludge.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-14 12:47:43 -07:00
Riken Shah
8c31e6f96e emoji: Add backend changes to support still image for animated emojis.
Now, when we add a custom animated emoji to the realm
we also save a still image of it (1st frame of the gif). So
we can avoid showing an animated emoji every time.
2021-09-12 07:13:04 +00:00
rht
6ff659d199 upload: Extract generate_message_upload_path helper.
This helper will let us avoid copying this logic in the data import
code path.
2021-09-02 16:31:08 -07:00
PIG208
04f5f25478 typing: Replace File with IO[bytes]. 2021-08-20 06:02:28 -07:00
Anders Kaseorg
1bdb7b1141 mypy: Add boto3-stubs.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg
5c90522e69 mypy: Add types-Pillow.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg
14f0594795 upload: Replace exif_rotate with Pillow exif_transpose.
Fixes #18599.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg
ad5f0c05b5 python: Remove default "utf8" argument for encode(), decode().
Partially generated by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 15:53:52 -07:00
PIG208
7d1c475f69 typing: Use assertions for function arguments.
Utilize the assert_is_not_None helper to eliminate errors of
'Argument x to "Foo" has incompatible type "Optional[Bar]"...'
2021-07-26 14:48:45 -07:00
Anders Kaseorg
fb3ddf50d4 python: Fix mypy no_implicit_reexport errors.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-16 14:02:31 -07:00
Mateusz Mandera
b9a8fb4453 upload: Deduplicate logic for public upload url creation.
get_public_upload_root_url and construct_public_upload_url_base were
both doing basically the same thing in the same. We deduplicate this,
making them share the same code, using the approach from
get_public_upload_root_url of using urljoin.

Using a format string is not a great idea, as it doesn't handle the case
of the URL already having parts that will be interpreted as format
string metacharacters. On the downside, this approach negatively affects
performance:

```
...: s = time.time()
...: for i in range(0, 250):
...:     r = u.get_public_upload_url("foo")
...: print(time.time()-s)
0.020366191864013672
```

up from 0.001 before this change.
2021-07-02 08:05:53 -07:00
Mateusz Mandera
85e19b2bde upload: Use URL manipulation for get_public_upload_url logic.
This is much faster than calling generate_presigned_url each time.

```
In [3]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
0.0010945796966552734
```
2021-06-22 09:35:56 -07:00
Mateusz Mandera
e883ab057f upload: Cache the boto client to improve performance.
Fixes #18915

This was very slow, causing performance issues. After investigating,
generate_presigned_url is the cheap part of this, but the
session.client() call is expensive - so that's what we should cache.

Before the change:
```
In [4]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
6.408717393875122
```

After:
```
In [4]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
0.48990607261657715
```

This is not good enough to avoid doing something ugly like replacing
generate_presigned_url with some manual URL manipulation, but it's a
helpful structure that we may find useful with further refactoring.
2021-06-22 09:35:19 -07:00
Alex Vandiver
721546dfc0 subdomains: Extend "static" to include resources hosted on S3.
This causes avatars and emoji which are hosted by Zulip in S3 (or
compatible) servers to no longer go through camo.  Routing these
requests through camo does not add any privacy benefit (as the request
logs there go to the Zulip admins regardless), and may break emoji
imported from Slack before 1bf385e35f,
which have `application/octet-stream` as their stored Content-Type.
2021-06-08 15:28:10 -07:00
Tim Abbott
9f2daeee45 upload: Use get_public_upload_url for export tarballs too.
This deduplicates the code so that we now just have one function for
constructing S3 URLs.
2021-05-27 23:26:45 -07:00
ryanreh99
5a4aecfc40 s3 uploads: Refactor to access objects via get_public_upload_url.
Our current logic only allows S3 block storage providers whose
upload URL matches with the format used by AWS. This also allows
other styles such as the "virtual host" format used by Oracle cloud.

Fixes #17762.
2021-05-27 23:26:42 -07:00
Mateusz Mandera
6a8586e989 upload: Mention new difference between sanitize_name and slugify.
In Django 3.2 slugify strips trailing dashes and underscores:
0382ecfe02

sanitize_name doesn't so this difference should be documented like the
others.
2021-05-03 08:36:22 -07:00
Mateusz Mandera
389c7bdb5a upload: Fix docstring and regex in sanitize_name regarding underscore.
Underscore character is already covered by \w, so _ in the regex is
redundant. Also the docstring is mildly incorrect - underscore already
is an allowed character by django's slugify (and always was) for the
aforementioned reason.
2021-05-03 08:36:22 -07:00
Ganesh Pawar
830f1fa8c5 upload: Refactor and add tests for ensure_avatar_image in upload.py.
`ensure_basic_avatar_image` and `ensure_medium_avatar_image` are
essentially the same thing, except a size parameter.
So, refactor them into a single function.

This doesn't introduce any functional changes.
2021-04-29 21:18:13 -07:00
Anders Kaseorg
e7ed907cf6 python: Convert deprecated Django ugettext alias to gettext.
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-15 18:01:34 -07:00
Anders Kaseorg
6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg
11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
ryanreh99
dfa7ce5637 uploads: Support non-AWS S3-compatible server.
Boto3 does not allow setting the endpoint url from
the config file. Thus we create a django setting
variable (`S3_ENDPOINT_URL`) which is passed to
service clients and resources of `boto3.Session`.

We also update the uploads-backend documentation
and remove the config environment variable as now
AWS supports the SIGv4 signature format by default.
And the region name is passed as a parameter instead
of creating a config file for just this value.

Fixes #16246.
2020-10-28 21:59:07 -07:00
ryanreh99
1c370a975c refactor: Access a bucket by calling zerver.lib.uploads.get_bucket. 2020-10-28 21:52:08 -07:00
Anders Kaseorg
72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Cody Piersall
5dab6e9d31 emoji-upload: Fix transparency issues on GIF emoji upload.
This preserves the alpha layer on GIF images that need to be resized
before being uploaded.  Two important changes occur here:

1. The new frame is a *copy* of the original image, which preserves the
   GIF info.
2. The disposal method of the original GIF is preserved.  This
   essentially determines what state each frame of the GIF starts from
   when it is drawn; see PIL's docs:
   https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#saving
   for more info.

This resolves some but not all of the test cases in #16370.
2020-10-11 16:23:07 -07:00
akshatdalton
52c411df8a emoji: Add padding around the gif on GIF emoji upload.
Replaced ImageOps.fit by ImageOps.pad, in zerver/lib/upload.py, which
returns a sized and padded version of the image, expanded to fill the
requested aspect ratio and size.
Fixes part of #16370.
2020-10-06 17:28:02 -07:00
Anders Kaseorg
faf600e9f5 urls: Remove unused URL names and shorten others.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-22 10:46:28 -07:00
Anders Kaseorg
ddf8ec33df upload: Strip leading slash from deleted S3 export paths.
Previously, S3UploadBackend.delete_export_tarball failed to strip the
leading ‘/’ from the export path.  This mistake is now caught by Moto
1.3.15.  I expect it caused deletion failures in the real S3, although
I haven’t verified this.

We store export_path in the audit log with a leading ‘/’, but the
actual S3 keys do not have a leading ‘/’.  Changing either system
would require a migration.  So the new convention is that the
variables named ‘export_path’ have a leading ‘/’, while variables
named ‘path_id’ or ‘key’ do not.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-13 20:59:09 -07:00
Anders Kaseorg
b7b7475672 python: Use standard secrets module to generate random tokens.
There are three functional side effects:

• Correct an insignificant but mathematically offensive bias toward
repeated characters in generate_api_key introduced in commit
47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased
from 190.52864 bits to 190.53428 bits.

• Use the base32 alphabet in confirmation.models.generate_key; its
entropy is reduced from 124.07820 bits to the documented 120 bits, but
now it uses 1 syscall instead of 24.

• Use the base32 alphabet in get_bigbluebutton_url; its entropy is
reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall
instead of 10.

(The base32 alphabet is A-Z 2-7.  We could probably replace all of
these with plain secrets.token_urlsafe, since I expect most callers
can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without
problems.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-09 15:52:57 -07:00
Anders Kaseorg
f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00