Commit Graph

59 Commits

Author SHA1 Message Date
Alex Vandiver
085d137871 upload: Rename attachment_vips_source, as it's not just for vips_source. 2025-07-29 10:01:40 -07:00
Anders Kaseorg
6006ba4c44 upload: Make closest_thumbnail_format take an HttpRequest.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-07-17 14:37:11 -07:00
Anders Kaseorg
432ebd9a36 tests: Consume more streaming responses.
Fixes warnings like “ResourceWarning: unclosed file <_io.FileIO
name='/srv/zulip/var/4fc6d348-ccfe-43fa-8a2e-5b4ff5a15a66/test-backend/run_2304691/worker_7/test_uploads/files/2/d9/KYUsAKJ_g45NA7CojVbRW0C4/zulip.jpeg'
mode='rb' closefd=True>” with warnings enabled.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-02-05 13:16:12 -08:00
Alex Vandiver
98362de185 models: Add content_type to ImageAttachment.
This means that only ImageAttachment row needs to be fetched, and
removes the need to pass around an extra parameter.  This
denormalization is safe, since in general Attachment rows are
read-only, so we are not concerned with drift between the Attachment
and ImageAttachment tables.

We cannot make content_type non-null, since while the both the
`content_type` column in Attachment and populating that from requests
predates the ImageAttachment table, we have both backfilled
ImageAttachment rows to consider, and imports may also leave files
with no `content_type`.  Any backfill of currently-null `content_type`
values will thus need to update both tables.

This change fixes a race condition when importing. ImageAttachment
rows are imported before rendering Messages, which are both before
importing Attachment rows; if the thumbnailing finished after the
Message was imported but before Attachment rows were imported, then
the re-rendering step would not know the image's content-type.
2025-01-31 14:29:57 -08:00
Alex Vandiver
8bd8a33dd2 thumbnail: Show the first few frames of large animated images.
71406ac767 switched the IMAGE_BOMB_TOTAL_PIXELS cutoff for what
images we preview to include the number of frames in the calculation.
While accurate to the implementation (thumbnailing a 1k-frame animation is
prohibitive, even a small resolutions), this was a behaviour change
from without thumbnailing -- animated gifs did not display inline at
all anymore.

Switch to thumbnailing as many frames as we can fit into a pixel-based
animated thumbnailing threshold, with a minimum of three (to be able
to convey that the image is actually animated).  Smaller-resolution
images will hence get more frames in their preview.  This also allows
the standard animate-on-hover or always-animate behaviour to be true
to their configurations, without confusing edge cases.

Fixes: #32609.
2025-01-15 09:56:19 -08:00
Alex Vandiver
230bae17bb thumbnail: Generate a transcoded high-res version of HEIC/TIFF images.
If the content-type of the image is not in INLINE_MIME_TYPES, then we
do not expect browsers to be able to display it.  This behaviour is
particularly confusing because the thumbnail will render properly,
since that will be in the more widely-supported WebP format, but the
lightbox will show a broken image.

In these cases, generate a high-resolution (4032x3024) "thumbnail"
which clients can choose to use instead.  This thumbnail format is not
in the listed in the server's advertised thumbnail size list, because
it is not reliably generated for every image.

The transcoded thumbnail format is set on the `img` tag if it is
generated, and the original content-type is always passed to the
client, so it can decide how or if to render the original image.  This
content-type is as the _original uploader_ specified it, so may be
incorrect.

The transcoded image is not animated, even if the original was.  HEIC
files can nominally be animated, but in testing libvips was not able
to correctly recognize them as such.  TIFF files are parsed as being
"animated," with one page per frame; this is of dubious utility, so
we merely transcode the first page.  Always generating a static
transcoded image serves to also limit the computational time spent.

THUMBNAIL_OUTPUT_FORMATS is switched to be a tuple to ensure that it
is not accidentally mutated.
2025-01-09 09:10:28 -08:00
Alex Vandiver
9a1f78db22 thumbnail: Support checking for images from streaming sources.
We may not always have trivial access to all of the bytes of the
uploaded file -- for instance, if the file was uploaded previously, or
by some other process.  Downloading the entire image in order to check
its headers is an inefficient use of time and bandwidth.

Adjust `maybe_thumbnail` and dependencies to potentially take a
`pyvips.Source` which supports streaming data from S3 or disk.  This
allows making the ImageAttachment row, if deemed appropriate, based on
only a few KB of data, and not the entire image.
2024-09-17 12:51:30 -07:00
Anders Kaseorg
91ade25ba3 python: Simplify with str.removeprefix, str.removesuffix.
These are available in Python ≥ 3.9.
https://docs.python.org/3/library/stdtypes.html#str.removeprefix

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-09-03 12:30:16 -07:00
Alex Vandiver
c726d2ec01 thumbnail: Do not Camo old thumbor URLs; serve images directly.
Providing a signed Camo URL for arbitrary URLs opened the server up to
being an open redirector.  Return 403 if the URL is not a user upload,
and the backend image if it is.  Since we do not have ImageAttachment
rows for uploads at a time we wrote `/thumbnail?` URLs, return the
full-size content.
2024-07-24 16:04:34 -07:00
Alex Vandiver
e4a8304f57 thumbnail: Store the post-orientation-transformation dimensions.
Modern browsers respect the EXIF orientation information of images,
applying rotation and/or mirroring as specified in those tags.  The
the `width="..."` and `height="..."` tags are to size the image
_after_ applying those orientation transformations.

The `.width` and `.height` properties of libvips' images are _before_
any transformations are applied.  Since we intend to use these to hint
to rendering clients the size that the image should be _rendered at_,
change to storing (and providing to clients) the dimensions of the
rendered image, not the stored bytes.
2024-07-24 09:56:42 -07:00
Alex Vandiver
e7ac62aad7 tests: Add a test that the thumb is actually the expected size. 2024-07-24 09:56:42 -07:00
Alex Vandiver
71406ac767 thumbnail: Factor frames into account for IMAGE_BOMB_TOTAL_PIXELS. 2024-07-21 18:41:59 -07:00
Alex Vandiver
aacf28f7e3 test_classes: Extract a thumbnailing output format helper. 2024-07-21 18:41:59 -07:00
Anders Kaseorg
d574200423 tests: Consume streaming responses.
Fixes warnings like “ResourceWarning: unclosed file <_io.FileIO
name='/srv/zulip/var/044e5d44-87aa-4c43-abbb-28a144fa6654/test-backend/run_1238680/worker_0/test_uploads/files/thumbnail/2/1e/jmUuDhQC8WlaSRCuc0zQyx7D/img.tif/100x75.webp'
mode='rb' closefd=True>” with warnings enabled.

deque(…, 0) is an efficient way to consume an iterator documented at
https://docs.python.org/3/library/itertools.html#itertools-recipes
under consume.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-19 09:20:56 -07:00
Alex Vandiver
4351cc5914 thumbnail: Move get_image_thumbnail_path and split_thumbnail_path. 2024-07-18 13:50:28 -07:00
Alex Vandiver
6c624805ce upload: Return the closest-rendered thumbnail. 2024-07-16 13:22:15 -07:00
Alex Vandiver
d121a80b78 upload: Serve thumbnailed images. 2024-07-16 13:22:15 -07:00
Alex Vandiver
2e38f426f4 upload: Generate thumbnails when images are uploaded.
A new table is created to track which path_id attachments are images,
and for those their metadata, and which thumbnails have been created.
Using path_id as the effective primary key lets us ignore if the
attachment is archived or not, saving some foreign key messes.

A new worker is added to observe events when rows are added to this
table, and to generate and store thumbnails for those images in
differing sizes and formats.
2024-07-16 13:22:15 -07:00
Alex Vandiver
7aa5bb233d tests: Clarify tests in test_thumbnail.py. 2024-07-16 13:22:15 -07:00
Vector73
d21ee6fa23 api: Deprecate uri and add url parameter in "/user_uploads" endpoint. 2024-07-14 22:32:36 -07:00
Alex Vandiver
544d3df057 thumbnail: Stop applying MAX_EMOJI_GIF_FILE_SIZE_BYTES before resizing.
b14a33c659 attempted to make the 128k limit apply _after_ resizing,
but left this check, which examines the pre-resized image size.
2024-07-12 13:26:47 -07:00
Alex Vandiver
fa28e3aa0f tests: Split up test_upload.EmojiTest into test_thumbnail. 2024-07-12 13:26:47 -07:00
Alex Vandiver
0dbe111ab3 test_helpers: Switch add/remove_ratelimit to a contextmanager.
Failing to remove all of the rules which were added causes action at a
distance with other tests.  The two methods were also only used by
test code, making their existence in zerver.lib.rate_limiter clearly
misplaced.

This fixes one instance of a mis-balanced add/remove, which caused
tests to start failing if run non-parallel and one more anonymous
request was added within a rate-limit-enabled block.
2023-06-12 12:55:27 -07:00
AcKindle3
b0ef8f0822 test: Replace occurences of uri with url.
In all the tests files, replaced all occurences of `uri` with `url`
appeared in comments, local variablles, function names and their callers.
2023-04-08 16:27:55 -07:00
Zixuan James Li
c34ac1fcd4 typing: Access url via key "Location" instead of attribute "url".
This is a part of #18777.

Signed-off-by: Zixuan James Li <359101898@qq.com>
2022-05-30 11:59:47 -07:00
Anders Kaseorg
6331a314d4 Correctly hyphenate “non-”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-04-27 22:10:31 -07:00
Anders Kaseorg
d58fece832 Correctly hyphenate “web-public”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-04-27 22:10:31 -07:00
Aman Agrawal
b799ec32b0 upload: Allow rate limited access to spectators for uploaded files.
We allow spectators access to uploaded files in web public streams
but rate limit the daily requests to 1000 per file by default.
2022-03-24 10:50:00 -07:00
Anders Kaseorg
405bc8dabf requirements: Remove Thumbor.
Thumbor and tc-aws have been dragging their feet on Python 3 support
for years, and even the alphas and unofficial forks we’ve been running
don’t seem to be maintained anymore.  Depending on these projects is
no longer viable for us.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-05-06 20:07:32 -07:00
Anders Kaseorg
6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg
11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg
4e9d587535 python: Pass query parameters as a dict when making GET requests.
This provides automatic URL-encoding.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-27 13:47:02 -07:00
Anders Kaseorg
31d0141a30 python: Close opened files.
Fixes various instances of ‘ResourceWarning: unclosed file’ with
python -Wd.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-26 12:31:30 -07:00
Anders Kaseorg
72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Anders Kaseorg
61d0417e75 python: Replace ujson with orjson.
Fixes #6507.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:55:12 -07:00
Anders Kaseorg
768f9f93cd docs: Capitalize Markdown consistently.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:23:06 -07:00
Mohit Gupta
08e74558a9 refactor: Rename remaining bugdown word to markdown in .py files.
This commit is part of series of commits aimed at renaming bugdown to
markdown.
2020-06-29 15:03:20 -07:00
Anders Kaseorg
5dc9b55c43 python: Manually convert more percent-formatting to f-strings.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-14 23:27:22 -07:00
Anders Kaseorg
365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg
69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Anders Kaseorg
67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Anders Kaseorg
c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Steve Howell
1b16693526 tests: Limit email-based logins.
We now have this API...

If you really just need to log in
and not do anything with the actual
user:

    self.login('hamlet')

If you're gonna use the user in the
rest of the test:

    hamlet = self.example_user('hamlet')
    self.login_user(hamlet)

If you are specifically testing
email/password logins (used only in 4 places):

    self.login_by_email(email, password)

And for failures uses this (used twice):

    self.assert_login_failure(email)
2020-03-11 17:10:22 -07:00
Steve Howell
c235333041 test performance: Pass in users to api_* helpers.
This reduces query counts in some cases, since
we no longer need to look up the user again. In
particular, it reduces some noise when we
count queries for O(N)-related tests.

The query count is usually reduced by 2 per
API call.  We no longer need to look up Realm
and UserProfile.  In most cases we are saving
these lookups for the whole tests, since we
usually already have the `user` objects for
other reasons.  In a few places we are simply
moving where that query happens within the
test.

In some places I shorten names like `test_user`
or `user_profile` to just be `user`.
2020-03-11 14:18:29 -07:00
Anders Kaseorg
8e37862b69 CVE-2019-19775: Close open redirect in thumbnail view.
This closes an open redirect vulnerability, one case of which was
found by Graham Bleaney and Ibrahim Mohamed using Pysa.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-12 17:29:20 -08:00
Anders Kaseorg
643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Anders Kaseorg
3127fb4dbd zerver/tests: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:43:03 -08:00
Aditya Bansal
26c6ef1834 thumbnails: Fix bug with use of filters in thumbnail generation.
We used to add sharpen filter for all the image sizes whereas it was
intended for resized images only which would have been smoothened
out a bit by the resize operation.
This unnecessary use of the filter used to result in weird issues
with full size images.
For example: Image located at this url:-
http://arqex.com/wp-content/uploads/2015/02/trees.png
When rendered in full size would have just boundaries visible.
2019-01-04 19:06:01 +05:30
rht
a1ff44a230 refactor: Add a helper function to create s3 buckets.
This refactor makes upgrading boto to boto3 easier.
Based on 43d2f6286c
2018-12-07 13:58:11 -08:00
Aditya Bansal
f90f701f03 camo: Change CAMO_URI setting value for test suite.
This is a preparatory commit which will help us with removing camo.
In the upcoming commits we introduce a new endpoint which is based
out on the setting CAMO_URI. Since camo could have been hosted on
a different server as well from the main Zulip server, this change
will help us realise in tests how that scenerio might be dealt with.
2018-10-26 16:51:54 -07:00