Messages are rendered outside of a transaction, for performance
reasons, and then sent inside of one. This opens thumbnailing up to a
race where the thumbnails have not yet been written when the message
is rendered, but the message has not been sent when thumbnailing
completes, causing `rewrite_thumbnailed_images` to be a no-op and the
message being left with a spinner which never resolves.
Explicitly lock and use he ImageAttachment data inside the
message-sending transaction, to rewrite the message content with the
latest information about the existing thumbnails.
Despite the thumbnailing worker taking a lock on Message rows to
update them, this does not lead to deadlocks -- the INSERT of the
Message rows happens in a transaction, ensuring that either the
message rending blocks the thumbnailing until the Message row is
created, or that the `rewrite_thumbnailed_images` and Message INSERT
waits until thumbnailing is complete (and updated no Message rows).
(cherry picked from commit 6f20c15ae9)
requests transforms the base urllib3 exceptions into
requests.RequestExceptions -- but only within code that it is
running. When parsing the streaming body in fetch_open_graph_image,
the read itself (inside lxml) may trigger urllib3 to raise its own
timeout error -- which escapes the current catch of
requests.RequestExceptions.
Catch both requests and urllib3 exceptions.
If an invalid timezone (such as +32h) was provided, the
timestamp.astimezone call would throw an exception, causing the
message send to fail. Replace that with a user-facing error.
Fixes#19658.
In #23380, we are changing all occurrences of uri with url in order to
follow the latest URL standard. Previous PRs #25038 and #25045 has
replaced the occurences of uri that has no direct relation with realm.
This commit changes just the model property, which has no API
compatibility concerns.
This timeout strategy using asynchronous exceptions has a number of
safety caveats (read the docstring!!) and should only be used in very
specific circumstances.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
soupsieve is a heavy-weight dependency, and Tornado pulls it in by way
of markdown rendering; since we are only using it for a very simple
process, perform that manually.
Per CSS spec[^1]:
> In quoted <string> url()s, only newlines and the character used to
> quote the string need to be escaped.
[^1]: https://drafts.csswg.org/css-values/#urls
`<time:1234567890123>` causes a "signed integer is greater than
maximum" exception from dateutil.parser; datetime also cannot handle
it ("year 41091 is out of range") but that is a ValueError which is
already caught.
Catch the OverflowError thrown by dateutil.
549dd8a4c4 changed the regex that we build to contain whitespace for
readability, and strip that back out before returning it.
Unfortunately, this also serves to strip out whitespace in the source
linkifier, causing it to not match expected strings.
Revert 549dd8a4c4.
Fixes: #27854.
Now, the topic wildcard mention follows the following
rules:
* If the topic has less than 15 participants , anyone
can use @ topic mentions.
* For more than 15, the org setting 'wildcard_mention_policy'
determines who can use @ topic mentions.
Earlier, topic wildcard mentions followed the same restriction
as stream wildcard mentions, which was incorrect.
Fixes part of #27700.
This reverts commit 091e2f177b.
This version of python_to_js_linkifier fails for at least some real
linkifiers. We'll likely re-introduce this after a bit more debugging.
Since the server-side implementation no longer uses look-ahead
or (more importantly) look-behind, it is possible to exactly implement
in Javascript. This removes a common class which would prevent local
echo.
This requires reworking the topic linking algorithm, to march the
server's as well. The tests and behaviour are adjusted in so doing --
previously, the JS implementation would have linked `#foo` with a
`foo` regex on the linkifier, but the server implementation would not
have.
Previously, when a deactivated user was mentioned, he wasn't
rendered as a Pill. This is because the dataset for validating mentions
only included active users, which is fixed by removing that filter.
To allow only silent mentions of them, an extra is_active property
added to FullNameInfo class, which is populated from the query,
which tells if user is deactivated. This is used to convert any
mentions of them to silent mentions in the backend markdown.
Fixes#26857
When searching for links inside a topic name, the question mark (?)
was used to split the topic. If a URL had a query after the URL
(e.g., "?foo=bar"), then the query was trimmed from the URL.
Removing the question mark from `basic_link_splitter` is sufficient
to fix this issue. The `get_web_link_regex` function then removes
the trailing punctuation if any, including literal question marks.
Fixes#26368.
Fixes#11767.
Previously multi-character emoji sequences weren't matched in the
emoji regex, so we'd convert the characters to separate images,
breaking the intended display.
This change allows us to match the full emoji sequence, and
therefore show the correct image.
This commit completes the notifications part of the @topic
wildcard mention feature.
Notifications are sent to the topic participants for the
@topic wildcard mention.
The active realm emoji are just a subset of all your
realm emoji, so just use a single cache entry per
realm.
Cache misses should be very infrequent per realm.
If a realm has lots of deactivated realm emoji, then
there's a minor expense to deserialize them, but that
is gonna be dwarfed by all the other more expensive
operations in message-send.
I also renamed the two related functions. I erred on
the side of using somewhat verbose names, as we don't
want folks to confuse the two use cases. Fortunately
there are somewhat natural affordances to use one or
the other, and mypy helps too.
Finally, I use realm_id instead of realm in places
where we don't need the full Realm object.
This commit adds a boolean field `mentions_topic_wildcard`
to the `MessageRenderingResult` dataclass.
The field is set to true only if message rendering determines
the message has an actual topic wildcard mention in it (and not,
e.g., topic wildcard mention syntax inside a code block).
The rendered content for topic wildcard mention is
'<span class="topic-mention">{wildcard}</span>'.
The 'topic-mention' class is the identifier for the wildcard
mention being a topic wildcard mention.
We don't use 'data-user-id="*"' and "user-mention" class for
topic wildcard mentions and eventually plan to remove them for
stream wildcard mentions too in a separate mini-project.
Updates find_proper_insertion_index to check for the inline image
classes as matching at least one of the classes in the element's
attrib["class"] so that cases where an inline preview image has
multiple classes, like YouTube video previews, will have the
correct insertion index.
Fixes#26186.