Commit Graph

177 Commits

Author SHA1 Message Date
PieterCK
d5e28bcd28 slack_import: Fix thread conversion condition.
Currently, threads in Slack direct messages will increment the
`thread_counter` variable inside the thread conversion logic. Since we
don't treat thread messages in Slack DMs differently than any other DM,
threads in DM will only falsely increment the thread topic names in
channels.

This adds a condition that checks if the Slack message is a DM or not
before executing the thread conversion logic.
2025-03-25 16:38:21 -07:00
Elsa Kihlberg Gawell
845f0d40e1 import_data: Make sure converted DMs don't have topic name.
Previously, `build_message` sets a message's topic name to the given
topic name, regardless of whether the message was a direct message (DM)
or a group direct message (GDM).

This change adds the `is_private` parameter to `build_message`. If
`is_private` is `True`, the `topic_name` will be overridden to an empty
string (""). Consequently, this also updates the third-party importers
to pass this parameter when calling `build_message`.

Co-authored-by: Pieter CK <pieterceka123@gmail.com>
2025-03-25 16:38:21 -07:00
Alex Vandiver
47d55c4b6f messages: Add an is_channel_message flag. 2025-03-18 09:34:11 -07:00
Anders Kaseorg
88a8087243 ruff: Fix PYI019 Use Self instead of custom TypeVar.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-03-14 17:31:50 -07:00
Alex Vandiver
33539568ae slack: Ensure a newline before attachment links.
The content `look![image](https://example.com)` does not render as a
link, nor an image upload (were it to `/user_uploads/...`).  The
`![...](...)` syntax is intended for inline images, but unsupported in
Zulip, and as such does not link or render as _anything_.

Ensure a newline between message content and any attachments.
2025-01-31 14:29:57 -08:00
Mateusz Mandera
f81e514d07 slack: Fetch workspace users from /users.list in the correct manner.
1. Fetching from the `/users.list` endpoint is supposed to use
   pagination. Slack will return at most 1000 results in a single
   request. This means that our Slack import system hasn't worked
   properly for workspaces with more than 1000 users. Users after the
   first 1000 would be considered by our tool as mirror dummies and thus
   created with is_active=False,is_mirror_dummy=True.
   Ref https://api.slack.com/methods/users.list

2. Workspaces with a lot of users, and therefore requiring the use of
   paginated requests to fetch them all, might also get us to run into
   Slack's rate limits, since we'll be doing repeating requests to the
   endpoint.
   Therefore, the API fetch needs to also handle rate limiting errors
   correctly.
   Per, https://api.slack.com/apis/rate-limits#headers, we can just read
   the retry-after header from the rsponse and wait the indicated number
   of seconds before repeating the requests. This is an easy approach to
   implement, so that's what we go with here.
2025-01-24 16:41:53 -08:00
Anders Kaseorg
653b0b0436 ruff: Partially reformat Python with Ruff 0.9 (2025 style).
These are the changes that are backwards compatible with the 2024
style.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2025-01-14 09:42:16 -08:00
PieterCK
a746be807f slack_import: Make check_token_access more flexible.
Previously, the `check_token_access` function had a hardcoded
`required_parameters` variable because it was only used in the Slack
data importer. This commit refactors `required_parameters` into a
function parameter, enabling the function to check a Slack token’s scope
for other purposes, such as Slack webhook integration.

Additionally, this commit changes the Slack API call in
`check_token_access` from `teams.info` to `api.test`. The endpoint is
better suited for this purpose since we're only checking a token’s scope
using the response header here.
2024-12-18 16:11:31 -08:00
PieterCK
f988412394 slack_data_import: Support converting integration bot users.
Currently, we're unable to convert messages from Slack's integration
bots because this message subtype doesn't come from a Slack "user", that
is they don't have a Slack user profile.

This is a preparatory change to support converting Slack's integration
bot messages. This commit artificially creates Slack user data from the
integration bot's "profile" so that we can create a corresponding Zulip
user for them.

Part of #31311.
2024-12-16 13:09:57 -08:00
PieterCK
10946caa3d slack_data_import: Update how Slack user avatars are processed.
Previously, the Slack export converter can only process Slack's avatar
URL from Slack's "ca.slack-edge.com" server, which looks like this:

https://ca.slack-edge.com/T0CDRA6HM3P-U06NABE26M9-1173e04f818e-512

This commit adds support for converting any public downloadable image
URLs.

This is done to support importing Slack's integration bots and their
messages, which typically have PNG type file url:

https://avatars.slack-edge.com/2024-05-01/7057208497908_a4351f6deb91094eac4c_72.png
2024-12-16 13:09:57 -08:00
PieterCK
0d7199b22e data_import: Add migration status file to converted exports.
This commit updates all third-party importer tools (Slack, Mattermost,
and Rocket Chat) in the `zerver/data_import` directory to also output a
migration_status.json file in their output tarball.

This is required because all importable tarball will be checked for
migration compatibility during import.

Fixes #28443.
2024-11-08 15:52:45 -08:00
Mateusz Mandera
420849ff6a slack: Call the correct resize_* function when importing realm icon.
For resizing the icon.png files, we use resize_avatar, not resize_logo.
This is pretty confusing - sure, for icons we use the same function as
for avatars, but we should have a proper name for the function called in
the icon context. So this commit also adds resize_realm_icon, and
changes the calls to resize_avatar in icon contexts to
resize_realm_icon.
2024-11-08 15:43:18 -08:00
PieterCK
6289a551aa data_import: Add email validation to third-party data converters.
This commit makes the third-party data converters check for invalid user
emails. If it finds any, it’ll raise an Exception and show an error
message with all the bad emails listed out.

Fixes: #31783
2024-10-15 16:04:43 -07:00
Alex Vandiver
2c51824b7d slack_import: Strip port from "domain_name".
This lets slack conversions be done on development hosts, which have a
trailing :9991 on their EXTERNAL_HOST; otherwise, we generate fake
emails like `imported-slack-bot@host.name:9991` which fail to
validate.
2024-09-26 12:01:11 -07:00
Alex Vandiver
e68096c907 slack: Protect against zip bombs.
A file which unpacks to more than 10x its original size is suspect,
particularly if that results in an uncompressed size > 1GB.
2024-09-26 12:01:11 -07:00
Alex Vandiver
6f7c14c9ec slack: Check that the archive is shaped the way we expect.
This is some minor protection against malicious zipfiles (e.g. many
very deep directories to chew up inodes), in addition to validation.
2024-09-26 12:01:11 -07:00
Alex Vandiver
d9f868a163 slack: Clean up expanded zipfiles more consistently. 2024-09-26 12:01:11 -07:00
roanster007
c6a06d4684 direct_message_group: Add new group_size field.
This commit adds a new `group_size` field to the `DirectMessageGroup`
model, and backfills its value to each of the existing direct message
groups.

Fixes part of #25713
2024-08-23 11:09:41 -07:00
Anders Kaseorg
e3a191b99b ruff: Fix FURB154 Use of repeated consecutive global, nonlocal.
This is a preview rule, not yet enabled by default.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-14 13:53:18 -07:00
Anders Kaseorg
b96feb34f6 ruff: Fix SIM117 Use a single with statement with multiple contexts.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-14 13:48:32 -07:00
Anders Kaseorg
0fa5e7f629 ruff: Fix UP035 Import from collections.abc, typing instead.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg
531b34cb4c ruff: Fix UP007 Use X | Y for type annotations.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg
e08a24e47f ruff: Fix UP006 Use list instead of List for type annotation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Alex Vandiver
0442e95276 emoji: Use a non-predictable filename.
We use a truncated SHA256 of the id and a server-side secret to make
emoji have non-guessable filenames, while also making collisions
unlikely.

We also adjust the Slack import to use the same SHA-based name,
instead of taking the same name as it had in Slack.
2024-07-12 13:26:47 -07:00
Alex Vandiver
5ae34dc42b slack: Store the content-type of realm icons. 2024-07-11 07:31:39 -07:00
roanster007
52692a6448 refactor: Rename huddle to direct_message_group in non API.
This commit performs a sweep on the first batch of non API
files to rename "huddle" to "direct_message_group`.

It also renames variables and methods of type -
"huddle_message" to "group_direct_message".

This is a part of #28640
2024-07-04 07:56:31 -07:00
Alex Vandiver
17fb23746f upload: Move methods into zerver.lib.upload from .base. 2024-06-26 16:43:11 -07:00
Alex Vandiver
0153d6dbcd thumbnailing: Move resizing functions into zerver.lib.thumbnail. 2024-06-20 23:06:08 -04:00
Anders Kaseorg
b545abe1e2 typos: Fix typos caught by mwic.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-05-20 13:55:00 -07:00
Lauryn Menard
b714bd9eec help: Rename and redirect set-default-streams-for-new-users for channel. 2024-05-03 13:02:20 -07:00
John Lu
a5cf0ec526 refactor: Replace HUDDLE with DIRECT_MESSAGE_GROUP.
Replaced HUDDLE attribute with DIRECT_MESSAGE_GROUP using VS Code search,
part of a general renaming of the object class.

Fixes part of #28640.

Co-authored-by: JohnLu2004 <JohnLu10212004@gmail.com>
2024-03-21 16:39:33 -07:00
Anders Kaseorg
53e80c41ea ruff: Fix SIM113 Use enumerate() for index variable in for loop.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-02-02 10:30:45 -08:00
Anders Kaseorg
712917b2c9 ruff: Fix RUF019 Unnecessary key check before dictionary access.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-02-02 10:30:45 -08:00
Alex Vandiver
1517601e9d import: Merge duplicate slack email addresses.
It is possible to have multiple users with the same email address --
for instance, when two users are guests in shared channels via two
different other Slack instances.

Combine those Slack user-ids into one Zulip user, by their user-id;
otherwise, we run into problems during import due to duplicate keys.
2024-01-22 16:34:59 -08:00
Alex Vandiver
09146b1b8f import: Show slack user-ids. 2024-01-22 16:34:59 -08:00
Alex Vandiver
7cc4b023f2 import: Support shared users in huddles/DMs.
1e5c49ad82 added support for shared channels -- but some users may
only currently exist in DMs or MPIMs, and not in channel membership.

Walk the list of MPIM subscriptions and messages, as well as DM users,
and add any such users to the set of mirror dummy users.
2024-01-22 16:34:59 -08:00
Anders Kaseorg
8a7916f21a python: Consistently use from…import for datetime.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 12:01:18 -08:00
Anders Kaseorg
4cb2eded68 typos: Fix typos caught by typos.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-10-09 11:55:16 -07:00
Alex Vandiver
c3cf1cd9d8 data_import: Skip Slack "Posts" which we don't otherwise support. 2023-09-29 12:34:07 -07:00
Anders Kaseorg
2665a3ce2b python: Elide unnecessary list wrappers.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-13 12:41:23 -07:00
Anders Kaseorg
c4748298bb ruff: Fix PERF102 Using only the keys/values of a dict.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 17:23:55 -07:00
Anders Kaseorg
c2c96eb0cf python: Annotate type aliases with TypeAlias.
This is not strictly necessary but it’s clearer and improves mypy’s
error messages.

https://docs.python.org/3/library/typing.html#typing.TypeAlias
https://mypy.readthedocs.io/en/stable/kinds_of_types.html#type-aliases

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 10:02:49 -07:00
Anders Kaseorg
50e6cba1af ruff: Fix UP032 Use f-string instead of format call.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 16:14:59 -07:00
Steve Howell
67cdf1a7b4 emojis: Use get_emoji_data.
The previous function was poorly named, asked for a
Realm object when realm_id sufficed, and returned a
tuple of strings that had different semantics.

I also avoid calling it duplicate times in a couple
places, although it was probably rarely the case that
both invocations actually happened if upstream
validations were working.

Note that there is a TypedDict called EmojiInfo, so I
chose EmojiData here.  Perhaps a better name would be
TinyEmojiData or something.

I also simplify the reaction tests with a verify
helper.
2023-07-17 09:35:53 -07:00
Alex Vandiver
21aeb4a040 slack: Handle the special case of permissions denied on team.info call.
This is a follow-up to 4c8915c8e4, for
the case when the `team:read` permission is missing, which causes the
`team.info` call itself to fail.  The error message supplies
information about the provided and missing permissions -- but it also
still sends the `X-OAuth-Scopes` header which we normall read, so we can
use that as normal.
2023-06-27 11:04:41 -07:00
Alex Vandiver
4c8915c8e4 slack: Provide more information when a Slack token fails to validate. 2023-06-23 11:09:45 -07:00
rht
1c84f02f57 slack import: Convert threads to nicely named Zulip topics.
Fixes #9006.
2023-05-30 16:35:19 -07:00
Anders Kaseorg
9797de52a0 ruff: Fix RUF010 Use conversion in f-string.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-05-26 22:09:18 -07:00
Alex Vandiver
567d1d54e7 upload: Rename upload_message_file to use word "attachment".
For consistency with the table, which is named Attachment.
2023-03-02 16:36:19 -08:00
Alex Vandiver
fe654b76b7 data_import: Stop tar'ing up converted data.
`./manage.py import` does not take a tarball; it takes a directory.
Making a separate tarball is a waste of CPU time and disk, as it is
never used.

This was included in the commit of the initial Slack conversion code
in 5b37c5562b and propagated from there into every conversion tool.

Remove the unnecessary tarball creation.
2023-02-26 17:42:01 -08:00