Commit Graph

185 Commits

Author SHA1 Message Date
Aman Agrawal
4cca5652e3 slack_import: Pipe file processing error message to the user.
When the slack import fails due to invalid zip file being uploaded,
we take user back to the file upload page with an appropriate
error message.
2025-06-16 10:46:25 -07:00
Aman Agrawal
b57b783dd8 slack: Don't show error code to users.
We log the error internally and only show invalid token as the
error message.
2025-05-28 17:18:07 -07:00
Aman Agrawal
68372f8e03 slack: Change invalid token error message. 2025-05-28 17:18:07 -07:00
PieterCK
0dfb709152 slack_data_import: Support converting integration bot messages.
Integration bot messages in Slack may include "blocks" and
"attachments," which are Slack's messaging features.
Currently, these messages aren't processed when converting
Slack export data.

This commit adds support for converting integration bot
messages, as well as other Slack messages containing "blocks"
and "attachments".

Message payload with the block type `rich_text` is skipped because all
messages sent by users have this format.

Fixes #31162.

[1]=https://docs.slack.dev/reference/block-kit/blocks/rich-text-block/
2025-05-23 14:27:31 -07:00
Mateusz Mandera
a52bc4d71b slack: Handle integration bots with missing data.
We encountered the following two new cases with integration bots in
Slack imports:
1. Bots without the image_72 field in their data. Such bots should fall
   back to gravatar.
2. Bots whose bot_id is the sender of certain messages, but querying the
   bots.info endpoint returns bot_not_found error. We should create
   dummy accounts in place of such bots.
2025-05-16 13:06:28 -07:00
Aman Agrawal
136c0f1c44 registration: Enable import from slack using realm registration form.
Co-authored-by: Alex Vandiver <alexmv@zulip.com>
Co-authored-by: Tim Abbott <tabbott@zulip.com>
2025-05-14 13:24:38 -07:00
userAdityaa
354a16fb0a migration: Rename 'populate_db' Client to 'ZulipDataImport'.
This commit:

* Creates a migration to rename any existing Client with
name="populate_db" to "ZulipDataImport".
* Updates populate_db.py to use ZulipDataImport for new
message creation

These changes should make code to identify imported messages
considerably more readable.

Fixes #33909.
2025-05-08 12:18:34 -07:00
Elsa Kihlberg Gawell
845f0d40e1 import_data: Make sure converted DMs don't have topic name.
Previously, `build_message` sets a message's topic name to the given
topic name, regardless of whether the message was a direct message (DM)
or a group direct message (GDM).

This change adds the `is_private` parameter to `build_message`. If
`is_private` is `True`, the `topic_name` will be overridden to an empty
string (""). Consequently, this also updates the third-party importers
to pass this parameter when calling `build_message`.

Co-authored-by: Pieter CK <pieterceka123@gmail.com>
2025-03-25 16:38:21 -07:00
Alex Vandiver
9b4b53ef29 slack: Mark content-type of imported attachments. 2025-01-31 14:29:57 -08:00
Alex Vandiver
33539568ae slack: Ensure a newline before attachment links.
The content `look![image](https://example.com)` does not render as a
link, nor an image upload (were it to `/user_uploads/...`).  The
`![...](...)` syntax is intended for inline images, but unsupported in
Zulip, and as such does not link or render as _anything_.

Ensure a newline between message content and any attachments.
2025-01-31 14:29:57 -08:00
Mateusz Mandera
f81e514d07 slack: Fetch workspace users from /users.list in the correct manner.
1. Fetching from the `/users.list` endpoint is supposed to use
   pagination. Slack will return at most 1000 results in a single
   request. This means that our Slack import system hasn't worked
   properly for workspaces with more than 1000 users. Users after the
   first 1000 would be considered by our tool as mirror dummies and thus
   created with is_active=False,is_mirror_dummy=True.
   Ref https://api.slack.com/methods/users.list

2. Workspaces with a lot of users, and therefore requiring the use of
   paginated requests to fetch them all, might also get us to run into
   Slack's rate limits, since we'll be doing repeating requests to the
   endpoint.
   Therefore, the API fetch needs to also handle rate limiting errors
   correctly.
   Per, https://api.slack.com/apis/rate-limits#headers, we can just read
   the retry-after header from the rsponse and wait the indicated number
   of seconds before repeating the requests. This is an easy approach to
   implement, so that's what we go with here.
2025-01-24 16:41:53 -08:00
PieterCK
a746be807f slack_import: Make check_token_access more flexible.
Previously, the `check_token_access` function had a hardcoded
`required_parameters` variable because it was only used in the Slack
data importer. This commit refactors `required_parameters` into a
function parameter, enabling the function to check a Slack token’s scope
for other purposes, such as Slack webhook integration.

Additionally, this commit changes the Slack API call in
`check_token_access` from `teams.info` to `api.test`. The endpoint is
better suited for this purpose since we're only checking a token’s scope
using the response header here.
2024-12-18 16:11:31 -08:00
PieterCK
f988412394 slack_data_import: Support converting integration bot users.
Currently, we're unable to convert messages from Slack's integration
bots because this message subtype doesn't come from a Slack "user", that
is they don't have a Slack user profile.

This is a preparatory change to support converting Slack's integration
bot messages. This commit artificially creates Slack user data from the
integration bot's "profile" so that we can create a corresponding Zulip
user for them.

Part of #31311.
2024-12-16 13:09:57 -08:00
PieterCK
10946caa3d slack_data_import: Update how Slack user avatars are processed.
Previously, the Slack export converter can only process Slack's avatar
URL from Slack's "ca.slack-edge.com" server, which looks like this:

https://ca.slack-edge.com/T0CDRA6HM3P-U06NABE26M9-1173e04f818e-512

This commit adds support for converting any public downloadable image
URLs.

This is done to support importing Slack's integration bots and their
messages, which typically have PNG type file url:

https://avatars.slack-edge.com/2024-05-01/7057208497908_a4351f6deb91094eac4c_72.png
2024-12-16 13:09:57 -08:00
PieterCK
0d7199b22e data_import: Add migration status file to converted exports.
This commit updates all third-party importer tools (Slack, Mattermost,
and Rocket Chat) in the `zerver/data_import` directory to also output a
migration_status.json file in their output tarball.

This is required because all importable tarball will be checked for
migration compatibility during import.

Fixes #28443.
2024-11-08 15:52:45 -08:00
PieterCK
6289a551aa data_import: Add email validation to third-party data converters.
This commit makes the third-party data converters check for invalid user
emails. If it finds any, it’ll raise an Exception and show an error
message with all the bad emails listed out.

Fixes: #31783
2024-10-15 16:04:43 -07:00
Alex Vandiver
d9f868a163 slack: Clean up expanded zipfiles more consistently. 2024-09-26 12:01:11 -07:00
Lauryn Menard
d431a5aad6 audit-log: Move user group event types to AuditLogEventType enum.
Event types moved: USER_GROUP_CREATED, USER_GROUP_DELETED
USER_GROUP_DIRECT_USER_MEMBERSHIP_ADDED,
USER_GROUP_DIRECT_USER_MEMBERSHIP_REMOVED,
USER_GROUP_DIRECT_SUBGROUP_MEMBERSHIP_ADDED,
USER_GROUP_DIRECT_SUBGROUP_MEMBERSHIP_REMOVED,
USER_GROUP_DIRECT_SUPERGROUP_MEMBERSHIP_ADDED,
USER_GROUP_DIRECT_SUPERGROUP_MEMBERSHIP_REMOVED,
USER_GROUP_NAME_CHANGED, USER_GROUP_DESCRIPTION_CHANGED,
USER_GROUP_GROUP_BASED_SETTING_CHANGED
2024-09-09 11:50:13 -07:00
Lauryn Menard
10d161638e audit-log: Move subscription event types to AuditLogEventType enum.
Event types moved: SUBSCRIPTION_CREATED, SUBSCRIPTION_ACTIVATED,
SUBSCRIPTION_DEACTIVATED, SUBSCRIPTION_PROPERTY_CHANGED.
2024-09-09 11:50:13 -07:00
Lauryn Menard
56c8cbde1e audit-log: Move realm event types to AuditLogEventType enum.
Event types moved: REALM_CREATED, REALM_DEFAULT_USER_SETTINGS_CHANGED
REALM_ORG_TYPE_CHANGED, REALM_DOMAIN_ADDED, REALM_DOMAIN_CHANGED
REALM_DOMAIN_REMOVED, REALM_PLAYGROUND_ADDED, REALM_PLAYGROUND_REMOVED
REALM_LINKIFIER_ADDED, REALM_LINKIFIER_CHANGED, REALM_LINKIFIER_REMOVED
REALM_EMOJI_ADDED, REALM_EMOJI_REMOVED, REALM_LINKIFIERS_REORDERED
REALM_IMPORTED
2024-09-09 11:50:13 -07:00
Lauryn Menard
d2c32f23db audit-log: Move realm event types to AuditLogEventType enum.
Event types moved: REALM_DEACTIVATED, REALM_REACTIVATED, REALM_SCRUBBED
REALM_PLAN_TYPE_CHANGED, REALM_LOGO_CHANGED, REALM_EXPORTED
REALM_PROPERTY_CHANGED, REALM_ICON_SOURCE_CHANGED, REALM_DISCOUNT_CHANGED
REALM_SPONSORSHIP_APPROVED, REALM_BILLING_MODALITY_CHANGED
REALM_REACTIVATION_EMAIL_SENT, REALM_SPONSORSHIP_PENDING_STATUS_CHANGED
REALM_SUBDOMAIN_CHANGED
2024-09-09 11:50:13 -07:00
Anders Kaseorg
3f29bc42b1 ruff: Fix B905 zip() without an explicit strict= parameter.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg
0fa5e7f629 ruff: Fix UP035 Import from collections.abc, typing instead.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg
e08a24e47f ruff: Fix UP006 Use list instead of List for type annotation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Alex Vandiver
5ae34dc42b slack: Store the content-type of realm icons. 2024-07-11 07:31:39 -07:00
roanster007
52692a6448 refactor: Rename huddle to direct_message_group in non API.
This commit performs a sweep on the first batch of non API
files to rename "huddle" to "direct_message_group`.

It also renames variables and methods of type -
"huddle_message" to "group_direct_message".

This is a part of #28640
2024-07-04 07:56:31 -07:00
Prakhar Pratyush
00474608c5 zulip_update: Send group DM for realm imported from other product.
When the export is NOT generated by another zulip server,
while importing:
* Set the 'zulip_update_announcements_level' to the latest level
as we don't want to send all the older update messages to them.

* Send a group DM to admins, suggesting them to configure the
stream in order to avoid missing future update messages.

Fixes #29041.
2024-05-08 17:05:59 -07:00
Ujjawal Modi
706c380971 audit_log: Update audit log entries created while creating a group.
Earlier a extra audit log entry of type
USER_GROUP_GROUP_BASED_SETTING_CHANGED was made when a new user
group is created. This commit updates the code to not create
that audit log entry.

There is no need to create these entry as we would still
have the required data from the "OLD_VALUE" field in the
audit log entry created when changing the setting and this
also makes it consistent with the entries created for
other operations like stream creation.
2024-04-19 10:18:45 -07:00
Alex Vandiver
7cc4b023f2 import: Support shared users in huddles/DMs.
1e5c49ad82 added support for shared channels -- but some users may
only currently exist in DMs or MPIMs, and not in channel membership.

Walk the list of MPIM subscriptions and messages, as well as DM users,
and add any such users to the set of mirror dummy users.
2024-01-22 16:34:59 -08:00
Anders Kaseorg
cd96193768 models: Extract zerver.models.realms.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Prakhar Pratyush
dd8a33f03e import_realm: Create audit log with user count data.
This commit creates a RealmAuditlog entry with a new event_type
'RealmAuditLog.REALM_IMPORTED' after the realm is reactivated.

It contains user count data (using realm_user_count_by_role)
stored in extra_data.

This helps to have an accurate user count data for the billing
system if someone tries to signup just after doing an import.
2023-12-11 15:03:24 -08:00
Anders Kaseorg
223b626256 python: Use urlsplit instead of urlparse.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 13:03:07 -08:00
Mateusz Mandera
ff81e6bf32 import_util: Remove uuid and uuid_owner_secret from realm dict. 2023-10-18 11:00:49 -07:00
Anders Kaseorg
c4748298bb ruff: Fix PERF102 Using only the keys/values of a dict.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 17:23:55 -07:00
Zixuan James Li
e9e18454d2 user_groups: Populate membership audit logs during realm creation.
This tracks user group membership changes when the realm is first set
up, either through an import or not. This happens when we add users to
the system user groups by their roles.

For an imported realm, we do extra handling when the data doesn't include
user groups. This gets audited as well.
2023-07-13 11:55:38 -07:00
Zixuan James Li
3349ac9f86 user_groups: Audit UserGroup group based setting changes.
This add audit log entries when any group based setting of a user group
is updated. We store both the old and new values in extra_data, along
with the name of that setting. Entries populated during user group creation
are hardcoded to track "can_mention_group".

Potentially we can adjust "set_defaults_for_group_settings" so that it
populates realm audit logs with it, but that is out of scope for this change.

We use an atomic transaction so that the audit logs are committed
together with the updates.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li
3035854dca user_groups: Audit UserGroup supergroup memberships changes.
This is mostly the same as tracking subgroup changes, except that now
modified_user_group is the subgroup.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li
ad698d597a user_groups: Audit UserGroup subgroup memberships changes.
It's worth noting that instead of adding another field to the
RealmAuditLog model, we store the modified subgroup ids in extra_data as
a JSON encoded dict with the key "subgroup_ids". We don't create audit
log entries for supergroup changes at this point.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li
63f5936207 user_groups: Audit UserGroup creation.
We also create RealmAuditLog entries for the initial memberships that
get added along with the creation of a UserGroup. System user groups are
not created with members so no audit logs are populated for that.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Alex Vandiver
21aeb4a040 slack: Handle the special case of permissions denied on team.info call.
This is a follow-up to 4c8915c8e4, for
the case when the `team:read` permission is missing, which causes the
`team.info` call itself to fail.  The error message supplies
information about the provided and missing permissions -- but it also
still sends the `X-OAuth-Scopes` header which we normall read, so we can
use that as normal.
2023-06-27 11:04:41 -07:00
Alex Vandiver
4c8915c8e4 slack: Provide more information when a Slack token fails to validate. 2023-06-23 11:09:45 -07:00
Alex Vandiver
1b2ba4e09d test_slack_importer: Switch to xoxb tokens, which is what we accept. 2023-06-23 11:09:45 -07:00
rht
1c84f02f57 slack import: Convert threads to nicely named Zulip topics.
Fixes #9006.
2023-05-30 16:35:19 -07:00
Mateusz Mandera
ffa3aa8487 auth: Rewrite data model for tracking enabled auth backends.
So far, we've used the BitField .authentication_methods on Realm
for tracking which backends are enabled for an organization. This
however made it a pain to add new backends (requiring altering the
column and a migration - particularly troublesome if someone wanted to
create their own custom auth backend for their server).

Instead this will be tracked through the existence of the appropriate
rows in the RealmAuthenticationMethods table.
2023-04-18 09:22:56 -07:00
Alex Vandiver
fe654b76b7 data_import: Stop tar'ing up converted data.
`./manage.py import` does not take a tarball; it takes a directory.
Making a separate tarball is a waste of CPU time and disk, as it is
never used.

This was included in the commit of the initial Slack conversion code
in 5b37c5562b and propagated from there into every conversion tool.

Remove the unnecessary tarball creation.
2023-02-26 17:42:01 -08:00
Anders Kaseorg
df001db1a9 black: Reformat with Black 23.
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.

(This does not actually upgrade our Python environment to Black 23
yet.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-02 10:40:13 -08:00
Alex Vandiver
92c8c17190 import: Add the UTF-8 flag on file entries in zipfiles from Slack.
Fixes: #22533.
2023-01-31 16:07:48 -08:00
Anders Kaseorg
8f7a7877fe python: Clean up janky URL matching code with urlsplit.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-18 17:25:46 -05:00
Mateusz Mandera
cefed552f6 test_slack_importer: Add assertion about message count.
This will help catch any future regression that might lead the import
tool to fail to import messages into the correct realm.
2022-10-07 10:10:01 -07:00
rht
a7cff0f091 Slack import: Translate to emoji name to codepoint using iamcal data.
Because Slack emoji naming is different from Zulip's.
According to https://emojipedia.org/slack/, Slack's emoji shortcodes are
derived from https://github.com/iamcal/emoji-data.
There are probably some deviations from that dataset, but this PR should
at least catch the ones that are identical to iamcal's.
2022-09-17 12:04:07 -07:00