Commit Graph

294 Commits

Author SHA1 Message Date
Alex Vandiver
755cb7d854 export: Move all queries, when possible, to iterators.
This reduces overall memory usage for large exports.
2025-10-01 11:21:34 -07:00
Alex Vandiver
cf33119348 export: Remove export-most-recent symlink.
The only callsite of do_export_realm calls `rmtree` on the output
path, which means this symlink is always dangling.  Since realms can
also be exported by end-users, following it would always be a race
condition, anyways.

Remove it.
2025-10-01 11:21:34 -07:00
Sahil Batra
764f4aa2e0 groups: Use realm_for_sharding for limiting NamedUserGroup queries.
For get and filter queries of NamedUserGroup, realm_for_sharding
field is used instead of realm field, as directly using
realm_for_sharding field on NamedUserGroup makes the query faster
than using realm present on the base UserGroup table.
2025-09-23 12:15:53 -07:00
Mateusz Mandera
6ab30fcced import: Set Stream.subscriber_count for imported channels.
The import code wasn't setting subscriber_count at all, resulting in a
value of 0 when dealing with imports from 3rd party apps.
We run this unconditionally, also for imports from Zulip, since this
ensures we won't inherit incorrect values, if the data we're import has
them - e.g. due to some bugs that affected the server the data came from.
2025-08-29 11:58:17 -07:00
Mateusz Mandera
a61d849e37 ldap: Implement external auth id auth+sync.
Fixes #24104.
2025-07-09 15:31:17 -07:00
Aditya Kumar Kasaudhan
c5f126c6ff navigation_views: Add backend for navigation views in left sidebar.
Fixes part of #32077.
2025-05-23 16:25:08 -07:00
Sahil Batra
202bebda89 streams: Add folder foreign key to Stream table.
This commit also adds "folder_id" field in stream and subscription
objects.

Fixes part of #31972.
2025-05-20 13:25:06 -07:00
Sahil Batra
350f6a1fa1 streams: Add ChannelFolder table.
Fixes part of #31972.
2025-05-20 13:25:06 -07:00
bedo
c0a9ca8e9a tests: Pass update_fields to all stream.save().
A prep PR to 34308.

Explicitly pass the fields to be updated,
This increases performance but most importantly
prevents overwriting the db-saved value of
"subscriber_count" field (added in an upcoming PR)
with the in-memory default value of 0,
since "subscriber_count" will only be updted
via the db.

Migrate some tests to use do_ functions instead of
direclty modifying the state.
2025-04-15 10:28:18 -07:00
Mateusz Mandera
9864eee029 export: Add detailed tests for export of public vs private data.
Adds detailed tests for the work in the prior commits fixing the
treatment of private data in various tables in exports with consent and
public exports.
2025-03-28 17:44:58 -07:00
PieterCK
719f8db654 migration_status: Refactor parse_migration_status.
This refactors `parse_migration_status` to copy the algorithm of
Django's `showmigrations` command instead of parsing its output. This is
done so that the code is not susceptible to breaking changes if Django
modifies showmigrations's implementation.

The previous `parse_migration_status` logic has been repurposed into a
test utility function (`prase_showmigrations`). It is used to verify
that the new `parse_migration_status` generates output identical to the
actual `showmigrations` command.

The `test_clean_up_migration_status_json` is removed because
`test_parse_migration_status` has covered that behavior.
2025-03-20 10:57:54 -07:00
Prakhar Pratyush
c70a3ffb0a populate_db: Mark onboarding steps as seen for existing users.
This commit updates populate_db to mark all onboarding steps as seen
for existing users to avoid unnecessary popups during development.

Developers should create a new user in development environment
to test the onboarding steps.
2025-03-13 14:38:16 -07:00
Mateusz Mandera
c031cf9275 import: Fix export/import of SavedSnippet.
This table's export and import weren't working:
1. It didn't have a Config in export.py, so it wasn't exported at all.
2. Its `date_created` wasn't registered in `DATE_FIELDS`.
3. It wasn't registered in `ID_MAPS` in import_realm.py, so having any
   SavedSnippets in the export would cause the import to fail with an
   exception.
4. It was missing a `fix_datetime_fields` call in its import codepath.
2025-03-10 13:07:56 -07:00
Mateusz Mandera
acb1731bf9 export: Fix export of child tables when exporting mirror dummy users.
Without this change, the child tables of UserProfile didn't get their
objects exported if those objects were tied to a mirror dummy user.

For example, a `Recipient` of type `PERSONAL`, or the associated
`Subscription` would not get exported. Same for other tables with
foreign keys to `UserProfile` - such as `UserPresence`.

This happened because the Configs for the export are defined as follows:

```python
user_profile_config = Config(
    custom_tables=[
        "zerver_userprofile",
        "zerver_userprofile_mirrordummy",
    ],
    # set table for children who treat us as normal parent
    table="zerver_userprofile",
    virtual_parent=realm_config,
    custom_fetch=custom_fetch_user_profile,
)

user_subscription_config = Config(
    table="_user_subscription",
    model=Subscription,
    normal_parent=user_profile_config,
    filter_args={"recipient__type": Recipient.PERSONAL},
    include_rows="user_profile_id__in",
)

Config(
    table="_user_recipient",
    model=Recipient,
    virtual_parent=user_subscription_config,
    id_source=("_user_subscription", "recipient"),
)
```

while in `export_from_config` we have:
```python
    elif config.normal_parent:
        # In this mode, our current model is figuratively Article,
        # and normal_parent is figuratively Blog, and
        # now we just need to get all the articles
        # contained by the blogs.
        model = config.model
        assert parent is not None
        assert parent.table is not None
        assert config.include_rows is not None
        parent_ids = {r["id"] for r in response[parent.table]}
```

This meant that when processing a table with
`normal_parent=user_profile_config`, the `parent_ids` above would only
have the ids of `UserProfile` objects under the `zerver_userprofile` key in
the exported data - completely missing those in
`zerver_userprofile_mirrordummy`.
2025-03-10 13:07:56 -07:00
PieterCK
2d6426100f import-export: Rework how we write migration_status.json.
The current `get_migration_by_app` has a rather naive approach to
compiling the migration status of a realm, which has led to issues like
#32826. Specifically, those flaws are:

- it does not report the complete state of the migration status of the
exporting servers, only the applied migration.
- it shows both the replaced and the squashed migrations. This would be
a problem if we decide to clean up old migration files we've
squashed(replaced) and import a slightly older realm with those still in
disk. `check_migration_status` would complain of incompatibility even
though those migration files don't matter (they are replaced, after
all).
- it does not clean up ancient/stale applied migrations (for reference,
see how `check-database-compatibility` cleans those)

This commit attempts to write a better `migration_status.json` by
parsing the output of `showmigrations` instead.

This is because Django's `showmigrations` has a lot more logic and
validations baked into it than previously thought. Ones that we care
about are:

- it does validations to make sure app names are valid
- it doesn't list replaced migrations and only squashed one
- it takes into account migrations in disk(`MigrationsLoader`) vs
applied migrations (`MigrationsRecorder`)

Which would resolve the first two points highlighted above.
2025-01-24 17:08:37 -08:00
PieterCK
4db7ea2296 migration_status: Add parse_migration_status.
This commit adds `parse_migration_status`, which takes in the string
output of `showmigrations` and parse it into key-value pair of installed
apps and a list of its migration status.

This is a prep commit to rework the check migrations function of
import/export which will parse the output of `showmigrations` to write
the `migration_status.json` file.
2025-01-24 17:08:37 -08:00
PieterCK
0b2f5c638d export_test: Prevent migration status fixtures from going stale.
Currently if for what ever reason one decided to change how
`migration_status.json` is written, the check migrations tests in
`test_import_export.py` will happily just use the old and potentially
stale migration status test fixtures in
`fixtures/applied_migrations_fixtures` against other stale fixtures and
run the check migrations tests in a bubble.

This adds an assertion in `verify_migration_status_json` that makes sure
all the migration status fixtures we use in the tests resembles the
actual `migration_status.json` file our export tool will write.
2025-01-24 17:08:37 -08:00
Alex Vandiver
230bae17bb thumbnail: Generate a transcoded high-res version of HEIC/TIFF images.
If the content-type of the image is not in INLINE_MIME_TYPES, then we
do not expect browsers to be able to display it.  This behaviour is
particularly confusing because the thumbnail will render properly,
since that will be in the more widely-supported WebP format, but the
lightbox will show a broken image.

In these cases, generate a high-resolution (4032x3024) "thumbnail"
which clients can choose to use instead.  This thumbnail format is not
in the listed in the server's advertised thumbnail size list, because
it is not reliably generated for every image.

The transcoded thumbnail format is set on the `img` tag if it is
generated, and the original content-type is always passed to the
client, so it can decide how or if to render the original image.  This
content-type is as the _original uploader_ specified it, so may be
incorrect.

The transcoded image is not animated, even if the original was.  HEIC
files can nominally be animated, but in testing libvips was not able
to correctly recognize them as such.  TIFF files are parsed as being
"animated," with one page per frame; this is of dubious utility, so
we merely transcode the first page.  Always generating a static
transcoded image serves to also limit the computational time spent.

THUMBNAIL_OUTPUT_FORMATS is switched to be a tuple to ensure that it
is not accidentally mutated.
2025-01-09 09:10:28 -08:00
PieterCK
639b291f30 import: Sort realms migrations status before checking.
This commit sorts the list of migrations to ensure the same
sets of migrations don't somehow ordered differently. This
has been reported to happen when importing Zulip Cloud to
self-host (zulip-cloud-current).

The order of migrations listed in `migration_status.json`
don't actually matter, migrations specify which other
migrations they depend on.

Fixes #32826.
2025-01-02 18:43:16 -08:00
Shubham Padia
4b3d1a5aac streams: Creator should be able to administer new channels.
There are cases when importing from slack where the stream creator can
technically be none, that is why we have named the default group string
to `stream_creator_or_nobody`. If stream creator is not present, we
default back to nobody. See
https://chat.zulip.org/#narrow/channel/3-backend/topic/Default.20can_administer_channel_group.20for.20imported.20realms/near/1983634
for mode details.
2024-12-03 18:38:25 -08:00
jitendra-ky
ca14366e38 import: Replace generic Exception with CommandError.
This change improves error handling in `do_import_realm` by replacing
the use of a generic Exception with CommandError. The updated approach
provides clearer, user-friendly error messages when there is a version
mismatch between the exported data and the Zulip server.

Fixes #32292.
2024-11-18 18:35:14 -08:00
Mateusz Mandera
1f21b7437c test_import_export: Don't hard-code ZULIP_VERSION in fixtures.
Otherwise, these tests fail if ZULIP_VERSION is different locally from
what's hard-coded. Use a placeholder instead and replace dynamically in
a helper function.
2024-11-10 19:12:39 -08:00
PieterCK
a9838d8089 import: Verify exported realm's migration compatibility.
When transferring a realm to a server that has a different set of
applied migrations (different Zulip versions), there is a chance that
the imported data formats appear to be compatible but data invariants
could still be violated.

This commit adds an assertion during the import process to verify
that both the exported realm and the importing server have matching
Zulip versions and have a compatible set of migrations.
2024-11-08 15:52:45 -08:00
PieterCK
40bcb4b42b export: Add migration status file to export tarball.
This commit updates the export process to write the migration status of
the realm as a JSON file to be included in the export tarball.

This is a preparatory step for adding an assertion to ensure that the
importing and exporting realms have a compatible set of applied
migrations.
2024-11-08 15:52:45 -08:00
Mateusz Mandera
da4443f392 thumbnail: Make thumbnailing work with data import.
We didn't have thumbnailing for images coming from data import and this
commit adds the functionality.

There are a few fundamental issues that the implementation needs to
solve.

1. The images come from an untrusted source and therefore we don't want
   to just pass them through to thumbnailing without checking. For that
   reason, we cannot just import ImageAttachment rows from the export
   data, even for zulip=>zulip imports.
   The right way to process images is to pass them to maybe_thumbail(),
   which runs libvips_check_image() on them to verify we're okay with
   thumbnailing, creates ImageAttachment rows for them and sends them
   to the thumbnailing queue worker. This approach lets us handle both
   zulip=>zulip and 3rd party=>zulip imports in the same way,

2. There is a somewhat circular dependency between the Message,
   Attachment and ImageAttachment import process:

- ImageAttachments would ideally be created after importing
  Attachments, but they need to already exist at the time of Message
  import. Otherwise, the markdown processor doesn't know it has to add
  HTML for image previews to messages that reference images. This would
  mean that messages imported from 3rd party tools don't get image
  previews.
- Attachments only get created after Message import however, due to the
  many-to-many relationship between Message and Attachment.

This is solved by fixing up some data of Attachments pre-emptively, such
as the path_ids. This gives us the necessary information for creating
ImageAttachments before importing Messages.

While we generate ImageAttachment rows synchronously, the actual
thumbnailing job is sent to the queue worker. Theoretically, the worker
could be very backlogged and not process the thumbnails anytime soon.
This is fine - if the app is loaded and tries to display a message with
such a not-yet-generated thumbnail, the code in `serve_file` will
generate the thumbnails synchronously on the fly and the user will see
the image preview displayed normally. See:

1b47134d0d/zerver/views/upload.py (L333-L342)
2024-10-24 10:32:51 -07:00
Shubham Padia
b305ca14dd user_groups: Add add_can_members_group to user group.
The default value for this field that we wanted to have was that group
itlself. But we are deferring that to later in order to reach the point
of switching over to the groups system sooner. Till then, we will use
`group_creator` as the default. See
https://chat.zulip.org/#narrow/stream/101-design/topic/Group.20add.20members.20dropdown/near/1952904
for more details.

For migration plan details, see
https://chat.zulip.org/#narrow/stream/101-design/topic/Group.20add.20members.20dropdown/near/1952902

The increase in query count from 7 to 9 in the query count test for
creating a user group is because of group_creator being the default for
the new field.
2024-10-11 16:31:18 -07:00
Prakhar Pratyush
07dcee36b2 export_realm: Add RealmExport model.
Earlier, we used to store the key data related to realm exports
in RealmAuditLog. This commit adds a separate table to store
those data.

It includes the code to migrate the concerned existing data in
RealmAuditLog to RealmExport.

Fixes part of #31201.
2024-10-04 12:06:35 -07:00
Shubham Padia
12ebd97f1f settings: Add group_creator as default for can_manage_group.
We create an unnamed user group with just the group creator as it's
member when trying to set the default. The pattern I've followed across
most of the acting_user additions is to just put the user declared
somewhere before the check_add_user_group and see if the test passes.
If it does not, then I'll look at what kind of user it needs to be set
to `acting_user`.
2024-10-01 17:35:14 -07:00
Prakhar Pratyush
65f465562f export_realm: Remove the 'react on consent message' approach.
For exporting full with consent:

* Earlier, a message advertising users to react with thumbs up
  was sent and later used to determine the users who consented.

* Now, we no longer need to send such a message. This commit
  updates the logic to use `allow_private_data_export` user-setting
  to determine users who consented.

Fixes part of #31201.
2024-09-24 14:32:42 -07:00
Alex Vandiver
903bfb31e6 upload: Provide the frontend with the less-modified filename. 2024-09-09 12:40:17 -07:00
Lauryn Menard
d431a5aad6 audit-log: Move user group event types to AuditLogEventType enum.
Event types moved: USER_GROUP_CREATED, USER_GROUP_DELETED
USER_GROUP_DIRECT_USER_MEMBERSHIP_ADDED,
USER_GROUP_DIRECT_USER_MEMBERSHIP_REMOVED,
USER_GROUP_DIRECT_SUBGROUP_MEMBERSHIP_ADDED,
USER_GROUP_DIRECT_SUBGROUP_MEMBERSHIP_REMOVED,
USER_GROUP_DIRECT_SUPERGROUP_MEMBERSHIP_ADDED,
USER_GROUP_DIRECT_SUPERGROUP_MEMBERSHIP_REMOVED,
USER_GROUP_NAME_CHANGED, USER_GROUP_DESCRIPTION_CHANGED,
USER_GROUP_GROUP_BASED_SETTING_CHANGED
2024-09-09 11:50:13 -07:00
Lauryn Menard
df1e9093a9 audit-log: Move stream/channel event types to AuditLogEventType enum.
Renamed event types below in the enum class to use channel instead of
stream.

Event types moved: STREAM_CREATED, STREAM_DEACTIVATED, STREAM_NAME_CHANGED
STREAM_REACTIVATED, STREAM_MESSAGE_RETENTION_DAYS_CHANGED
STREAM_PROPERTY_CHANGED, STREAM_GROUP_BASED_SETTING_CHANGED
2024-09-09 11:50:13 -07:00
Lauryn Menard
56c8cbde1e audit-log: Move realm event types to AuditLogEventType enum.
Event types moved: REALM_CREATED, REALM_DEFAULT_USER_SETTINGS_CHANGED
REALM_ORG_TYPE_CHANGED, REALM_DOMAIN_ADDED, REALM_DOMAIN_CHANGED
REALM_DOMAIN_REMOVED, REALM_PLAYGROUND_ADDED, REALM_PLAYGROUND_REMOVED
REALM_LINKIFIER_ADDED, REALM_LINKIFIER_CHANGED, REALM_LINKIFIER_REMOVED
REALM_EMOJI_ADDED, REALM_EMOJI_REMOVED, REALM_LINKIFIERS_REORDERED
REALM_IMPORTED
2024-09-09 11:50:13 -07:00
Lauryn Menard
d2c32f23db audit-log: Move realm event types to AuditLogEventType enum.
Event types moved: REALM_DEACTIVATED, REALM_REACTIVATED, REALM_SCRUBBED
REALM_PLAN_TYPE_CHANGED, REALM_LOGO_CHANGED, REALM_EXPORTED
REALM_PROPERTY_CHANGED, REALM_ICON_SOURCE_CHANGED, REALM_DISCOUNT_CHANGED
REALM_SPONSORSHIP_APPROVED, REALM_BILLING_MODALITY_CHANGED
REALM_REACTIVATION_EMAIL_SENT, REALM_SPONSORSHIP_PENDING_STATUS_CHANGED
REALM_SUBDOMAIN_CHANGED
2024-09-09 11:50:13 -07:00
Lauryn Menard
e5daa3470f audit-log: Move user event types to AuditLogEventType enum.
Event types moved: USER_CREATED, USER_ACTIVATED, USER_DEACTIVATED
USER_REACTIVATED, USER_ROLE_CHANGED, USER_DELETED
USER_DELETED_PRESERVING_MESSAGES
2024-09-09 11:50:13 -07:00
Mateusz Mandera
5476340b52 import: Export and import .original emoji files correctly.
The export tool was only exporting the already-thumbnailed emoji file,
omitting the original one. Now we make sure to export the .original file
too, like we do for avatars, and make the import tool process it
directly, to thumbnail it directly and generate a still in the case of
animated emojis.

Otherwise, the imported realm wouldn't have the <emoji>.png.original
file that we generally expect to have accessible, and stills for
animated emojis were completely missing.
2024-08-21 16:30:19 -07:00
Prakhar Pratyush
6838a7302d test_import_export: Add test coverage for OnboardingUserMessage. 2024-07-22 10:26:33 -07:00
Mateusz Mandera
4a93149435 settings: Rework how push notifications service is configured.
Instead of the PUSH_NOTIFICATIONS_BOUNCER_URL and
SUBMIT_USAGE_STATISTICS settings, we want servers to configure
individual ZULIP_SERVICE_* settings, while maintaining backward
compatibility with the old settings. Thus, if all the new
ZULIP_SERVICE_* are at their default False value, but the legacy
settings are activated, they need to be translated in computed_settings
to the modern way.
2024-07-17 17:14:06 -07:00
Anders Kaseorg
6412c2d630 ruff: Fix FURB142 Use of set.add() in a for loop.
This is a preview rule, not yet enabled by default.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-14 13:52:59 -07:00
Anders Kaseorg
48202389b8 ruff: Bump target-version from py38 to py310.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg
0fa5e7f629 ruff: Fix UP035 Import from collections.abc, typing instead.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg
531b34cb4c ruff: Fix UP007 Use X | Y for type annotations.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg
e08a24e47f ruff: Fix UP006 Use list instead of List for type annotation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Alex Vandiver
62a0611ddb emoji: Pass down content-type, rather than guessing from extension. 2024-07-12 13:26:47 -07:00
Alex Vandiver
ff90e5355f upload: Pass down content-type of realm icon/logo to backend.
This saves having to try to re-derive it from the file extension,
which may be ".original" in some cases.
2024-07-11 07:31:39 -07:00
roanster007
02d0566dc5 refactor: Rename Huddle Django model class to DirectMessageGroup.
This commit renames the "Huddle" Django model class to
"DirectMessageGroup", while maintaining the same table --
"zerver_huddle".

Fixes part of #28640.
2024-07-07 21:31:30 -07:00
Alex Vandiver
f52a93bc14 upload: Stop requiring callers pass in the file size.
This can be calculated because we have the contents.
2024-07-07 14:40:07 -07:00
Alex Vandiver
e29a455b2d avatars: Encode version into the filename.
Hash the salt, user-id, and now avatar version into the filename.
This allows the URL contents to be immutable, and thus to be marked as
immutable and cacheable.  Since avatars are served unauthenticated,
hashing with a server-side salt makes the current and past avatars not
enumerable.

This requires plumbing the current (or future) avatar version through
various parts of the upload process.

Since this already requires a full migration of current avatars, also
take the opportunity to fix the missing `.png` on S3 uploads (#12852).

We switch from SHA-1 to SHA-256, but truncate it such that avatar URL
data does not substantially increase in size.

Fixes: #12852.
2024-07-07 14:40:07 -07:00
roanster007
52692a6448 refactor: Rename huddle to direct_message_group in non API.
This commit performs a sweep on the first batch of non API
files to rename "huddle" to "direct_message_group`.

It also renames variables and methods of type -
"huddle_message" to "group_direct_message".

This is a part of #28640
2024-07-04 07:56:31 -07:00
Alex Vandiver
08b24484d1 upload: Remove redundant acting_user_profile argument.
This argument, effectively added in 9eb47f108c, was never actually
used.
2024-06-26 16:43:11 -07:00