zulip

mirror of https://github.com/zulip/zulip.git synced 2025-10-24 16:43:57 +00:00

Author	SHA1	Message	Date
Mateusz Mandera	ff876d2df4	export: Treat is_mirror_dummy=True users as consenting. As explained in the comment added to the function, in terms of privacy concerns, it is fine to export all data for these accounts. And it is important to do - so that exporting an organization which was originally imported e.g. from Slack doesn't result in excessively limited data for accounts that were mirror dummies and never "activated" themselves.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	3c43603607	export: Treat deactivated user with consent enabled as consenting. Prior to this, deactivated user were presumed to be non-consenting to private data export, regardless of their setting.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	3c1fae1707	export: Fix get_consented_user_ids to also account for bots. Now that we severely limited the way that non-consenting users get exported, we need to start to consider bots as consenting when appropriate - otherwise the exported bot accounts will be unusable after importing.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	e57b6719fa	export: Scrub RealmAuditLog rows where modified_user is non-consenting.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	9da4eeaa94	export: Don't export real email of users unless accessible to admins. An administrator shouldn't be able to bypass a user's setting to hide their email address from everyone, including admins. Therefore, we should overwrite the delivery_email for such users during export - unless the user consented to have their private data exported. The notable consequence of this is that such user accounts will become completely inaccessible after importing this data to a new server, due to not having a functional email address on record. These accounts will only be possible to reclaim via a manual intervention to change the email address on the `UserProfile` by server administrators.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	13303fd916	export: Plumb consented_user_ids to export_usermessage_batch in a file. This allows us to get rid of the call to `get_consented_user_ids` in `fetch_usermessages`. Now it's only called at the beginning of the export, eliminating the redundant db query and also resolving the potential for data consistency issues, if some users change their consent setting after the export starts. Now the full export process operates with a single snapshot of these consenting user ids. These ids need to be plumbed through via a file rather than normal arg passing, because this is a separate management command, run in subprocesses during the export.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	747e73470e	export: Reset settings to default for users not in exportable_user_ids. These users didn't consent to having their private data exported. Therefore, correct handling of these users should involve scrubbing their settings to just match the realm defaults.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	ceb32a7285	export: Use exportable_user_ids arg to plumb through consenting users. Instead of making repeated calls to get_consented_user_ids, we can just fetch it (mostly) once and put it in `context["exportable_user_ids"]`. This is essentially what the (unused until now) exportable_user_ids logic was added for after all. The added, intended, effect of this is that non-consenting users will now get exported as mirror dummy accounts, due to the handling of non-exportable users in `custom_fetch_user_profile`. The remaining additional call to `get_consented_user_ids` is in `fetch_usermessages`. This one is tricky as this function gets called in subprocesses via `zerver/management/commands/export_usermessage_batch.py` management command invoked by the export process. It requires passing the `exportable_user_ids` in some other way. This can be dealt with in upcoming commits.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	8b9516fb0b	export: Only export Client objects needed by the data being exported. We shouldn't export the entire Client table - it includes Clients for all the realms on the server, completely unrelated to the realm we're exporting. Since these contain parts of the UserAgents used by the users, we should treat these as private data and only export the Clients that the specific data we're exporting "knows" about.	2025-03-28 16:52:44 -07:00
Mateusz Mandera	3a0de29f5d	export: Don't export miscellaneous private data of non-consenting users.	2025-03-28 16:52:43 -07:00
PieterCK	719f8db654	migration_status: Refactor `parse_migration_status`. This refactors `parse_migration_status` to copy the algorithm of Django's `showmigrations` command instead of parsing its output. This is done so that the code is not susceptible to breaking changes if Django modifies showmigrations's implementation. The previous `parse_migration_status` logic has been repurposed into a test utility function (`prase_showmigrations`). It is used to verify that the new `parse_migration_status` generates output identical to the actual `showmigrations` command. The `test_clean_up_migration_status_json` is removed because `test_parse_migration_status` has covered that behavior.	2025-03-20 10:57:54 -07:00
Mateusz Mandera	c031cf9275	import: Fix export/import of SavedSnippet. This table's export and import weren't working: 1. It didn't have a Config in export.py, so it wasn't exported at all. 2. Its `date_created` wasn't registered in `DATE_FIELDS`. 3. It wasn't registered in `ID_MAPS` in import_realm.py, so having any SavedSnippets in the export would cause the import to fail with an exception. 4. It was missing a `fix_datetime_fields` call in its import codepath.	2025-03-10 13:07:56 -07:00
Mateusz Mandera	acb1731bf9	export: Fix export of child tables when exporting mirror dummy users. Without this change, the child tables of UserProfile didn't get their objects exported if those objects were tied to a mirror dummy user. For example, a `Recipient` of type `PERSONAL`, or the associated `Subscription` would not get exported. Same for other tables with foreign keys to `UserProfile` - such as `UserPresence`. This happened because the Configs for the export are defined as follows: ```python user_profile_config = Config( custom_tables=[ "zerver_userprofile", "zerver_userprofile_mirrordummy", ], # set table for children who treat us as normal parent table="zerver_userprofile", virtual_parent=realm_config, custom_fetch=custom_fetch_user_profile, ) user_subscription_config = Config( table="_user_subscription", model=Subscription, normal_parent=user_profile_config, filter_args={"recipient__type": Recipient.PERSONAL}, include_rows="user_profile_id__in", ) Config( table="_user_recipient", model=Recipient, virtual_parent=user_subscription_config, id_source=("_user_subscription", "recipient"), ) ``` while in `export_from_config` we have: ```python elif config.normal_parent: # In this mode, our current model is figuratively Article, # and normal_parent is figuratively Blog, and # now we just need to get all the articles # contained by the blogs. model = config.model assert parent is not None assert parent.table is not None assert config.include_rows is not None parent_ids = {r["id"] for r in response[parent.table]} ``` This meant that when processing a table with `normal_parent=user_profile_config`, the `parent_ids` above would only have the ids of `UserProfile` objects under the `zerver_userprofile` key in the exported data - completely missing those in `zerver_userprofile_mirrordummy`.	2025-03-10 13:07:56 -07:00
PieterCK	2d6426100f	import-export: Rework how we write `migration_status.json`. The current `get_migration_by_app` has a rather naive approach to compiling the migration status of a realm, which has led to issues like #32826. Specifically, those flaws are: - it does not report the complete state of the migration status of the exporting servers, only the applied migration. - it shows both the replaced and the squashed migrations. This would be a problem if we decide to clean up old migration files we've squashed(replaced) and import a slightly older realm with those still in disk. `check_migration_status` would complain of incompatibility even though those migration files don't matter (they are replaced, after all). - it does not clean up ancient/stale applied migrations (for reference, see how `check-database-compatibility` cleans those) This commit attempts to write a better `migration_status.json` by parsing the output of `showmigrations` instead. This is because Django's `showmigrations` has a lot more logic and validations baked into it than previously thought. Ones that we care about are: - it does validations to make sure app names are valid - it doesn't list replaced migrations and only squashed one - it takes into account migrations in disk(`MigrationsLoader`) vs applied migrations (`MigrationsRecorder`) Which would resolve the first two points highlighted above.	2025-01-24 17:08:37 -08:00
PieterCK	4db7ea2296	migration_status: Add `parse_migration_status`. This commit adds `parse_migration_status`, which takes in the string output of `showmigrations` and parse it into key-value pair of installed apps and a list of its migration status. This is a prep commit to rework the check migrations function of import/export which will parse the output of `showmigrations` to write the `migration_status.json` file.	2025-01-24 17:08:37 -08:00
sujal shah	771d3b1434	invites: Enable adding users to user groups during invitations. This commit allows users to be assigned to custom groups when inviting them to join Zulip, similar to how channels are handled. The implementation follows a similar pattern for adding pills, ensuring consistency, as user groups and channels are parallel in nature. Fixes #24365.	2024-11-26 11:26:34 -08:00
Prakhar Pratyush	175104ea01	streams: Add 'ChannelEmailAddress' model. This commit removes the 'email_token' field from the 'Stream' model, introduces a new model 'ChannelEmailAddress', and backfills it. This is a prep work towards our plan to generate unique channel emails for different users, which can be used to check post permissions at message send time.	2024-11-21 14:53:28 -08:00
PieterCK	40bcb4b42b	export: Add migration status file to export tarball. This commit updates the export process to write the migration status of the realm as a JSON file to be included in the export tarball. This is a preparatory step for adding an assertion to ensure that the importing and exporting realms have a compatible set of applied migrations.	2024-11-08 15:52:45 -08:00
Mateusz Mandera	da4443f392	thumbnail: Make thumbnailing work with data import. We didn't have thumbnailing for images coming from data import and this commit adds the functionality. There are a few fundamental issues that the implementation needs to solve. 1. The images come from an untrusted source and therefore we don't want to just pass them through to thumbnailing without checking. For that reason, we cannot just import ImageAttachment rows from the export data, even for zulip=>zulip imports. The right way to process images is to pass them to maybe_thumbail(), which runs libvips_check_image() on them to verify we're okay with thumbnailing, creates ImageAttachment rows for them and sends them to the thumbnailing queue worker. This approach lets us handle both zulip=>zulip and 3rd party=>zulip imports in the same way, 2. There is a somewhat circular dependency between the Message, Attachment and ImageAttachment import process: - ImageAttachments would ideally be created after importing Attachments, but they need to already exist at the time of Message import. Otherwise, the markdown processor doesn't know it has to add HTML for image previews to messages that reference images. This would mean that messages imported from 3rd party tools don't get image previews. - Attachments only get created after Message import however, due to the many-to-many relationship between Message and Attachment. This is solved by fixing up some data of Attachments pre-emptively, such as the path_ids. This gives us the necessary information for creating ImageAttachments before importing Messages. While we generate ImageAttachment rows synchronously, the actual thumbnailing job is sent to the queue worker. Theoretically, the worker could be very backlogged and not process the thumbnails anytime soon. This is fine - if the app is loaded and tries to display a message with such a not-yet-generated thumbnail, the code in `serve_file` will generate the thumbnails synchronously on the fly and the user will see the image preview displayed normally. See: `1b47134d0d/zerver/views/upload.py (L333-L342)`	2024-10-24 10:32:51 -07:00
Prakhar Pratyush	55f97cd06f	realm_export: Add support to create full data export via /export/realm. Earlier, only public data export was possible via `POST /export/realm` endpoint. This commit adds support to create full data export with member consent via that endpoint. Also, this adds a 'export_type' parameter to the dictionaries in `realm_export` event type and `GET /export/realm` response. Fixes part of #31201.	2024-10-11 13:20:42 -07:00
Prakhar Pratyush	07dcee36b2	export_realm: Add RealmExport model. Earlier, we used to store the key data related to realm exports in RealmAuditLog. This commit adds a separate table to store those data. It includes the code to migrate the concerned existing data in RealmAuditLog to RealmExport. Fixes part of #31201.	2024-10-04 12:06:35 -07:00
Prakhar Pratyush	5d9eb4e358	realm_export: Save stats in '.json' format instead of '.txt'. This commit updates code to store the realm export stats in json format instead of plain text. This will help in storing the stats as JsonField in RealmExport table.	2024-10-04 12:06:35 -07:00
Anders Kaseorg	184c0203f3	upload: Lazily import boto3. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-09-24 16:38:37 -07:00
Vector73	9e4e85e140	saved_snippets: Add backend for saved snippets. Part of #31227.	2024-09-24 15:27:58 -07:00
Prakhar Pratyush	65f465562f	export_realm: Remove the 'react on consent message' approach. For exporting full with consent: * Earlier, a message advertising users to react with thumbs up was sent and later used to determine the users who consented. * Now, we no longer need to send such a message. This commit updates the logic to use `allow_private_data_export` user-setting to determine users who consented. Fixes part of #31201.	2024-09-24 14:32:42 -07:00
Alex Vandiver	ce0df00e44	export: Notify all realm admins on realm export.	2024-09-23 10:02:43 -07:00
Alex Vandiver	7afe6800f7	export: Use relative paths, include more data.	2024-09-23 10:02:43 -07:00
Alex Vandiver	e125ad823d	exports: Add a separate bucket for realm exports. This allows finer-grained access control and auditing. The links generated also expire after one week, and the suggested configuration is that the underlying data does as well. Co-authored-by: Prakhar Pratyush <prakhar@zulip.com>	2024-09-20 15:43:49 -07:00
Alex Vandiver	91ac5c3c8b	export: Log before the compression step, which can be slow.	2024-09-20 15:43:49 -07:00
sujal shah	614caf111e	user_groups: Add `creator` and `date_created` field in user groups. This commit introduced 'creator' and 'date_created' fields in user groups, allowing users to view who created the groups and when. Both fields can be null for groups without creator data.	2024-09-13 18:44:58 -07:00
Lauryn Menard	d2c32f23db	audit-log: Move realm event types to AuditLogEventType enum. Event types moved: REALM_DEACTIVATED, REALM_REACTIVATED, REALM_SCRUBBED REALM_PLAN_TYPE_CHANGED, REALM_LOGO_CHANGED, REALM_EXPORTED REALM_PROPERTY_CHANGED, REALM_ICON_SOURCE_CHANGED, REALM_DISCOUNT_CHANGED REALM_SPONSORSHIP_APPROVED, REALM_BILLING_MODALITY_CHANGED REALM_REACTIVATION_EMAIL_SENT, REALM_SPONSORSHIP_PENDING_STATUS_CHANGED REALM_SUBDOMAIN_CHANGED	2024-09-09 11:50:13 -07:00
Mateusz Mandera	5476340b52	import: Export and import .original emoji files correctly. The export tool was only exporting the already-thumbnailed emoji file, omitting the original one. Now we make sure to export the .original file too, like we do for avatars, and make the import tool process it directly, to thumbnail it directly and generate a still in the case of animated emojis. Otherwise, the imported realm wouldn't have the <emoji>.png.original file that we generally expect to have accessible, and stills for animated emojis were completely missing.	2024-08-21 16:30:19 -07:00
roanster007	7b3e163d55	refactor: Rename `huddle` to `direct_message_group` in non api files. This commit completes rename of "huddle" to "direct_message_group" in all the non API files. Part of #28640	2024-07-31 23:25:56 -07:00
Alex Vandiver	2e38f426f4	upload: Generate thumbnails when images are uploaded. A new table is created to track which path_id attachments are images, and for those their metadata, and which thumbnails have been created. Using path_id as the effective primary key lets us ignore if the attachment is archived or not, saving some foreign key messes. A new worker is added to observe events when rows are added to this table, and to generate and store thumbnails for those images in differing sizes and formats.	2024-07-16 13:22:15 -07:00
Prakhar Pratyush	7d379e00b0	export: Fix 'OnboardingUserMessage' table not being exported. Earlier, the export tool was logging a warning: "??? NO DATA EXPORTED FOR TABLE zerver_onboardingusermessage!!!" This bug was due to not configuring a Config object for 'OnboardingUserMessage' in 'get_realm_config()'. This commit fixes the bug to export the table properly.	2024-07-16 09:36:02 -07:00
Anders Kaseorg	1e9b6445a9	ruff: Fix PLR6104 Use `+=` to perform an augmented assignment directly. This is a preview rule, not yet enabled by default. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-14 13:49:51 -07:00
Anders Kaseorg	0fa5e7f629	ruff: Fix UP035 Import from `collections.abc`, `typing` instead. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-13 22:28:22 -07:00
Anders Kaseorg	531b34cb4c	ruff: Fix UP007 Use `X \| Y` for type annotations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-13 22:28:22 -07:00
Anders Kaseorg	e08a24e47f	ruff: Fix UP006 Use `list` instead of `List` for type annotation. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-13 22:28:22 -07:00
Alex Vandiver	d5a4941691	django: Switch to .alias() instead .annotate() where possible. When using the sub-expression purely for filtering, and not for accessing the value in the resultset, .alias() is potentially faster since it does not pull the value in as well.	2024-07-11 09:26:23 -07:00
roanster007	02d0566dc5	refactor: Rename `Huddle` Django model class to `DirectMessageGroup`. This commit renames the "Huddle" Django model class to "DirectMessageGroup", while maintaining the same table -- "zerver_huddle". Fixes part of #28640.	2024-07-07 21:31:30 -07:00
Alex Vandiver	e29a455b2d	avatars: Encode version into the filename. Hash the salt, user-id, and now avatar version into the filename. This allows the URL contents to be immutable, and thus to be marked as immutable and cacheable. Since avatars are served unauthenticated, hashing with a server-side salt makes the current and past avatars not enumerable. This requires plumbing the current (or future) avatar version through various parts of the upload process. Since this already requires a full migration of current avatars, also take the opportunity to fix the missing `.png` on S3 uploads (#12852). We switch from SHA-1 to SHA-256, but truncate it such that avatar URL data does not substantially increase in size. Fixes: #12852.	2024-07-07 14:40:07 -07:00
Prakhar Pratyush	fb836a4f0a	onboarding: Add 'OnboardingUserMessage' model. This prep commit adds a new OnboardingUserMessage model that will be used to mark the new onboarding messages for new users as unread and the first message of each onboarding topic as starred. This table won't include the old onboarding messages.	2024-07-05 15:39:32 -07:00
roanster007	52692a6448	refactor: Rename `huddle` to `direct_message_group` in non API. This commit performs a sweep on the first batch of non API files to rename "huddle" to "direct_message_group`. It also renames variables and methods of type - "huddle_message" to "group_direct_message". This is a part of #28640	2024-07-04 07:56:31 -07:00
Mateusz Mandera	b1d50b511c	presence: Handle PresenceSequence in the export/import system.	2024-06-02 22:08:28 -07:00
Prakhar Pratyush	cc793612f0	export: Create REALM_EXPORTED audit log for exports via shell. Earlier, we were creating RealmAuditLog with REALM_EXPORTED event_type when export of public data took place via organization settings panel. We were not creating the audit log when the export was executed via shell i.e './manage.py export'. This commit creates the audit log in that case too. It will help during import to distinguish readily between imports from another Zulip server vs imports from another product.	2024-05-08 16:16:37 -07:00
Sahil Batra	c9a7c13ea7	import: Add code to support import and export of NamedUserGroups.	2024-04-26 17:03:09 -07:00
Sahil Batra	71b601cf5a	groups: Create NamedUserGroup table.	2024-04-26 17:03:09 -07:00
Anders Kaseorg	a82a3eb4d7	ruff: Fix UP033 Use `@functools.cache`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-04-01 18:32:52 -07:00
John Lu	a5cf0ec526	refactor: Replace HUDDLE with DIRECT_MESSAGE_GROUP. Replaced HUDDLE attribute with DIRECT_MESSAGE_GROUP using VS Code search, part of a general renaming of the object class. Fixes part of #28640. Co-authored-by: JohnLu2004 <JohnLu10212004@gmail.com>	2024-03-21 16:39:33 -07:00

1 2 3 4 5 ...

409 Commits