This commit updates code to use "\x07" as value for
"subject" field of Message objects for DMs and group
DMs, so that we have a unique value for DMs and group
DMs which cannot be used for channel messages.
This helps in avoiding having an empty string value as
topic for DMs, which is also used for "general chat"
channel messages, as large number of DMs in the realm
resulted in PostgreSQL query planner thinking that there
are too many "general chat" messages and thus generated
bad query plans for operations like fetching
"general chat" messages in a stream or moving messages
to and from "general chat" topic.
This change as done for ArchivedMessage and
ScheduledMessage objects as well.
Note that the clients still get "subject" value as
an empty string "".
This commit also adds tests for checking that "\x07"
cannot be used as topic for channel messages.
Fixes#34360.
This commit removes a misleading comment regarding
'zerver_message_edit_history_id' index.
We added the index in 0679 to use in 0680 but later the 0680 migration
was reworked resulting in the index not being used in 0680.
We didn't drop the index as we expect it to be helpful for other
things.
The comment was misleading hence removed.
Indexes on topic ("subject") are polluted by the existence of DMs,
which all have empty topics, and as such skew the statistics greatly.
This is particularly important given the new use of the empty topic
for the "general chat" function -- left as-is, the database makes bad
query plans because it believes the topic is vastly more common than
it actually is.
We move the old indexes to a new name with `_all`, and
recreate (concurrently) the same indexes but with a condition on
is_channel_message. These new indexes are unused at current, until
the query-building logic adds limits on is_channel_message; see the
following commit.
Zulip now supports empty string as a valid topic name.
For clients predating this feature, such messages appear
in "general chat" topic. Messages sent to "general chat" are
stored in the database as having a "" topic.
This commit adds a migration to rename the existing
"general chat" topic in the database to "".
Fixes parts of #32996.
This means that only ImageAttachment row needs to be fetched, and
removes the need to pass around an extra parameter. This
denormalization is safe, since in general Attachment rows are
read-only, so we are not concerned with drift between the Attachment
and ImageAttachment tables.
We cannot make content_type non-null, since while the both the
`content_type` column in Attachment and populating that from requests
predates the ImageAttachment table, we have both backfilled
ImageAttachment rows to consider, and imports may also leave files
with no `content_type`. Any backfill of currently-null `content_type`
values will thus need to update both tables.
This change fixes a race condition when importing. ImageAttachment
rows are imported before rendering Messages, which are both before
importing Attachment rows; if the thumbnailing finished after the
Message was imported but before Attachment rows were imported, then
the re-rendering step would not know the image's content-type.
This commit is a part of the work to support empty string
as a topic name.
Previously, empty string was not a valid topic name.
Adds a `empty_topic_name` client capability to allow client
to specify whether it supports empty string as a topic name.
Adds backward compatibility for:
- `subject` field in the `message` event type
Removes deprecated `user` object from reactions objects returned by
the API as it is redundant because of the presence of `user_id` field in
the API and is not used by any clients now.
A new table is created to track which path_id attachments are images,
and for those their metadata, and which thumbnails have been created.
Using path_id as the effective primary key lets us ignore if the
attachment is archived or not, saving some foreign key messes.
A new worker is added to observe events when rows are added to this
table, and to generate and store thumbnails for those images in
differing sizes and formats.
Previously the bot sent bot commands whenever an undefined message
was sent by the user. This commit intends to fix the problem so that
the bot will only respond to the first message it does not understand
and not reply to any future undefined messages.
Fixes part of #30049.
This prep commit adds a new OnboardingUserMessage model
that will be used to mark the new onboarding messages
for new users as unread and the first message of each
onboarding topic as starred.
This table won't include the old onboarding messages.
Migrate all `ids` of anything which does not have a foreign key from
the Message or UserMessage table (and would thus require walking
those) to be `bigint`. This is done by removing explicit
`BigAutoField`s, trading them for explicit `AutoField`s on the tables
to not be migrated, while updating `DEFAULT_AUTO_FIELD` to the new
default.
In general, the tables adjusted in this commit are small tables -- at
least compared to Messages and UserMessages.
Many-to-many tables without their own model class are adjusted by a
custom Operation, since they do not automatically pick up migrations
when `DEFAULT_AUTO_FIELD` changes[^1].
Note that this does multiple scans over tables to update foreign
keys[^2]. Large installs may wish to hand-optimize this using the
output of `./manage.py sqlmigrate` to join multiple `ALTER TABLE`
statements into one, to speed up the migration. This is unfortunately
not possible to do generically, as constraint names may differ between
installations.
This leaves the following primary keys as non-`bigint`:
- `auth_group.id`
- `auth_group_permissions.id`
- `auth_permission.id`
- `django_content_type.id`
- `django_migrations.id`
- `otp_static_staticdevice.id`
- `otp_static_statictoken.id`
- `otp_totp_totpdevice.id`
- `two_factor_phonedevice.id`
- `zerver_archivedmessage.id`
- `zerver_client.id`
- `zerver_message.id`
- `zerver_realm.id`
- `zerver_recipient.id`
- `zerver_userprofile.id`
[^1]: https://code.djangoproject.com/ticket/32674
[^2]: https://code.djangoproject.com/ticket/24203
Previously, when the operand of id operator was more than
2147483647, it was raising server error. This is because the
maximum permissible PostgreSQL integers value is 2147483647.
This is fixed by raising a BadNarrowOperatorError in case the
id operand is larger than 2147483647.
For endpoints with a type parameter to indicate whether a message is
a direct or stream message, adds support for passing "channel" as a
value for stream messages.
Part of stream to channel rename project.
Calling `.select_related()` with no arguments joins through every
possible table, recursively. In this case, this currently produces a
query which joins through forty-three tables.
This is rather inefficient, particularly for what is a very common
call which should be very fast.
No callsite depends on having prefetched any joined table on the
object; drop all of the joins.
By default, `SELECT FOR UPDATE` will also lock any rows which are
`JOIN`ed into the selected rows; in the case of UserMessage rows, this
can mean arbitrary Message rows.
Since the messages themselves are not being changed, it is not
necessary to lock them -- and doing so may lead to deadlocks, in the
case that the UserMessage row is locked for update before the Message,
and some other request has already taken a read lock on the Message
and is blocked on the UserMessage write lock.
Change `select_for_update_query` to explicitly only lock UserMessage.