models: Add is_private flag to UserMessage and add index for it.

The is_private flag is intended to be set if recipient type is
'private'(1) or 'huddle'(3), otherwise i.e if it is 'stream'(2), it
should be unset.

This commit adds a database index for the is_private flag (which we'll
need to use it). That index is used to reset the flag if it was
already set. The already set flags were due to a previous removal of
is_me_message flag for which the values were not cleared out.

For now, the is_private flag is always 0 since the really hard part of
this migration is clearing the unspecified previous state; future
commits will fully implement it actually doing something.

History: Migration rewritten significantly by tabbott to ensure it
runs in only 3 minutes on chat.zulip.org.  A key detail in making that
work was to ensure that we use the new index for the queries to find
rows to update (which currently requires the `order_by` and `limit`
clauses).
This commit is contained in:
Shubham Padia
2018-06-24 20:19:18 +05:30
committed by Tim Abbott
parent 28589c5563
commit bf6dc4472b
6 changed files with 113 additions and 11 deletions

View File

@@ -15,6 +15,17 @@ in bursts.
access full history, even before they joined the stream. access full history, even before they joined the stream.
- Added support for announcement-only streams. - Added support for announcement-only streams.
**Upgrade notes:**
* Zulip 1.9 contains a significant database migration that can take
several minutes to run. The upgrade process automatically minimizes
disruption by running this migration first, before beginning the
user-facing downtime. However, if you'd like to watch the downtime
phase of the upgrade closely, we recommend
[running them first manually](../production/expensive-migrations.html)
and as well as the usual trick of
[doing an apt upgrade first](../production/maintain-secure-upgrade.html#applying-system-updates).
**Full feature changelog:** **Full feature changelog:**
- Added an organization setting for message deletion time limits. - Added an organization setting for message deletion time limits.
- Added an organization setting to control who can edit topics. - Added an organization setting to control who can edit topics.

View File

@@ -4,8 +4,8 @@
# Running expensive migrations early # Running expensive migrations early
Zulip 1.7 contains some significant database migrations that can take Zulip 1.7 and 1.9 each contain some significant database migrations
several minutes to run. that can take several minutes to run.
The upgrade process automatically minimizes disruption by running The upgrade process automatically minimizes disruption by running
these first, before beginning the user-facing downtime. However, if these first, before beginning the user-facing downtime. However, if
@@ -19,6 +19,14 @@ can run them manually before starting the upgrade:
Postgres database. Postgres database.
3. In the postgres shell, run the following commands: 3. In the postgres shell, run the following commands:
CREATE INDEX CONCURRENTLY
zerver_usermessage_is_private_message_id
ON zerver_usermessage (user_profile_id, message_id)
WHERE (flags & 2048) != 0;
(This first migration, `zerver_usermessage_is_private_message_id`, is
the only one new in Zulip 1.9).
CREATE INDEX CONCURRENTLY CREATE INDEX CONCURRENTLY
zerver_usermessage_mentioned_message_id zerver_usermessage_mentioned_message_id
ON zerver_usermessage (user_profile_id, message_id) ON zerver_usermessage (user_profile_id, message_id)
@@ -44,13 +52,13 @@ can run them manually before starting the upgrade:
ON zerver_usermessage (user_profile_id, message_id) ON zerver_usermessage (user_profile_id, message_id)
WHERE (flags & 1) = 0; WHERE (flags & 1) = 0;
4. These will take some time to run, during which the server will These will take some time to run, during which the server will
continue to serve user traffic as usual with no disruption. Once continue to serve user traffic as usual with no disruption. Once they
they finish, you can proceed with installing Zulip 1.7. finish, you can proceed with installing Zulip 1.7.
To help you estimate how long these will take on your server: count To help you estimate how long these will take on your server: count
the number of UserMessage rows, with `select COUNT(*) from zerver_usermessage;` the number of UserMessage rows, with `select COUNT(*) from zerver_usermessage;`
at the `./manage.py dbshell` prompt. At the time these migrations at the `./manage.py dbshell` prompt. At the time these migrations
were run on chat.zulip.org, it had 75M UserMessage rows; the first 4 were run on chat.zulip.org, it had 75M UserMessage rows; the first 5
indexes took about 1 minute each to create, and the final, indexes took about 1 minute each to create, and the final,
"unread_message" index took more like 10 minutes. "unread_message" index took more like 10 minutes.

View File

@@ -98,6 +98,7 @@ usermessage_index_migrations = [
"[ ] 0095_index_unread_user_messages", "[ ] 0095_index_unread_user_messages",
"[ ] 0098_index_has_alert_word_user_messages", "[ ] 0098_index_has_alert_word_user_messages",
"[ ] 0099_index_wildcard_mentioned_user_messages", "[ ] 0099_index_wildcard_mentioned_user_messages",
"[ ] 0177_user_message_add_and_index_is_private_flag",
] ]
# Our next optimization is to check whether any migrations are needed # Our next optimization is to check whether any migrations are needed
# before we start the critical section of the restart. This saves # before we start the critical section of the restart. This saves

View File

@@ -88,6 +88,14 @@ def create_indexes() -> None:
where_clause='WHERE (flags & 8) != 0 OR (flags & 16) != 0', where_clause='WHERE (flags & 8) != 0 OR (flags & 16) != 0',
) )
# copied from 0177
create_index_if_not_exist(
index_name='zerver_usermessage_is_private_message_id',
table_name='zerver_usermessage',
column_string='user_profile_id, message_id',
where_clause='WHERE (flags & 2048) != 0',
)
class Command(ZulipBaseCommand): class Command(ZulipBaseCommand):
help = """Create concurrent indexes for large tables.""" help = """Create concurrent indexes for large tables."""

View File

@@ -0,0 +1,78 @@
# -*- coding: utf-8 -*-
# Generated by Django 1.11.13 on 2018-06-14 13:39
from __future__ import unicode_literals
import sys
import bitfield.models
from django.db import migrations, transaction
from zerver.lib.migrate import create_index_if_not_exist # nolint
from django.db.backends.postgresql_psycopg2.schema import DatabaseSchemaEditor
from django.db.migrations.state import StateApps
from django.db.models import F
def reset_is_private_flag(
apps: StateApps, schema_editor: DatabaseSchemaEditor) -> None:
UserMessage = apps.get_model("zerver", "UserMessage")
UserProfile = apps.get_model("zerver", "UserProfile")
user_profile_ids = UserProfile.objects.all().order_by("id").values_list("id", flat=True)
# We only need to do this because previous migration
# zerver/migrations/0100_usermessage_remove_is_me_message.py
# didn't clean the field after removing it.
i = 0
total = len(user_profile_ids)
print("Setting default values for the new flag...")
sys.stdout.flush()
for user_id in user_profile_ids:
while True:
# Ideally, we'd just do a single database query per user.
# Unfortunately, Django doesn't use the fancy new index on
# is_private that we just generated if we do that,
# resulting in a very slow migration that could take hours
# on a large server. We address this issue by doing a bit
# of hackery to generate the SQL just right (with an
# `ORDER BY` clause that forces using the new index).
flag_set_objects = UserMessage.objects.filter(user_profile__id = user_id).extra(
where=["flags & 2048 != 0"]).order_by("message_id")[0:1000]
user_message_ids = flag_set_objects.values_list("id", flat=True)
count = UserMessage.objects.filter(id__in=user_message_ids).update(
flags=F('flags').bitand(~UserMessage.flags.is_private))
if count < 1000:
break
i += 1
if (i % 50 == 0 or i == total):
percent = round((i / total) * 100, 2)
print("Processed %s/%s %s%%" % (i, total, percent))
sys.stdout.flush()
class Migration(migrations.Migration):
atomic = False
dependencies = [
('zerver', '0176_remove_subscription_notifications'),
]
operations = [
migrations.AlterField(
model_name='archivedusermessage',
name='flags',
field=bitfield.models.BitField(['read', 'starred', 'collapsed', 'mentioned', 'wildcard_mentioned', 'summarize_in_home', 'summarize_in_stream', 'force_expand', 'force_collapse', 'has_alert_word', 'historical', 'is_private'], default=0),
),
migrations.AlterField(
model_name='usermessage',
name='flags',
field=bitfield.models.BitField(['read', 'starred', 'collapsed', 'mentioned', 'wildcard_mentioned', 'summarize_in_home', 'summarize_in_stream', 'force_expand', 'force_collapse', 'has_alert_word', 'historical', 'is_private'], default=0),
),
migrations.RunSQL(
create_index_if_not_exist(
index_name='zerver_usermessage_is_private_message_id',
table_name='zerver_usermessage',
column_string='user_profile_id, message_id',
where_clause='WHERE (flags & 2048) != 0',
),
reverse_sql='DROP INDEX zerver_usermessage_is_private_message_id;'
),
migrations.RunPython(reset_is_private_flag,
reverse_code=migrations.RunPython.noop),
]

View File

@@ -1523,13 +1523,9 @@ class Reaction(models.Model):
# though each row is only 4 integers. # though each row is only 4 integers.
class AbstractUserMessage(models.Model): class AbstractUserMessage(models.Model):
user_profile = models.ForeignKey(UserProfile, on_delete=CASCADE) # type: UserProfile user_profile = models.ForeignKey(UserProfile, on_delete=CASCADE) # type: UserProfile
# WARNING: We removed the previously-final flag,
# is_me_message, without clearing any values it might have had in
# the database. So when we next add a flag, you need to do a
# migration to set it to 0 first
ALL_FLAGS = ['read', 'starred', 'collapsed', 'mentioned', 'wildcard_mentioned', ALL_FLAGS = ['read', 'starred', 'collapsed', 'mentioned', 'wildcard_mentioned',
'summarize_in_home', 'summarize_in_stream', 'force_expand', 'force_collapse', 'summarize_in_home', 'summarize_in_stream', 'force_expand', 'force_collapse',
'has_alert_word', "historical"] 'has_alert_word', "historical", "is_private"]
flags = BitField(flags=ALL_FLAGS, default=0) # type: BitHandler flags = BitField(flags=ALL_FLAGS, default=0) # type: BitHandler
class Meta: class Meta: