Commit Graph

2007 Commits

Author SHA1 Message Date
Steve Howell
b4346d0276 performance: Extract subscribers/peers in bulk.
We replace get_peer_user_ids_for_stream_change
with two bulk functions to get peers and/or
subscribers.

Note that we have three codepaths that care about
peers:

    subscribing existing users:
        we need to tell peers about new subscribers
        we need to tell subscribed user about old subscribers

    unsubscribing existing users:
        we only need to tell peers who unsubscribed

    subscribing new user:
        we only need to tell peers about the new user
        (right now we generate send_event
        calls to tell the new user about existing
        subscribers, but this is a waste
        of effort that we will fix soon)

The two bulk functions are this:

    bulk_get_subscriber_peer_info
    bulk_get_peers

They have some overlap in the implementation,
but there are some nuanced differences that are
described in the comments.

Looking up peers/subscribers in bulk leads to some
nice optimizations.

We will save some memchached traffic if you are
subscribing to multiple public streams.

We will save a query in the remove-subscriber
case if you are only dealing with private streams.
2020-10-15 15:12:01 -07:00
Steve Howell
94e41c71f9 refactor: Use set of ids for altered users. 2020-10-15 15:12:01 -07:00
Steve Howell
b894597fa3 refactor: Use sets of stream_ids for helper args. 2020-10-15 15:12:01 -07:00
Steve Howell
3889554977 refactor: Extract send_peer_remove_events. 2020-10-15 15:12:01 -07:00
Tim Abbott
bf66e9c4ab actions: Add transaction.atomic to bulk_add_subs_to_db_with_logging.
This will ensure that we always fully execute the database part of
modifying subscription objects.  In particular, this should prevent
invariant failures like #16347 where Subscription objects were created
without corresponding RealmAuditLog entries.

Fixes #16347.
2020-10-14 11:06:00 -07:00
Steve Howell
5728149e94 performance: Streamline query to add subscribers.
We don't need the select_related('user_profile')
optimization any more, because we just keep
track of user info in our own data structures.

In this codepath we are never actually modifying
users; we just occasionally need their ids or
emails.

This can be a pretty substantive improvement if
you are adding a bunch of users to a stream
who each have a bunch of their own subscriptions.

We could also limit the number of full rows in this
query by adding an extra hop to the DB just to
get colors (using values_list), and then only get
full sub info for the streams that we're adding, rather
than getting every single subscription, in full, for each user.

Apart from finding what colors the user has already
used, the only other reason we need all the columns
in Subscription here is to handle streams that
need to be reactivated.  Otherwise we could do
only("id", "active", "recipient_id", "user_profile_id")
or similar.  Fortunately, Subscription isn't
an overly wide table; it's mostly bool fields.

But by far the biggest thing to avoid is bringing
in all the extra user_profile data.

We have pretty good coverage on query counts here,
so I think this fix is pretty low risk.
2020-10-14 11:03:07 -07:00
Steve Howell
116a441bc5 refactor: Introduce SubInfo class.
This class removes a lot of the annoying tuples
we were passing around.

Also, by including the user everywhere, which
is easily available to us when we make instances
of SubInfo, it sets the stage to remove
select_related('user_profile').
2020-10-14 10:53:10 -07:00
Steve Howell
febef45e38 minor: Add comments to do_get_streams. 2020-10-14 10:53:10 -07:00
Steve Howell
a9356508ca events: Stop sending occupy/vacate events.
We used to send occupy/vacate events when
either the first person entered a stream
or the last person exited.

It appears that our two main apps have never
looked at these events.  Instead, it's
generally the case that clients handle
events related to stream creation/deactivation
and subscribe/unsubscribe.

Note that we removed the apply_events code
related to these events.  This doesn't affect
the webapp, because the webapp doesn't care
about the "streams" field in do_events_register.

There is a theoretical situation where a
third party client could be the victim of
a race where the "streams" data includes
a stream where the last subscriber has left.
I suspect in most of those situations it
will be harmless, or possibly even helpful
to the extent that they'll learn about
streams that are in a "quasi" state where
they're activated but not occupied.

We could try to patch apply_event to
detect when subscriptions get added
or removed. Or we could just make the
"streams" piece of do_events_register
not care about occupy/vacate semantics.
I favor the latter, since it might
actually be what users what, and it will
also simplify the code and improve
performance.
2020-10-14 10:53:10 -07:00
Steve Howell
598601e8fc stream events: Prevent spurious events.
If a user asks to be subscribed to a stream
that they are already subscribed to, then
that stream won't be in new_stream_user_ids,
and we won't need to send an event for it.

This change makes that happen more automatically.
2020-10-13 11:28:17 -07:00
Steve Howell
18771099e4 performance: Introduce new_stream_user_ids.
Let
    U = number of users to subscribe
    S = number of streams to subscribe

We were technically doing N^3 amount of work
when we sent certain events, or to be more
precise, U * S * S amount of work.  For each
stream, we were looping through a list of tuples
of size U * S to find the users for the stream.

In practice either U or S is usually 1, so the
performance gains here are probably negligible,
especially since the constant factors here
were just slinging around Python data.

But the code is actually more readable now, so
it's a double win.
2020-10-13 11:28:17 -07:00
Steve Howell
ebb605319b refactor: Rename stream_map to recipient_id_to_stream.
I want to make a new dict called stream_id_to_stream,
and stream_map would be confusing.
2020-10-13 11:28:17 -07:00
Steve Howell
b502957184 refactor: Extract new_recipient_ids local.
We rename needs_new_sub (which sounds like
a boolean!) to new_recipient_ids, and we
calculate it explicitly within the loop, so
that we don't need to worry as much about
subsequent passes through the loop mutating it.

This allows us to also remove recipient_ids,
which in turn lets us remove recipients_map,
albeit with a small tweak for stream_map.

I also introduce the my_subs local, which
I use to more directly populate used_colors,
as well as using it as the loop var.
2020-10-13 11:28:17 -07:00
Steve Howell
766892d8aa import: Reuse get_last_message_id() helper. 2020-10-13 11:28:17 -07:00
Steve Howell
9df9934ed6 refactor: Pass realm to bulk_add_subscriptions.
I think it's important that the callers understand
that bulk_add_subscriptions assumes all streams
are being created within a single realm, so I make
it an explicit parameter.

This may be overkill--I would also be happy if we
just included the assertions from this commit.
2020-10-13 11:28:17 -07:00
Steve Howell
efc931a671 minor: Extract realm local. 2020-10-13 11:28:17 -07:00
Steve Howell
b2d0a2efb9 refactor: Extract send_subscription_add_events.
This function now does all the work that we used
to do with notify_subscriptions_added happening
inside a loop.

There's a small fine-tuning here, where we only
get recent traffic on streams that we're actually
sending events for.
2020-10-13 11:28:17 -07:00
Steve Howell
223ce83a0a refactor: Clean up call to notify_subscriptions_added.
We now just pass in all_subscribers_by_stream, rather
than a callback.

We also move sub_tuples_by_user closer to the
loop where we call notify_subscriptions_added.
2020-10-13 11:28:17 -07:00
Steve Howell
811426b345 Extract send_stream_creation_events_for_private_streams.
We can probably avoid passing in users here.
2020-10-12 16:40:37 -07:00
Steve Howell
1cfaef0d1a refactor: Simplify pick_color logic.
This removes the need to jankily mutate
the active flag in the caller, and we don't
need to mutate our subs_by_user either.
2020-10-12 16:40:37 -07:00
Steve Howell
13569ff97a refactor: Eliminate new_subs.
We now just process new subs for a user immediately
within the loop.
2020-10-12 16:40:37 -07:00
Steve Howell
8c70fbde78 refactor: Use subs_to_add in return value.
The subs_to_add is directly related to a var
called new_subs, which I hope to eliminate
soon.
2020-10-12 16:40:37 -07:00
Steve Howell
1afca3d430 minor: Extract local for stream. 2020-10-12 16:40:37 -07:00
Steve Howell
84aa1389d8 Extract bulk_add_subs_to_db_with_logging.
This is a trivial code extraction.
2020-10-12 16:40:37 -07:00
Steve Howell
3ff9ce78ea refactor: Extract send_peer_add_events. 2020-10-12 16:40:37 -07:00
Anders Kaseorg
aabef3d9be python: Catch specific exceptions from orjson.
Followup to #16120.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-11 16:11:41 -07:00
sahil839
6c473ed75f message: Call build_message_send_dict from check_message.
We call build_message_send_dict from check_message instead of
do_send_messages.

This is a prep commit for adding a new setting for handling
wildcard mentions in large streams.
2020-09-29 17:18:04 -07:00
sahil839
f1a5fbaeb0 message: Extract build_message_send_dict function.
We extract the loop for building message dict in
do_send_messages in a separate function named
build_message_send_dict.

This is a prep commit for moving the code for building
of message dict in check_message.
2020-09-29 16:50:47 -07:00
sahil839
0514ba7ecb message: Add 'links_for_embed' to message_dict.
There is a bug where we send event for even
those messages which do not have embedded links
as we are using single set 'links_for_embed' to
check whether we have to send event for
embedded links or not.

This commit fixes the bug by adding 'links_for_embed'
in message dict itself and send the event only
if that message has embedded links.
2020-09-29 16:50:47 -07:00
Steve Howell
2c496d9afd mypy: Fix do_send_user_group_update_event. 2020-09-29 16:49:10 -07:00
Vishnu KS
367c792968 actions: Downgrade realm before scrubbing. 2020-09-28 15:37:49 -07:00
Vishnu KS
0d30f59c97 billing: downgrade_now -> downgrade_now_without_creating_additional_invoice. 2020-09-28 15:37:49 -07:00
Tim Abbott
99396b25a6 MessageDict: Add a bit of docstring documentation. 2020-09-28 11:50:02 -07:00
Tim Abbott
90ff62aabc actions: Rename message local variable to message_dict.
This is a preparatory refactor to make it easy to see the changes
using `git show` in the next commit.
2020-09-28 11:14:59 -07:00
sahil839
ae74f8aafb actions: Remove unnecessary comment in do_send_messages function.
This commit removes the unnecessary comment which was added in
9454683108, when we were using message.get() for keys which
were also passed as args in do_send_messages, but there are no
such keys in the current code.
2020-09-28 10:58:35 -07:00
sahil839
76c75fea92 actions: Remove unnecessary line from do_send_messages.
This commit removes the unnecessary line of code to get
rendered_content from message dict sent by check_message
when it actually does not inlcude 'rendered_content' key.

This line was added in 9454683108, but now we do not send
rendered_content in the message dict as we render the message
in do_send_messages itself.
2020-09-28 10:58:35 -07:00
Aman Agrawal
2bc3924672 move_topic_to_stream: Allow moving to/between/from private streams.
Fixes #16284.

Most of the work for this was done when we implemented correct
behavior for guest users, since they treat public streams like private
streams anyway.

The general method involves moving the messages to the new stream with
special care of UserMessage.

We delete UserMessages for subs who are losing access to the message.
For private streams with protected history, we also create UserMessage
elements for users who are not present in the old stream, since that's
important for those users to access the moved messages.
2020-09-14 15:00:55 -07:00
Anders Kaseorg
3b301f522b python: Tweak some magic trailing commas to avoid Black bugs.
https://github.com/psf/black/issues/1658
https://github.com/psf/black/issues/1671

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Anders Kaseorg
f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Anders Kaseorg
bef46dab3c python: Prefer kwargs form of dict.update.
For less inflation by Black.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Vishnu KS
6bbcb622e1 onboarding: Move send_welcome_bot_response to onboarding. 2020-09-03 17:41:08 -07:00
Anders Kaseorg
a610bd19a1 python: Simplify away various unnecessary lists and list comprehensions.
Loosely inspired by the flake8-comprehensions plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Anders Kaseorg
ab120a03bc python: Replace unnecessary intermediate lists with generators.
Mostly suggested by the flake8-comprehension plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Mateusz Mandera
9b50c49ea7 streams: Mark all messages as read when deactivating a stream.
The query to finds and marks all unread UserMessages in the stream as read
can be quite expensive, so we'll move that work to the deferred_work
queue and split it into batches.

Fixes #15770.
2020-09-01 11:24:27 -07:00
Alex Vandiver
81893c9dbb actions: Invalid flag operation is a user error. 2020-08-29 11:38:59 -04:00
orientor
372e010dbb events: Add op field to update_message_flags events.
`update_message_flags` events used `operation` instead of `op`, the
latter being the standard field used in other events. So add `op`
field to `update_message_flags` and mark `operation` as deprecated,
so that it can be removed later.
2020-08-24 12:42:03 -07:00
Clara Dantas
05bf72a75c attachments: Add is_web_public field.
This commit adds the is_web_public field in the AbstractAttachment
class. This is useful when validating user access to the attachment,
as otherwise we would have to make a query in the db to check if
that attachment was sent in a message in a web-public stream or not.
2020-08-12 17:26:03 -07:00
Alex Vandiver
153f16ee6a links: Flatten the set into a list before serializing into the queue.
orjson does not transparently do this set-to-list translation, unlike
ujson.
2020-08-12 11:42:24 -07:00
Anders Kaseorg
61d0417e75 python: Replace ujson with orjson.
Fixes #6507.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:55:12 -07:00
Anders Kaseorg
768f9f93cd docs: Capitalize Markdown consistently.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:23:06 -07:00