Commit Graph

275 Commits

Author SHA1 Message Date
Rishi Gupta
37bdc7c010 analytics: Remove COUNT_STATS['messages_sent:hour'].
Having both messages_sent:hour and messages_sent:is_bot:day is confusing,
since a single messages_sent:is_bot:hour would have a superset of the
information and take less total space. This commit and its parent together
replace the two stats with a single messages_sent:is_bot:hour.
2017-01-17 15:54:57 -08:00
Rishi Gupta
b593ac9d7c analytics: Change messages_sent:is_bot to hourly frequency.
In preparation for replacing messages_sent.
2017-01-17 15:54:57 -08:00
Rishi Gupta
68fcb4152f analytics: Remove interval field from *Count tables.
Includes a database migration. The interval field was originally there to
facilitate time aggregation (e.g. aggregate_hour_to_day), but we now do such
aggregations in views code or in the frontend.
2017-01-17 15:54:57 -08:00
Rishi Gupta
a8f2ebb443 analytics: Include interval in COUNT_STATS property names. 2017-01-17 15:54:57 -08:00
Rishi Gupta
c466036c80 analytics: Remove unneeded references to interval from test_counts.py. 2017-01-17 15:54:57 -08:00
Rishi Gupta
12d277d4f4 analytics: Change messages_sent:client stat to daily frequency.
A few reasons:
* Our two other subgroup'd message stats in UserCount are at CountStat.DAY
  frequency (messages_sent:is_bot and messages_sent:message_type).
* Keeping this stat at hourly frequency would likely double the size of our
  analytics table, given the current stats. (Counterpoint: if there are
  roughly as many active streams as active users, and we keep
  messages_sent_to_stream:is_bot at hourly frequency, then maybe this stat
  is only a 30% or 50% increase).
* We're currently only showing this on the frontend as a pie chart anyway.
2017-01-17 15:54:57 -08:00
Rishi Gupta
690002aef8 analytics: Add fixtures for several CountStats. 2017-01-17 15:54:57 -08:00
Rishi Gupta
2710a944e8 analytics: Refactor fixture creation to make it more general.
Also less verbose, in preparation for adding a bunch more fixtures.
2017-01-17 15:54:57 -08:00
Rishi Gupta
1f4a4e5e26 analytics: Force --clear-existing-data option in populate_analytics_db.
Makes more sense for a fixture generating script to just clear the existing
data every time.
2017-01-17 15:54:57 -08:00
Rishi Gupta
680e7f75e1 analytics: Change generate_time_series_data argument from length to days.
Previously, this function seemed ambivalent about whether it was generating
a series of abstract data points or a series of data points that would
correspond to times. Switch firmly to the latter, so e.g. if the frequency
changes, so will the length of the output sequence.
2017-01-17 15:54:57 -08:00
Rishi Gupta
3712fda30d analytics: Ensure fixture data points are non-negative. 2017-01-17 15:54:57 -08:00
Rishi Gupta
ecfc336a15 analytics: Add views for remaining /stats graphs. 2017-01-17 15:54:57 -08:00
Rishi Gupta
73c0c4c52e analytics/views.py: Increase efficiency of get_time_series_by_subgroup.
Not sure if this would actually be a performance problem in practice, but
this was originally making a database query for each subgroup (instead of
just a single query getting data for all the subgroups).

Also removed the filter against the interval column, which will soon not be
needed (interval will be uniquely determined by the property).
2017-01-17 15:54:57 -08:00
Rishi Gupta
d873902755 analytics/views.py: Refactor get_messages_sent_by_humans_and_bots.
Refactor out the reusable parts, since we're about to add several more
views.
2017-01-17 15:54:57 -08:00
Rishi Gupta
3a72b5cda9 analytics: Rename messages_sent_to_realm.
Several additional stats in the pipeline that also relate to messages sent
to the realm.
2017-01-17 15:54:57 -08:00
Rishi Gupta
cdb1c96169 analytics tests: Refactor assertCountEquals calls to be more readable. 2017-01-17 15:54:57 -08:00
Rishi Gupta
59d50c3a47 analytics tests: Make it easy to refer to users in test realm. 2017-01-17 15:54:57 -08:00
Rishi Gupta
54e66e6079 analytics: Add remaining backend tests in TestCountStats. 2017-01-17 15:54:57 -08:00
aakash-cr7
b373f2ef0f analytics: Add backend test for messages_sent_to_stream:is_bot. 2017-01-17 15:54:57 -08:00
Amy Liu
10c0c2b16d analytics: Add backend tests for messages_sent:message_type. 2017-01-17 15:54:57 -08:00
Rishi Gupta
f30b174199 analytics: Set property and interval defaults in assertCountEquals. 2017-01-17 15:54:57 -08:00
Rishi Gupta
a563a15f88 analytics: Make TestCountStats tests more robust.
Adds two things to TestCountStats.setUp():
* A realm with no messages, that generally should not show up in *Count
  tables,
* Users/streams/messages created at 0, 1, 61, and 1441 (just over a day)
  minutes ago (previously was 0, 60), to better test the start_time/end_time
  in the queries, and the frequency/interval setting in the CountStats.
2017-01-17 15:54:57 -08:00
Rishi Gupta
e94bc8f142 analytics tests: Autogenerate names for create* functions. 2017-01-17 15:54:57 -08:00
Amy Liu
f7ce76fb63 analytics: Add create_stream_with_recipient and create_huddle_with_recipient.
This commit replaces AnalyticsTestCase.create_stream with create_stream_with_recipient and adds the method create_huddle_with_recipient.
2017-01-17 15:54:57 -08:00
Rishi Gupta
f375caed46 /activity: Fix URL route for analytics.views.get_realm_activity.
analytics.views.get_realm_activity was taking a 'realm_str', but the URL
route was expecting a 'realm'. Changed the URL route to take a 'realm_str'.
2017-01-12 15:21:06 -08:00
Rishi Gupta
3f2a002c6e analytics/lib/counts.py: Fix one of the COUNT_STATS definitions.
Fixes an error in the definition of
COUNT_STATS['messages_sent_to_stream:is_bot']. The CountStat needs a
group_by argument since it is supposed to group by UserProfile.is_bot.
2017-01-10 20:41:07 -08:00
Rishi Gupta
977f5b9178 analytics/lib/counts.py: Fix error in count_message_type_by_user_query.
This query counts the number of messages each user has sent, subgroup'd by
whether the message was a private_message (PM or sent to a huddle), sent to
a 'private_stream', or sent to a 'public_stream'.

We need to join on zerver_stream to find out whether stream messages were
sent to public streams or private streams, but it needs to be a LEFT JOIN
rather than a JOIN so that we preserve the messages sent to non-streams.
2017-01-10 20:41:07 -08:00
Rishi Gupta
6374596a77 analytics: Add initial fixture for testing views. 2017-01-10 17:48:07 -08:00
Tim Abbott
3f8d4193da lint: Fix % comprehensions being used without a tuple. 2017-01-09 11:45:11 -08:00
Rishi Gupta
ac29928d91 Remove domain from analytics management commands. 2017-01-09 11:26:08 -08:00
Rishi Gupta
e14f575979 Remove domain from analytics/views.py. 2017-01-09 11:26:08 -08:00
Rishi Gupta
552d626ef2 analytics: Fix FillState.last_modified not being updated.
We were updating FillState with FillState.objects.filter(..).update(..),
which does not update the last_modified field (which has auto_now=True).
The correct incantation is the save() method of the actual FillState
object.
2017-01-08 23:36:34 -08:00
Rishi Gupta
190d320afa analytics: Change CountStat.property from Text to str. 2017-01-08 17:24:51 -08:00
Rishi Gupta
a07757c127 analytics/views: Fix query in get_messages_sent_to_realm. 2017-01-08 17:24:51 -08:00
Rishi Gupta
f8962d521d analytics: Fix uses of 'interval' in arguments and variable names.
interval refers to a time interval, and frequency refers to something that
semantically means something closer to 'hourly' or 'daily'.

Currently, interval can have values 'hour', 'day', or 'gauge', and frequency
can only have values 'hour' and 'day'.
2017-01-08 17:24:51 -08:00
Rishi Gupta
f5899dd14b analytics: Add lib/ function to drop all analytics tables. 2017-01-08 17:24:51 -08:00
Rishi Gupta
73dc904e9c analytics: Move time_range from views.py to lib/time_utils.py 2017-01-08 17:24:51 -08:00
Tommy Ip
28abfca565 analytics: Fix bare except clause. 2017-01-08 16:25:22 -08:00
Rishi Gupta
2b0a7fd0ba Rename models.get_realm_by_string_id to get_realm.
Finishes the refactoring started in c1bbd8d. The goal of the refactoring is
to change the argument to get_realm from a Realm.domain to a
Realm.string_id. The steps were

* Add a new function, get_realm_by_string_id.

* Change all calls to get_realm to use get_realm_by_string_id instead.

* Remove get_realm.

* (This commit) Rename get_realm_by_string_id to get_realm.

Part of a larger migration to remove the Realm.domain field entirely.
2017-01-04 17:12:23 -08:00
Rishi Gupta
605361ec86 makemessages: Fix string with unnamed arguments in analytics/views.py. 2016-12-30 16:52:24 -08:00
Rishi Gupta
9e5325a164 Add /stats page with basic stats graph.
Adds a new url route and a new json endpoint.
2016-12-29 14:20:13 -08:00
Rishi Gupta
31efe858ef Clean up imports in analytics/views.py. 2016-12-29 14:20:13 -08:00
Rishi Gupta
717afcb408 Remove calls to get_realm in preparation for its deprecation.
Also removes two calls to email_to_domain.
2016-12-26 17:53:32 -08:00
Rishi Gupta
c7c0e36508 analytics: Add InstallationCount checks to prototype TestCountStat.
Was enabled by commit 41e8ee3 where we moved TIME_ZERO to before the realms
created by populate_db.py.

Also removes the stub for TestAggregates, since the remaining thing to be
tested was the aggregation from RealmCount to InstallationCount, and the end
to end checks provided by the TestCountStat tests should be sufficient.
2016-12-20 12:03:23 -08:00
Rishi Gupta
dbc94d0fc0 analytics: Remove test for no longer supported behavior.
In a previous design, there was no FillState table, and one could run any
CountStat at any time. This is no longer supported.

This test was making sure that if one ran a CountStat at a certain hour, and
then ran it at a previous hour, the old rows would still be there.
2016-12-20 12:03:23 -08:00
Rishi Gupta
e09aaf1020 analytics: Remove tests that will be subsumed by TestCountStats. 2016-12-20 12:03:23 -08:00
Rishi Gupta
6748b72ccc analytics: Remove tests now covered by test_active_users_by_is_bot. 2016-12-20 12:03:23 -08:00
Rishi Gupta
2211b8b102 analytics: Change count_message_by_stream to join on UserProfile.
It seems unlikely we will need count_message_by_stream without the
UserProfile table in the future, so write count_message_by_stream_and_is_bot
in the usual query form and replace count_message_by_stream with it.
This also has the benefit of shortening our list of "special case" queries
from two to one.

The pathways of the removed test will be covered more thoroughly in the new
TestCountStats tests.
2016-12-20 12:03:23 -08:00
Rishi Gupta
6992f9784c analytics: Update TestCountStat prototype. 2016-12-20 12:03:23 -08:00
Rishi Gupta
c6a6c871ee analytics: Change TIME_ZERO in tests to be in the past. 2016-12-20 12:03:23 -08:00