Files
zulip/puppet/zulip_ops/files/graphite/aggregation-rules.conf
Tim Abbott 36e336edc3 puppet: Rename zulip_internal to zulip_ops.
The old "zulip_internal" name was from back when Zulip, Inc. had two
distributions of Zulip, the enterprise distribution in puppet/zulip/
and the "internal" SAAS distribution in puppet/zulip_internal.  I
think the name is a bit confusing in the new fully open-source Zulip
work, so we're replacing it with "zulip_ops".  I don't think the new
name is perfect, but it's better.

In the following commits, we'll delete a bunch of pieces of Zulip,
Inc.'s infrastructure that don't exist anymore and thus are no longer
useful (e.g. the old Trac configuration), with the goal of cleaning
the repository of as much unnecessary content as possible.
2016-10-16 19:23:27 -07:00

61 lines
3.1 KiB
Plaintext

# The form of each line in this file should be as follows:
#
# output_template (frequency) = method input_pattern
#
# This will capture any received metrics that match 'input_pattern'
# for calculating an aggregate metric. The calculation will occur
# every 'frequency' seconds and the 'method' can specify 'sum' or
# 'avg'. The name of the aggregate metric will be derived from
# 'output_template' filling in any captured fields from 'input_pattern'.
#
# For example, if you're metric naming scheme is:
#
# <env>.applications.<app>.<server>.<metric>
#
# You could configure some aggregations like so:
#
# <env>.applications.<app>.all.requests (60) = sum <env>.applications.<app>.*.requests
# <env>.applications.<app>.all.latency (60) = avg <env>.applications.<app>.*.latency
#
# As an example, if the following metrics are received:
#
# prod.applications.apache.www01.requests
# prod.applications.apache.www01.requests
#
# They would all go into the same aggregation buffer and after 60 seconds the
# aggregate metric 'prod.applications.apache.all.requests' would be calculated
# by summing their values.
#
# Note that any time this file is modified, it will be re-read automatically.
# NOTE: If you use the `sum` aggregation method, make sure the aggregation period is
# 5 seconds unless you know what you are doing. statsd pushes to carbon
# every 5 seconds (see local.js), so aggregating over a longer period of time
# will inflate the output value
# Aggregate all per-bucket remote cache stats into a generic hit/miss stat
stats.<app>.cache.all.hit (5) = sum stats.<app>.cache.*.hit
stats.<app>.cache.all.miss (5) = sum stats.<app>.cache.*.miss
# Aggregate all per-bucket remote cache stats counts into a generic hit/miss stat
stats_counts.<app>.cache.all.hit (5) = sum stats_counts.<app>.cache.*.hit
stats_counts.<app>.cache.all.miss (5) = sum stats_counts.<app>.cache.*.miss
# Aggregate all per-domain active stats to overall active stats
stats.gauges.<app>.users.active.all.<bucket> (5) = sum stats.gauges.<app>.users.active.*.<bucket>
stats.gauges.<app>.users.reading.all.<bucket> (5) = sum stats.gauges.<app>.users.reading.*.<bucket>
# Aggregate all per-realm end-to-end send stats to overall
stats.timers.<app>.endtoend.send_time.all.<type> (5) = sum stats.timers.<app>.endtoend.send_time.*.<type>
stats.timers.<app>.endtoend.receive_time.all.<type> (5) = sum stats.timers.<app>.endtoend.receive_time.*.<type>
stats.timers.<app>.endtoend.displayed_time.all.<type> (5) = sum stats.timers.<app>.endtoend.displayed_time.*.<type>
# Aggregate all per-realm narrow timing stats
stats.timers.<app>.narrow.initial_core.all.<type> (5) = sum stats.timers.<app>.narrow.initial_core.*.<type>
stats.timers.<app>.narrow.initial_free.all.<type> (5) = sum stats.timers.<app>.narrow.initial_free.*.<type>
stats.timers.<app>.narrow.network.all.<type> (5) = sum stats.timers.<app>.narrow.network.*.<type>
# Do the same for unnarrow times
stats.timers.<app>.unnarrow.initial_core.all.<type> (5) = sum stats.timers.<app>.unnarrow.initial_core.*.<type>
stats.timers.<app>.unnarrow.initial_free.all.<type> (5) = sum stats.timers.<app>.unnarrow.initial_free.*.<type>