zulip

mirror of https://github.com/zulip/zulip.git synced 2025-10-23 04:52:12 +00:00

Author	SHA1	Message	Date
Alex Vandiver	5606489d96	puppet: Use Service for PostgreSQL restarts. Using pg_ctlcluster leaves systemctl thinking the process aborted; and not all instances (e.g. Docker) have systemctl. (cherry picked from commit `7a8a8f5f23`)	2025-07-01 16:37:03 -07:00
Alex Vandiver	bc7ea80452	puppet: Do not bother manually symlinking hunspell dictionaries. This code dates back to 57b52310639a; however, this has been handled by `postgresql-common` adding a post-install trigger to call `pg_updatedicts` for each new PostgreSQL version, since `postgresql-common` version 153 (February 2014). (cherry picked from commit `9def655564`)	2025-07-01 16:37:03 -07:00
Alex Vandiver	38421b77ea	upgrade-postgresql: Use tags to partially-apply configuration. This uses the same technique used in `840884ec89`, to only apply select parts of the Puppet configuration. This is more correct, and simpler, than attempting to chop out some base puppet roles, and hack around the `purge => true` supervisor.d configuration. (cherry picked from commit `e13f82f048`)	2025-07-01 16:37:03 -07:00
Anders Kaseorg	85b2e6a1e9	install: Support PostgreSQL 17. Signed-off-by: Anders Kaseorg <anders@zulip.com> (cherry picked from commit `818742c62b`)	2025-04-14 16:02:26 -07:00
Alex Vandiver	e44108edb2	puppet: Upgrade tusd to 2.8.0. (cherry picked from commit `21eff33875`)	2025-04-14 16:02:26 -07:00
Alex Vandiver	d7293735e1	smokescreen: Move metrics port from the default 9810, to 4760. This prevents errors if Smokescreen is running on a host with more than 10 Tornado shards. (cherry picked from commit `b11cbbab01`)	2025-04-14 16:02:26 -07:00
Alex Vandiver	da72e9447e	kandra: Add a grok exporter to parse nginx logfiles. This provides access logging metrics to Prometheus. For cardinality reasons, we cannot (nor would we want to) put every request path into its own label value -- but we do separate out the most-frequent access paths (as well as some low-frequency but high-interest ones) into their own label values. In order to differentiate accesses to https://zulip.com/ from https://example.zulipchat.com/ (both of which appear at path `/`), we use a `grok_exporter.realm_names_regex` value in `zulip.conf`, which is expected to be set to match the hostname of all possible realms. (cherry picked from commit `840fa74854`)	2025-04-14 16:02:26 -07:00
Alex Vandiver	c357eb8225	kandra: Update prometheus configuration. This pulls in the more complete production Prometheus configuration. (cherry picked from commit `bd54f0363e`)	2025-04-14 16:02:26 -07:00
Alex Vandiver	ab81867721	nginx: Relay the same Host: header that nginx saw. Unilaterally adding the port can cause CSRF failures when the port is a default port, and thus optional. Switch to providing the exact `Host` header that the original request contained. (cherry picked from commit `5f783ed5ad`)	2025-04-10 17:42:48 -07:00
Alex Vandiver	1e6a413895	nginx: Use cache slicing to prevent thundering herds for video thumbs. This prevents a thundering herd for videos -- if a very large video is posted to a channel with many active clients, all of them simultaneously request it, to provide the in-feed preview image. While these requests come with a `Range` header which is intended to limit the request to just the first couple MB, nginx ignores this header when making its request to the upstream -- so it can obtain and cache the whole file locally. This results in multiple competing requests for the whole content from S3, all racing to store the content in the cache. Use cache slicing to split the content cache into chunks of 5MB; the cache is filled one slice at a time, as needed based on the byte ranges that clients request. Clients making requests without a `Range` header are provided with the content transparently stitched together from the individual slices. The slice size of 5MB is chosen to encompass more 95% of file uploads (saving an extra trip to the origin) while also being large enough to be able to provide video thumbnails in a single slice, as well as not take too much time to obtain from the upstream. (cherry picked from commit `23e8eb5c7c`)	2025-04-10 17:42:48 -07:00
Alex Vandiver	0e3eb0081b	nginx: Serve full app from localhost. Some deployments choose to wrap Zulip's nginx in an outer proxy -- for example, to do custom TLS termination. In such deployments, the outer proxy is routing to `127.0.0.1:80`; `b4fb22ba1b` breaks these configurations, as it switches the `127.0.0.1:80` listener to only serving `/api/internal/` paths. Switch to serving the whole application over `127.0.0.1:80`. (cherry picked from commit `e2e0c72a80`)	2025-04-04 12:01:08 -07:00
Alex Vandiver	71348739be	nginx: Tell the backend service what port we listen on. The `$host` nginx variable is _not_ the unadulterated `Host` header (which would be `$http_host`) -- it is that header, without the port, with a fallback to the `server_name` which processed the request. This means that backend services are not aware of the port that the request came in on, unless they derive that from reading `nginx_listen_port` in `/etc/zulip/zulip.conf`, or similar. Specifically, this caused `tusd`, on deploys with non-standard `nginx_listen_port`, to generate a `Location` header which left off the port, and as such attempted a CORS check when retrieving metadata about the just-uploaded file, which failed. Add the port to the `Host` header we pass to `tusd` and other backend services. (cherry picked from commit `4e26705fbc`)	2025-04-01 09:29:32 -07:00
Alex Vandiver	851953a729	nginx: Move localhost to its own block, bound to the loopback address. This makes the `localhost.d` directory less of a lie, and decreases the chances that local reconfigurations will break the 127.0.0.1:80 server which is used for IPC. In cases where `nginx_http_only` is enabled, we respect `nginx_listen_port` soas to not attempt to bind on port 80 if the administrator was explicitly attempting to avoid that. (cherry picked from commit `b4fb22ba1b`)	2025-04-01 09:29:32 -07:00
Alex Vandiver	f902a39ac9	nginx: Allow adding extra monitoring paths in a localhost.d. (cherry picked from commit `023e634e98`)	2025-03-26 10:55:07 -07:00
Alex Vandiver	29a0d287fc	puppet: Allow for arbitrary queues to have more than one worker. This generalizes from thumbnail_workers, to include any other queue. We only additionally choose to document `email_senders_workers`, however, since other queues are not guaranteed to work correctly with multiple consumers.	2025-03-14 14:07:09 -07:00
Alex Vandiver	eca77631b4	kandra: Rename onboarding-video to navigation-tour-video.	2025-03-13 09:54:44 -07:00
Tim Abbott	b16bd27e9c	puppet: Add gettext to dependencies for app instances. This is already installed on a lot of systems, and is used indirectly when upgrading Zulip from Git. We previously removed this in `263212decf`, I believe due to an incorrect understanding of only makemessages needing it.	2025-03-06 13:08:08 -08:00
Alex Vandiver	fa3983ac46	kandra: Add /static/onboarding-video/ static content.	2025-03-05 22:37:19 -08:00
Alex Vandiver	412826e08b	kandra: Add zulip-notify hooks for CZO.	2025-03-05 13:57:10 -08:00
Alex Vandiver	c5200e8b05	deliver_scheduled_emails: Use a queue, instead of infinite retries. `deliver_scheduled_emails` tries to deliver the email synchronously, and if it fails, it retries after 10 seconds. Since it does not track retries, and always tries the earliest-scheduled-but-due message first, the worker will not make forward progress if there is a persistent failure with that message, and will retry indefinitely. This can result in excessive network or email delivery charges from the remote SMTP server. Switch to delivering emails via a new queue worker. The `deliver_scheduled_emails` job now serves only to pull deferred jobs out of the table once they are due, insert them into RabbitMQ, and then delete them. This limits the potential for head-of-queue failures to failures inserting into RabbitMQ, which is more reasonable than failures speaking to a complex external system we do not control. Retries and any connections to the SMTP server are left to the RabbitMQ consumer. We build a new RabbitMQ queue, rather than use the existing `email_senders` queue, because that queue is expected to be reasonably low-latency, for things like missed message notifications. The `send_future_email` codepath which inserts into ScheduledEmails is also (ab)used to digest emails, which are extremely bursty in their frequency -- and a large burst could significantly delay emails behind it in the queue. The new queue is explicitly only for messages which were not initiated by user actions (e.g., invitation reminders, digests, new account follow-ups) which are thus not latency-sensitive. Fixes: #32463.	2025-03-04 16:09:25 -08:00
Anders Kaseorg	d7556b4060	requirements: Migrate to uv. https://docs.astral.sh/uv/ Signed-off-by: Anders Kaseorg <anders@zulip.com>	2025-02-24 22:29:24 -08:00
Alex Vandiver	c84c76f3a1	puppet: Upgrade external dependencies.	2025-02-21 14:38:31 -08:00
Alex Vandiver	3e421d71ba	kandra: Update Teleport CA.	2025-02-21 10:16:33 -08:00
Alex Vandiver	ce34ebdfed	kandra: Add a hook to notify of local schema drift from merge-base.	2025-02-19 12:43:00 -08:00
Alex Vandiver	af4fa75b66	puppet: Upgrade version of aws tool.	2025-01-28 17:43:08 -08:00
Alex Vandiver	3ec896ebda	nginx: Add an option which defaults loadbalancer requests to https. In some cases, it is not possible to configure the load-balancer to add an X-Forwarded-Proto header. If Zulip is serving its traffic over HTTP, it will rightly error out, since it cannot guarantee that its response will be served over an encrypted connection. Add a new `loadbalancer.rejects_http_requests` settings which serves as a way for the operator to swear that the load-balancer will never serve responses from Zulip over an unencrypted connection. In most cases, this is because the load-balancer is configured to have port 80 always serve an HTTP 301 redirect to the same URL over HTTPS. Properly configuring the proxy to send `X-Forwarded-Proto` is always a better solution than using this configuration parameter, so use of this should be viewed as a last resort.	2025-01-22 12:25:42 -08:00
Prakhar Pratyush	86cd8349f7	cron: Update 'update-channel-recently-active-status' to run weekly. This commit updates the 'update-channel-recently-active-status' cron job to run weekly instead of daily.	2025-01-02 16:37:34 -08:00
Anders Kaseorg	58822372d5	typos: Fix typos caught by typos and mwic. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-12-24 19:15:36 -08:00
Anders Kaseorg	19b8cde27f	ruff: Fix PLC0206 Extracting value from dictionary without calling `.items()`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-12-21 21:06:53 -08:00
Aman Agrawal	50256f4831	stream: Add field to track active status of stream.	2024-11-27 17:31:06 -08:00
Tim Abbott	11d6273990	Update puppet/zulip/manifests/app_frontend_base.pp Co-authored-by: Anders Kaseorg <andersk@mit.edu>	2024-11-15 15:08:33 -08:00
Tim Abbott	9d68d89d01	puppet: Require libldap-common be installed. Zulip instances without a database included, like the Docker image, would not fail to use TLS properly, since `TLS_REQCERT` was not set in `/etc/ldap/ldap.conf`. While there's a few other ways we could fix this, just installing libldap-common on app frontend instances seems like a good solution, and has no impact on other Zulip systems, and it was already being installed through a "Recommends" tier apt dependency indirectly from the PostgreSQL server package. Fixes zulip/docker-zulip#454.	2024-11-15 15:08:33 -08:00
Anders Kaseorg	2bb87aebec	install: Remove PostgreSQL 12 support. PostgreSQL 12 reaches end of life on November 14, 2024. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-11-01 15:33:31 -07:00
Alex Vandiver	f325e15439	nagios: Switch staging hosts to not page, but send a zulip.	2024-10-08 16:55:07 -07:00
Alex Vandiver	1bd0ab506c	nginx: Make uwsgi timeout shorter than nginx-to-uwsgi timeout. The nginx-to-uwsig-timeout defaults to 60s, which is exactly the same as the current "harakiri" timeout configured in uwsgi (which limits the length a request can run before the worker is terminated). This causes a race, where if nginx hits its 60s before uwsgi, then we return a 504; otherwise, we get a 502. Make the nginx-to-uwsgi timeout explicit, and shorten the "harakiri" timeout to be explicitly less than that. Document the 60s timeout, which all outer reverse proxies must be set to _longer than_ in order to have proper "onion" timeouts.	2024-10-07 15:41:08 -07:00
Alex Vandiver	7ddcf3774b	puppet: Add tusd daily logrotation.	2024-10-04 14:22:37 -07:00
Alex Vandiver	2571196899	puppet: Remove unnecessary "create" directive. It is irrelevant, since copytruncate is used.	2024-10-04 14:22:37 -07:00
Alex Vandiver	e6c64e78e6	puppet: Switch logrotate to be in charge of tornado logs.	2024-10-04 14:22:37 -07:00
Alex Vandiver	34308efb94	puppet: Upgrade sentry-cli. This version causes `sentry-cli monitors run` to not fail if Sentry is down (getsentry/sentry-cli#2169).	2024-09-30 11:24:36 -07:00
Alex Vandiver	87ee167726	puppet: Allow tusd to be exposed on non-localhost. This allows its /metrics endpoint to be monitored.	2024-09-27 15:06:47 -07:00
Alex Vandiver	0c7d83f7da	kandra: Use vector to plumb SES logs into S3.	2024-09-26 11:19:45 -07:00
Alex Vandiver	60759ab5fb	kandra: Use generic "vector" process, not dedicated "akamai" process. This makes the Vector configuration extensible, to allow it to be used not just for ingesting Akamai logs.	2024-09-26 11:19:45 -07:00
Alex Vandiver	b0ca32c955	nginx: Fix missing word in comment.	2024-09-25 11:15:03 -07:00
Alex Vandiver	77a121082b	kandra: Add localhost access to internal APIs on port 80. This parallels `02d3fb7666`.	2024-09-25 10:08:27 -07:00
Alex Vandiver	24d110f063	settings: Increase default max file upload size to 100MB. This also _lowers_ the default nginx client_max_body_size, since that no longer caps the upload file size.	2024-09-19 11:37:29 -07:00
Alex Vandiver	818c30372f	upload: Use tusd for resumable, larger uploads. Currently, it handles two hook types: 'pre-create' (to verify that the user is authenticated and the file size is within the limit) and 'pre-finish' (which creates an attachment row). No secret is shared between Django and tusd for authentication of the hooks endpoints, because none is necessary -- tusd forwards the end-user's credentials, and the hook checks them like it would any end-user request. An end-user gaining access to the endpoint would be able to do no more harm than via tusd or the normal file upload API. Regardless, the previous commit has restricted access to the endpoint at the nginx layer. Co-authored-by: Brijmohan Siyag <brijsiyag@gmail.com>	2024-09-19 11:37:29 -07:00
Alex Vandiver	02d3fb7666	nginx: Allow HTTP access to internal endpoints from localhost.	2024-09-19 11:37:29 -07:00
Alex Vandiver	c34913b4d7	nginx: Limit access to internal endpoints, to localhost.	2024-09-17 12:51:30 -07:00
Alex Vandiver	64a16dd9b4	kandra: We do not serve staging from staging.zulip.com:80. It is not configured in the ALB's rules, nor does the ALB ever forward to port 80.	2024-09-09 15:17:19 -07:00
Alex Vandiver	9497f23307	puppet: Make restart-server cron use standard day-of-week. Using 7 makes this fail when run under the Sentry cron wrapper.	2024-08-30 13:13:05 -07:00

1 2 3 4 5 ...

1795 Commits