Initial commit

This commit is contained in:
JamesEisele
2024-09-19 09:59:14 -06:00
parent 653047eb0b
commit 6cc435c5a1
10 changed files with 659 additions and 2 deletions

166
.gitignore vendored Normal file
View File

@@ -0,0 +1,166 @@
.DS_Store
grafana/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

142
README.md
View File

@@ -1,2 +1,140 @@
# Grafana-Loki-batteries-included-Docker-setup
"Batteries included" Docker setup for Grafana Loki with syslog support.
# Overview
With this deployment you'll have:
- A full logging stack that will ingest syslog via syslog-ng, relay syslog messages to Alloy, send them to Loki for storage, and display them within Grafana.
- We'll also configure the necessary prerequisites to use the [Explore Logs plugin](https://grafana.com/blog/2024/04/09/find-your-logs-data-with-explore-logs-no-logql-required/) within Grafana for easier reviewing of logs.
## TL;DR
If you want to skip the background and just get Loki up and running, you can either go to the [Configuration section](#configuration) below or reference the relevant `*-docker-compose.yml` file alongside the individual service config files.
# Why
The purpose of this repo is establish a simple stack with straightfoward service configs to ingest syslog data into Loki while also giving greater context to the configuration process. It took me longer than I wanted to get an MVP for Loki up and running based on available guides, including Grafana's own documentation and tutorials. More often than not, these sources were outdated in sneaky ways, glossed over critical config items, or even misrepresented service capabilities.
Grafana recommends deploying Loki to production with k8s and the lack of consistent Docker documentation backs that up. However, that level of complexity isn't something I needed for my own environments and isn't something I want to spin up even if I wanted to test a log platform.
With so many moving parts that must be in a base functional state before you can troubleshoot your own stack, getting started with Loki can overly tedious.
# Considerations
## Docker Compose vs. Swarm deployment
Generally speaking, there's little difference in how you'd deploy a regular `docker-compose.yml` file via Docker Compose versus in a Docker swarm environment with the `docker stack deploy`. The reason I'm calling out the differences in configuration approaches between the two is mostly around remote storage considerations.
Outside of the service configuration files that get read into syslog-ng and Alloy*, you'll have to store your log files somewhere outside your Docker host if you want to utilize Swarm's service provisioning across multiple nodes.
*It might make sense to rely on Docker Secrets and/or Docker Configs for these services' config files. I've avoided exploring that here as I've found their implementations too restrictive considering there's better tools to handle this if you want to take on extra complexity with something like Hashicorp Vault.
## Grafana Alloy vs. Grafana Agent vs. Promtail
Grafana offers three ways to ingest logs:
1. **Grafana Alloy**: collects logs plus everything else.
2. **Grafana Agent**: collects logs plus everything else. Deprecated and replaced by Alloy.
3. **Grafana Promtail**: feature complete log collector.
For this stack, we're going with Ally for it's going to be better supported going forward, even if we don't need the whole kitchen sink it provides.
With that said, it was a lot easier wrap my head around deploying syslog for Loki using Promtail initially so I've included the relevant config files and docker service info in this repo for reference.
## Alloy syslog
- Alloy only plays nice with RFC syslog messages which can cause headaches. When you encounter these cases, you'll need to relay syslog messages using a service like syslog-ng or rsyslog front of Alloy that ingests problematic logs and outputs them in RFC format and sends them over TCP.
- Regarding syslog handling, more background can be found in the [Promtail docs about the Syslog Receiver](https://grafana.com/docs/loki/latest/send-data/promtail/scraping/#syslog-receiver). The same considerations from Promtail apply to Alloy. It's *highly* recommended you read this to understand certain limitations around how Loki has to handle incoming syslog data.
- To overcome these limitations, we'll use a syslog-ng service to forward re-formated syslog data to Alloy.
# Prerequisites
## Docker compose
If you're going to deploy the stack outside of a Docker swarm instance, you'll need docker compose set up. It should come included if you've installed Docker based on Docker's own [install guide](https://docs.docker.com/engine/install/ubuntu/) for your relevant OS.
```shell
$ docker compose version
Docker Compose version v2.29.2
```
## Docker swarm
If you want to quickly test this stack in Swarm but you don't have a swarm already deployed, you can run it on a single node swarm:
```shell
$ docker swarm init
Swarm initialized: current node (qqicqjamshajxxut69baiqq93) is now a manager.
...
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qqicqjamxxut69 * host-1 Ready Active Leader 27.1.2
```
## Config files
Loki, Alloy, and syslog-ng each rely on their own configuration files that specify things like log schema, endpoint ports and addresses, as well as log labels. To get started, you'll only need to change the actual endpoints data get sent to based on whether you're deploying this via regular Docker compose or in a swarm.
- Loki: `loki-config.yml`
- Alloy: `config.alloy`
- syslog-ng: `syslog-ng.conf`
# Configuration
1. Clone this repo wherever you plan to run it and `cd` into the project directory:
```bash
$ git clone https://github.com/JamesEisele/grafana-loki-log-stack.git
$ cd grafana-loki-log-stack
grafana-loki-log-stack $
grafana-loki-log-stack $
```
2. Review your Loki config.
- You shouldn't need to update any of the provided defaults as it will automatically ingest logs sent its way.
- More info can be found in the [official docs page for it](https://grafana.com/docs/loki/latest/configure/). For our assumed use case, we've loosely followed their "local configuration example".
3. Review your Alloy config.
- In the `loki.write "default"` section, you'll need to update your endpoint `url` address if your Loki container isn't going to be running on the same host as Alloy. In a swarm setup, you could get by if you wanted to use your swarm manager's IP but you might consider a keepalived virtual IP shared across your swarm managers if you have more than one manager.
4. Review your syslog-ng config file `syslog-ng.conf`.
- Similar to Step 3, you'll need to update the `destination d_alloy` address so that it's pointing to the same host IP or address that's running your Alloy container.
5. Review your `*-docker-compose.yml` file which will define how Docker will deploy your services.
- Included in the repo are two separate examples:
1. `local-docker-compose.yml`: a compose file meant to be used with the `docker compose up` command
2. `swarm-docker-compose.yml`: a compose file meant to be used in a Docker swarm environment (`docker stack deploy`).
The primary difference between these two files is that the swarm compose file defines volumes that are located on a remote NFS share.
7. Run the compose file using the `-f` flag to specify where your compose file is located:
- Vanilla docker compose: `docker compose -f local-docker-compose.yml up -d`
- Swarm: `docker stack deploy -c /mnt/docker-swarm/stacks/swarm-docker-compose.yml loki-logs`. In this example, we've saved the relevant compose and config files on a remote NFS share that's been mounted on the swarm manager we're running the command from. This gives flexibility if you have multiple managers you want to share the compose file to at once.
8. Check service status
- Vanilla compose: `docker ps`
- Swarm: `docker service ls`
If you need to troubleshoot a specific service, you can either show the logs for the container with `docker logs <container name>` or by exec'ing into the container to inspect things further.
9. Login to Grafana with the default credentials `admin`/`admin` to configure Loki as a datasource. Since we're not using Docker's built-in network, you'll need to specify your actual host's IP address (e.g., `10.72.5.29`). Verify that when you save the datasource, the integrated test in Grafana shows that it is reachable.
10. (Optional) Test that syslog-ng is properly relaying TCP and UDP syslog messages to Promtail. You'll need Python v3.7 or higher installed (check with `python3 --version`):
```shell
# Setup a virtual environment (Linux variant):
$ python3 -m venv venv
$ source venv/bin/activate
(venv) $ pip install -r requirements.txt
# Run the script to send the syslog messages (you'll need to configure your Promtail host and IP variables in the `syslog-test.py` script file):
(venv) $ python3 syslog-test.py -l syslog-ng.example.com
> Sent syslog message whatvia TCP to syslog-ng.example.com:514
> Sent syslog message via UDP to syslog-ng.example.com:514
```
You can now login to your Grafana instance and under the "Expore" > "Logs" section in the sidebar, you can should see your two test log messages show up under your Loki:
[Screenshot of Grafana web interface showing successful test syslog messages sent from syslog-test.py scritpt.](/media/syslog-py_test.png?raw=true)
11. Shutdown the stack:
- Vanilla docker compose: `docker compose -f local-docker-compose.yml down`
- Swarm: `docker stack rm loki-logs`
# Swarm specific configuration items
If you're deploying this stack via Swarm, the easiest way to get up and running is to store your named volumes and config files in a remote NFS share that's accessible on all nodes. You might also think about putting your `swarm-docker-compose.yml` file on an NFS share accesible to all managers.
```shell
sudo mkdir /mnt/swarm/volumes/grafana-data
sudo mkdir /mnt/swarm/volumes/loki-data
sudo mkdir /mnt/swarm/volumes/loki-config
sudo nano /mnt/swarm/volumes/loki-config/loki-config.yml
sudo mkdir /mnt/swarm/volumes/alloy-config/
sudo nano /mnt/swarm/volumes/alloy-config/config.alloy
sudo mkdir /mnt/swarm/volumes/syslog-ng-config/
sudo nano /mnt/swarm/volumes/syslog-ng-config/syslog-ng.conf
```
As mentioned above, it might make sense for you to deploy this stack with Docker [Secrets](https://docs.docker.com/engine/swarm/secrets/) an/or Docker [Configs](https://docs.docker.com/engine/swarm/configs/) to streamline deployments.
# Sources
- https://gist.github.com/xtavras Githug gists for Python syslog testing used here to verify syslog-ng relay functionality with Promtail.
- [Convert a Promtail config to an Alloy config](https://grafana.com/docs/alloy/latest/set-up/migrate/from-promtail/).

53
config.alloy Normal file
View File

@@ -0,0 +1,53 @@
loki.source.syslog "syslog" {
listener {
address = "0.0.0.0:1514"
idle_timeout = "1m0s"
label_structured_data = true
labels = {
job = "syslog",
}
max_message_length = 0
}
forward_to = [loki.write.default.receiver]
relabel_rules = discovery.relabel.syslog.rules
}
discovery.relabel "syslog" {
targets = []
rule {
source_labels = ["__syslog_message_hostname"]
target_label = "host"
}
rule {
source_labels = ["__syslog_message_severity"]
target_label = "severity"
}
rule {
source_labels = ["__syslog_message_app_name"]
target_label = "app"
}
}
local.file_match "alloy_syslog_relay_varlogs" {
path_targets = [{
__address__ = "localhost",
__path__ = "/var/log/*log",
job = "alloy-varlogs",
}]
}
loki.source.file "alloy_syslog_relay_varlogs" {
targets = local.file_match.alloy_syslog_relay_varlogs.targets
forward_to = [loki.write.default.receiver]
legacy_positions_file = "/tmp/positions.yaml"
}
loki.write "default" {
endpoint {
url = "http://localhost:3100/loki/api/v1/push"
}
external_labels = {}
}

51
local-docker-compose.yml Normal file
View File

@@ -0,0 +1,51 @@
services:
grafana:
image: grafana/grafana:11.2.0
user: '1000'
ports:
- 3000:3000/tcp
volumes:
- grafana-data:/var/lib/grafana
environment:
- GF_INSTALL_PLUGINS=https://storage.googleapis.com/integration-artifacts/grafana-lokiexplore-app/grafana-lokiexplore-app-latest.zip;grafana-lokiexplore-app
restart: unless-stopped
loki:
image: grafana/loki:3.1.1
ports:
- 3100:3100
volumes:
- loki-data:/loki
- ./loki-config.yml:/etc/loki/local-config.yml:ro
command: -config.file=/etc/loki/local-config.yml
restart: unless-stopped
alloy:
image: grafana/alloy:v1.3.1
ports:
- 1514:1514/tcp # syslog ingestion
- 1514:1514/udp # syslog ingestion
- 12345:12345 # management UI
volumes:
- ./config.alloy:/etc/alloy/config.alloy
command: >
run --disable-reporting --server.http.listen-addr=0.0.0.0:12345 --storage.path=/var/lib/alloy/data /etc/alloy/config.alloy
depends_on:
- loki
restart: unless-stopped
syslog-ng:
image: balabit/syslog-ng:4.8.0
volumes:
- ./syslog-ng.conf:/etc/syslog-ng/syslog-ng.conf
ports:
- 514:514/udp # Syslog UDP
- 514:601/tcp # Syslog TCP
# - 6514:6514/tcp # Syslog TLS # Out scope for this project.
depends_on:
- alloy
restart: unless-stopped
volumes:
grafana-data:
loki-data:

61
loki-config.yml Normal file
View File

@@ -0,0 +1,61 @@
# Sourced primarily from:
# https://raw.githubusercontent.com/grafana/loki/v3.1.1/cmd/loki/loki-local-config.yaml
auth_enabled: false
limits_config:
allow_structured_metadata: true # Set for Explore Logs plugin
volume_enabled: true # Set for Explore Logs plugin
server:
http_listen_port: 3100
grpc_listen_port: 9096
common:
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
path_prefix: /tmp/loki
replication_factor: 1
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
schema_config:
configs:
- from: 2020-10-24
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
storage_config:
tsdb_shipper:
active_index_directory: /tmp/loki/index
cache_location: /tmp/loki/index_cache
filesystem:
directory: /tmp/loki/chunks
# Set Loki alerting evaluation criteria and config items.
ruler:
alertmanager_url: http://localhost:9093
# protobuf > JSON for performance.
frontend:
encoding: protobuf
address: 0.0.0.0
pattern_ingester:
enabled: true # Set for Explore Logs plugin
# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
# analytics to Grafana Labs.
analytics:
reporting_enabled: false

BIN
media/syslog-py_test.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 897 KiB

3
requirements.txt Normal file
View File

@@ -0,0 +1,3 @@
pytz==2024.2
rfc5424-logging-handler==1.4.3
tzlocal==5.2

93
swarm-docker-compose.yml Normal file
View File

@@ -0,0 +1,93 @@
# sudo mkdir /mnt/swarm/volumes/grafana-data
# sudo mkdir /mnt/swarm/volumes/loki-data
# sudo mkdir /mnt/swarm/volumes/loki-config
# sudo nano /mnt/swarm/volumes/loki-config/loki-config.yml
# sudo mkdir /mnt/swarm/volumes/alloy-config/
# sudo nano /mnt/swarm/volumes/alloy-config/config.alloy
# sudo mkdir /mnt/swarm/volumes/syslog-ng-config/
# sudo nano /mnt/swarm/volumes/syslog-ng-config/syslog-ng.conf
services:
grafana:
image: grafana/grafana:11.2.0
user: '1000'
ports:
- 3000:3000/tcp
volumes:
- grafana-data:/var/lib/grafana
environment:
- GF_INSTALL_PLUGINS=https://storage.googleapis.com/integration-artifacts/grafana-lokiexplore-app/grafana-lokiexplore-app-latest.zip;grafana-lokiexplore-app
loki:
image: grafana/loki:3.1.1
ports:
- 3100:3100
volumes:
- loki-config:/docker-config:ro
- loki-data:/loki
command: -config.file=/docker-config/loki-config.yml
alloy:
image: grafana/alloy:v1.3.1
ports:
- 1514:1514/tcp # syslog ingestion
- 1514:1514/udp # syslog ingestion
- 12345:12345 # management UI
volumes:
- alloy-config:/docker-config:ro
command: >
run --disable-reporting --server.http.listen-addr=0.0.0.0:12345 --storage.path=/var/lib/alloy/data /docker-config/config.alloy
depends_on:
- loki
syslog-ng:
image: balabit/syslog-ng:4.8.0
volumes:
- syslog-ng-config:/docker-config:ro
ports:
- 514:514/udp # Syslog UDP
- 514:601/tcp # Syslog TCP
# - 6514:6514/tcp # Syslog TLS # Out scope for this project.
command: >
-f /docker-config/syslog-ng.conf
depends_on:
- alloy
volumes:
grafana-data:
driver: local
driver_opts:
type: "nfs"
o: "addr=10.25.100.220,rw"
device: ":/mnt/swarm/volumes/grafana-data"
loki-config:
driver: local
driver_opts:
type: "nfs"
o: "addr=10.25.100.220,rw"
device: ":/mnt/swarm/volumes/loki-config"
loki-data:
driver: local
driver_opts:
type: "nfs"
o: "addr=10.25.100.220,rw"
device: ":/mnt/swarm/volumes/loki-data"
alloy-config:
driver: local
driver_opts:
type: "nfs"
o: "addr=10.25.100.220,rw"
device: ":/mnt/swarm/volumes/alloy-config/"
syslog-ng-config:
driver: local
driver_opts:
type: "nfs"
o: "addr=10.25.100.220,rw"
device: ":/mnt/swarm/volumes/syslog-ng-config/"

30
syslog-ng.conf Normal file
View File

@@ -0,0 +1,30 @@
@version: 4.2
# Source to receive logs from remote syslog clients (UDP & TCP).
# We use `network()` over `syslog()` here as `syslog()` alone handles
# framing differently which will result in errors and missing data.
# network(transport(tcp) port(514) flags(syslog-protocol));
source s_sys {
syslog(transport(udp) port(514));
network(transport(tcp) port(601) flags(syslog-protocol));
};
# Source to read local logs.
source s_local {
internal();
};
# Log destination to forward to Alloy.
# Change "syslog.example.com" to host IP running Alloy and
# adjust syslog port as needed based on your Alloy config.
destination d_alloy {
syslog("alloy.example.com" transport("tcp") port(1514));
};
# Ship local log files & received syslog messages to Alloy.
log {
source(s_sys);
source(s_local);
destination(d_alloy);
};

62
syslog-test.py Normal file
View File

@@ -0,0 +1,62 @@
'''
This script sends test syslog messages over TCP and UDP to syslog servers
passed on run. Run `python3 syslog-test.py --help` for more info.
Based on Github user xtavras's scripts:
- https://gist.github.com/xtavras/be13760713e2a9ee1a8bdae2ed6d2565
- https://gist.github.com/xtavras/4a01f7d1f94237a4abcdfb02074453c1
'''
import argparse
import socket
import logging
from rfc5424logging import Rfc5424SysLogHandler
def main():
args = parse_arguments()
syslog_hosts = args.list
for host in syslog_hosts:
send_syslog_msg(host=host, port=514, protocol='tcp')
send_syslog_msg(host=host, port=514, protocol='udp')
def parse_arguments():
parser = argparse.ArgumentParser(
prog='syslog-test.py',
usage='%(prog)s [options]\nexample: python3 %(prog)s syslog-test.py --host 127.0.0.1',
description='Send test TCP and UDP syslog messages to remote syslog servers.')
parser.add_argument('-l', '--list', help='List of IPs or hostnames of syslog servers to send messages to.', nargs='+', default=[]) # one or more parameters
args = parser.parse_args()
if not args.list:
parser.error('No syslog host specified to send test messages to. Run with -h flag for more info. Defaulting to "127.0.0.1"...')
args.list = ['127.0.0.1']
return args
def send_syslog_msg(host, port, protocol):
if protocol == 'tcp':
logger = logging.getLogger('syslog-tcp-test-script')
rfc5424Handler = Rfc5424SysLogHandler(address=(host, port), socktype=socket.SOCK_STREAM)
rfc5424Handler.setLevel(logging.DEBUG)
logger.addHandler(rfc5424Handler)
logger.warning('this is a TCP test', extra={'msgid': 1})
elif protocol == 'udp':
logger = logging.getLogger('syslog-udp-test-script')
rfc5424Handler = Rfc5424SysLogHandler(address=(host, port), socktype=socket.SOCK_DGRAM)
rfc5424Handler.setLevel(logging.DEBUG)
logger.addHandler(rfc5424Handler)
logger.warning('this is a UDP test', extra={'msgid': 1})
else:
print(f' > Unrecognized sylog protocol {protocol}')
print(f' > Sent syslog message via {protocol.upper()} to {host}:{port}')
if __name__ == '__main__':
main()