2017-07-29

Prometheus の Alertmanager（と Postfix）でメール通知

前回 Prometheus server と node_exporter を同じノード上にインストールしてグラフが取れてることを確認したりしました。なので今度はメール通知をやってみようと思う。メール通知するためには Alertmanager というアラートを出す専用のやつをインストールして Prometheus server と連携する必要がある。なので構成としては

監視サーバ（ホスト名：promhost）
- Prometheus server, Alertmanager, node_exporter（エージェント）
監視対象のサーバ（ホスト名：targethost）
- node_exporter（エージェント）

みたいな形でインストールする。というわけで前回インストールしたノードを監視サーバにしたてて、新たに別のノードにエージェントをインストールしました。 node_exporter のインストールは前回書いたので、Alertmanager のインストール方法をメモします。

Alertmanager のインストール

Prometheus server, node_exporter と同じく Go 製なのでインストールが簡単。前回にならって /opt/alertmanager/ にインストールする。

現在の最新バージョンは v0.8.0（その他のリリース）。

$ curl -LO https://github.com/prometheus/alertmanager/releases/download/v0.8.0/alertmanager-0.8.0.linux-amd64.tar.gz

$ tar xzf alertmanager-0.8.0.linux-amd64.tar.gz
$ sudo mv alertmanager-0.8.0.linux-amd64 /opt/alertmanager
$ sudo chmod 755 /opt/alertmanager
$ sudo chown -R root:root /opt/alertmanager

終わり。

Alertmanager の設定

/etc/alertmanager/alertmanager.yml（alertmanager の設定ファイル）を作成。 /global/smtp_require_tls に false を指定していることに注目。 TLS サポートを有効にしていないローカルの Postfix などで配送する場合はこの値を指定してやらないとメールが送れなかった。

参考
- Google グループ
- Consider a global config option for require_tls · Issue #433 · prometheus/alertmanager · GitHub

global:
  # The smarthost and SMTP sender used for mail notifications.
  smtp_smarthost: 'localhost:25'
  smtp_require_tls: false
  smtp_from: 'Alertmanager <alertmanager@localhost.localdomain>'

# The root route on which each incoming alert enters.
route:
  # When a new group of alerts is created by an incoming alert, wait at
  # least 'group_wait' to send the initial notification.
  # This way ensures that you get multiple alerts for the same group that start
  # firing shortly after another are batched together on the first
  # notification.
  group_wait: 30s

  # When the first notification was sent, wait 'group_interval' to send a batch
  # of new alerts that started firing for that group.
  group_interval: 5m

  # If an alert has successfully been sent, wait 'repeat_interval' to
  # resend them.
  repeat_interval: 3h

  # A default receiver
  receiver: default

receivers:
- name: 'default'
  email_configs:
  - to: 'alertmanager@localhost.localdomain'

/etc/systemd/system/alertmanager.service（alertmanager 用 systemd 設定ファイル）を作成。

[Unit]
Description=Alertmanager for Prometheus
After=network.target

[Service]
Type=simple
EnvironmentFile=-/etc/default/alertmanager
ExecStart=/opt/alertmanager/alertmanager $OPTIONS
PrivateTmp=true
WorkingDirectory=/opt/alertmanager

[Install]
WantedBy=multi-user.target

/etc/default/alertmanager（systemd で参照する環境変数ファイル）を作成。ここで -storage.path を指定しないとカレントディレクトリの data ディレクトリにデータ用ディレクトリを作ろうとする。 systemd で動いてる場合は WorkingDirectory の指定がない場合は / で動くので /data ってディレクトリが作られてしまう。上の systemd 設定ファイルで WorkingDirectory 指定してあるのは -storage.path の他にもカレントディレクトリに何か作ってしまわないか心配になったので念のため。

OPTIONS="-config.file /etc/alertmanager/alertmanager.yml -storage.path /var/lib/alertmanager"

Prometheus に Alerting rules の定義

アラートをメール通知するためにはまだ足りない。アラートの条件はどこに書くかというと、Prometheus server の設定ファイル（前回の通りだと /etc/prometheus/prometheus.yml）に書く。

rule_files:
  - /etc/prometheus/alert.rules

こんな行を追加する。あと監視対象のノード追加。

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets:
        - 'localhost:9090'
        - 'promhost:9100'     # 監視サーバ (localhost:9100 でもいいけど)
        - 'targethost:9100'   # 監視対象サーバ

前回との差分は以下の通り。

--- /etc/prometheus/prometheus.yml.old  2017-07-29 22:53:32.319944850 +0900
+++ /etc/prometheus/prometheus.yml      2017-07-29 22:44:24.542404336 +0900
@@ -11,8 +11,7 @@

 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
 rule_files:
-  # - "first.rules"
-  # - "second.rules"
+  - /etc/prometheus/alert.rules

 # A scrape configuration containing exactly one endpoint to scrape:
 # Here it's Prometheus itself.
@@ -24,4 +23,7 @@
     # scheme defaults to 'http'.

     static_configs:
-      - targets: ['localhost:9090']
+      - targets:
+        - 'localhost:9090'
+        - 'promhost:9100'
+        - 'targethost:9100'

念のため全体も。

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - /etc/prometheus/alert.rules

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets:
        - 'localhost:9090'
        - 'promhost:9100'
        - 'targethost:9100'

prometheus.yml を更新した後は、/etc/prometheus/alert.rules を作成。

# Alert for any instance that is unreachable for >5 minutes.
ALERT InstanceDown
  IF up == 0
  FOR 5m
  LABELS { severity = "page" }
  ANNOTATIONS {
    summary = "Instance {{ $labels.instance }} down",
    description = "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.",
  }

# Alert for any instance that have a median request latency >1s.
ALERT APIHighRequestLatency
  IF api_http_request_latencies_second{quantile="0.5"} > 1
  FOR 1m
  ANNOTATIONS {
    summary = "High request latency on {{ $labels.instance }}",
    description = "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)",
  }

上記は公式ドキュメントまんま。監視対象の node_exporter が落ちたりノードが停止したりするとメールが飛ぶ設定（だがまだ足りない）。

Prometheus server と Alertmanager の連携

Prometheus server に Alertmanager を認識させるにはコマンドライン引数に -alertmanager.url を追加する。前回 /etc/default/prometheus を作ったのでそこに書く。

OPTIONS="-config.file=/etc/prometheus/prometheus.yml -storage.local.path=/var/lib/prometheus -web.console.libraries=/etc/prometheus/console_libraries -web.console.templates=/etc/prometheus/consoles -alertmanager.url=http://localhost:9093"

上記のように末尾に追加して再起動。

$ systemctl restart prometheus

これでメールが飛ぶようになったはず。

エラーメッセージへの対処

いくつかエラーメッセージに出くわしたのでその対処法を書く。

Error on notify: Cancelling notify retry due to unrecoverable error: parsing from addresses: mail: missing phrase

systemctl status -l alertmanager に出てたエラーメッセージ。

/global/smtp_from, /receivers/email_configs/to などに指定するメールアドレスが不正な場合に出ます。 smtp_from: 'Alertmanager <alertmanager@localhost.localdomain>' みたいに書くか smtp_from: alertmanager@localhost.localdomain みたいに書く必要があります（localhost だからって @ 以降も忘れずに）。

参考
- mail: missing phrase · Issue #624 · prometheus/alertmanager · GitHub

2017-07-27

CentOS 7 に Prometheus 入れる

Prometheus

VM 作成

※Prometheus に興味のある人はここは飛ばして構いません。自分用の記録みたいなものです。

まず検証用に KVM の VM を作る。最初 virt-manager を RLogin + Xming で立ち上げようと思ったらうまく Xming に接続してくれなくて開けなかった。ここでしばらく迷ってたが、後々 virsh で VM を clone すれば一から作成する必要がないことに気付く。というかいつもそうやって VM 作ってた。というわけで以下のコマンドで clone できるらしい。

$ sudo virt-clone --original <コピー元仮想マシン名> --name <コピー先仮想マシン名> --file /var/lib/libvirt/images/<コピー先イメージファイル名>

参考記事：KVMのクローンを利用して新規仮想マシンを作成する : アジャイル株式会社

clone は成功。起動する前に DHCP で割り当てる IP の設定を DNS サーバ（つまり RTX810）に入れる。

[rtx810]# ip host ホスト名 xxx.xxx.xxx.xxx
[rtx810]# dhcp scope bind xx xxx.xxx.xxx.xxx 52:54:00:xx:xx:xx
[rtx810]# save

この MAC アドレスは sudo virsh domiflist <仮想マシン名> で見れる（参考記事：qemu - libvirt: fetch ipv4 address from guest - Stack Overflow）。

で、RTX810 への設定が終わったら sudo virsh start <仮想マシン名> する。別の端末で ping しておいて無事 IP が割り当てられたことを確認、ログイン。

Prometheus のインストール

いよいよ Prometheus のインストールに入る。ぶっちゃけこれ以降は PrometheusをCentOS7.3＆Docker上にインストールしてみた - Qiita の「３．Prometheusインストール」以降をなぞっただけです（ただあちらは Docker 用の手順ですが自分は KVM です）。

まずはここから最新のリリースを取得する（現在の最新は v1.7.1）。バイナリで配布されているので展開するだけ。手軽すぎる…

$ curl -LO https://github.com/prometheus/prometheus/releases/download/v1.7.1/prometheus-1.7.1.linux-amd64.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   624    0   624    0     0    765      0 --:--:-- --:--:-- --:--:--   764
100 16.3M  100 16.3M    0     0  2508k      0  0:00:06  0:00:06 --:--:-- 3385k
$ tar xzf prometheus-1.7.1.linux-amd64.tar.gz
$ cd prometheus-1.7.1.linux-amd64/
$ ls
LICENSE  NOTICE  console_libraries  consoles  prometheus  prometheus.yml  promtool
$ file prometheus
prometheus: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not str                                                                                             ipped
$ ./prometheus --help
usage: prometheus [<args>]


   -version false
      Print version information.

   -config.file "prometheus.yml"
      Prometheus configuration file name.

 == ALERTMANAGER ==

最後の help は省略（219行あった）。

起動も簡単。引数なしで実行するだけ。

$ ./prometheus
INFO[0000] Starting prometheus (version=1.7.1, branch=master, revision=3afb3fffa3a29c3de865e1172fb740442e9d0133)  source="main.go:88"
INFO[0000] Build context (go=go1.8.3, user=root@0aa1b7fc430d, date=20170612-11:44:05)  source="main.go:89"
INFO[0000] Host details (Linux 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86_64 ホスト名 (none))  source="main.go:90"
INFO[0000] Loading configuration file prometheus.yml     source="main.go:252"
INFO[0000] Loading series map and head chunks...         source="storage.go:428"
INFO[0000] 0 series loaded.                              source="storage.go:439"
INFO[0000] Starting target manager...                    source="targetmanager.go:63"
INFO[0000] Listening on :9090                            source="web.go:259"

(ここで Ctrl-C を押して終了)

^CWARN[0032] Received SIGTERM, exiting gracefully...       source="main.go:234"
INFO[0032] See you next time!                            source="main.go:241"
INFO[0032] Stopping target manager...                    source="targetmanager.go:77"
INFO[0032] Stopping rule manager...                      source="manager.go:388"
INFO[0032] Rule manager stopped.                         source="manager.go:394"
INFO[0032] Stopping notification handler...              source="notifier.go:418"
INFO[0032] Stopping local storage...                     source="storage.go:457"
INFO[0032] Stopping maintenance loop...                  source="storage.go:459"
INFO[0032] Maintenance loop stopped.                     source="storage.go:1458"
INFO[0032] Stopping series quarantining...               source="storage.go:463"
INFO[0032] Series quarantining stopped.                  source="storage.go:1907"
INFO[0032] Stopping chunk eviction...                    source="storage.go:467"
INFO[0032] Chunk eviction stopped.                       source="storage.go:1153"
INFO[0032] Checkpointing in-memory metrics and chunks...  source="persistence.go:633"
INFO[0032] Done checkpointing in-memory metrics and chunks in 20.712234ms.  source="persistence.go:665"
INFO[0032] Checkpointing fingerprint mappings...         source="persistence.go:1526"
INFO[0032] Done checkpointing fingerprint mappings in 3.875662ms.  source="persistence.go:1549"
INFO[0032] Local storage stopped.                        source="storage.go:484"

起動の仕方を確認したところで所定の位置に配置していきます。今回は /opt/prometheus に配置します。

$ sudo mv prometheus-1.7.1.linux-amd64/ /opt/prometheus  # さっき解凍したディレクトリを移動
$ sudo chmod 755 /opt/prometheus
$ sudo chown -R root:root /opt/prometheus

$ sudo mkdir /etc/prometheus  # prometheus 用の設定ディレクトリを作る
$ sudo ln -s /opt/prometheus/console_libraries /etc/prometheus/console_libraries  # いろいろ symlink をはっていく
$ sudo ln -s /opt/prometheus/consoles /etc/prometheus/consoles
$ sudo ln -s /opt/prometheus/prometheus.yml /etc/prometheus/prometheus.yml

$ sudo vim /etc/default/prometheus
# /var/lib/prometheus (データ用ディレクトリはなければ勝手に作られます)
OPTIONS="-config.file=/etc/prometheus/prometheus.yml -storage.local.path=/var/lib/prometheus -web.console.libraries=/etc/prometheus/console_libraries -web.console.templates=/etc/prometheus/consoles"

$ sudo vim /etc/systemd/system/prometheus.service  # 参考記事の設定そのままです（感謝）
[Unit]
Description=Prometheus Service
After=network.target

[Service]
Type=simple
EnvironmentFile=-/etc/default/prometheus
ExecStart=/opt/prometheus/prometheus $OPTIONS
PrivateTmp=true
# (追記：2017/08/12 12:36) それなりにメモリ食うので restart させておくといいかもしれません
#Restart=always

[Install]
WantedBy=multi-user.target

$ sudo systemctl daemon-reload  # 設定ファイルを追加したので必要かわからんけど一応
$ sudo systemctl start prometheus  # 起動
$ sudo systemctl status -l prometheus
● prometheus.service - Prometheus Service
   Loaded: loaded (/etc/systemd/system/prometheus.service; disabled; vendor preset: disabled)
   Active: active (running) since 木 2017-07-27 00:29:01 JST; 15min ago
 Main PID: 1323 (prometheus)
   CGroup: /system.slice/prometheus.service
           └─1323 /opt/prometheus-1.7.1/prometheus -config.file=/etc/prometheus/prometheus.yml -storage.local.path=/var/lib/prometheus -web.console.libraries=/etc/prometheus/console_libraries -web.console.templates=/etc/prometheus/consoles

 7月 27 00:29:01 ホスト名 prometheus[1323]: time="2017-07-27T00:29:01+09:00" level=info msg="Loading series map and head chunks..." source="storage.go:428"
 7月 27 00:29:01 ホスト名 prometheus[1323]: time="2017-07-27T00:29:01+09:00" level=info msg="546 series loaded." source="storage.go:439"
 7月 27 00:29:01 ホスト名 prometheus[1323]: time="2017-07-27T00:29:01+09:00" level=info msg="Starting target manager..." source="targetmanager.go:63"
 7月 27 00:29:01 ホスト名 prometheus[1323]: time="2017-07-27T00:29:01+09:00" level=info msg="Listening on :9090" source="web.go:259"
 7月 27 00:34:01 ホスト名 prometheus[1323]: time="2017-07-27T00:34:01+09:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
 7月 27 00:34:01 ホスト名 prometheus[1323]: time="2017-07-27T00:34:01+09:00" level=info msg="Done checkpointing in-memory metrics and chunks in 21.002274ms." source="persistence.go:665"
 7月 27 00:39:01 ホスト名 prometheus[1323]: time="2017-07-27T00:39:01+09:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
 7月 27 00:39:01 ホスト名 prometheus[1323]: time="2017-07-27T00:39:01+09:00" level=info msg="Done checkpointing in-memory metrics and chunks in 20.588386ms." source="persistence.go:665"
 7月 27 00:44:01 ホスト名 prometheus[1323]: time="2017-07-27T00:44:01+09:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
 7月 27 00:44:01 ホスト名 prometheus[1323]: time="2017-07-27T00:44:01+09:00" level=info msg="Done checkpointing in-memory metrics and chunks in 20.26329ms." source="persistence.go:665"

Web UI がついてるので「Listening on :9090」の通り 9090 ポートにアクセスすれば管理画面が見れる。

f:id:tyru:20170727014628p:plain

node_exporter（エージェント）を監視対象のノードにインストールする

参考記事：【入門】PrometheusでサーバやDockerコンテナのリソース監視 | Pocketstudio.jp log3

node_exporter も curl で持ってきて解凍するだけで動きます。現在の最新リリースは v0.14.0 でした。他のリリースはこちら。

今回はテストなので Prometheus server と同じノード上にインストールします。

$ curl -LO https://github.com/prometheus/node_exporter/releases/download/v0.14.0/node_exporter-0.14.0.linux-amd64.tar.gz
$ tar xzf node_exporter-0.14.0.linux-amd64.tar.gz
$ cd node_exporter-0.14.0.linux-amd64/
$ ls
LICENSE  NOTICE  node_exporter
$ ./node_exporter
INFO[0000] Starting node_exporter (version=0.14.0, branch=master, revision=840ba5dcc71a084a3bc63cb6063003c1f94435a6)  source="node_exporter.go:140"
INFO[0000] Build context (go=go1.7.5, user=root@bb6d0678e7f3, date=20170321-12:12:54)  source="node_exporter.go:141"
INFO[0000] No directory specified, see --collector.textfile.directory  source="textfile.go:57"
INFO[0000] Enabled collectors:                           source="node_exporter.go:160"
INFO[0000]  - conntrack                                  source="node_exporter.go:162"
INFO[0000]  - stat                                       source="node_exporter.go:162"
INFO[0000]  - uname                                      source="node_exporter.go:162"
INFO[0000]  - vmstat                                     source="node_exporter.go:162"
INFO[0000]  - wifi                                       source="node_exporter.go:162"
INFO[0000]  - entropy                                    source="node_exporter.go:162"
INFO[0000]  - infiniband                                 source="node_exporter.go:162"
INFO[0000]  - netdev                                     source="node_exporter.go:162"
INFO[0000]  - netstat                                    source="node_exporter.go:162"
INFO[0000]  - sockstat                                   source="node_exporter.go:162"
INFO[0000]  - diskstats                                  source="node_exporter.go:162"
INFO[0000]  - meminfo                                    source="node_exporter.go:162"
INFO[0000]  - textfile                                   source="node_exporter.go:162"
INFO[0000]  - zfs                                        source="node_exporter.go:162"
INFO[0000]  - mdadm                                      source="node_exporter.go:162"
INFO[0000]  - time                                       source="node_exporter.go:162"
INFO[0000]  - edac                                       source="node_exporter.go:162"
INFO[0000]  - filefd                                     source="node_exporter.go:162"
INFO[0000]  - filesystem                                 source="node_exporter.go:162"
INFO[0000]  - hwmon                                      source="node_exporter.go:162"
INFO[0000]  - loadavg                                    source="node_exporter.go:162"
INFO[0000] Listening on :9100                            source="node_exporter.go:186"

というわけでこれも systemctl で起動できるようにする。先ほどの prometheus の設定ファイルをほぼコピペする。

$ sudo vim /etc/systemd/system/node_exporter.service
[Unit]
Description=Node exporter for prometheus
After=network.target

[Service]
Type=simple
EnvironmentFile=-/etc/default/node_exporter
ExecStart=/opt/node_exporter/node_exporter $OPTIONS
PrivateTmp=true

[Install]
WantedBy=multi-user.target

$ sudo vim /etc/default/node_exporter  # ほぼ空の設定ファイルを作っておく
OPTIONS=

$ sudo systemctl daemon-reload
$ sudo systemctl start node_exporter
$ sudo systemctl status -l node_exporter
● node_exporter.service - Node exporter for prometheus
   Loaded: loaded (/etc/systemd/system/node_exporter.service; disabled; vendor preset: disabled)
   Active: active (running) since 木 2017-07-27 01:10:16 JST; 3min 5s ago
 Main PID: 1568 (node_exporter)
   CGroup: /system.slice/node_exporter.service
           └─1568 /opt/node_exporter/node_exporter

 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - zfs" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - diskstats" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - filefd" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - loadavg" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - meminfo" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - netdev" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - textfile" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - edac" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg=" - uname" source="node_exporter.go:162"
 7月 27 01:10:16 ホスト名 node_exporter[1568]: time="2017-07-27T01:10:16+09:00" level=info msg="Listening on :9100" source="node_exporter.go:186"

systemctl で起動できたので、今度は先ほど微塵も変更しなかった prometheus.yml 設定ファイルをいじってノードの監視設定を入れます。ちなみにオリジナルの設定ファイルはこんなんです。

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

それをこんな風に変更します。

--- /etc/prometheus/prometheus.yml.org  2017-07-27 01:21:56.237724564 +0900
+++ /etc/prometheus/prometheus.yml      2017-07-27 01:21:58.481728041 +0900
@@ -24,4 +24,4 @@
     # scheme defaults to 'http'.

     static_configs:
-      - targets: ['localhost:9090']
+      - targets: ['localhost:9090', 'localhost:9100']

こう書くと Prometheus server が http://localhost:9100/metrics に HTTP GET して情報を取得するようになります。ちなみに見ての通り既に localhost:9090 というものがあるので、 Prometheus server も http://localhost:9100/metrics という情報を公開してるんですね。自分自身の情報を公開してるので今回 node_exporter のホストを追加する意味あんの？と思いましたが、「node_～」で始まる項目はそもそも Prometheus server は公開してなかったり、結構取得できる値が違ってました。被った項目はまぁ当然こんな風にノードごとに分けて見れます。

f:id:tyru:20170727014417p:plain

ちなみに設定ファイルが正しいかどうかをチェックするには promtool コマンドを使います。

$ sudo /opt/prometheus/promtool check-config /etc/prometheus/prometheus.yml
Checking /etc/prometheus/prometheus.yml
  SUCCESS: 0 rule files found

おわり

今回はインストールがメインだったのでまだ全然中身理解してないですが、とりあえず一旦これで。

2017-07-23

Bash on Windows のショートカット実行で .bash_profile を読み込ませたい

WSL

ショートカットはこれ。自分の場合 C:\Users\ユーザ名\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Bash on Ubuntu on Windows.lnk にインストールされてた。

f:id:tyru:20170723013827p:plain

結論としては ~/.profile に -f ~/.bashrc && . ~/.bashrc を書いて BoW のショートカットのパスは C:\Windows\System32\bash.exe ~ --login（--login を追加）にすれば無事 ~/.profile が読み込まれる。

--login を指定しただけだとコンソールで色が付かなくなるが、なぜかというと

~/.bashrc で環境変数 PS1 を設定してプロンプトに色を付けている
~/.bashrc が読み込まれるのは「対話シェル(-i)」かつ「ログインシェル(–login)で起動されなかった」時のみ *1
- BoW をショートカットから起動した時はデフォルトで対話的 *2
- 最初に何もしなくても色が付いていたのはこのため
--login を指定するとログインシェルになるため自動的には読み込まれない
よって ~/.profile で ~/.bashrc を読み込んでやる必要がある

~~ちなみに .bash_profile じゃなく .profile と言ってるのは Ubuntu だとそっち使うらしい。~~ Ubuntu 関係なく /etc/profile が読み込まれ、その後に ~/.bash_profile, ~/.bash_login, ~/.profile の順に探索され最初に見つかったファイルが読み込まれます（@kariya_mitsuru ++）。*3

*1:“When an interactive shell that is not a login shell is started, bash reads and executes commands from /etc/bash.bashrc and ~/.bashrc, if these files exist. This may be inhibited by using the –norc option. The –rcfile file option will force bash to read and execute commands from file instead of /etc/bash.bashrc and ~/.bashrc.” / man 1 bash の INVOCATION より

*2:“An interactive shell is one started without non-option arguments and without the -c option whose standard input and error are both connected to terminals (as determined by isatty(3) ), or one started with the -i option.” / man 1 bash の INVOCATION より

*3:“When bash is invoked as an interactive login shell, or as a non-interactive shell with the –login option, it first reads and executes commands from the file /etc/profile, if that file exists. After reading that file, it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order, and reads and executes commands from the first one that exists and is readable.” / man 1 bash の INVOCATION より

Humanity

Edit the world by your favorite way