Added openbsd blog article on softraids

author: Julien Dessaux 2021-04-30 14:39:18 +0200
committer: Julien Dessaux 2021-04-30 14:55:27 +0200
commit: 816673ab604e2432765f64e5ad5e87ec108566bc (patch)
tree: 84305139e6c1517cc8bc2f05f635fd3a2d9cc304
parent: Added a book article (diff)
download: www-816673ab604e2432765f64e5ad5e87ec108566bc.tar.gz
www-816673ab604e2432765f64e5ad5e87ec108566bc.tar.bz2
www-816673ab604e2432765f64e5ad5e87ec108566bc.zip
1 files changed, 74 insertions, 0 deletions
diff --git a/content/blog/OpenBSD/softraid_monitoring.md b/content/blog/OpenBSD/softraid_monitoring.md
new file mode 100644
index 0000000..77adfc3
--- /dev/null
+++ b/content/blog/OpenBSD/softraid_monitoring.md
@@ -0,0 +1,74 @@
+---
+title: OpenBSD softraid monitoring
+date: 2021-04-30
+description: How to properly check a software raid array on OpenBSD
+tags:
+  - OpenBSD
+---
+
+## Introduction
+
+I have reinstalled my nas recently from gentoo to OpenBSD and was amazed once again at how elegant OpenBSD is. The softraid setup was simple thanks to the wonderful [faq](https://www.openbsd.org/faq/faq14.html#softraid). The only thing I changed is that I used a raid5 with 3 disks, but the last line of the faq about the monitoring left the matter as an exercise to the reader.
+
+## Softraid monitoring
+
+I had a hard time figuring out how to properly monitor the state of the array without relying on parsing the output of `bioctl` but at last here it is in all its elegance :
+{{< highlight sh >}}
+root@nas:~# sysctl hw.sensors.softraid0
+hw.sensors.softraid0.drive0=online (sd4), OK
+{{< /highlight >}}
+
+I manually failed one drive (with `bioctl -O /dev/sd2a sd4`) then rebuilt it (with `bioctl -R /dev/sd2a sd4)`... then failed two drives in order to have examples of all possible outputs. Here they are if you are interested :
+{{< highlight sh >}}
+root@nas:~# sysctl hw.sensors.softraid0
+hw.sensors.softraid0.drive0=degraded (sd4), WARNING
+{{< /highlight >}}
+
+{{< highlight sh >}}
+root@nas:~# sysctl hw.sensors.softraid0
+hw.sensors.softraid0.drive0=rebuilding (sd4), WARNING
+{{< /highlight >}}
+
+{{< highlight sh >}}
+root@nas:~# sysctl -a |grep -i softraid
+hw.sensors.softraid0.drive0=failed (sd4), CRITICAL
+{{< /highlight >}}
+
+## Nagios check
+
+I am still using nagios on my personal infrastructure, here is the check I wrote if you are interested :
+
+{{< highlight perl >}}
+#!/usr/bin/env perl
+###############################################################################
+#     \_o<     WARNING : This file is being managed by ansible!      >o_/     #
+#     ~~~~                                                           ~~~~     #
+###############################################################################
+
+use strict;
+use warnings;
+
+##### Arguments processing #####
+use Getopt::Long;
+my $diskname;
+my $usage = "Usage: $0 [OPTIONS]
+OPTIONS:
+    -d DEVICE_NAME, --device-name=DEVICE_NAME : device name to inspect.";
+GetOptions("device-name=s" => \$diskname) or die $usage;
+die "You must provide a device-name\n\n$usage" unless $diskname;
+
+##### Softraid Check #####
+my %output = (
+        "code" => 3,
+        "status" => "UNKNOWN",
+);
+if (`uname` eq "OpenBSD\n") {
+        $output{status} = $1 if `sysctl hw.sensors.$diskname.drive0` =~ /=(.*)$/ or do { $!=3; die "UNKNOWN Failed to get sysctl hw.sensors.$diskname.drive0" };
+        $output{code} = 0 if ($output{status} =~ /OK$/);
+        $output{code} = 1 if ($output{status} =~ /WARNING$/);
+        $output{code} = 2 if ($output{status} =~ /CRITICAL$/);
+}
+
+print $output{status};
+exit $output{code};
+{{< /highlight >}}
author	Julien Dessaux	2021-04-30 14:39:18 +0200
committer	Julien Dessaux	2021-04-30 14:55:27 +0200
commit	816673ab604e2432765f64e5ad5e87ec108566bc (patch)
tree	84305139e6c1517cc8bc2f05f635fd3a2d9cc304
parent	Added a book article (diff)
download	www-816673ab604e2432765f64e5ad5e87ec108566bc.tar.gz www-816673ab604e2432765f64e5ad5e87ec108566bc.tar.bz2 www-816673ab604e2432765f64e5ad5e87ec108566bc.zip