summaryrefslogtreecommitdiff
path: root/README
blob: 6afbef3b735f098c74c1e74678f91079563ad8fd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
Source:
    https://projects.bytemark.co.uk/projects/custodian

Copyright:
    Copyright (c) 2012 Bytemark Computer Consulting Ltd

Licence:
    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.




About Custodian
---------------

Custodian is a simple, scalable, and reliable protocol-tester that allows
a number of services to be tested across a network.

The core design is based upon a work queue, which has manipulated by
two main scripts:

  custodian-enqueue
    * A parser that reads a list of hosts and tests to apply.  These
      tests are broken down into individual jobs, serialized and stored
      in a central queue.

  custodian-dequeue
    * A tool that pulls jobs from the queue, executing them in turn, and
      raises/clears alerts based upon the result of the test.




Implementation
--------------

  In brief we accept four distinct kinds of line:


  1. Comments
  ------------
  Comments are lines that are blank or which begin with the comment-character ("#").


  2. Macro Definitions
  ---------------------
  There are two types of macros:

     FOO_HOSTS are 1.2.3.4 and 2.3.4.5 and 4.5.6.6.
     FOO_HOSTS are fetched from https://admin.bytemark.co.uk/network/monitor_ips/routers.

  We accept each of these, with the caveat that macro-names must match the regular
 expression ^[0-9A-Z_]$.  Note that it is an error to redefine an existing macro-name.


  3.  Service Tests
  -----------------
  Service tests are best explained by several examples:

     SWITCHES must run ssh otherwise 'Bytemark networking infrastructure: switch'.
     mirror.bytemark.co.uk must run ftp on 21 otherwise 'Bytemark Mirror: FTP failure'.

  The general case is:

     url|ip|hostname|macro must run XXX (extra args) otherwise 'alert'.

  We use a class-factory to instantiate tests, so the service name being tested  corresponds
  directly to the protocol-tester in our source tree.


  4. Ping Tests
  -------------
  Ping tests are of the form:

     FOO must ping otherwise 'alert text'.
     example.vm.bytemark.co.uk must ping otherwise 'alert text'.

  These are a simplification of the service tests, because the only real difference
  is that we write "must ping" rather than "must run ping" - to that end we silently
  rewrite any line which reads:

    (.*) must ping (.*)

  This becomes:

    $1 must run ping $2

  This allows the line to be parsed by the previous service-test rules.