custodian.git - A distributed protocol tester.

Age	Commit message (Collapse)	Author
2018-02-22	Allow tests to set a subject.	Steve Kemp
	This updates the parser, globally, to allow: .... with subject 'xxx'
2018-02-22	Removed obsolete code.	Steve Kemp
	The intention of this series of changes is to allow subjects to be replaced for specific tests. The idea of replacement replaced the idea of a custom-prefix - so I've removed that code before proceeding.
2017-09-20	Always ensure we send a trailing \n to graphite.	Steve Kemp
	This is required for the metrics to be submitted correctly.
2017-09-20	Update our metric-submission.	Steve Kemp
	We'll want to handle timeouts more cleanly now, and use TCP.
2017-09-20	Added a User-Agent to our HTTP/HTTPS checks.	Steve Kemp

2017-08-08	Moved case statement outside timeout block.13-catch-bogus-dns	Steve Kemp
	Also removed a redudant `begin`.
2017-08-08	Use a case-statement for both kinds of IP-matching.	Steve Kemp

2017-08-08	Sanity-check DNS on a per-protocol basis.	Steve Kemp
	When a failure occurs in looking up IPv4 addresses we confirm that, similarly when/if IPv6 lookups fail we confirm that before raising the alert.
2017-08-08	Updated to move ignore-dns-failure code into routine.	Steve Kemp
	That is then tested when resolve-errors are handled.
2017-08-08	Ignore bogus DNS results.	Steve Kemp
	We've had a problem for the past few weeks (?) where we see false DNS errors when making http/https requests with `curb`/`libcurl`. To resolve these issues properly we're going to have to rewrite the code to avoid the current gem. However that is considerable work because of the hole we've back ourself into - wanting to test both IPv4 and IPv6 "properly". We'll have to duplicate that work if we use `net/http`, or even mroe so if we use `open3` and exec `curl -4\|-6 ..` For the moment this commit changes how things are handled to deal with the issue we see - which doesn't solve the problem but will mask it. When custodian runs a test it will return a status-code: * Custodian::TestResult::TEST_FAILED * The test failed, such that an alert should be raised. * Custodian::TestResult::TEST_PASSED * The test succeeded, such that any previous alert should be cleared. * Custodian::TestResult::TEST_SKIPPED * Nothing should be done. As the failure we see is very very specific - an exception is thrown of the type `Curl::Err::HostResolutionError` - we can catch that and return `TEST_SKIPPED`. That means that there will be no (urgent) alert. Obviously the potential risk of swallowing all DNS-failures is that a domain might expire and we'd never know. So we'll do a little better than merely skipping the test if there are DNS failures: * If we see a DNS failure. * Then we try to lookup the host as an A & AAAA record. * If that succeeds we decide the issue was bogus. * If that fails then the host legitimately doesn't resolve so we raise an alert. To recap: * If a host fails normally - bogus status-code, or missing text - we behave as we did in the past. * Only in the case of a DNS-error from curb/curl do we go down this horrid path. * Where we try to confirm the error, and swallow it if false. This closes #13.
2017-07-13	Alert in more detail on DNS failures.	Steve Kemp

2017-07-11	Updated to log the exact DNS error.13-log-dns-errors	Steve Kemp
	This is part of #13.
2017-04-10	Remove username/password prior to testing URL with curb.	Steve Kemp

2017-04-10	Use standard URL username/password holders.10-support-http-basic-auth	Steve Kemp
	Rather than: with auth 'username:password' We use: http://user:pass@example.com/
2017-03-28	Support HTTP BASIC-AUthentication.	Steve Kemp
	Supply this like so: http://example.com/ must run http with auth 'username:passw0rd' with status 200 otherwise 'failure'
2017-03-27	First stab at allowing custom SSL expiry days	James Hannah

2017-03-16	Use the subject-prefix if it is present.	Steve Kemp

2017-03-16	Added helper for reading a custom-prefix.	Steve Kemp
	This will allow classification (by human eyes) of raised-alerts.
2016-12-19	Show host/port when TCP timeout occurs.	Steve Kemp
	This is a failure case which is not 100% clear. This closes #4.
2016-11-03	Send the server-name-indicator (SNI) when falling back to legacy.3-send-sni-when-falling-back-to-openssl	Steve Kemp
	If ruby-based SSL negotiation fails then we fallback to invoking (horridly!) openssl directly. Until now this didn't send the SNI hostname to connect to, so it could only test the first/default SSL site that was listening upon a given IP address. This commit updates things such that we send the correct hostname, from the URL under-test.
2016-07-18	Fallback to using `openssl` if we can't get certificates.	Steve Kemp
	Since the ruby version available to wheezy doesn't support TLS 1.2 fetching the certificate from remote HTTPS servers will fail, if that is all that is available. If we hit that condition, and only that one, we'll fall back to invoking `openssl` natively. This will allow us to monitor expiration-time for remote SSL certificates, but the downside is that we no longr receive the bundle that the remote server might send - so we cannot validate the signature chain. This closes #2.
2016-07-13	Update error message for validation-failu	Steve Kemp

2016-07-13	Retry SSL checks on negotiation failure.release-0.29	Steve Kemp
	This prevents an endless loop.
2016-04-22	Updated to fix the last remaining rubocop warnings.	Steve Kemp
	This involved silencing a few issues that were judged to be minor, and changing various whitespaces and function-calls. The most obvious example was changing this: assert(ret.kind_of? Array) To this: assert(ret.kind_of?(Array))
2016-04-22	More rubocop fixups.	Steve Kemp
	These are again mostly based around whitespace-changes.
2016-04-22	More rubocop fixes.	Steve Kemp

2016-04-22	Fixed up more rubocop warnings.	Steve Kemp
	Again these were whitespace-related.
2016-04-22	More updates to silence rubocop style-guides.	Steve Kemp
	These warnings were largely whitespace-based.
2016-04-22	Readded file.	Steve Kemp
	It was required after all.
2016-04-22	Removed obsolete file.	Steve Kemp

2016-04-22	Deleted trailing whitespace.	Steve Kemp
	Made minor formatting cleanups
2016-04-22	Simplified the parsing of the TFTP URI.	Steve Kemp

2016-04-21	added tftp protocol test	James F. Carter

2016-04-21	added a simple tftp utility	James F. Carter

2016-02-10	Don't allow limiting protocl on HTTP/HTTPS tests.	root
	We cannot allow HTTP/HTTPS to be limited by protocol, such as IPv4-only or IPv6-only. Raise an error in the parser if this is attempted. Added test-case to confirm, and this closes #12488.
2016-02-10	Adjusted greediness of regex in http with content	Patrick J Cherry
	It should match the next occurrence of the opening quote type, not the last.
2016-02-10	Adjusted http with content string parsing.	Patrick J Cherry
	It now matches "can't match" and 'he said "ha!"'. Added tests.
2016-01-18	Updated the queue-handling.	Steve Kemp
	We now use a zset to store our pending tests. This means that jobs are only in the queue once - no duplicates are allowed. This closes #12428.
2016-01-11	Allow expected-test to be double-quoted.	Steve Kemp
	This changes the parser from only allowing this: http://example.com/ must run http with content 'reserved'. To allowing both of these: http://example.com/ must run http with content "reservered". http://example.com/ must run http with content 'reserved'.
2015-12-18	Updated to use the right form of counting for the set.	Steve Kemp

2015-12-18	Fixed the name of the string.	Steve Kemp

2015-12-18	Updated to revert to a set with no ordering.	Steve Kemp
	This is more reliable, albeit potentially racy and with the failure case that a job might be readded twice.
2015-12-18	Return values using a reverse-score-range.	Steve Kemp
	This prevents starvation, by ensuring that we pull tests out in a FIFO fashion - by virtue of the timestamp.
2015-12-18	Removed references and support for beanstalkd.	Steve Kemp
	The beanstalkd queue used to be used in the past, and we later added support for Redis via a simple abstraction layer. But now we've no longer tested and used beanstalkd for over a year, and the client-libraries are no longer available as native Debian packages. With that in mind we've excised the code, although left the abstraction-class in-place.
2015-12-18	Removed debugging print.	Steve Kemp

2015-12-18	Removed the diagnostic output of the test-scores	Steve Kemp

2015-12-18	Use a sorted set for tests in our queue.	Steve Kemp
	This ensures that all tests always run, and we have an ordering.
2015-12-17	Treat our Redis queue as a set.	Steve Kemp
	This means that tests will only ever be enqueued once, regardless of how many times they are parsed. In the past we could have a configuration file that read: test1 .. test2 .. test3 .. Parsing/adding this file would result in a queue looking like so: test1 .. test2 .. test3 .. test1 .. test2 .. test3 .. test1 .. test2 .. test3 .. Now the queue will ALWAYS look like this: test1 .. test2 .. test3 .. In the normal course of events this won't matter, as teh processing loop will look like so: * Add new jobs every minute. * Worker runs the jobs. In the case of a failing job though the test might take 2.5 minutes and that will cause the queue to backup. (2.5 minutes because a test is repeated 5 times before a fail is announced, and the timeout is 30 seconds. These values can and should be tweaked.) With the new method even if the queue is slowly draining the queue will never grow to containu hundreds of events it will just be "topped up" not "overflowing". Thanks to James Hannah for the suggestion, and James Lawrie for the patience.
2015-12-02	Added swordfish range - 5.28.56.0/21	Steve Kemp

2015-11-30	Don't do SHA1 signature testing by default.	Steve Kemp