Age | Commit message (Collapse) | Author |
|
|
|
Resolve "Custodian temporary DNS failure"
Closes #13
See merge request !11
|
|
Also removed a redudant `begin`.
|
|
|
|
When a failure occurs in looking up IPv4 addresses we confirm
that, similarly when/if IPv6 lookups fail we confirm that before
raising the alert.
|
|
That is then tested when resolve-errors are handled.
|
|
|
|
We've had a problem for the past few weeks (?) where we see
false DNS errors when making http/https requests with `curb`/`libcurl`.
To resolve these issues properly we're going to have to rewrite
the code to avoid the current gem. However that is considerable work
because of the hole we've back ourself into - wanting to test both
IPv4 and IPv6 "properly". We'll have to duplicate that work if
we use `net/http`, or even mroe so if we use `open3` and exec
`curl -4|-6 ..`
For the moment this commit changes how things are handled to deal
with the issue we see - which doesn't solve the problem but will
mask it.
When custodian runs a test it will return a status-code:
* Custodian::TestResult::TEST_FAILED
* The test failed, such that an alert should be raised.
* Custodian::TestResult::TEST_PASSED
* The test succeeded, such that any previous alert should be cleared.
* Custodian::TestResult::TEST_SKIPPED
* Nothing should be done.
As the failure we see is very very specific - an exception is thrown
of the type `Curl::Err::HostResolutionError` - we can catch that
and return `TEST_SKIPPED`. That means that there will be no
(urgent) alert.
Obviously the potential risk of swallowing all DNS-failures is that
a domain might expire and we'd never know. So we'll do a little
better than merely skipping the test if there are DNS failures:
* If we see a DNS failure.
* Then we try to lookup the host as an A & AAAA record.
* If that succeeds we decide the issue was bogus.
* If that fails then the host legitimately doesn't resolve so we raise an alert.
To recap:
* If a host fails normally - bogus status-code, or missing text - we behave as we did in the past.
* Only in the case of a DNS-error from curb/curl do we go down this horrid path.
* Where we try to confirm the error, and swallow it if false.
This closes #13.
|
|
Alert in more detail on DNS failures.
See merge request !10
|
|
|
|
Updated to log the exact DNS error.
See merge request !9
|
|
This is part of #13.
|
|
Resolve "The redis view of "known_tests" is often out-of-date"
Closes #12
See merge request !8
|
|
|
|
This will prune old tests from the `redis`-alerter - if that alerter
isn't used this will be harmless.
|
|
|
|
Resolve "We should support HTTP-basic auth for HTTP-based status-checks."
Closes #10
See merge request !7
|
|
Rather than:
with auth 'username:password'
We use:
http://user:pass@example.com/
|
|
|
|
Supply this like so:
http://example.com/ must run http with auth 'username:passw0rd' with status 200 otherwise 'failure'
|
|
Allow tests to specify the number of days before an expiring SSL certificate will generate a warning
See merge request !5
|
|
|
|
Resolve "gitlab-ci should run the test-cases."
Closes #9
See merge request !6
|
|
|
|
They fail.
|
|
|
|
|
|
Resolve "Allow subject-lines to be prefixed with a custom string."
See merge request !4
|
|
|
|
|
|
This will allow classification (by human eyes) of raised-alerts.
|
|
Move to gitlab-CI.
Closes #6
See merge request !3
|
|
|
|
|
|
This closes #6.
|
|
|
|
Show host/port when TCP timeout occurs.
This is a failure case which is not 100% clear.
This closes #4.
See merge request !2
|
|
This is a failure case which is not 100% clear.
This closes #4.
|
|
|
|
Send the server-name-indicator (SNI) when falling back to legacy.
If ruby-based SSL negotiation fails then we fallback to invoking
(horridly!) openssl directly. Until now this didn't send the SNI
hostname to connect to, so it could only test the first/default SSL site
that was listening upon a given IP address.
This commit updates things such that we send the correct hostname,
from the URL under-test.
Closes #3
See merge request !1
|
|
If ruby-based SSL negotiation fails then we fallback to invoking
(horridly!) openssl directly. Until now this didn't send the SNI
hostname to connect to, so it could only test the first/default SSL site
that was listening upon a given IP address.
This commit updates things such that we send the correct hostname,
from the URL under-test.
|
|
|
|
Since the ruby version available to wheezy doesn't support TLS 1.2
fetching the certificate from remote HTTPS servers will fail, if
that is all that is available.
If we hit that condition, and only that one, we'll fall back to
invoking `openssl` natively. This will allow us to monitor
expiration-time for remote SSL certificates, but the downside is
that we no longr receive the bundle that the remote server might
send - so we cannot validate the signature chain.
This closes #2.
|
|
|
|
|
|
This prevents an endless loop.
|
|
This involved silencing a few issues that were judged to be minor,
and changing various whitespaces and function-calls. The most
obvious example was changing this:
assert(ret.kind_of? Array)
To this:
assert(ret.kind_of?(Array))
|
|
These are again mostly based around whitespace-changes.
|
|
|
|
Again these were whitespace-related.
|