summaryrefslogtreecommitdiff
path: root/lib/oxidized
diff options
context:
space:
mode:
authorJason Ackley <jason@ackley.net>2018-05-02 08:38:56 -0500
committerJason Ackley <jason@ackley.net>2018-05-02 08:38:56 -0500
commitca4aa7c815c5ec9880c0b141cbe8e02f51406b7b (patch)
treebdf4c6ef4b6e4fd42032ad84bca9e50a17d15958 /lib/oxidized
parentdba1f023ce6b53e4e353ca0c9ccc88facdad796f (diff)
Quick-fix: repair some logic for @jobs_done.
This small adjustment changes to only increment @jobs_done for a successful pull of a node or when the retries are exceeded and the node is abandoned for that cycle. Previously @jobs_done was incremented as soon as process() was called. The problem is that this incremented @jobs_done before knowing if the node completes OK or fails (And requires a retry). During a retry - the node to be requeued for processing - which would increment @jobs_done multiple times per node (up to retries count per node for a downed node). This causes @jobs_done to become out of sync with reality. One of the main impacts of this is when the :nodes_done hook gets called. This could cause the hook to fire mid-cycle and then not fire at the 'real' end of the interval which is the intent of :nodes_done. The next time it fires would be when the @jobs_done catches back up (in the NEXT cycle) to the @nodes.count.
Diffstat (limited to 'lib/oxidized')
-rw-r--r--lib/oxidized/worker.rb7
1 files changed, 6 insertions, 1 deletions
diff --git a/lib/oxidized/worker.rb b/lib/oxidized/worker.rb
index 692b060..5d5fc01 100644
--- a/lib/oxidized/worker.rb
+++ b/lib/oxidized/worker.rb
@@ -42,9 +42,9 @@ module Oxidized
node.stats.add job
@jobs.duration job.time
node.running = false
- @jobs_done += 1 # needed for worker_done event
if job.status == :success
+ @jobs_done += 1 # needed for :nodes_done hook
Oxidized.Hooks.handle :node_success, :node => node,
:job => job
msg = "update #{node.name}"
@@ -66,6 +66,11 @@ module Oxidized
msg += ", retry attempt #{node.retry}"
@nodes.next node.name
else
+ # Only increment the @jobs_done when we give up retries for a node (or success).
+ # As it would otherwise cause @jobs_done to be incremented with generic retries.
+ # This would cause :nodes_done hook to desync from running at the end of the nodelist and
+ # be fired when the @jobs_done > @nodes.count (could be mid-cycle on the next cycle).
+ @jobs_done += 1
msg += ", retries exhausted, giving up"
node.retry = 0
Oxidized.Hooks.handle :node_fail, :node => node,