Class: ScheduleManager

Inherits:
Object
  • Object
show all
Defined in:
lib/schedule_manager.rb

Overview

Class with methods related to managing update schedules.

Class Method Summary collapse

Class Method Details

.decrement_update_interval(feed) ⇒ Object

Decrement the interval between updates of the passed feed. The current interval is decremented by 10% up to the minimum set in the application configuration.


109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/schedule_manager.rb', line 109

def self.decrement_update_interval(feed)
  new_interval = (feed.fetch_interval_secs * 0.9).round
  min = Feedbunch::Application.config.min_update_interval
  new_interval = min if new_interval < min

  # Add up to +/- 1 minute to the update interval, to add some entropy and distribute updates more evenly over time.
  entropy = schedule_entropy
  new_interval += entropy.seconds

  # Decrement the update interval saved in the database
  Rails.logger.debug "Decrementing update interval of feed #{feed.id} - #{feed.title} to #{new_interval} seconds"
  feed.update fetch_interval_secs: new_interval

  # Actually decrement the update interval
  set_scheduled_update feed.id, feed.fetch_interval_secs
end

.fix_scheduled_updatesObject

For each available feed in the database, ensure that the next update of the feed is scheduled

If a feed is found with no scheduled update, one is added.

After invoking this method all available feeds are guaranteed to have their next update scheduled.


13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# File 'lib/schedule_manager.rb', line 13

def self.fix_scheduled_updates
  Rails.logger.debug 'Fixing scheduled feed updates'

  queue = Sidekiq::Queue.new 'update_feeds'
  queued_ids = queue.select{|job| job.klass == 'ScheduledUpdateFeedWorker'}.map{|job| job.args[0]}

  scheduled_set = Sidekiq::ScheduledSet.new
  scheduled_ids = scheduled_set.select{|job| job.klass == 'ScheduledUpdateFeedWorker'}.map{|job| job.args[0]}

  retrySet = Sidekiq::RetrySet.new
  retry_ids = retrySet.select{|job| job.klass == 'ScheduledUpdateFeedWorker'}.map{|job| job.args[0]}

  workers = Sidekiq::Workers.new
  worker_ids = workers.select{|process_id, thread_id, work| work['payload']['class'] == 'ScheduledUpdateFeedWorker'}.map{|process_id, thread_id, work| work['payload']['args'][0]}

  feeds_unscheduled = []

  Feed.where(available: true).find_each do |feed|
    # count how many update workers there are for each feed in Sidekiq
    schedule_count = feed_schedule_count feed.id, queued_ids, scheduled_ids, retry_ids, worker_ids
    Rails.logger.debug "Update schedule for feed #{feed.id}  #{feed.title} present #{schedule_count} times"

    # if a feed has no update schedule, add it to the array of feeds to be fixed
    if schedule_count == 0
      Rails.logger.warn "Missing schedule for feed #{feed.id} - #{feed.title}"
      feeds_unscheduled << feed
    elsif schedule_count > 1
      # there should be one scheduled update for each feed.
      # If a feed has more than one scheduled update, remove all updates for the feed and add it to the array of feeds to be fixed
      Rails.logger.warn "Feed #{feed.id} - #{feed.title} is scheduled more than one time, removing all scheduled updates to re-add just one"
      unschedule_feed_updates feed.id
      feeds_unscheduled << feed
    end
  end

  if feeds_unscheduled.length > 0
    Rails.logger.warn "A total of #{feeds_unscheduled.length} feeds are missing their update schedules. Adding missing schedules."
    feeds_unscheduled.each do |feed|
      add_missing_schedule feed
    end
  end
end

.increment_update_interval(feed) ⇒ Object

Increment the interval between updates of the passed feed. The current interval is incremented by 10% up to the maximum set in the application configuration.


131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
# File 'lib/schedule_manager.rb', line 131

def self.increment_update_interval(feed)
  new_interval = (feed.fetch_interval_secs * 1.1).round
  max = Feedbunch::Application.config.max_update_interval
  new_interval = max if new_interval > max

  # Add up to +/- 1 minute to the update interval, to add some entropy and distribute updates more evenly over time.
  entropy = schedule_entropy
  new_interval += entropy.seconds

  # Increment the update interval saved in the database
  Rails.logger.debug "Incrementing update interval of feed #{feed.id} - #{feed.title} to #{new_interval} seconds"
  feed.update fetch_interval_secs: new_interval

  # Actually increment the update interval
  set_scheduled_update feed.id, feed.fetch_interval_secs
end

.schedule_first_update(feed_id) ⇒ Object

Schedule the first update of a feed. Receives as argument the id of the feed for which its first update will be scheduled.

The update is scheduled to run in a random amount of minutes, between 0 and 60, after this method is invoked. This is done so that feed updates are more or less evenly, or at least randomly, spaced in time. This way the server load from the updates is spaced over time, to affect user experience as little as possible.

After each update, the worker schedules the next update of the feed. The worker tries to adapt the scheduling to the rate at which new entries appear in the feed.


68
69
70
71
72
# File 'lib/schedule_manager.rb', line 68

def self.schedule_first_update(feed_id)
  delay = Random.rand 61
  Rails.logger.info "Scheduling updates of feed #{feed_id} every hour, starting #{delay} minutes from now at #{Time.zone.now + delay.minutes}"
  set_scheduled_update feed_id, delay.minutes
end

.unschedule_feed_updates(feed_id) ⇒ Object

Unschedule (this is, remove from scheduling) future updates of the passed feed. Receives as argument the id of the feed; scheduled updates for other feeds are unaffected.

Normally when this method is invoked the feed is also have to be marked as unavailable. After invoking this method, if the feed is marked as unavailable, it won't be updated again. However if it's marked as available the next time FixSchedulesWorker runs (normally daily), periodic updates will start running again. Because of this, if we really want a feed to stop updating it's not enough to invoke this method, the “available” flag of the feed must be set to false as well.


85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
# File 'lib/schedule_manager.rb', line 85

def self.unschedule_feed_updates(feed_id)
  Rails.logger.info "Unscheduling updates of feed #{feed_id}"

  queue = Sidekiq::Queue.new 'update_feeds'
  queue_jobs = queue.select {|job| job.klass == 'ScheduledUpdateFeedWorker' && job.args[0] == feed_id}
  Rails.logger.info "Feed #{feed_id} update found in 'update_feeds' queue #{queue_jobs.size} times, deleting" if queue_jobs.size > 0
  queue_jobs.each {|job| job.delete}

  scheduled_set = Sidekiq::ScheduledSet.new
  scheduled_job = scheduled_set.select {|job| job.klass == 'ScheduledUpdateFeedWorker' && job.args[0] == feed_id}
  Rails.logger.info "Feed #{feed_id} update scheduled #{scheduled_job.size} times, deleting" if scheduled_job.size > 0
  scheduled_job.each {|job| job.delete}

  retrying = Sidekiq::RetrySet.new
  retrying_job = retrying.select {|job| job.klass == 'ScheduledUpdateFeedWorker' && job.args[0] == feed_id}
  Rails.logger.info "Feed #{feed_id} update marked for retrying #{retrying_job.size} times, deleting" if retrying_job.size > 0
  retrying_job.each {|job| job.delete}
end