Class: OPMLImporter

Inherits:
Object
  • Object
show all
Defined in:
lib/opml_importer.rb

Overview

This class manages import of subscription data from another feed aggregator into Feedbunch

Constant Summary collapse

FOLDER =

Class constant for the directory in which OPML export files will be saved.

'opml_imports'

Class Method Summary collapse

Class Method Details

.enqueue_import_job(file, user) ⇒ Object

This method extracts subscriptions data from an OPML file and saves them in a (unzipped) OPML file in the filesystem. Afterwards it enqueues a background job to import those subscriptions in the user's account.

Receives as arguments the file uploaded by the user and user that requested the import.

Optionally the file can be a zip archive; this is the format one gets when exporting from Google.

If any error is raised during importing, this method raises an OpmlImportError, to ensure that the user is always redirected to the start page, instead of being left at a blank HTTP 500 page.


25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# File 'lib/opml_importer.rb', line 25

def self.enqueue_import_job(file, user)
  Rails.logger.info "User #{user.id} - #{user.email} requested import of a data file"
  # Destroy the current import job state for the user. This in turn triggers a deletion of any associated import failure data.
  user.opml_import_job_state&.destroy
  user.create_opml_import_job_state state: OpmlImportJobState::RUNNING

  subscription_data = read_data_file file
  filename = "feedbunch_import_#{Time.zone.now.to_i}.opml"
  Feedbunch::Application.config.uploads_manager.save user.id, FOLDER, filename, subscription_data

  Rails.logger.info "Enqueuing Import Subscriptions Job for user #{user.id} - #{user.email}, OPML file #{filename}"
  ImportOpmlWorker.perform_async filename, user.id
  return nil
rescue => e
  Rails.logger.error "Error trying to read OPML data from file uploaded by user #{user.id} - #{user.email}"
  Rails.logger.error e.message
  Rails.logger.error e.backtrace
  user.opml_import_job_state&.destroy
  user.create_opml_import_job_state state: OpmlImportJobState::ERROR
  raise OpmlImportError.new
end

.process_opml(filename, user) ⇒ Object

Process an OPML file with subscriptions for a user, and then delete it.

Receives as arguments:

  • the name of the file, including path from Rails.root (e.g. 'uploads/1371321122.opml')

  • the user who is importing the file

The file is retrieved using the currently configured uploads_manager (from the filesystem or from Amazon S3).


56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# File 'lib/opml_importer.rb', line 56

def self.process_opml(filename, user)
  # Open file and check if it actually exists
  xml_contents = Feedbunch::Application.config.uploads_manager.read user.id, FOLDER, filename
  if xml_contents == nil
    Rails.logger.error "Trying to import for user #{user.id} from non-existing OPML file: #{filename}"
    raise OpmlImportError.new
  end

  # Parse OPML file (it's actually XML)
  begin
    docXml = Nokogiri::XML(xml_contents) {|config| config.strict}
  rescue Nokogiri::XML::SyntaxError => e
    Rails.logger.error "Trying to parse malformed XML file #{filename}"
    raise e
  end

  # Count total number of feeds
  total_feeds = count_total_feeds docXml
  # Check that the file was actually an OPML file with feeds
  if total_feeds == 0
    Rails.logger.error "Trying to import for user #{user.id} from OPML file: #{filename} but file contains no feeds"
    raise OpmlImportError.new
  end
  # Update total number of feeds, so user can see progress.
  user.opml_import_job_state.update total_feeds: total_feeds

  # Arrays that will be passed to ImportSubscriptionsWorker
  urls = []
  folder_ids = []

  # Process feeds that are not in a folder
  docXml.xpath('/opml/body/outline[@type="rss" and @xmlUrl]').each do |feed_node|
    folder_ids << nil
    urls << feed_node['xmlUrl']
  end

  # Process feeds in folders
  docXml.xpath('/opml/body/outline[not(@type="rss")]').each do |folder_node|
    # Ignore <outline> nodes which contain no feeds
    if folder_node.xpath('./outline[@type="rss" and @xmlUrl]').present?
      folder_title = folder_node['title'] || folder_node['text']
      folder = import_folder folder_title, user
      folder_node.xpath('./outline[@type="rss" and @xmlUrl]').each do |feed_node|
        folder_ids << folder.id
        urls << feed_node['xmlUrl']
      end
    end
  end

  # Enqueue set of workers with sidekiq-superworker to import each individual feed
  ImportSubscriptionsWorker.perform_async user.opml_import_job_state.id, urls, folder_ids

  return nil
end