Class: Scrapinghub::Jobs

Inherits:
Object
  • Object
show all
Includes:
Contracts, HTTParty
Defined in:
lib/scrapinghub/jobs.rb

Instance Method Summary collapse

Constructor Details

#initialize(api_key:) ⇒ Object

Initialize a new Jobs API client

Parameters:

  • api_key (String)

    Scrapinghub API key



18
19
20
# File 'lib/scrapinghub/jobs.rb', line 18

def initialize(api_key:)
  @api_key = api_key
end

Instance Method Details

#delete(args) ⇒ Kleisli::Left, Kleisli::Right

Delete one or more jobs.

Parameters:

  • project (Fixnum)

    the project’s numeric ID

  • job (String, Array<String>)

    the ID of a specific job to delete

Returns:

  • (Kleisli::Left)

    if validation fails (e.g. bad authentication) or if there were any low-level exceptions (e.g. the host is down), with a message detailing the failure.

  • (Kleisli::Right)

    if the operation was successful.



141
142
143
144
145
146
147
148
149
150
# File 'lib/scrapinghub/jobs.rb', line 141

def delete(args)
  options = { body: args, basic_auth: { username: @api_key } }
  Try { self.class.post("/api/jobs/delete.json", options) }.to_either >-> response {
    if response.code == 200
      Right(response)
    else
      Left(response)
    end
  }
end

#list(args) ⇒ Kleisli::Left, Kleisli::Right

Retrieve information about jobs.

Parameters:

  • project (Fixnum)

    the project’s numeric ID

  • job (String, Array<String>)

    (optional) ID(s) of specific jobs to retrieve

  • spider (String)

    (optional) a spider name (only jobs belonging to this spider will be returned)

  • state (String)

    (optional) return only jobs with this state. Valid values: “pending”, “running”, “finished”

  • has_tag (String, Array<String>)

    (optional) return only jobs containing the given tag(s)

  • lacks_tag (String, Array<String>)

    (optional) return only jobs not containing the given tag(s)

  • count (Fixnum)

    (optional) maximum number of jobs to return

Returns:

  • (Kleisli::Left)

    if validation fails (e.g. bad authentication) or if there were any low-level exceptions (e.g. the host is down), with a message detailing the failure.

  • (Kleisli::Right)

    if the operation was successful.



48
49
50
51
52
53
54
55
56
57
# File 'lib/scrapinghub/jobs.rb', line 48

def list(args)
  options = { query: args, basic_auth: { username: @api_key } }
  Try { self.class.get("/api/jobs/list.json", options) }.to_either >-> response {
    if response.code == 200
      Right(response)
    else
      Left(response)
    end
  }
end

#schedule(args) ⇒ Kleisli::Left, Kleisli::Right

Schedule a job.

Parameters:

  • project (Fixnum)

    the project’s numeric ID

  • spider (String)

    the spider name

  • add_tag (String, Array<String>)

    (optional) add tag(s) to the job

  • priority (Fixnum)

    (optional) set the job priority: possible values range from 0 (lowest priority) to 4 (highest priority), default is 2

  • extra (Hash)

    (optional) extra parameters passed as spider arguments

Returns:

  • (Kleisli::Left)

    if validation fails (e.g. bad authentication) or if there were any low-level exceptions (e.g. the host is down), with a message detailing the failure.

  • (Kleisli::Right)

    if the operation was successful.



78
79
80
81
82
83
84
85
86
87
88
# File 'lib/scrapinghub/jobs.rb', line 78

def schedule(args)
  extra = args.delete(:extra) || {}
  options = { body: args.merge(extra), basic_auth: { username: @api_key } }
  Try { self.class.post("/api/schedule.json", options) }.to_either >-> response {
    if response.code == 200
      Right(response)
    else
      Left(response)
    end
  }
end

#stop(args) ⇒ Kleisli::Left, Kleisli::Right

Stop one or more running jobs.

Parameters:

  • project (Fixnum)

    the project’s numeric ID

  • job (String)

    the ID of a job to stop

Returns:

  • (Kleisli::Left)

    if validation fails (e.g. bad authentication) or if there were any low-level exceptions (e.g. the host is down), with a message detailing the failure.

  • (Kleisli::Right)

    if the operation was successful.



163
164
165
166
167
168
169
170
171
172
# File 'lib/scrapinghub/jobs.rb', line 163

def stop(args)
  options = { body: args, basic_auth: { username: @api_key } }
  Try { self.class.post("/api/jobs/stop.json", options) }.to_either >-> response {
    if response.code == 200
      Right(response)
    else
      Left(response)
    end
  }
end

#update(args) ⇒ Kleisli::Left, Kleisli::Right

Update information about jobs.

Parameters:

  • project (Fixnum)

    the project’s numeric ID

  • job (String, Array<String>)

    (optional) ID(s) of specific jobs to update

  • spider (String)

    (optional) query on spider name to update

  • state (String)

    (optional) query on jobs with this state to update. Valid values: “pending”, “running”, “finished”

  • has_tag (String, Array<String>)

    (optional) query on jobs containing the given tag(s) to update

  • lacks_tag (String, Array<String>)

    (optional) query on jobs not containing the given tag(s) to update

  • add_tag (String, Array<String>)

    (optional) tag(s) to add to the queried jobs

  • remove_tag (String, Array<String>)

    (optional) tag(s) to remove from the queried jobs

Returns:

  • (Kleisli::Left)

    if validation fails (e.g. bad authentication) or if there were any low-level exceptions (e.g. the host is down), with a message detailing the failure.

  • (Kleisli::Right)

    if the operation was successful.



119
120
121
122
123
124
125
126
127
128
# File 'lib/scrapinghub/jobs.rb', line 119

def update(args)
  options = { body: args, basic_auth: { username: @api_key } }
  Try { self.class.post("/api/jobs/update.json", options) }.to_either >-> response {
    if response.code == 200
      Right(response)
    else
      Left(response)
    end
  }
end