Class: Gitlab::GithubImport::UserFinder
- Inherits:
-
Object
- Object
- Gitlab::GithubImport::UserFinder
- Includes:
- ExclusiveLeaseHelpers, Utils::StrongMemoize
- Defined in:
- lib/gitlab/github_import/user_finder.rb
Overview
Class that can be used for finding a GitLab user ID based on a GitHub user ID or username.
Any found user IDs are cached in Redis to reduce the number of SQL queries executed over time. Valid keys are refreshed upon access so frequently used keys stick around.
Lookups are cached even if no ID was found to remove the need for querying the database when most queries are not going to return results anyway.
Constant Summary collapse
- ID_CACHE_KEY =
The base cache key to use for caching user IDs for a given GitHub user ID.
'github-import/user-finder/user-id/%s'- ID_FOR_EMAIL_CACHE_KEY =
The base cache key to use for caching user IDs for a given GitHub email address.
'github-import/user-finder/id-for-email/%s'- EMAIL_FOR_USERNAME_CACHE_KEY =
The base cache key to use for caching the Email addresses of GitHub usernames.
'github-import/user-finder/email-for-username/%s'- USERNAME_ETAG_CACHE_KEY =
The base cache key to use for caching the user ETAG response headers
'github-import/user-finder/user-etag/%s'- EMAIL_FETCHED_FOR_PROJECT_CACHE_KEY =
The base cache key to store whether an email has been fetched for a project
'github-import/user-finder/%{project}/email-fetched/%{username}'- SOURCE_NAME_CACHE_KEY =
'github-import/user-finder/%{project}/source-name/%{username}'- EMAIL_API_CALL_LOGGING_MESSAGE =
{ true => 'Fetching email from GitHub with ETAG header', false => 'Fetching email from GitHub' }.freeze
Constants included from ExclusiveLeaseHelpers
ExclusiveLeaseHelpers::FailedToObtainLockError
Instance Attribute Summary collapse
-
#client ⇒ Object
readonly
Returns the value of attribute client.
-
#project ⇒ Object
readonly
Returns the value of attribute project.
Instance Method Summary collapse
-
#author_id_for(object, author_key: :author) ⇒ Object
Returns the GitLab user ID of an object's author.
- #cached_id_for_github_email(email) ⇒ Object
- #cached_id_for_github_id(id) ⇒ Object
- #email_for_github_username(username) ⇒ String, Nil
-
#fetch_source_name_from_github(username) ⇒ String
Retrieves the name of the user associated with a specified GitHub username.
-
#find(id, username) ⇒ Object
Returns the GitLab ID for the given GitHub ID or username.
-
#find_from_cache(id, email = nil) ⇒ Object
Finds a user ID from the cache for a given GitHub ID or Email.
-
#find_id_from_database(id, email) ⇒ Object
Finds a GitLab user ID from the database for a given GitHub user ID or Email.
-
#id_for_github_email(email) ⇒ Object
Queries and caches the GitLab user ID for a GitHub email, if one was found.
-
#id_for_github_id(id) ⇒ Object
If importing from github.com, queries and caches the GitLab user ID for a GitHub user ID, if one was found.
-
#initialize(project, client) ⇒ UserFinder
constructor
project - An instance of
Projectclient - An instance ofGitlab::GithubImport::Client. - #query_id_for_github_email(email) ⇒ Object
- #query_id_for_github_id(id) ⇒ Object
-
#read_id_from_cache(key) ⇒ Object
Reads an ID from the cache.
-
#source_user(user) ⇒ Object
Returns the GitLab user ID from placeholder or reassigned_to user.
-
#source_user_accepted?(user) ⇒ Boolean
Returns true if GitLab user has accepted their reassignment status or if UCM is not enabled.
-
#user_id_for(user, ghost: true) ⇒ Integer, NilClass
Returns the GitLab user ID for a GitHub user.
Methods included from ExclusiveLeaseHelpers
Constructor Details
#initialize(project, client) ⇒ UserFinder
project - An instance of Project
client - An instance of Gitlab::GithubImport::Client
44 45 46 47 |
# File 'lib/gitlab/github_import/user_finder.rb', line 44 def initialize(project, client) @project = project @client = client end |
Instance Attribute Details
#client ⇒ Object (readonly)
Returns the value of attribute client.
18 19 20 |
# File 'lib/gitlab/github_import/user_finder.rb', line 18 def client @client end |
#project ⇒ Object (readonly)
Returns the value of attribute project.
18 19 20 |
# File 'lib/gitlab/github_import/user_finder.rb', line 18 def project @project end |
Instance Method Details
#author_id_for(object, author_key: :author) ⇒ Object
Returns the GitLab user ID of an object's author.
If the object has no author ID we'll use the ID of the GitLab ghost
user.
object - An instance of Hash or a Github::Representer
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
# File 'lib/gitlab/github_import/user_finder.rb', line 54 def (object, author_key: :author) user_info = case when :actor object[:actor] when :review_requester object[:review_requester] else object ? object[:author] : nil end # TODO when improved user mapping is released we can refactor everything below to just # user_id_for(user_info) id = user_id_for(user_info, ghost: true) if id [id, true] else [project.creator_id, false] end end |
#cached_id_for_github_email(email) ⇒ Object
233 234 235 |
# File 'lib/gitlab/github_import/user_finder.rb', line 233 def cached_id_for_github_email(email) read_id_from_cache(ID_FOR_EMAIL_CACHE_KEY % email) end |
#cached_id_for_github_id(id) ⇒ Object
229 230 231 |
# File 'lib/gitlab/github_import/user_finder.rb', line 229 def cached_id_for_github_id(id) read_id_from_cache(ID_CACHE_KEY % id) end |
#email_for_github_username(username) ⇒ String, Nil
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 |
# File 'lib/gitlab/github_import/user_finder.rb', line 190 def email_for_github_username(username) email = read_email_from_cache(username) if email.blank? && !email_fetched_for_project?(username) in_lock(lease_key(username), sleep_sec: 0.2.seconds, retries: 30) do |retried| # when retried, check the cache again as the other process that had the lease may have fetched the email if retried email = read_email_from_cache(username) # early return if the other process fetched a non-empty email. If the email is empty, we'll attempt to # fetch it again in the lines below, but using the ETAG cached by the other process which won't count to # the rate limit. next email if email.present? end # If an ETAG is available, make an API call with the ETAG. # Only make a rate-limited API call if the ETAG is not available and the email is nil. etag = read_etag_from_cache(username) email = fetch_email_from_github(username, etag: etag) || email cache_email!(username, email) cache_etag!(username) if email.blank? && etag.nil? # If a non-blank email is cached, we don't need the ETAG or project check caches. # Otherwise, indicate that the project has been checked. if email.present? clear_caches!(username) else set_project_as_checked!(username) end end end email.presence rescue ::Octokit::NotFound cache_email!(username, '') nil end |
#fetch_source_name_from_github(username) ⇒ String
Retrieves the name of the user associated with a specified GitHub username.
To prevent multiple concurrent requests for the same user, a exclusive lock is used. The name is cached to avoid multiple calls to GitHub.
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
# File 'lib/gitlab/github_import/user_finder.rb', line 124 def fetch_source_name_from_github(username) in_lock(lease_key(username), sleep_sec: 0.2.seconds, retries: 30) do |retried| if retried source_name = read_source_name_from_cache(username) next source_name if source_name.present? end begin user = client.user(username) source_name = user.fetch(:name, username) rescue ::Octokit::NotFound => error log("GitHub user not found. #{error.message}", username: username) source_name = username end cache_source_name(username, source_name) source_name end end |
#find(id, username) ⇒ Object
Returns the GitLab ID for the given GitHub ID or username.
id - The ID of the GitHub user. username - The username of the GitHub user.
151 152 153 154 155 156 157 158 159 160 161 |
# File 'lib/gitlab/github_import/user_finder.rb', line 151 def find(id, username) email = email_for_github_username(username) cached, found_id = find_from_cache(id, email) return found_id if found_id # We only want to query the database if necessary. If previous lookups # didn't yield a user ID we won't query the database again until the # keys expire. find_id_from_database(id, email) unless cached end |
#find_from_cache(id, email = nil) ⇒ Object
Finds a user ID from the cache for a given GitHub ID or Email.
164 165 166 167 168 169 170 171 172 173 |
# File 'lib/gitlab/github_import/user_finder.rb', line 164 def find_from_cache(id, email = nil) id_exists, id_for_github_id = cached_id_for_github_id(id) return [id_exists, id_for_github_id] if id_for_github_id # Just in case no Email address could be retrieved (for whatever reason) return [false] unless email cached_id_for_github_email(email) end |
#find_id_from_database(id, email) ⇒ Object
Finds a GitLab user ID from the database for a given GitHub user ID or Email.
177 178 179 |
# File 'lib/gitlab/github_import/user_finder.rb', line 177 def find_id_from_database(id, email) id_for_github_id(id) || id_for_github_email(email) end |
#id_for_github_email(email) ⇒ Object
Queries and caches the GitLab user ID for a GitHub email, if one was found.
255 256 257 258 259 |
# File 'lib/gitlab/github_import/user_finder.rb', line 255 def id_for_github_email(email) gitlab_id = query_id_for_github_email(email) || nil Gitlab::Cache::Import::Caching.write(ID_FOR_EMAIL_CACHE_KEY % email, gitlab_id) end |
#id_for_github_id(id) ⇒ Object
If importing from github.com, queries and caches the GitLab user ID for a GitHub user ID, if one was found.
When importing from Github Enterprise, do not query user by Github ID since we only have users' Github ID from github.com.
242 243 244 245 246 247 248 249 250 251 |
# File 'lib/gitlab/github_import/user_finder.rb', line 242 def id_for_github_id(id) gitlab_id = if project.github_enterprise_import? nil else query_id_for_github_id(id) end Gitlab::Cache::Import::Caching.write(ID_CACHE_KEY % id, gitlab_id) end |
#query_id_for_github_email(email) ⇒ Object
265 266 267 |
# File 'lib/gitlab/github_import/user_finder.rb', line 265 def query_id_for_github_email(email) User.by_any_email(email).pick(:id) end |
#query_id_for_github_id(id) ⇒ Object
261 262 263 |
# File 'lib/gitlab/github_import/user_finder.rb', line 261 def query_id_for_github_id(id) User.by_provider_and_extern_uid(:github, id).select(:id).first&.id end |
#read_id_from_cache(key) ⇒ Object
Reads an ID from the cache.
The return value is an Array with two values:
- A boolean indicating if the key was present or not.
- The ID as an Integer, or nil in case no ID could be found.
275 276 277 278 279 280 281 282 283 |
# File 'lib/gitlab/github_import/user_finder.rb', line 275 def read_id_from_cache(key) value = Gitlab::Cache::Import::Caching.read(key) exists = !value.nil? number = value.to_i # The cache key may be empty to indicate a previously looked up user for # which we couldn't find an ID. [exists, number > 0 ? number : nil] end |
#source_user(user) ⇒ Object
Returns the GitLab user ID from placeholder or reassigned_to user.
97 98 99 100 101 102 103 104 105 106 107 |
# File 'lib/gitlab/github_import/user_finder.rb', line 97 def source_user(user) source_user = source_user_mapper.find_source_user(user[:id]) return source_user if source_user source_user_mapper.find_or_create_source_user( source_name: fetch_source_name_from_github(user[:login]), source_username: user[:login], source_user_identifier: user[:id] ) end |
#source_user_accepted?(user) ⇒ Boolean
Returns true if GitLab user has accepted their reassignment status or if UCM is not enabled
110 111 112 113 114 115 |
# File 'lib/gitlab/github_import/user_finder.rb', line 110 def source_user_accepted?(user) return true unless user_mapping_enabled? return true if map_to_personal_namespace_owner? source_user(user).accepted_status? end |
#user_id_for(user, ghost: true) ⇒ Integer, NilClass
Returns the GitLab user ID for a GitHub user. Can return nil if ghost is false.
The ghost: false argument is used to avoid assigning ghost users as assignees or reviewers.
83 84 85 86 87 88 89 90 91 92 93 94 |
# File 'lib/gitlab/github_import/user_finder.rb', line 83 def user_id_for(user, ghost: true) # user[:login] == 'ghost' here refers to the Github username if user.nil? || user[:login].nil? || user[:login] == 'ghost' return ghost ? GithubImport.ghost_user_id(project.organization_id) : nil end return find(user[:id], user[:login]) unless user_mapping_enabled? return project.root_ancestor.owner_id if map_to_personal_namespace_owner? source_user(user).mapped_user_id end |