Module: Gitlab::PathRegex

Extended by:
PathRegex
Included in:
PathRegex
Defined in:
lib/gitlab/path_regex.rb

Constant Summary collapse

TOP_LEVEL_ROUTES =

All routes that appear on the top level must be listed here. This will make sure that groups cannot be created with these names as these routes would be masked by the paths already in place.

Example:

 /api/api-project

the path `api` shouldn't be allowed because it would be masked by `api/*`
%w[
  -
  .well-known
  404.html
  422.html
  500.html
  502.html
  503.html
  abuse_reports
  admin
  api
  apple-touch-icon-precomposed.png
  apple-touch-icon.png
  assets
  autocomplete
  dashboard
  deploy.html
  explore
  favicon.ico
  favicon.png
  files
  groups
  health_check
  help
  import
  invites
  jwt
  login
  oauth
  profile
  projects
  public
  robots.txt
  s
  search
  sent_notifications
  sitemap
  sitemap.xml
  sitemap.xml.gz
  slash-command-logo.png
  snippets
  unsubscribes
  uploads
  users
  v2
].freeze
PROJECT_WILDCARD_ROUTES =

NOTE: Do not add new items to this list unless necessary as this will cause conflicts with existing namespaced routes for groups or projects. See docs.gitlab.com/ee/development/routing.html#project-routes

This list should contain all words following `/*namespace_id/:project_id` in routes that contain a second wildcard.

Example:

/*namespace_id/:project_id/badges/*ref/build

If `badges` was allowed as a project/group name, we would not be able to access the `badges` route for those projects:

Consider a namespace with path `foo/bar` and a project called `badges`. The route to the build badge would then be `/foo/bar/badges/badges/master/build.svg`

When accessing this path the route would be matched to the `badges` path with the following params:

- namespace_id: `foo`
- project_id: `bar`
- ref: `badges/master`

Failing to find the project, this would result in a 404.

By rejecting `badges` the router can count on the fact that `badges` will be preceded by the `namespace/project`.

%w[
  -
  badges
  blame
  blob
  builds
  commits
  create
  create_dir
  edit
  environments/folders
  files
  find_file
  gitlab-lfs/objects
  info/lfs/objects
  new
  preview
  raw
  refs
  tree
  update
  wikis
].freeze
GROUP_ROUTES =

NOTE: Do not add new items to this list unless necessary as this will cause conflicts with existing namespaced routes for groups or projects. See docs.gitlab.com/ee/development/routing.html#group-routes

These are all the paths that follow `/groups/*id/ or `/groups/*group_id` We need to reject these because we have a `/groups/*id` page that is the same as the `/*id`.

If we would allow a subgroup to be created with the name `activity` then this group would not be accessible through `/groups/parent/activity` since this would map to the activity-page of its parent.

%w[
  -
].freeze
ILLEGAL_PROJECT_PATH_WORDS =
PROJECT_WILDCARD_ROUTES
ILLEGAL_GROUP_PATH_WORDS =
(PROJECT_WILDCARD_ROUTES | GROUP_ROUTES).freeze
PATH_START_CHAR =

The namespace regex is used in JavaScript to validate usernames in the “Register” form. However, Javascript does not support the negative lookbehind assertion (?<!) that disallows usernames ending in `.git` and `.atom`. Since this is a non-trivial problem to solve in Javascript (heavily complicate the regex, modify view code to allow non-regex validations, etc), `NAMESPACE_FORMAT_REGEX_JS` serves as a Javascript-compatible version of `NAMESPACE_FORMAT_REGEX`, with the negative lookbehind assertion removed. This means that the client-side validation will pass for usernames ending in `.atom` and `.git`, but will be caught by the server-side validation.

'[a-zA-Z0-9_\.]'
PATH_REGEX_STR =
PATH_START_CHAR + '[a-zA-Z0-9_\-\.]*'
NAMESPACE_FORMAT_REGEX_JS =
PATH_REGEX_STR + '[a-zA-Z0-9_\-]|[a-zA-Z0-9_]'
NO_SUFFIX_REGEX =
/(?<!\.git|\.atom)/.freeze
NAMESPACE_FORMAT_REGEX =
/(?:#{NAMESPACE_FORMAT_REGEX_JS})#{NO_SUFFIX_REGEX}/.freeze
PROJECT_PATH_FORMAT_REGEX =
/(?:#{PATH_REGEX_STR})#{NO_SUFFIX_REGEX}/.freeze
FULL_NAMESPACE_FORMAT_REGEX =
%r{(#{NAMESPACE_FORMAT_REGEX}/){,#{Namespace::NUMBER_OF_ANCESTORS_ALLOWED}}#{NAMESPACE_FORMAT_REGEX}}.freeze

Instance Method Summary collapse

Instance Method Details

#archive_formats_regexObject


225
226
227
228
# File 'lib/gitlab/path_regex.rb', line 225

def archive_formats_regex
  #                           |zip|tar|    tar.gz    |         tar.bz2         |
  @archive_formats_regex ||= /(zip|tar|tar\.gz|tgz|gz|tar\.bz2|tbz|tbz2|tb2|bz2)/.freeze
end

#container_image_blob_sha_regexObject


261
262
263
# File 'lib/gitlab/path_regex.rb', line 261

def container_image_blob_sha_regex
  @container_image_blob_sha_regex ||= %r{[\w+.-]+:?\w+}.freeze
end

#container_image_regexObject


257
258
259
# File 'lib/gitlab/path_regex.rb', line 257

def container_image_regex
  @container_image_regex ||= %r{([\w\.-]+\/){0,1}[\w\.-]+}.freeze
end

#full_namespace_path_regexObject


195
196
197
# File 'lib/gitlab/path_regex.rb', line 195

def full_namespace_path_regex
  @full_namespace_path_regex ||= %r{\A#{full_namespace_route_regex}/\z}
end

#full_namespace_route_regexObject


157
158
159
160
161
162
163
164
165
166
167
168
169
170
# File 'lib/gitlab/path_regex.rb', line 157

def full_namespace_route_regex
  @full_namespace_route_regex ||= begin
    illegal_words = Regexp.new(Regexp.union(ILLEGAL_GROUP_PATH_WORDS).source, Regexp::IGNORECASE)

    single_line_regexp %r{
      #{root_namespace_route_regex}
      (?:
        /
        (?!#{illegal_words}/)
        #{NAMESPACE_FORMAT_REGEX}
      )*
    }x
  end
end

#full_project_git_path_regexObject


203
204
205
# File 'lib/gitlab/path_regex.rb', line 203

def full_project_git_path_regex
  @full_project_git_path_regex ||= %r{\A\/?(?<namespace_path>#{full_namespace_route_regex})\/(?<project_path>#{project_route_regex})\.git\z}
end

#full_project_path_regexObject


199
200
201
# File 'lib/gitlab/path_regex.rb', line 199

def full_project_path_regex
  @full_project_path_regex ||= %r{\A#{full_namespace_route_regex}/#{project_route_regex}/\z}
end

#full_snippets_repository_path_regexObject


253
254
255
# File 'lib/gitlab/path_regex.rb', line 253

def full_snippets_repository_path_regex
  %r{\A(#{personal_snippet_repository_path_regex}|#{project_snippet_repository_path_regex})\z}
end

#git_reference_regexObject


230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
# File 'lib/gitlab/path_regex.rb', line 230

def git_reference_regex
  # Valid git ref regex, see:
  # https://www.kernel.org/pub/software/scm/git/docs/git-check-ref-format.html

  @git_reference_regex ||= single_line_regexp %r{
    (?!
       (?# doesn't begins with)
       \/|                    (?# rule #6)
       (?# doesn't contain)
       .*(?:
          [\/.]\.|            (?# rule #1,3)
          \/\/|               (?# rule #6)
          @\{|                (?# rule #8)
          \\                  (?# rule #9)
       )
    )
    [^\000-\040\177~^:?*\[]+  (?# rule #4-5)
    (?# doesn't end with)
    (?<!\.lock)               (?# rule #1)
    (?<![\/.])                (?# rule #6-7)
  }x
end

#namespace_format_messageObject


211
212
213
214
# File 'lib/gitlab/path_regex.rb', line 211

def namespace_format_message
  "can contain only letters, digits, '_', '-' and '.'. " \
  "Cannot start with '-' or end in '.', '.git' or '.atom'." \
end

#namespace_format_regexObject


207
208
209
# File 'lib/gitlab/path_regex.rb', line 207

def namespace_format_regex
  @namespace_format_regex ||= /\A#{NAMESPACE_FORMAT_REGEX}\z/.freeze
end

#project_path_format_messageObject


220
221
222
223
# File 'lib/gitlab/path_regex.rb', line 220

def project_path_format_message
  "can contain only letters, digits, '_', '-' and '.'. " \
  "Cannot start with '-', end in '.git' or end in '.atom'" \
end

#project_path_format_regexObject


216
217
218
# File 'lib/gitlab/path_regex.rb', line 216

def project_path_format_regex
  @project_path_format_regex ||= /\A#{PROJECT_PATH_FORMAT_REGEX}\z/.freeze
end

#project_route_regexObject


172
173
174
175
176
177
178
179
180
181
# File 'lib/gitlab/path_regex.rb', line 172

def project_route_regex
  @project_route_regex ||= begin
    illegal_words = Regexp.new(Regexp.union(ILLEGAL_PROJECT_PATH_WORDS).source, Regexp::IGNORECASE)

    single_line_regexp %r{
      (?!(#{illegal_words})/)
      #{PROJECT_PATH_FORMAT_REGEX}
    }x
  end
end

#repository_git_route_regexObject


187
188
189
# File 'lib/gitlab/path_regex.rb', line 187

def repository_git_route_regex
  @repository_git_route_regex ||= /#{repository_route_regex}\.git/.freeze
end

#repository_route_regexObject


183
184
185
# File 'lib/gitlab/path_regex.rb', line 183

def repository_route_regex
  @repository_route_regex ||= /#{full_namespace_route_regex}|#{personal_snippet_repository_path_regex}/.freeze
end

#repository_wiki_git_route_regexObject


191
192
193
# File 'lib/gitlab/path_regex.rb', line 191

def repository_wiki_git_route_regex
  @repository_wiki_git_route_regex ||= /#{full_namespace_route_regex}\.wiki\.git/.freeze
end

#root_namespace_route_regexObject


146
147
148
149
150
151
152
153
154
155
# File 'lib/gitlab/path_regex.rb', line 146

def root_namespace_route_regex
  @root_namespace_route_regex ||= begin
    illegal_words = Regexp.new(Regexp.union(TOP_LEVEL_ROUTES).source, Regexp::IGNORECASE)

    single_line_regexp %r{
      (?!(#{illegal_words})/)
      #{NAMESPACE_FORMAT_REGEX}
    }x
  end
end