Class: Linguist::Repository

Inherits:
Object
  • Object
show all
Defined in:
lib/linguist/repository.rb

Overview

A Repository is an abstraction of a Grit::Repo or a basic file system tree. It holds a list of paths pointing to Blobish objects.

Its primary purpose is for gathering language statistics across the entire project.

Constant Summary collapse

MAX_TREE_SIZE =
100_000

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(repo, commit_oid, max_tree_size = MAX_TREE_SIZE) ⇒ Repository

Public: Initialize a new Repository to be analyzed for language data

repo - a Linguist::Source::Repository object commit_oid - the sha1 of the commit that will be analyzed;

this is usually the master branch

max_tree_size - the maximum tree size to consider for analysis (default: MAX_TREE_SIZE)

Returns a Repository

Raises:

  • (TypeError)


33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/linguist/repository.rb', line 33

def initialize(repo, commit_oid, max_tree_size = MAX_TREE_SIZE)
  @repository = if repo.is_a? Linguist::Source::Repository
    repo
  else
    # Allow this for backward-compatibility purposes
    Linguist::Source::RuggedRepository.new(repo)
  end
  @commit_oid = commit_oid
  @max_tree_size = max_tree_size

  @old_commit_oid = nil
  @old_stats = nil

  raise TypeError, 'commit_oid must be a commit SHA1' unless commit_oid.is_a?(String)
end

Instance Attribute Details

#repositoryObject (readonly)

Returns the value of attribute repository.



12
13
14
# File 'lib/linguist/repository.rb', line 12

def repository
  @repository
end

Class Method Details

.incremental(repo, commit_oid, old_commit_oid, old_stats, max_tree_size = MAX_TREE_SIZE) ⇒ Object

Public: Create a new Repository based on the stats of an existing one



18
19
20
21
22
# File 'lib/linguist/repository.rb', line 18

def self.incremental(repo, commit_oid, old_commit_oid, old_stats, max_tree_size = MAX_TREE_SIZE)
  repo = self.new(repo, commit_oid, max_tree_size)
  repo.load_existing_stats(old_commit_oid, old_stats)
  repo
end

Instance Method Details

#breakdown_by_fileObject

Public: Return the language breakdown of this repository by file

Returns a map of language names => [filenames…]



105
106
107
108
109
110
111
112
113
# File 'lib/linguist/repository.rb', line 105

def breakdown_by_file
  @file_breakdown ||= begin
    breakdown = Hash.new { |h,k| h[k] = Array.new }
    cache.each do |filename, (language, _)|
      breakdown[language] << filename.dup.force_encoding("UTF-8").scrub
    end
    breakdown
  end
end

#cacheObject

Public: Return the cached results of the analysis

This is a per-file breakdown that can be passed to other instances of Linguist::Repository to perform incremental scans

Returns a map of filename => [language, size]



121
122
123
124
125
126
127
128
129
# File 'lib/linguist/repository.rb', line 121

def cache
  @cache ||= begin
    if @old_commit_oid == @commit_oid
      @old_stats
    else
      compute_stats(@old_commit_oid, @old_stats)
    end
  end
end

#current_treeObject

Raises:

  • (NotImplementedError)


136
137
138
139
# File 'lib/linguist/repository.rb', line 136

def current_tree
  raise NotImplementedError, "current_tree is deprecated" unless repository.is_a? Linguist::Source::RuggedRepository
  repository.get_tree(@commit_oid)
end

#languageObject

Public: Get primary Language of repository.

Returns a language name



88
89
90
91
92
93
# File 'lib/linguist/repository.rb', line 88

def language
  @language ||= begin
    primary = languages.max_by { |(_, size)| size }
    primary && primary[0]
  end
end

#languagesObject

Public: Returns a breakdown of language stats.

Examples

# => { 'Ruby' => 46319,
       'JavaScript' => 258 }

Returns a Hash of language names and Integer size values.



75
76
77
78
79
80
81
82
83
# File 'lib/linguist/repository.rb', line 75

def languages
  @sizes ||= begin
    sizes = Hash.new { 0 }
    cache.each do |_, (language, size)|
      sizes[language] += size
    end
    sizes
  end
end

#load_existing_stats(old_commit_oid, old_stats) ⇒ Object

Public: Load the results of a previous analysis on this repository to speed up the new scan.

The new analysis will be performed incrementally as to only take into account the file changes since the last time the repository was scanned

old_commit_oid - the sha1 of the commit that was previously analyzed old_stats - the result of the previous analysis, obtained by calling

Repository#cache on the old repository

Returns nothing



61
62
63
64
65
# File 'lib/linguist/repository.rb', line 61

def load_existing_stats(old_commit_oid, old_stats)
  @old_commit_oid = old_commit_oid
  @old_stats = old_stats
  nil
end

#read_indexObject

Raises:

  • (NotImplementedError)


131
132
133
134
# File 'lib/linguist/repository.rb', line 131

def read_index
  raise NotImplementedError, "read_index is deprecated" unless repository.is_a? Linguist::Source::RuggedRepository
  repository.set_attribute_source(@commit_oid)
end

#sizeObject

Public: Get the total size of the repository.

Returns a byte size Integer



98
99
100
# File 'lib/linguist/repository.rb', line 98

def size
  @size ||= languages.inject(0) { |s,(_,v)| s + v }
end