Class: DICOM::Anonymizer

Inherits:
Object
  • Object
show all
Includes:
Logging
Defined in:
lib/dicom/anonymizer.rb,
lib/dicom/deprecated.rb

Overview

Note:

For a thorough introduction to the concept of DICOM anonymization, please refer to The DICOM Standard, Part 15: Security and System Management Profiles, Annex E: Attribute Confidentiality Profiles. For guidance on settings for individual data elements, please refer to DICOM PS 3.15, Annex E, Table E.1-1: Application Level Confidentiality Profile Attributes.

This is a convenience class for handling the anonymization (de-identification) of DICOM files.

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Logging

included, #logger

Constructor Details

#initialize(options = {}) ⇒ Anonymizer

Note:

To customize logging behaviour, refer to the Logging module documentation.

Creates an Anonymizer instance.

Examples:

Create an Anonymizer instance and increase the log output

a = Anonymizer.new
a.logger.level = Logger::INFO

Perform anonymization using the audit trail feature

a = Anonymizer.new(:audit_trail => 'trail.json')
a.enumeration = true
a.write_path = '//anonymized/'
a.anonymize('//dicom/today/')

Parameters:

  • options (Hash) (defaults to: {})

    the options to create an anonymizer instance with

Options Hash (options):

  • :audit_trail (String)

    a file name path (if the file contains old audit data, these are loaded and used in the current anonymization)

  • :blank (Boolean)

    toggles whether to set the values of anonymized elements as empty instead of some generic value

  • :delete_private (Boolean)

    toggles whether private elements are to be deleted

  • :encryption (TrueClass, Digest::Class)

    if set as true, the default hash function (MD5) will be used for representing DICOM values in an audit file. Otherwise a Digest class can be given, e.g. Digest::SHA256

  • :enumeration (Boolean)

    toggles whether (some) elements get enumerated values (to enable post-anonymization re-identification)

  • :logger_level (Fixnum)

    the logger level which is applied to DObject operations during anonymization (defaults to Logger::FATAL)

  • :random_file_name (Boolean)

    toggles whether anonymized files will be given random file names when rewritten (in combination with the :write_path option)

  • :recursive (Boolean)

    toggles whether to anonymize on all sub-levels of the DICOM object tag hierarchies

  • :uid (Boolean)

    toggles whether UIDs will be replaced with custom generated UIDs (beware that to preserve UID relations in studies/series, the audit_trail feature must be used)

  • :uid_root (String)

    an organization (or custom) UID root to use when replacing UIDs

  • :write_path (String)

    a directory where the anonymized files are re-written (if not specified, files are overwritten)



68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# File 'lib/dicom/anonymizer.rb', line 68

def initialize(options={})
  # Transfer options to attributes:
  @blank = options[:blank]
  @delete_private = options[:delete_private]
  @enumeration = options[:enumeration]
  @logger_level = options[:logger_level] || Logger::FATAL
  @random_file_name = options[:random_file_name]
  @recursive = options[:recursive]
  @uid = options[:uid]
  @uid_root = options[:uid_root] ? options[:uid_root] : UID_ROOT
  @write_path = options[:write_path]
  # Array of folders to be processed for anonymization:
  @folders = Array.new
  # Folders that will be skipped:
  @exceptions = Array.new
  # Data elements which will be anonymized (the array will hold a list of tag strings):
  @tags = Array.new
  # Default values to use on anonymized data elements:
  @values = Array.new
  # Which data elements will have enumeration applied, if requested by the user:
  @enumerations = Array.new
  # We use a Hash to store information from DICOM files if enumeration is desired:
  @enum_old_hash = Hash.new
  @enum_new_hash = Hash.new
  # All the files to be anonymized will be put in this array:
  @files = Array.new
  @prefixes = Hash.new
  # Setup audit trail if requested:
  if options[:audit_trail]
    @audit_trail_file = options[:audit_trail]
    if File.exists?(@audit_trail_file) && File.size(@audit_trail_file) > 2
      # Load the pre-existing audit trail from file:
      @audit_trail = AuditTrail.read(@audit_trail_file)
    else
      # Start from scratch with an empty audit trail:
      @audit_trail = AuditTrail.new
    end
    # Set up encryption if indicated:
    if options[:encryption]
      require 'digest'
      if options[:encryption].respond_to?(:hexdigest)
        @encryption = options[:encryption]
      else
        @encryption = Digest::MD5
      end
    end
  end
  # Set the default data elements to be anonymized:
  set_defaults
end

Instance Attribute Details

#audit_trailObject (readonly)

An AuditTrail instance used for this anonymization (if specified).



18
19
20
# File 'lib/dicom/anonymizer.rb', line 18

def audit_trail
  @audit_trail
end

#audit_trail_fileObject (readonly)

The file name used for the AuditTrail serialization (if specified).



20
21
22
# File 'lib/dicom/anonymizer.rb', line 20

def audit_trail_file
  @audit_trail_file
end

#blankObject

A boolean that if set as true will cause all anonymized tags to be blank instead of get some generic value.



22
23
24
# File 'lib/dicom/anonymizer.rb', line 22

def blank
  @blank
end

#deleteObject (readonly)

An hash of elements (represented by tag keys) that will be deleted from the DICOM objects on anonymization.



24
25
26
# File 'lib/dicom/anonymizer.rb', line 24

def delete
  @delete
end

#delete_privateObject

A boolean that if set as true, will make the anonymization delete all private tags.



26
27
28
# File 'lib/dicom/anonymizer.rb', line 26

def delete_private
  @delete_private
end

#encryptionObject (readonly)

The cryptographic hash function to be used for encrypting DICOM values recorded in an audit trail file.



28
29
30
# File 'lib/dicom/anonymizer.rb', line 28

def encryption
  @encryption
end

#enumerationObject

A boolean that if set as true will cause all anonymized tags to be get enumerated values, to enable post-anonymization re-identification by the user.



30
31
32
# File 'lib/dicom/anonymizer.rb', line 30

def enumeration
  @enumeration
end

#logger_levelObject (readonly)

The logger level which is applied to DObject operations during anonymization (defaults to Logger::FATAL).



32
33
34
# File 'lib/dicom/anonymizer.rb', line 32

def logger_level
  @logger_level
end

#random_file_nameObject

A boolean that if set as true will cause all anonymized files to be written with random file names (if write_path has been specified).



34
35
36
# File 'lib/dicom/anonymizer.rb', line 34

def random_file_name
  @random_file_name
end

#recursiveObject

A boolean that if set as true, will cause the anonymization to run on all levels of the DICOM file tag hierarchy.



36
37
38
# File 'lib/dicom/anonymizer.rb', line 36

def recursive
  @recursive
end

#uidObject

A boolean indicating whether or not UIDs shall be replaced when executing the anonymization.



38
39
40
# File 'lib/dicom/anonymizer.rb', line 38

def uid
  @uid
end

#uid_rootObject

The DICOM UID root to use when generating new UIDs.



40
41
42
# File 'lib/dicom/anonymizer.rb', line 40

def uid_root
  @uid_root
end

#write_pathObject

The path where the anonymized files will be saved. If this value is not set, the original DICOM files will be overwritten.



42
43
44
# File 'lib/dicom/anonymizer.rb', line 42

def write_path
  @write_path
end

Instance Method Details

#==(other) ⇒ Boolean Also known as: eql?

Checks for equality.

Other and self are considered equivalent if they are of compatible types and their attributes are equivalent.

Parameters:

  • other

    an object to be compared with self.

Returns:

  • (Boolean)

    true if self and other are considered equivalent



127
128
129
130
131
# File 'lib/dicom/anonymizer.rb', line 127

def ==(other)
  if other.respond_to?(:to_anonymizer)
    other.send(:state) == state
  end
end

#add_exception(path) ⇒ Object

Deprecated.

Use Anonymizer#anonymize instead.

Adds an exception folder which will be avoided when anonymizing.

Examples:

Adding a folder

a.add_exception("/home/dicom/tutorials/")

Parameters:

  • path (String)

    a path that will be avoided

Raises:

  • (ArgumentError)


12
13
14
15
16
17
18
19
20
21
# File 'lib/dicom/deprecated.rb', line 12

def add_exception(path)
  # Deprecation warning:
  logger.warn("The '#add_exception' method of the Anonymization class has been deprecated! Please use the '#anonymize' method with a dataset argument instead.")
  raise ArgumentError, "Expected String, got #{path.class}." unless path.is_a?(String)
  if path
    # Remove last character if the path ends with a file separator:
    path.chop! if path[-1..-1] == File::SEPARATOR
    @exceptions << path
  end
end

#add_folder(path) ⇒ Object

Deprecated.

Use Anonymizer#anonymize instead.

Adds a folder who’s files will be anonymized.

Examples:

Adding a folder

a.add_folder("/home/dicom")

Parameters:

  • path (String)

    a path that will be included in the anonymization

Raises:

  • (ArgumentError)


30
31
32
33
34
35
# File 'lib/dicom/deprecated.rb', line 30

def add_folder(path)
  # Deprecation warning:
  logger.warn("The '#add_exception' method of the Anonymization class has been deprecated! Please use the '#anonymize' method with a dataset argument instead.")
  raise ArgumentError, "Expected String, got #{path.class}." unless path.is_a?(String)
  @folders << path
end

#anonymize(data) ⇒ Array<DObject>

Anonymizes the given DICOM data with the settings of this Anonymizer instance.

Parameters:

Returns:

  • (Array<DObject>)

    an array of the anonymized DICOM objects



140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
# File 'lib/dicom/anonymizer.rb', line 140

def anonymize(data)
  dicom = prepare(data)
  if @tags.length > 0
    dicom.each do |dcm|
      anonymize_dcm(dcm)
      # Write DICOM object to file unless it was passed to the anonymizer as an object:
      write(dcm) unless dcm.was_dcm_on_input
    end
  else
    logger.warn("No tags have been selected for anonymization. Aborting anonymization.")
  end
  # Reset the ruby-dicom log threshold to its original level:
  logger.level = @original_level
  # Save the audit trail (if used):
  @audit_trail.write(@audit_trail_file) if @audit_trail
  logger.info("Anonymization complete.")
  dicom
end

#delete_tag(tag) ⇒ Object

Specifies that the given tag is to be completely deleted from the anonymized DICOM objects.

Examples:

Completely delete the Patient’s Name tag from the DICOM files

a.delete_tag('0010,0010')

Parameters:

  • tag (String)

    a data element tag

Raises:

  • (ArgumentError)


166
167
168
169
170
# File 'lib/dicom/anonymizer.rb', line 166

def delete_tag(tag)
  raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
  raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
  @delete[tag] = true
end

#enum(tag) ⇒ Boolean, NilClass

Checks the enumeration status of this tag.

Parameters:

  • tag (String)

    a data element tag

Returns:

  • (Boolean, NilClass)

    the enumeration status of the tag, or nil if the tag has no match

Raises:

  • (ArgumentError)


177
178
179
180
181
182
183
184
185
186
187
# File 'lib/dicom/anonymizer.rb', line 177

def enum(tag)
  raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
  raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
  pos = @tags.index(tag)
  if pos
    return @enumerations[pos]
  else
    logger.warn("The specified tag (#{tag}) was not found in the list of tags to be anonymized.")
    return nil
  end
end

#executeObject

Deprecated.

Use Anonymizer#anonymize instead.

Executes the anonymization process.

This method is run when all settings have been finalized for the Anonymization instance.



43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
# File 'lib/dicom/deprecated.rb', line 43

def execute
  # Deprecation warning:
  logger.warn("The '#execute' method of the Anonymization class has been deprecated! Please use the '#anonymize' method instead.")
  # FIXME: This method has grown way too lengthy. It needs to be refactored one of these days.
  # Search through the folders to gather all the files to be anonymized:
  logger.info("Initiating anonymization process.")
  start_time = Time.now.to_f
  logger.info("Searching for files...")
  load_files
  logger.info("Done.")
  if @files.length > 0
    if @tags.length > 0
      logger.info(@files.length.to_s + " files have been identified in the specified folder(s).")
      if @write_path
        # Determine the write paths, as anonymized files will be written to a separate location:
        logger.info("Processing write paths...")
        process_write_paths
        logger.info("Done")
      else
        # Overwriting old files:
        logger.warn("Separate write folder not specified. Existing DICOM files will be overwritten.")
        @write_paths = @files
      end
      # If the user wants enumeration, we need to prepare variables for storing
      # existing information associated with each tag:
      create_enum_hash if @enumeration
      # Start the read/update/write process:
      logger.info("Initiating read/update/write process. This may take some time...")
      # Monitor whether every file read/write was successful:
      all_read = true
      all_write = true
      files_written = 0
      files_failed_read = 0
      begin
        require 'progressbar'
        pbar = ProgressBar.new("Anonymizing", @files.length)
      rescue LoadError
        pbar = nil
      end
      # Temporarily increase the log threshold to suppress messages from the DObject class:
      anonymizer_level = logger.level
      logger.level = Logger::FATAL
      @files.each_index do |i|
        pbar.inc if pbar
        # Read existing file to DICOM object:
        dcm = DObject.read(@files[i])
        if dcm.read?
          # Extract the data element parents to investigate for this DICOM object:
          parents = element_parents(dcm)
          parents.each do |parent|
            # Anonymize the desired tags:
            @tags.each_index do |j|
              if parent.exists?(@tags[j])
                element = parent[@tags[j]]
                if element.is_a?(Element)
                  if @blank
                    value = ""
                  elsif @enumeration
                    old_value = element.value
                    # Only launch enumeration logic if there is an actual value to the data element:
                    if old_value
                      value = enumerated_value(old_value, j)
                    else
                      value = ""
                    end
                  else
                    # Use the value that has been set for this tag:
                    value = @values[j]
                  end
                  element.value = value
                end
              end
            end
            # Delete elements marked for deletion:
            @delete.each_key do |tag|
              parent.delete(tag) if parent.exists?(tag)
            end
          end
          # General DICOM object manipulation:
          # Add a Patient Identity Removed attribute (as per
          # DICOM PS 3.15, Annex E, E.1.1 De-Identifier, point 6):
          dcm.add(Element.new('0012,0062', 'YES'))
          # Delete (and replace) the File Meta Information (as per
          # DICOM PS 3.15, Annex E, E.1.1 De-Identifier, point 7):
          dcm.delete_group('0002')
          # Handle UIDs if requested:
          replace_uids(parents) if @uid
          # Delete private tags?
          dcm.delete_private if @delete_private
          # Write DICOM file:
          dcm.write(@write_paths[i])
          if dcm.written?
            files_written += 1
          else
            all_write = false
          end
        else
          all_read = false
          files_failed_read += 1
        end
      end
      pbar.finish if pbar
      # Finished anonymizing files. Reset the log threshold:
      logger.level = anonymizer_level
      # Print elapsed time and status of anonymization:
      end_time = Time.now.to_f
      logger.info("Anonymization process completed!")
      if all_read
        logger.info("All files in the specified folder(s) were SUCCESSFULLY read to DICOM objects.")
      else
        logger.warn("Some files were NOT successfully read (#{files_failed_read} files). If some folder(s) contain non-DICOM files, this is expected.")
      end
      if all_write
        logger.info("All DICOM objects were SUCCESSFULLY written as DICOM files (#{files_written} files).")
      else
        logger.warn("Some DICOM objects were NOT succesfully written to file. You are advised to investigate the result (#{files_written} files succesfully written).")
      end
      @audit_trail.write(@audit_trail_file) if @audit_trail
      elapsed = (end_time-start_time).to_s
      logger.info("Elapsed time: #{elapsed[0..elapsed.index(".")+1]} seconds")
    else
      logger.warn("No tags were selected for anonymization. Aborting.")
    end
  else
    logger.warn("No files were found in specified folders. Aborting.")
  end
end

#hashFixnum

Note:

Two objects with the same attributes will have the same hash code.

Computes a hash code for this object.

Returns:

  • (Fixnum)

    the object’s hash code



195
196
197
# File 'lib/dicom/anonymizer.rb', line 195

def hash
  state.hash
end

Prints to screen a list of which tags are currently selected for anonymization along with the replacement values that will be used and enumeration status.



174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
# File 'lib/dicom/deprecated.rb', line 174

def print
  logger.warn("Anonymizer#print is deprecated.")
  # Extract the string lengths which are needed to make the formatting nice:
  names = Array.new
  types = Array.new
  tag_lengths = Array.new
  name_lengths = Array.new
  type_lengths = Array.new
  value_lengths = Array.new
  @tags.each_index do |i|
    name, vr = LIBRARY.name_and_vr(@tags[i])
    names << name
    types << vr
    tag_lengths[i] = @tags[i].length
    name_lengths[i] = names[i].length
    type_lengths[i] = types[i].length
    value_lengths[i] = @values[i].to_s.length unless @blank
    value_lengths[i] = '' if @blank
  end
  # To give the printed output a nice format we need to check the string lengths of some of these arrays:
  tag_maxL = tag_lengths.max
  name_maxL = name_lengths.max
  type_maxL = type_lengths.max
  value_maxL = value_lengths.max
  # Format string array for print output:
  lines = Array.new
  @tags.each_index do |i|
    # Configure empty spaces:
    s = ' '
    f1 = ' '*(tag_maxL-@tags[i].length+1)
    f2 = ' '*(name_maxL-names[i].length+1)
    f3 = ' '*(type_maxL-types[i].length+1)
    f4 = ' ' if @blank
    f4 = ' '*(value_maxL-@values[i].to_s.length+1) unless @blank
    if @enumeration
      enum = @enumerations[i]
    else
      enum = ''
    end
    if @blank
      value = ''
    else
      value = @values[i]
    end
    tag = @tags[i]
    lines << tag + f1 + names[i] + f2 + types[i] + f3 + value.to_s + f4 + enum.to_s
  end
  # Print to screen:
  lines.each do |line|
    puts line
  end
end

#remove_tag(tag) ⇒ Object

Removes a tag from the list of tags that will be anonymized.

Examples:

Do not anonymize the Patient’s Name tag

a.remove_tag('0010,0010')

Parameters:

  • tag (String)

    a data element tag

Raises:

  • (ArgumentError)


205
206
207
208
209
210
211
212
213
214
# File 'lib/dicom/anonymizer.rb', line 205

def remove_tag(tag)
  raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
  raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
  pos = @tags.index(tag)
  if pos
    @tags.delete_at(pos)
    @values.delete_at(pos)
    @enumerations.delete_at(pos)
  end
end

#set_tag(tag, options = {}) ⇒ Object

Sets the anonymization settings for the specified tag. If the tag is already present in the list of tags to be anonymized, its settings are updated, and if not, a new tag entry is created.

Examples:

Set the anonymization settings of the Patient’s Name tag

a.set_tag('0010,0010', :value => 'MrAnonymous', :enum => true)

Parameters:

  • tag (String)

    a data element tag

  • options (Hash) (defaults to: {})

    the anonymization settings for the specified tag

Options Hash (options):

  • :value (String, Integer, Float)

    the replacement value to be used when anonymizing this data element. Defaults to the pre-existing value and ” for new tags.

  • :enum (String, Integer, Float)

    specifies if enumeration is to be used for this tag. Defaults to the pre-existing value and false for new tags.

Raises:

  • (ArgumentError)


226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
# File 'lib/dicom/anonymizer.rb', line 226

def set_tag(tag, options={})
  raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
  raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
  pos = @tags.index(tag)
  if pos
    # Update existing values:
    @values[pos] = options[:value] if options[:value]
    @enumerations[pos] = options[:enum] if options[:enum] != nil
  else
    # Add new elements:
    @tags << tag
    @values << (options[:value] ? options[:value] : default_value(tag))
    @enumerations << (options[:enum] ? options[:enum] : false)
  end
end

#to_anonymizerAnonymizer

Returns self.

Returns:



246
247
248
# File 'lib/dicom/anonymizer.rb', line 246

def to_anonymizer
  self
end

#value(tag) ⇒ String, ...

Note:

If enumeration is selected for a string type tag, a number will be appended in addition to the string that is returned here.

Gives the value which will be used when anonymizing this tag.

Parameters:

  • tag (String)

    a data element tag

Returns:

  • (String, Integer, Float, NilClass)

    the replacement value for the specified tag, or nil if the tag is not matched

Raises:

  • (ArgumentError)


258
259
260
261
262
263
264
265
266
267
268
# File 'lib/dicom/anonymizer.rb', line 258

def value(tag)
  raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
  raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
  pos = @tags.index(tag)
  if pos
    return @values[pos]
  else
    logger.warn("The specified tag (#{tag}) was not found in the list of tags to be anonymized.")
    return nil
  end
end