Class: DICOM::Anonymizer
- Inherits:
-
Object
- Object
- DICOM::Anonymizer
- Includes:
- Logging
- Defined in:
- lib/dicom/anonymizer.rb,
lib/dicom/deprecated.rb
Overview
For a thorough introduction to the concept of DICOM anonymization, please refer to The DICOM Standard, Part 15: Security and System Management Profiles, Annex E: Attribute Confidentiality Profiles. For guidance on settings for individual data elements, please refer to DICOM PS 3.15, Annex E, Table E.1-1: Application Level Confidentiality Profile Attributes.
This is a convenience class for handling the anonymization (de-identification) of DICOM files.
Instance Attribute Summary collapse
-
#audit_trail ⇒ Object
readonly
An AuditTrail instance used for this anonymization (if specified).
-
#audit_trail_file ⇒ Object
readonly
The file name used for the AuditTrail serialization (if specified).
-
#blank ⇒ Object
A boolean that if set as true will cause all anonymized tags to be blank instead of get some generic value.
-
#delete ⇒ Object
readonly
An hash of elements (represented by tag keys) that will be deleted from the DICOM objects on anonymization.
-
#delete_private ⇒ Object
A boolean that if set as true, will make the anonymization delete all private tags.
-
#encryption ⇒ Object
readonly
The cryptographic hash function to be used for encrypting DICOM values recorded in an audit trail file.
-
#enumeration ⇒ Object
A boolean that if set as true will cause all anonymized tags to be get enumerated values, to enable post-anonymization re-identification by the user.
-
#logger_level ⇒ Object
readonly
The logger level which is applied to DObject operations during anonymization (defaults to Logger::FATAL).
-
#random_file_name ⇒ Object
A boolean that if set as true will cause all anonymized files to be written with random file names (if write_path has been specified).
-
#recursive ⇒ Object
A boolean that if set as true, will cause the anonymization to run on all levels of the DICOM file tag hierarchy.
-
#uid ⇒ Object
A boolean indicating whether or not UIDs shall be replaced when executing the anonymization.
-
#uid_root ⇒ Object
The DICOM UID root to use when generating new UIDs.
-
#write_path ⇒ Object
The path where the anonymized files will be saved.
Instance Method Summary collapse
-
#==(other) ⇒ Boolean
(also: #eql?)
Checks for equality.
-
#add_exception(path) ⇒ Object
deprecated
Deprecated.
Use Anonymizer#anonymize instead.
-
#add_folder(path) ⇒ Object
deprecated
Deprecated.
Use Anonymizer#anonymize instead.
-
#anonymize(data) ⇒ Array<DObject>
Anonymizes the given DICOM data with the settings of this Anonymizer instance.
-
#delete_tag(tag) ⇒ Object
Specifies that the given tag is to be completely deleted from the anonymized DICOM objects.
-
#enum(tag) ⇒ Boolean, NilClass
Checks the enumeration status of this tag.
-
#execute ⇒ Object
deprecated
Deprecated.
Use Anonymizer#anonymize instead.
-
#hash ⇒ Fixnum
Computes a hash code for this object.
-
#initialize(options = {}) ⇒ Anonymizer
constructor
Creates an Anonymizer instance.
-
#print ⇒ Object
Prints to screen a list of which tags are currently selected for anonymization along with the replacement values that will be used and enumeration status.
-
#remove_tag(tag) ⇒ Object
Removes a tag from the list of tags that will be anonymized.
-
#set_tag(tag, options = {}) ⇒ Object
Sets the anonymization settings for the specified tag.
-
#to_anonymizer ⇒ Anonymizer
Returns self.
-
#value(tag) ⇒ String, ...
Gives the value which will be used when anonymizing this tag.
Methods included from Logging
Constructor Details
#initialize(options = {}) ⇒ Anonymizer
To customize logging behaviour, refer to the Logging module documentation.
Creates an Anonymizer instance.
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
# File 'lib/dicom/anonymizer.rb', line 68 def initialize(={}) # Transfer options to attributes: @blank = [:blank] @delete_private = [:delete_private] @enumeration = [:enumeration] @logger_level = [:logger_level] || Logger::FATAL @random_file_name = [:random_file_name] @recursive = [:recursive] @uid = [:uid] @uid_root = [:uid_root] ? [:uid_root] : UID_ROOT @write_path = [:write_path] # Array of folders to be processed for anonymization: @folders = Array.new # Folders that will be skipped: @exceptions = Array.new # Data elements which will be anonymized (the array will hold a list of tag strings): @tags = Array.new # Default values to use on anonymized data elements: @values = Array.new # Which data elements will have enumeration applied, if requested by the user: @enumerations = Array.new # We use a Hash to store information from DICOM files if enumeration is desired: @enum_old_hash = Hash.new @enum_new_hash = Hash.new # All the files to be anonymized will be put in this array: @files = Array.new @prefixes = Hash.new # Setup audit trail if requested: if [:audit_trail] @audit_trail_file = [:audit_trail] if File.exists?(@audit_trail_file) && File.size(@audit_trail_file) > 2 # Load the pre-existing audit trail from file: @audit_trail = AuditTrail.read(@audit_trail_file) else # Start from scratch with an empty audit trail: @audit_trail = AuditTrail.new end # Set up encryption if indicated: if [:encryption] require 'digest' if [:encryption].respond_to?(:hexdigest) @encryption = [:encryption] else @encryption = Digest::MD5 end end end # Set the default data elements to be anonymized: set_defaults end |
Instance Attribute Details
#audit_trail ⇒ Object (readonly)
An AuditTrail instance used for this anonymization (if specified).
18 19 20 |
# File 'lib/dicom/anonymizer.rb', line 18 def audit_trail @audit_trail end |
#audit_trail_file ⇒ Object (readonly)
The file name used for the AuditTrail serialization (if specified).
20 21 22 |
# File 'lib/dicom/anonymizer.rb', line 20 def audit_trail_file @audit_trail_file end |
#blank ⇒ Object
A boolean that if set as true will cause all anonymized tags to be blank instead of get some generic value.
22 23 24 |
# File 'lib/dicom/anonymizer.rb', line 22 def blank @blank end |
#delete ⇒ Object (readonly)
An hash of elements (represented by tag keys) that will be deleted from the DICOM objects on anonymization.
24 25 26 |
# File 'lib/dicom/anonymizer.rb', line 24 def delete @delete end |
#delete_private ⇒ Object
A boolean that if set as true, will make the anonymization delete all private tags.
26 27 28 |
# File 'lib/dicom/anonymizer.rb', line 26 def delete_private @delete_private end |
#encryption ⇒ Object (readonly)
The cryptographic hash function to be used for encrypting DICOM values recorded in an audit trail file.
28 29 30 |
# File 'lib/dicom/anonymizer.rb', line 28 def encryption @encryption end |
#enumeration ⇒ Object
A boolean that if set as true will cause all anonymized tags to be get enumerated values, to enable post-anonymization re-identification by the user.
30 31 32 |
# File 'lib/dicom/anonymizer.rb', line 30 def enumeration @enumeration end |
#logger_level ⇒ Object (readonly)
The logger level which is applied to DObject operations during anonymization (defaults to Logger::FATAL).
32 33 34 |
# File 'lib/dicom/anonymizer.rb', line 32 def logger_level @logger_level end |
#random_file_name ⇒ Object
A boolean that if set as true will cause all anonymized files to be written with random file names (if write_path has been specified).
34 35 36 |
# File 'lib/dicom/anonymizer.rb', line 34 def random_file_name @random_file_name end |
#recursive ⇒ Object
A boolean that if set as true, will cause the anonymization to run on all levels of the DICOM file tag hierarchy.
36 37 38 |
# File 'lib/dicom/anonymizer.rb', line 36 def recursive @recursive end |
#uid ⇒ Object
A boolean indicating whether or not UIDs shall be replaced when executing the anonymization.
38 39 40 |
# File 'lib/dicom/anonymizer.rb', line 38 def uid @uid end |
#uid_root ⇒ Object
The DICOM UID root to use when generating new UIDs.
40 41 42 |
# File 'lib/dicom/anonymizer.rb', line 40 def uid_root @uid_root end |
#write_path ⇒ Object
The path where the anonymized files will be saved. If this value is not set, the original DICOM files will be overwritten.
42 43 44 |
# File 'lib/dicom/anonymizer.rb', line 42 def write_path @write_path end |
Instance Method Details
#==(other) ⇒ Boolean Also known as: eql?
Checks for equality.
Other and self are considered equivalent if they are of compatible types and their attributes are equivalent.
127 128 129 130 131 |
# File 'lib/dicom/anonymizer.rb', line 127 def ==(other) if other.respond_to?(:to_anonymizer) other.send(:state) == state end end |
#add_exception(path) ⇒ Object
Use Anonymizer#anonymize instead.
Adds an exception folder which will be avoided when anonymizing.
12 13 14 15 16 17 18 19 20 21 |
# File 'lib/dicom/deprecated.rb', line 12 def add_exception(path) # Deprecation warning: logger.warn("The '#add_exception' method of the Anonymization class has been deprecated! Please use the '#anonymize' method with a dataset argument instead.") raise ArgumentError, "Expected String, got #{path.class}." unless path.is_a?(String) if path # Remove last character if the path ends with a file separator: path.chop! if path[-1..-1] == File::SEPARATOR @exceptions << path end end |
#add_folder(path) ⇒ Object
Use Anonymizer#anonymize instead.
Adds a folder who’s files will be anonymized.
30 31 32 33 34 35 |
# File 'lib/dicom/deprecated.rb', line 30 def add_folder(path) # Deprecation warning: logger.warn("The '#add_exception' method of the Anonymization class has been deprecated! Please use the '#anonymize' method with a dataset argument instead.") raise ArgumentError, "Expected String, got #{path.class}." unless path.is_a?(String) @folders << path end |
#anonymize(data) ⇒ Array<DObject>
Anonymizes the given DICOM data with the settings of this Anonymizer instance.
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
# File 'lib/dicom/anonymizer.rb', line 140 def anonymize(data) dicom = prepare(data) if @tags.length > 0 dicom.each do |dcm| anonymize_dcm(dcm) # Write DICOM object to file unless it was passed to the anonymizer as an object: write(dcm) unless dcm.was_dcm_on_input end else logger.warn("No tags have been selected for anonymization. Aborting anonymization.") end # Reset the ruby-dicom log threshold to its original level: logger.level = @original_level # Save the audit trail (if used): @audit_trail.write(@audit_trail_file) if @audit_trail logger.info("Anonymization complete.") dicom end |
#delete_tag(tag) ⇒ Object
Specifies that the given tag is to be completely deleted from the anonymized DICOM objects.
166 167 168 169 170 |
# File 'lib/dicom/anonymizer.rb', line 166 def delete_tag(tag) raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String) raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag? @delete[tag] = true end |
#enum(tag) ⇒ Boolean, NilClass
Checks the enumeration status of this tag.
177 178 179 180 181 182 183 184 185 186 187 |
# File 'lib/dicom/anonymizer.rb', line 177 def enum(tag) raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String) raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag? pos = @tags.index(tag) if pos return @enumerations[pos] else logger.warn("The specified tag (#{tag}) was not found in the list of tags to be anonymized.") return nil end end |
#execute ⇒ Object
Use Anonymizer#anonymize instead.
Executes the anonymization process.
This method is run when all settings have been finalized for the Anonymization instance.
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
# File 'lib/dicom/deprecated.rb', line 43 def execute # Deprecation warning: logger.warn("The '#execute' method of the Anonymization class has been deprecated! Please use the '#anonymize' method instead.") # FIXME: This method has grown way too lengthy. It needs to be refactored one of these days. # Search through the folders to gather all the files to be anonymized: logger.info("Initiating anonymization process.") start_time = Time.now.to_f logger.info("Searching for files...") load_files logger.info("Done.") if @files.length > 0 if @tags.length > 0 logger.info(@files.length.to_s + " files have been identified in the specified folder(s).") if @write_path # Determine the write paths, as anonymized files will be written to a separate location: logger.info("Processing write paths...") process_write_paths logger.info("Done") else # Overwriting old files: logger.warn("Separate write folder not specified. Existing DICOM files will be overwritten.") @write_paths = @files end # If the user wants enumeration, we need to prepare variables for storing # existing information associated with each tag: create_enum_hash if @enumeration # Start the read/update/write process: logger.info("Initiating read/update/write process. This may take some time...") # Monitor whether every file read/write was successful: all_read = true all_write = true files_written = 0 files_failed_read = 0 begin require 'progressbar' = ProgressBar.new("Anonymizing", @files.length) rescue LoadError = nil end # Temporarily increase the log threshold to suppress messages from the DObject class: anonymizer_level = logger.level logger.level = Logger::FATAL @files.each_index do |i| .inc if # Read existing file to DICOM object: dcm = DObject.read(@files[i]) if dcm.read? # Extract the data element parents to investigate for this DICOM object: parents = element_parents(dcm) parents.each do |parent| # Anonymize the desired tags: @tags.each_index do |j| if parent.exists?(@tags[j]) element = parent[@tags[j]] if element.is_a?(Element) if @blank value = "" elsif @enumeration old_value = element.value # Only launch enumeration logic if there is an actual value to the data element: if old_value value = enumerated_value(old_value, j) else value = "" end else # Use the value that has been set for this tag: value = @values[j] end element.value = value end end end # Delete elements marked for deletion: @delete.each_key do |tag| parent.delete(tag) if parent.exists?(tag) end end # General DICOM object manipulation: # Add a Patient Identity Removed attribute (as per # DICOM PS 3.15, Annex E, E.1.1 De-Identifier, point 6): dcm.add(Element.new('0012,0062', 'YES')) # Delete (and replace) the File Meta Information (as per # DICOM PS 3.15, Annex E, E.1.1 De-Identifier, point 7): dcm.delete_group('0002') # Handle UIDs if requested: replace_uids(parents) if @uid # Delete private tags? dcm.delete_private if @delete_private # Write DICOM file: dcm.write(@write_paths[i]) if dcm.written? files_written += 1 else all_write = false end else all_read = false files_failed_read += 1 end end .finish if # Finished anonymizing files. Reset the log threshold: logger.level = anonymizer_level # Print elapsed time and status of anonymization: end_time = Time.now.to_f logger.info("Anonymization process completed!") if all_read logger.info("All files in the specified folder(s) were SUCCESSFULLY read to DICOM objects.") else logger.warn("Some files were NOT successfully read (#{files_failed_read} files). If some folder(s) contain non-DICOM files, this is expected.") end if all_write logger.info("All DICOM objects were SUCCESSFULLY written as DICOM files (#{files_written} files).") else logger.warn("Some DICOM objects were NOT succesfully written to file. You are advised to investigate the result (#{files_written} files succesfully written).") end @audit_trail.write(@audit_trail_file) if @audit_trail elapsed = (end_time-start_time).to_s logger.info("Elapsed time: #{elapsed[0..elapsed.index(".")+1]} seconds") else logger.warn("No tags were selected for anonymization. Aborting.") end else logger.warn("No files were found in specified folders. Aborting.") end end |
#hash ⇒ Fixnum
Two objects with the same attributes will have the same hash code.
Computes a hash code for this object.
195 196 197 |
# File 'lib/dicom/anonymizer.rb', line 195 def hash state.hash end |
#print ⇒ Object
Prints to screen a list of which tags are currently selected for anonymization along with the replacement values that will be used and enumeration status.
174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 |
# File 'lib/dicom/deprecated.rb', line 174 def print logger.warn("Anonymizer#print is deprecated.") # Extract the string lengths which are needed to make the formatting nice: names = Array.new types = Array.new tag_lengths = Array.new name_lengths = Array.new type_lengths = Array.new value_lengths = Array.new @tags.each_index do |i| name, vr = LIBRARY.name_and_vr(@tags[i]) names << name types << vr tag_lengths[i] = @tags[i].length name_lengths[i] = names[i].length type_lengths[i] = types[i].length value_lengths[i] = @values[i].to_s.length unless @blank value_lengths[i] = '' if @blank end # To give the printed output a nice format we need to check the string lengths of some of these arrays: tag_maxL = tag_lengths.max name_maxL = name_lengths.max type_maxL = type_lengths.max value_maxL = value_lengths.max # Format string array for print output: lines = Array.new @tags.each_index do |i| # Configure empty spaces: s = ' ' f1 = ' '*(tag_maxL-@tags[i].length+1) f2 = ' '*(name_maxL-names[i].length+1) f3 = ' '*(type_maxL-types[i].length+1) f4 = ' ' if @blank f4 = ' '*(value_maxL-@values[i].to_s.length+1) unless @blank if @enumeration enum = @enumerations[i] else enum = '' end if @blank value = '' else value = @values[i] end tag = @tags[i] lines << tag + f1 + names[i] + f2 + types[i] + f3 + value.to_s + f4 + enum.to_s end # Print to screen: lines.each do |line| puts line end end |
#remove_tag(tag) ⇒ Object
Removes a tag from the list of tags that will be anonymized.
205 206 207 208 209 210 211 212 213 214 |
# File 'lib/dicom/anonymizer.rb', line 205 def remove_tag(tag) raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String) raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag? pos = @tags.index(tag) if pos @tags.delete_at(pos) @values.delete_at(pos) @enumerations.delete_at(pos) end end |
#set_tag(tag, options = {}) ⇒ Object
Sets the anonymization settings for the specified tag. If the tag is already present in the list of tags to be anonymized, its settings are updated, and if not, a new tag entry is created.
226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 |
# File 'lib/dicom/anonymizer.rb', line 226 def set_tag(tag, ={}) raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String) raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag? pos = @tags.index(tag) if pos # Update existing values: @values[pos] = [:value] if [:value] @enumerations[pos] = [:enum] if [:enum] != nil else # Add new elements: @tags << tag @values << ([:value] ? [:value] : default_value(tag)) @enumerations << ([:enum] ? [:enum] : false) end end |
#to_anonymizer ⇒ Anonymizer
Returns self.
246 247 248 |
# File 'lib/dicom/anonymizer.rb', line 246 def to_anonymizer self end |
#value(tag) ⇒ String, ...
If enumeration is selected for a string type tag, a number will be appended in addition to the string that is returned here.
Gives the value which will be used when anonymizing this tag.
258 259 260 261 262 263 264 265 266 267 268 |
# File 'lib/dicom/anonymizer.rb', line 258 def value(tag) raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String) raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag? pos = @tags.index(tag) if pos return @values[pos] else logger.warn("The specified tag (#{tag}) was not found in the list of tags to be anonymized.") return nil end end |