Class: Datadog::DI::Serializer Private

Inherits:
Object
  • Object
show all
Defined in:
lib/datadog/di/serializer.rb

Overview

This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.

Serializes captured snapshot to primitive types, which are subsequently serialized to JSON and sent to the backend.

This class performs actual removal of sensitive values from the snapshots. It uses Redactor to determine which values are sensitive and need to be removed.

Serializer normally ought not to invoke user (application) code, to guarantee predictable performance. However, objects like ActiveRecord models cannot be usefully serialized into primitive types without custom logic (for example, the attributes are more than 3 levels down from the top-level object which is the default capture depth, thus they won’t be captured at all). To accommodate complex objects, there is an extension mechanism implemented permitting registration of serializer callbacks for arbitrary types. Applications and libraries definining such serializer callbacks should be very careful to have predictable performance and avoid exceptions and infinite loops and other such issues.

All serialization methods take the names of the variables being serialized in order to be able to redact values.

The result of serialization should not reference parameter values when the values are mutable (currently, this only applies to string values). Serializer will duplicate such mutable values, so that if method arguments are captured at entry and then modified during method execution, the serialized values from entry are correctly preserved. Alternatively, we could pass a parameter to the serialization methods which would control whether values are duplicated. This may be more efficient but there would be additional overhead from passing this parameter all the time and the API would get more complex.

Constant Summary collapse

@@flat_registry =

This classvariable is part of a private API. You should avoid using this classvariable if possible, as it may be removed or be changed in the future.

Third-party library integration / custom serializers.

Dynamic instrumentation has limited payload sizes, and for efficiency reasons it is not desirable to transmit data to Datadog that will never contain useful information. Additionally, due to depth limits, desired data may not even be included in payloads when serialized with the default, naive serializer. Therefore, custom objects like ActiveRecord model instances may need custom serializers.

CUSTOMER NOTE: The API for defining custom serializers is not yet finalized. Please create an issue at github.com/datadog/dd-trace-rb/issues describing the object(s) you wish to serialize so that we can ensure your use case will be supported as the library evolves.

Note that the current implementation does not permit defining a serializer for a particular class, which is the simplest use case. This is because the library itself does not need this functionality yet, and it won’t help for ActiveRecord models (that derive from a common base class but are all of different classes) or for Mongoid models (that do not have a common base class at all but include a standard Mongoid module).

[]

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(settings, redactor, telemetry: nil) ⇒ Serializer

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns a new instance of Serializer.



68
69
70
71
72
# File 'lib/datadog/di/serializer.rb', line 68

def initialize(settings, redactor, telemetry: nil)
  @settings = settings
  @redactor = redactor
  @telemetry = telemetry
end

Instance Attribute Details

#redactorObject (readonly)

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



75
76
77
# File 'lib/datadog/di/serializer.rb', line 75

def redactor
  @redactor
end

#settingsObject (readonly)

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



74
75
76
# File 'lib/datadog/di/serializer.rb', line 74

def settings
  @settings
end

#telemetryObject (readonly)

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



76
77
78
# File 'lib/datadog/di/serializer.rb', line 76

def telemetry
  @telemetry
end

Class Method Details

.register(condition: nil, &block) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



64
65
66
# File 'lib/datadog/di/serializer.rb', line 64

def self.register(condition: nil, &block)
  @@flat_registry << {condition: condition, proc: block}
end

Instance Method Details

#serialize_args(args, kwargs, depth: settings.dynamic_instrumentation.max_capture_depth, attribute_count: settings.dynamic_instrumentation.max_capture_attribute_count) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Serializes positional and keyword arguments to a method, as obtained by a method probe.

UI supports a single argument list only and does not distinguish between positional and keyword arguments. We convert positional arguments to keyword arguments (“arg1”, “arg2”, …) and ensure the positional arguments are listed first.



85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/datadog/di/serializer.rb', line 85

def serialize_args(args, kwargs,
  depth: settings.dynamic_instrumentation.max_capture_depth,
  attribute_count: settings.dynamic_instrumentation.max_capture_attribute_count)
  counter = 0
  combined = args.each_with_object({}) do |value, c|
    counter += 1
    # Conversion to symbol is needed here to put args ahead of
    # kwargs when they are merged below.
    c[:"arg#{counter}"] = value
  end.update(kwargs)
  serialize_vars(combined, depth: depth, attribute_count: attribute_count)
end

#serialize_value(value, name: nil, depth: settings.dynamic_instrumentation.max_capture_depth, attribute_count: nil, type: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Serializes a single named value.

The name is needed to perform sensitive data redaction.

In some cases, the value being serialized does not have a name (for example, it is the return value of a method). In this case name can be nil.

Returns a data structure comprised of only values of basic types (integers, strings, arrays, hashes).

Respects string length, collection size and traversal depth limits.



122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
# File 'lib/datadog/di/serializer.rb', line 122

def serialize_value(value, name: nil,
  depth: settings.dynamic_instrumentation.max_capture_depth,
  attribute_count: nil,
  type: nil)
  attribute_count ||= settings.dynamic_instrumentation.max_capture_attribute_count
  cls = type || value.class
  begin
    if redactor.redact_type?(value)
      return {type: class_name(cls), notCapturedReason: "redactedType"}
    end

    if name && redactor.redact_identifier?(name)
      return {type: class_name(cls), notCapturedReason: "redactedIdent"}
    end

    @@flat_registry.each do |entry|
      if (condition = entry[:condition]) && condition.call(value)
        serializer_proc = entry.fetch(:proc)
        return serializer_proc.call(self, value, name: nil, depth: depth)
      end
    end

    serialized = {type: class_name(cls)}
    case value
    when NilClass
      serialized.update(isNull: true)
    when Integer, Float, TrueClass, FalseClass
      serialized.update(value: value.to_s)
    when Time
      # This path also handles DateTime values although they do not need
      # to be explicitly added to the case statement.
      serialized.update(value: value.iso8601)
    when Date
      serialized.update(value: value.to_s)
    when String, Symbol
      need_dup = false
      value = if String === value
        # This is the only place where we duplicate the value, currently.
        # All other values are immutable primitives (e.g. numbers).
        # However, do not duplicate if the string is frozen, or if
        # it is later truncated.
        need_dup = !value.frozen?
        value
      else
        value.to_s
      end
      max = settings.dynamic_instrumentation.max_capture_string_length
      if value.length > max
        serialized.update(truncated: true, size: value.length)
        value = value[0...max]
        need_dup = false
      end
      value = value.dup if need_dup
      serialized.update(value: value)
    when Array
      if depth < 0
        serialized.update(notCapturedReason: "depth")
      else
        max = settings.dynamic_instrumentation.max_capture_collection_size
        if max != 0 && value.length > max
          serialized.update(notCapturedReason: "collectionSize", size: value.length)
          # same steep failure with array slices.
          # https://github.com/soutaro/steep/issues/1219
          value = value[0...max] || []
        end
        entries = value.map do |elt|
          serialize_value(elt, depth: depth - 1)
        end
        serialized.update(elements: entries)
      end
    when Hash
      if depth < 0
        serialized.update(notCapturedReason: "depth")
      else
        max = settings.dynamic_instrumentation.max_capture_collection_size
        cur = 0
        entries = []
        value.each do |k, v|
          if max != 0 && cur >= max
            serialized.update(notCapturedReason: "collectionSize", size: value.length)
            break
          end
          cur += 1
          entries << [serialize_value(k, depth: depth - 1), serialize_value(v, name: k, depth: depth - 1)]
        end
        serialized.update(entries: entries)
      end
    else
      if depth < 0
        serialized.update(notCapturedReason: "depth")
      else
        fields = {}
        cur = 0

        # MRI and JRuby 9.4.5+ preserve instance variable definition
        # order when calling #instance_variables. Previous JRuby versions
        # did not preserve order and returned the variables in arbitrary
        # order.
        #
        # The arbitrary order is problematic because 1) when there are
        # fewer instance variables than capture limit, the order in which
        # the variables are shown in UI will change from one capture to
        # the next and generally will be arbitrary to the user, and
        # 2) when there are more instance variables than capture limit,
        # *which* variables are captured will also change meaning user
        # looking at the UI may have "new" instance variables appear and
        # existing ones disappear as they are looking at multiple captures.
        #
        # For consistency, we should have some kind of stable order of
        # instance variables on all supported Ruby runtimes, so that the UI
        # stays consistent. Given that initial implementation of Ruby DI
        # does not support JRuby, we don't handle JRuby's lack of ordering
        # of #instance_variables here, but if JRuby is supported in the
        # future this may need to be addressed.
        ivars = value.instance_variables

        ivars.each do |ivar|
          if cur >= attribute_count
            serialized.update(notCapturedReason: "fieldCount", fields: fields)
            break
          end
          cur += 1
          fields[ivar] = serialize_value(value.instance_variable_get(ivar), name: ivar, depth: depth - 1)
        end
        serialized.update(fields: fields)
      end
    end
    serialized
  rescue => exc
    telemetry&.report(exc, description: "Error serializing")
    {type: class_name(cls), notSerializedReason: exc.to_s}
  end
end

#serialize_vars(vars, depth: settings.dynamic_instrumentation.max_capture_depth, attribute_count: settings.dynamic_instrumentation.max_capture_attribute_count) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Serializes variables captured by a line probe.

These are normally local variables that exist on a particular line of executed code.



102
103
104
105
106
107
108
# File 'lib/datadog/di/serializer.rb', line 102

def serialize_vars(vars,
  depth: settings.dynamic_instrumentation.max_capture_depth,
  attribute_count: settings.dynamic_instrumentation.max_capture_attribute_count)
  vars.each_with_object({}) do |(k, v), agg|
    agg[k] = serialize_value(v, name: k, depth: depth, attribute_count: attribute_count)
  end
end