Microsoft Application Insights Output Plugin for Logstash
This project is a plugin for Logstash.
This plugin have to be installed on top of the Logstash core pipeline. It is not a stand-alone program.
This plugin outputs events to Microsoft Application Insights Analytics open schema tables.
Plugin Features
Supported Logstash Versions
- Logstash 2.3.2
- Logstash 2.3.4
- Logstash 2.4.0
- Logstash 5.0.0
Note:
- x64 Ruby for Windows is known to have some compatibility issues.
- the plugin depends on azure-storage that depends on gem nokogiri, which doesn't support Ruby 2.2+ on Windows.
Setting up
Install Logstash
- Download logstash from https://www.elastic.co/downloads/logstash
Install logstash-output-application_insights output plugin
One command installation:
bin/logstash-plugin install "logstash-output-application_insights"
Create configuration file
Example (input from files output Application Insights):
input {
file {
path => "/../files/*"
start_position => "beginning"
}
}
filter {
# some filters here
}
output {
application_insights {
instrumentation_key => "5a6714a3-ec7b-4999-ab96-232f1da92059"
table_id => "c24394e1-f077-420e-8a25-ef6fdf045938"
storage_account_name_key => [ "my-storage-account", "pfrYTwPgKyYNfKBY2QdF+v5sbgx8/eAQp+FFkGpPBnkMDE1k+ZNK3r3qIPqqw8UsOIUqaF3dXBdPDouGJuxNXQ==" ]
#
# if you want to allow Microsoft get telemtry data about this process please set it to true
enable_telemetry_to_microsoft => false
}
}
Run Logstash
bin/logstash -f 'file://localhost/../your-config-file'
Installation options
One command installation:
bin/logstash-plugin install "logstash-output-application_insights"
If above does not work, or you would like to patch code here is a workaround to install this plugin within your logstash:
- Check out/clone microsoft/logstash-output-application-insights code from github https://github.com/Microsoft/logstash-output-application-insights
Option 1: Run in a local Logstash clone
Edit Logstash
Gemfile
and add the logstash-output-application-insights plugin path:gem "logstash-output-application-insights", :path => "/../logstash-output-application-insights"
Install plugin the plugin from the Logstash home
bin/logstash-plugin install --no-verify
Option 2: Run in an installed Logstash
Build your plugin gem
gem build logstash-output-application-insights.gemspec
Install the plugin from the Logstash home
bin/logstash-plugin install "logstash-output-application_insights"
Configuration parameters
storage_account_name_key
Array of pairs, storage_account_name and an array of acces_keys. No default At least one pair is required. If not defined, values will be taken (if exist) from Environment Variable: AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_ACCESS_KEY examples:
storage_account_name_key => [ "my-storage-account", "pfrYTwPgKyYNfKBY2QdF+v5sbgx8/eAQp+FFkGpPBnkMDE1k+ZNK3r3qIPqqw8UsOIUqaF3dXBdPDouGJuxNXQ==" ]
storage_account_name_key => [ ["my-storage-account1", "key1"], "my-storage-account2", "key2"], ["my-storage-account3", "key3"] ]
storage_account_name_key => [ ["my-storage-account1", ["key11", "key12"]], ["my-storage-account1", "key2"], ["my-storage-account1", ["key3"] ]
Note: the storage account must be of "General purpose" kind.
azure_storage_table_prefix
A prefix for the azure storage tables name used by this Logstash instance. Default host name It is recommeded that each Logstash instance have a unique prefix, to avoid confusion and loss of tracking, although sharing tables won't damage proper execution. If not set, the host name is used (not alphanumeric characters are removed, and converted downcase), if host name available. The prefix string may contain only alphanumeric characters, it is case sensitive, and must start with a letter example:
azure_storage_table_prefix => "myprefix"
azure_storage_container_prefix
A prefix for the azure storage containers name used by this Logstash instance. Default host name It is recommeded that each Logstash prefix have a unique prefix, to avoid confusion and loss of tracking, although sharing containers won't damage proper execution. if not set, the host name is used (not alphanumeric characters are removed, and converted downcase), if host name available. The prefix string may contain only alphanumeric characters and dash, double dash are not allowed, it is case insesitive. example:
azure_storage_container_prefix => "myprefix"
azure_storage_blob_prefix
A prefix for the azure storage blobs name used by this Logstash instance. Default host name Each Logstash prefix MUST have a unique prefix, to avoid loss of data !!! If not set, the host name is used (not alphanumeric characters are removed, and converted downcase), if host name available string may include only characters that are allowed in any valid url example:
azure_storage_blob_prefix => "myprefix"
instrumentation_key
Default Application Insights Analytics instrumentation_key. No default It will be used only in case the key is not specified in the tables property associated to a table_id, or as field or metadata fields in the event example:
instrumentation_key => "5A6714A3-EC7B-4999-AB96-232F1DA92059"
table_id
Default Application Insights Analytics table_id. No default Will be used only in case it is not specified as field or metadata fields in the event example:
table_id => "C24394E1-F077-420E-8A25-EF6FDF045938"
table_columns
Specifies the list of the fields that will be filtered from the event, fields not specified will be ignored. No Default (event all fields) If not specified all fileds in events will be filtered, the order is kept. The order is essential in case of CSV serialization. example:
table_columns => [ "EventLogID", "AppName", "EnvironmentName", "ActivityID", "EventID", "Severity", "Title" ]
case_insensitive_columns
If set to true, events fields are refered as case insensitive. Default false (case sensitive) example:
case_insensitive_columns => true
blob_max_bytesize
Advanced, internal, should not be set. Default 4 GB. Azure storage maximum bytesize is 192 GB ( = 50,000 * 4 MB ) example:
blob_max_bytesize => 4000000000
blob_max_events
Specifies, maximum number of events in one blob. Default 1,000,000 events Setting it too low may improve latency, but will reduce ingestion performance Setting it too high may damage latency up to maximum delay, but ingestion will be more efficient, and load on network will be lower example:
blob_max_events => 1000000
blob_max_delay
Specifies maximum latency time, in seconds. Defualt 60 seconds The latency time is measured since the time an event arrived till it is commited to azure storage, and Application Insights is notified. The total latency time may be higher, as this is not the full ingestion flow example:
blob_max_delay => 3600
blob_serialization
Specifies the blob serialziation to create. Default "json" currently 2 types are supported "csv" and "json"" example:
blob_serialization => "json""
io_retry_delay
Interval of time between retries due to IO failures example:
io_retry_delay => 0.5
io_max_retries
Number of retries on IO failures, before giving up, and move to available options example:
io_max_retries => 3
blob_retention_time
Specifies the retention time of the blob in the container after it is notified to Application Insighta Analytics. Dfeauly 604,800 seconds (1 week) Once the retention time expires, the blob is the deleted from container example:
blob_retention_time => 604800
blob_access_expiry_time
Specifies the time Application Insights Analytics have access to the blob that are notifie. Default 86,400 seconds ( 1 day) Blob access is limited with SAS URL example:
blob_retention_time => 604800
csv_default_value
Specifies the string that is used as the value in a csv record, in case the field does not exist in the event. Default "" example:
csv_default_value => "-"
serialized_event_field
Specifies a serialized event field name, that if exist in current event, its value as is will be taken as the serialized event. No Default example:
serialized_event_field => "serializedMessage"
logger_level
Specifies the log level. valid values are: DEBUG, INFO, WARN, ERROR, FATAL, UNKNOWN. Default "INFO" example:
logger_level => "INFO"
logger_files
Specifies the list of targets for the log. may include files, devices, "stdout" and "stderr". Default "logstash-output-application-insights.log" example:
csv_default_value => [ "c:/logstash/dev/runtime/log/logstash-output-application-insights.log", "stdout" ]
logger_progname
Specifies the program name that will displayed in each log record. Default "AI" Should be modified only in case there is another plugin with the same program name example:
logger_progname => "MSAI"
logger_shift_size
Specifies maximum logfile size. No Default (no size limit) Only applies when shift age is a number !!! Not supported in Windows !!! example (1 MB):
logger_shift_size => 1048576
logger_shift_age
Specifies Number of old logfiles to keep, or frequency of rotation (daily, weekly or monthly). No default (never) Not supported in Windows !!! examples:
logger_shift_age => weekly
logger_shift_age => 5
resurrect_delay
Specifies the time interval, between tests that check whether a stoarge account came back to life, after it stoped responding. Default 10 seconds example (half second):
flow_control_delay => 0.5
flow_control_suspend_bytes
Specifies the high water mark for the flow control, that is used to avoid out of memory crash. Default 52,428,800 Bytes (50 MB) Once the memory consumption reach the high water mark, the plugin will stop accepting events, till memory is below the low water mark example (200 MB):
flow_control_suspend_bytes => 209715200
flow_control_resume_bytes
Specifies the low water mark for the flow control, that is used to avoid out of memory crash. Default 41,820,160 Bytes (40 MB) Once memory consumption reach the high water mark, the plugin will stop accepting events, till memory is below the low water mark example (10 MB):
flow_control_resume_bytes => 10455040
flow_control_delay
Specifies the amount of time the flow control suspend receiving event. Default 1 second It is to allow GC, and flush of event to Azure storage before checking whether memory is below low water mark example (half second):
flow_control_delay => 0.5
ca_file
File path of the CA file, required only if having issue with SSL (see OpenSSL). No default example:
ca_file => "/path/to/cafile.crt"
enable_telemetry_to_microsoft
When set to true, telemetry about the plugin, will't be sent to Microsoft. Deafult false Only if you want to allow Microsoft get telemtry data about this process set it to true example:
enable_telemetry_to_microsoft => true
disable_cleanup
When set to true, storage cleanup won't be done by the plugin (should be done by some other means or by another Logstash process with this flag enabled) Default false example:
disable_cleanup => true
disable_compression
When set to true, blobs won't be compressed (beware: it will require more storage, more memory and more bandwidth) Default false example:
disable_compression => true
delete_not_notified_blobs
When set to true, not notified blobs are deleted, if not set they are copied to the orphan-blobs container. Default false example:
delete_not_notified_blobs => true
validate_notification
When set to true, access to application insights will be validated at initialization and if validation fail, logstash process will abort. Default false example:
validate_notification => true
validate_storage
When set to true, access to azure storage for each of the configured accounts will be validated at initialization and if validation fail, logstash process will abort. Default false example:
validate_storage => true
save_notified_blobs_records
When set to true, notified blobs records are saved in the state table, as long as blobs are retained in their containers. Default false Used for troubleshooting example:
save_notified_blobs_records => true
disable_notification
When set to true, notification is not sent to application insights, but behaves as if notified. Default false Used for troubleshooting example:
disable_notification => true
disable_blob_upload
When set to true, events are not uploaded, and blob not commited, but behaves as if uploaded and uploaded. Default false Used for troubleshooting example:
disable_blob_upload => true
disable_truncation
When set to true, event fields won't be truncated to max 1MB (beware: The max allows bytes size per filed is 1MB, setting it to true, it will be just waste of bandwidth and storage) Default false Used for troubleshooting example:
disable_truncation => true
stop_on_unknown_io_errors
When set to true, process will stop if an unknown IO error is detected. Default false Used for troubleshooting example:
stop_on_unknown_io_errors => true
azure_storage_host_suffix
when set an alternative storage service will be used. Default "core.windows.net" example:
azure_storage_host_suffix => "core.windows.net"
application_insights_endpoint
when set blob ready notification are sent to an alternative endpoint. Default "https://dc.services.visualstudio.com/v2/track" example:
application_insights_endpoint => "https://dc.services.visualstudio.com/v2/track"
notification_version
Advanced, internal, should not be set, the only current valid value is 1 example:
notification_version => 1
tables
Allow to support multiple tables, and to configure each table with its own parameters, using the global parameters as defaults. It is only required if the plugin need to support mutiple table. Tables is Hash, where the key is the table_id and the value is a has of specific properties, that their defualt value are the global properties. The specific properties are: instrumentation_key, table_columns, blob_max_delay, csv_default_value, serialized_event_field, blob_serialization, csv_separator template:
tables => { "table_id1" => { properties } "table_id2" => { properties } }
Examples:
tables => { "6f29a89e-1385-4317-85af-3ac1cea48058" => { "instrumentation_key" => "76c3b8e9-dfc6-4afd-8d4c-3b02fdadb19f", "blob_max_delay" => 60 } }
tables => { "6f29a89e-1385-4317-85af-3ac1cea48058" => { "instrumentation_key" => "76c3b8e9-dfc6-4afd-8d4c-3b02fdadb19f", "blob_max_delay" => 60 }
"2e1b46aa-56d2-4e13-a742-d0db516d66fc" => { "instrumentation_key" => "76c3b8e9-dfc6-4afd-8d4c-3b02fdadb19f", "blob_max_delay" => 120 "ext" => "csv" "serialized_event_field" => "message" }
}
Enviroment variables
AZURE_STORAGE_ACCOUNT
Specifies the Azure storage account name Will be used by the plugin to set the account name part in plugin property storage_account_name_key if it is missing Example:
AZURE_STORAGE_ACCOUNT="my-storage-account"
AZURE_STORAGE_ACCESS_KEY
Specifies the Azure storage account access key Will be used by the plugin to set the key part in plugin property storage_account_name_key if it is missing Example:
AZURE_STORAGE_ACCESS_KEY="pfrYTwPgKyYNfKBY2QdF+v5sbgx8/eAQp+FFkGpPBnkMDE1k+ZNK3r3qIPqqw8UsOIUqaF3dXBdPDouGJuxNXQ=="
Setting up Http/Https Proxy
If you use a proxy server or firewall, you may need to set the HTTP_PROXY and/or HTTPS_PROXY environment variables in order to access Azure storage and Application Insights. Examples:
HTTP_PROXY=http://proxy.example.org
HTTPS_PROXY=https://proxy.example.org
If the proxy server requires a user name and password, include them in the following form:
HTTP_PROXY=http://username:[email protected]
If the proxy server uses a port other than 80, include the port number:
HTTP_PROXY=http://username:[email protected]:8080
Setting up SSL certificates
When using SSL/HTTPS, typically log in or authentication may require a CA Authority (CA) certificate. If the required certificate is not already bundled in the system. it may be configured in the plugin (see above ca_file) example:
ca_file => "/path/to/cafile.crt"
Getting Started for Contributors
If you would like to become an active contributor to this project please follow the instructions provided in CONTRIBUTING.md and DEVELOPER.md
Provide Feedback
If you encounter any bugs with the library please file an issue in the Issues section of the project.
Code of Conduct
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.