Version: Self Hosted Turbo

Custom Monitoring using StatsD

Overview

StatsD is a popular standard for developing infrastructure and application plugins. A wide suite of standard plugins are available from Statsd community and can be accessed here

sfAgent Statsd plugin integrates to Statsd client in the following way:

Runs a daemon to listen to UDP port for data being sent by statsd client and accumulates all metrics being sent in the last N seconds (called flushinterval)
Translates the data from statsd format to SnappyFlow’s format
Forwards the data to SnappyFlow with necessary tags

Prerequisites

Create a rules file for a statsd client or contact support@snappyflow.io to create the rules file for a specific statsd client.

Configuration

User can also manually add the configuration shown below to config.yaml under /opt/sfagent/ directory

key: <profile_key> 
tags: 
  Name: <name> 
  appName: <app_name> 
  projectName: <project_name> 
metrics: 
  plugins: 
    - name: statsd 
      enabled: true 
      config: 
        port: 8125 
        flushinterval: 30 
        ruleFile: /path/to/statsd-rules/file 

port: The UDP port on which statsd client sends metrics. sfAgent runs a statsd server listening on this port for the UDP datagrams. Default value is 8125.

flushInterval: SnappyFlow’s statsd plugin collects all the metrics received in the last N seconds and sends the data to SnappyFlow as a single document

ruleFile: User generated statsd rules file path or please contact support@snappyflow.io to create a rule file for a specific statsd client.

Operating Instructions

Validate the statsd configuration and the rules. It is mandatory to run this command after any change is made in the statsd rules file, followed by restarting the sfAgent service.

sudo /opt/sfagent/sfagent -check-statsd

Creating Rules File

statsd metrics are expected in the format shown below

namespace.prefix.[type.]metric:value|metricType

Example

ClusterA.Kafka1.Topic1.Lag:500|g

In this case,

namespace= ClusterA,
prefix= Kafka1,
type= Topic1,
metric= Lag,
value= 500,
metricType= g(gauge)

The field type is optional. If this field is present, it will enforce a nested json else the resulting json will be flat
Example
```
Kafka1.General.numTopic:5|g
```
In this case,
```
namespace= Kafka1,
prefix= General,
metric= numTopic,
value= 5,
metricType= g (gauge)
```
namespace= Kafka1, prefix= General, metric= numTopic, value= 5, metricType= g (gauge)

note

In special cases where namespace is not present and the metrics start directly with prefix, set namespace: none.

Supported datatypes are float, double, long, integer.

Rule to create nested json: "NESTED"

Syntax

<json_key> = NESTED(namespace: <namespace>, prefix: <prefix_name>, key: <type_key>, metric: [<list of metrics along with datatypes>])

<json_key>: key of the final nested json.

<namespace>: This rule is applied to all metrics having this namespace

<prefix>: This rule is applied to all metrics having this prefix.

<key>: adds a key:value pair in the nested json

<metric>: Specify all the metrics to collect for this prefix.

Example

DB.host1.disk1.readLatency:20|g 
DB.host1.disk1.writeLatency:50|g

Rule

latency = NESTED(namespace: DB, prefix: host1, key: diskName, metric:[readLatency:float, writeLatency:float])

Output

"latency": [ 
{ 
   "diskName": disk1, 
   "readLatency":20, 
   "writeLatency": 50 
}, 
{ 
   "diskName": disk2, 
   "readLatency":25, 
   "writeLatency": 45 
} 
 ] 

Rule to create flat json: "FLAT"

Syntax

<json_key> = FLAT(namespace: <namespace>, prefix: <prefix_name>, metric: <metric_name>)

<namespace>: This rule is applied to all metrics having this namespace

<prefix>: This rule is applied to all metrics having this prefix.

<metric>: Specify all the metrics to collect for this prefix.

Example

Kafka1.System.cpuutil:10|g,
Kafka1.System.ramutil:20|g,

Rule

computeMetrics = FLAT(namespace: Kafka1, prefix: System, metric: [cpuutil:float, ramutil:float])

Output

"cpuutil": 10, 
“ramutil”:20

"RENDER" Rule:

Extraction rules mentioned above, extract a set of metrics from statsd datagrams. These extracted metrics are grouped together in documents and shipped to SnappyFlow. Render rules describe grouping of metrics into documentType

Syntax

RENDER(_documentType: <doctype>, m1, m2,…mn) where m1..mn can be metric names or Rule names

Example

RENDER(documentType: system, computeMetrics, latency) will create a documentType

{ 
plugin: statsd 
documentType: system 
"cpuutil": 10, 
“ramutil”: 20 
"latency": [ 
        { 
        "diskName": disk1, 
        "readLatency":20, 
        "writeLatency": 50 
        }, 
        { 
         “diskName”: disk2, 
         “readLatency”:25, 
         “writeLatency”: 45 
        } 
        ] 
} 

Tagging

sfAgent statsD plugin is capable of parsing and forwarding the tags contained in the statsd metric datagrams. Tags are expressed in different formats based on the intended destination being Datadog, Influx or Graphite.

Add TAGTYPE rule in the statsd rules file to enable the parsing. Default TAGTYPE is None i.e. no custom tags present. In each of the formats below, the tags are recognized and passed forward into SnappyFlow documents

TAGTYPE = Datadog

Sample metric:

Cluster1.kafka1.cpuUtil:35|c|#_tag_appName:testApp1,_tag_projectName:apmProject,_documentType:cpuStats

TAGTYPE = Influx

Sample metric:

Cluster1.Kafka1.cpuUtil,_tag_appName=testApp1,_tag_projectName=apmProject,_documentType=cpuStats:35|c

TAGTYPE = Graphite

Sample metric:

Cluster1.Kafka1.cpuUtil;_tag_appName=testApp1;_tag_projectName=apmProject;_documentType=cpuStats:35|c

Sidekiq Use-case

This section shows to monitor sidekiq using statsd with sfAgent.

Description

We will use a simple ruby on rails application which shows endangered sharks’ data.

There are two sidekiq worker configured, one to add the data and another to remove the sharks data named as AddEndangeredWorker and RemoveEndangeredWorker respectively.
Sidekiq statsd client is also configured to get the metrics.
For this example, sidekiq-statsd by phstc is used as the client.

Installation

Skip this part if the statsd client is already configured.

Follow this documentation to setup the ruby on rails application, if needed
To add the statsd client:
1. Create a new file sidekiq.rb under config/initializers/ and add the configuration specified here.
2. Install the [sidekiq-statsd gem](https://github.com/phstc/sidekiq-statsd" /l "installation) and run the application.

Sample Metrics

Metrics are generated upon worker activation in the application.

Add endangered worker metrics

production.worker.AddEndangeredWorker.processing_time:1113|ms
production.worker.AddEndangeredWorker.success:1|c
production.worker.enqueued:0|g
production.worker.retry_set_size:0|g
production.worker.processed:69|g
production.worker.failed:0|g
production.worker.queues.default.enqueued:0|g
production.worker.queues.default.latency:0|g

Remove endangered worker metrics

production.worker.RemoveEndangeredWorker.processing_time:1472|ms
production.worker.RemoveEndangeredWorker.success:1|c
production.worker.enqueued:0|g
production.worker.retry_set_size:0|g
production.worker.processed:107|g
production.worker.failed:0|g
production.worker.queues.default.enqueued:0|g
production.worker.queues.default.latency:0|g

Rules

Follow the Rules User Guide section to understand the rules.

TAGTYPE = None

worker = NESTED(namespace: production, prefix: worker, key: worker_name, metric:[processing_time:double, success:float])

queues = NESTED(namespace: production, prefix: worker.queues, key: queue_name, metric:[enqueued:float, latency:float])

processedJobs = FLAT(namespace: production, prefix: worker, metric: processed:integer)

RENDER(_documentType: sidekiq, worker, queues, processedJobs)

sfAgent Configuration

Content of the /opt/sfagent/config.yaml. The rules file is /opt/sfagent/statsd-rules.txt

key: <profile_key>
tags:
  Name: <instance-name>
  appName: <app-name>
  projectName: <project-name>
metrics:
  plugins:
  - name: statsd
      enabled: true
      config:
        port: 8125
        flushInterval: 10
        ruleFile: '/opt/sfagent/statsd-rules.txt'

Output

{
  "_documentType": "sidekiq",
  "_tag_Name": "vm",
  "queues": [
    {
      "latency": 0,
      "queue_name": "default",
      "enqueued": 0
    }
  ],
  "_plugin": "statsD",
  "processedJobs": 107,
  "worker": [
    {
      "processing_time": 1472,
      "worker_name": "RemoveEndangeredWorker",
      "success": 1
    },
    {
      "processing_time": 1113,
      "worker_name": "AddEndangeredWorker",
      "success": 1
    }
  ],
  "_tag_projectName": "statsDProject",
  "_tag_uuid": "080027957dd8",
  "time": 1616132931981,
  "_tag_appName": "statsDApp"
}

Custom Monitoring using StatsD

Overview​

Prerequisites​

Configuration​

Operating Instructions​

Creating Rules File​

Rule to create nested json: "NESTED"​

Rule to create flat json: "FLAT"​

"RENDER" Rule:​

Tagging​

Sidekiq Use-case​

Description​

Installation​

Sample Metrics​

Rules​

sfAgent Configuration​

See Also​

Overview

Prerequisites

Configuration

Operating Instructions

Creating Rules File

Rule to create nested json: "NESTED"

Rule to create flat json: "FLAT"

"RENDER" Rule:

Tagging

Sidekiq Use-case

Description

Installation

Sample Metrics

Rules

sfAgent Configuration

See Also