dsc.conf

NAME
DESCRIPTION
CONFIGURATION
DATASETS
INDEXERS AND FILTERS
IP INDEXERS
IP FILTERS
DNS INDEXERS
DNS FILTERS
QNAME FILTERS
PARAMETERS
FILE NAMING CONVENTIONS
DATA FORMATS
GEOIP
EXAMPLE
FILES
SEE ALSO
AUTHORS
BUGS

NAME

dsc.conf − Configuration for the DNS Statistics Collector

DESCRIPTION

The file dsc.conf contains defaults for the program dsc(1) . Each line is a configuration option and may have arguments in the form option value;. Comment lines must have a hash sign (#) in the first column.

Since dsc(1) version 2.2.0, a configuration line may not be divided with CR/LF and quoted characters are not understood (\quote).

CONFIGURATION

local_address IP [ MASK / BITS ] ;

Specifies the DNS server’s local IP address with an optional mask/bits for local networks. It is used to determine the direction of an IP packet: sending, receiving, or other. You may specify multiple local addresses by repeating the local_address line any number of times.

run_dir PATH ;

A directory that will become dsc current directory after it starts. Output files will be written here, as will any core dumps.

minfree_bytes BYTES ;

If the filesystem where dsc writes its output files does not have at least this much free space, then dsc will not write the files. This prevents dsc from filling up the filesystem. The files that would have been written are simply lost and cannot be recovered. Output files will be written again when the filesystem has the necessary free space.

pid_file " FILE " ;

The file where dsc will store its process id.

bpf_program " RULE " ;

A Berkeley Packet Filter program string. You may use this to further restrict the traffic seen but note that dsc currently has one indexer that looks at all IP packets. If you specify something like udp port 53 that indexer will not work.

However, if you want to monitor multiple DNS servers with separate dsc instances on one collector box, then you may need to use bpf_program to make sure that each dsc process sees only the traffic it should see.

Note that this directive must go before the interface directive because dsc makes only one pass through the configuration file and the BPF filter is set when the interface is initialized.

pcap_buffer_size NUMBER ;

Set the buffer size (in bytes) for pcap, increasing this may help if you see dropped packets by the kernel but increasing it too much may have other side effects.

Note that this directive must go before the interface directive because dsc makes only one pass through the configuration file and the pcap buffer size is set when the interface is initialized.

pcap_thread_timeout MILLISECONDS ;

Set the internal timeout pcap-thread uses when waiting for packets, the default is 100 ms.

Note that this directive must go before the interface directive.

drop_ip_fragments ;

Drop all packets that are fragments.

Note that this directive must go before the interface directive.

interface IFACE | FILE ;

The interface name to sniff packets from or a pcap file to read packets from. You may specify multiple interfaces.

qname_filter NAME FILTER ;

This directive allows you to define custom filters to match query names in DNS messages. Please see section QNAME FILTERS for more information.

datasets NAME TYPE LABEL:FIRST LABEL:SECOND FILTERS [ PARAMETERS ] ;

This directive is the hart of dsc. However, it is also the most complex (see section DATASETS).

See section EXAMPLE for a set of defined good datasets which is also installed as dsc.conf.sample.

bpf_vlan_tag_byte_order TYPE ;

dsc knows about VLAN tags. Some operating systems (FreeBSD-4.x) have a bug whereby the VLAN tag id is byte-swapped. Valid values for this directive are host and net (the default). Set this to host if you suspect your operating system has the VLAN tag byte order bug.

match_vlan ID [ ID ... ] ;

A white-space separated list of VLAN identifiers. If set, only the packets belonging to these VLANs are analyzed.

statistics_interval SECONDS ;

Specify how often dsc should write statistics, default to 60 seconds.

no_wait_interval ;

Do not wait on interval sync to start capturing, normally DSC will sleep for time() % statistics_interval to align with the minute (as was the default interval before) but now if you change the interval to more then a minute you can use with option to begin capture right away.

output_format FORMAT ;

Specify the output format, can be give multiple times to output in more then one format. Default output format is XML, see section DATA FORMATS and FILE NAMING CONVENTIONS.

dump_reports_on_exit ;

Dump any remaining report before exiting.

NOTE: Timing in the data files will be off!

geoip_v4_dat " FILE " [ OPTION ... ] ;

Specify the GeoIP dat file to open for IPv4 country lookup, see section GEOIP for options.

geoip_v6_dat " FILE " [ OPTION ... ] ;

Specify the GeoIP dat file to open for IPv6 country lookup, see section GEOIP for options.

geoip_asn_v4_dat " FILE " [ OPTION ... ] ;

Specify the GeoIP dat file to open for IPv4 AS number lookup, see section GEOIP for options.

geoip_asn_v6_dat " FILE " [ OPTION ... ] ;

Specify the GeoIP dat file to open for IPv6 AS number lookup, see section GEOIP for options.

client_v4_mask NETMASK ;

Set the IPv4 MASK for client_subnet INDEXERS.

client_v6_mask NETMASK ;

Set the IPv6 MASK for client_subnet INDEXERS.

DATASETS

A dataset is a 2-D array of counters. For example, you might have a dataset with “Query Type” along one dimension and “Query Name Length” on the other. The result is a table that shows the distribution of query name lengths for each query type.

A dataset has the following format:

datasets NAME TYPE LABEL:FIRST LABEL:SECOND FILTERS [ PARAMETERS ] ;

	NAME		The name of the dataset, this must be unique and is used in the filename for the output files.
	TYPE		The protocol layer, available layers are:

ip
dns

	LABEL		The label of the dimensions.
	FIRST		The indexer for the first dimension, see INDEXERS sections.
	SECOND		The indexer for the second dimension, see INDEXERS sections.

FILTERS

One or more filters, see FILTERS sections.

PARAMETERS

Zero or more parameters, see section PARAMETERS.

INDEXERS AND FILTERS

An indexer is simply a function that transforms the attributes of an IP/DNS message into an array index. For some attributes the transformation is straightforward. For example, the “Query Type” indexer simply extracts the query type value from a DNS message and uses this 16-bit value as the array index.

Other attributes are slightly more complicated. For example, the “TLD” indexer extracts the TLD of the QNAME field of a DNS message and maps it to an integer. The indexer maintains a simple internal table of TLD-to-integer mappings. The actual integer values are unimportant because the TLD strings, not the integers, appear in the resulting data.

When you specify an indexer on a dataset line, you must provide both the name of the indexer and a label. The Label appears as an attribute in the output.

For example the following line:

dataset the_dataset dns Foo:foo Bar:bar queries-only;

Would produce the following XML output:

IP INDEXERS

dsc includes only minimal support for collecting IP-layer stats. Mostly we are interested in finding out the mix of IP protocols received by the DNS server. It can also show us if/when the DNS server is the subject of denial-of-service attack.
ip_direction

One of three values: sent, recv or else. Direction is determined based on the setting for local_address in the configuration file.

ip_proto

The IP protocol type, e.g.: tcp, udp or icmp. Note that the bpf_program setting affects all traffic seen. If the program contains the word “udp” then you won’t see any counts for non-UDP traffic.

ip_version

The IP version number, e.g.: 4 or 6. Can be used to compare how much traffic comes in via IPv6 compared to IPv4.

IP FILTERS

Currently there is only one IP protocol filter: any. It includes all received packets.

DNS INDEXERS

certain_qnames

This indexer isolates the two most popular query names seen by DNS root servers: localhost and [a--m].root-servers.net.

client_subnet

Groups DNS messages together by the subnet of the client’s IP address. The subnet is masked by /24 for IPv4 and by /96 for IPv6. We use this to make datasets with large, diverse client populations more manageable and to provide a small amount of privacy and anonymization.

	client		The IP (v4 and v6) address of the DNS client.
	server		The IP (v4 and v6) address of the DNS server.

country

The country code of the IP (v4 and v6), see section GEOIP.

	asn		The AS (autonomous system) number of the IP (v4 and v6), see section GEOIP.
	do_bit		This indexer has only two values: 0 or 1. It indicates whether or not the “DO” bit is set in a DNS query. According to RFC 2335: Setting the DO bit to one in a query indicates to the server that the resolver is able to accept DNSSEC security RRs.

edns_version

The EDNS version number, if any, in a DNS query. EDNS Version 0 is documented in RFC 2671.

edns_bufsiz

The EDNS buffer size per 512 chunks (0-511, 512-1023 etc).

idn_qname

This indexer has only two values: 0 or 1. It returns 1 when the first QNAME in the DNS message question section is an internationalized domain name (i.e., containing non-ASCII characters). Such QNAMEs begin with the string xn--. This convention is documented in RFC 3490.

	msglen		The overall length (size) of the DNS message.
	null		A “no-op” indexer that always returns the same value. This can be used to effectively turn the 2-D table into a 1-D array.
	opcode		The DNS message opcode is a four-bit field. QUERY is the most common opcode. Additional currently defined opcodes include: IQUERY, STATUS, NOTIFY, and UPDATE.
	qclass		The DNS message query class (QCLASS) is a 16-bit value. IN is the most common query class. Additional currently defined query class values include: CHAOS, HS, NONE, and ANY.
	qname		The full QNAME string from the first (and usually only) QNAME in the question section of a DNS message.

qnamelen

The length of the first (and usually only) QNAME in a DNS message question section. Note this is the “expanded” length if the message happens to take advantage of DNS message “compression”.

qtype

The query type (QTYPE) for the first QNAME in the DNS message question section. Well-known query types include: A, AAAA, A6, CNAME, PTR, MX, NS, SOA, and ANY.

query_classification

A stateless classification of “bogus” queries:
non-auth-tld

When the TLD is not one of the IANA-approved TLDs.

root-servers.net

A query for a root server IP address.

localhost

A query for the localhost IP address.

a-for-root

An “A” query for the DNS root (.).

a-for-a

An “A” query for an IPv4 address.

rfc1918-ptr

A PTR query for an RFC 1918 address.

funny-class

A query with an unknown/undefined query class.

funny-qtype

A query with an unknown/undefined query type.

src-port-zero

When the UDP message’s source port equals zero.

malformed

A malformed DNS message that could not be entirely parsed.

	rcode		The RCODE value in a DNS response. The most common response codes are 0 (NO ERROR) and 3 (NXDOMAIN).
	rd_bit		This indexer returns 1 if the RD (recursion desired) bit is set in the query. Usually only stub resolvers set the RD bit. Usually authoritative servers do not offer recursion to their clients.
	tc_bit		This indexer returns 1 if the TC (truncated) bit is set (in a response). An authoritative server sets the TC bit when the entire response won’t fit into a UDP message.
	tld		The TLD of the first QNAME in a DNS message’s question section.

second_ld

The Second LD of the first QNAME in a DNS message’s question section.

third_ld

The Third LD of the first QNAME in a DNS message’s question section.

transport

Indicates whether the DNS message is carried via UDP or TCP.

dns_ip_version

The IP version number that carried the DNS message.

dns_source_port

The source port of the DNS message.

dns_sport_range

The source port of the DNS message per 1024 chunks (0-1023, 1024-2047 etc).

qr_aa_bits

The "qr_aa_bits" dataset may be useful when dsc is monitoring an authoritative name server. This dataset counts the number of DNS messages received with each combination of QR,AA bits. Normally the authoritative name server should *receive* only *queries*. If the name server is the target of a DNS reflection attack, it will probably receive DNS *responses* which have the QR bit set.

DNS FILTERS

You must specify one or more of the following filters (separated by commas) on the dataset line. Note that multiple filters are ANDed together. That is, they narrow the input stream, rather than broaden it.

any

The no-op filter, counts all messages.

queries-only

Count only DNS query messages. A query is a DNS message where the QR bit is set to 0.

replies-only

Count only DNS response messages. A response is a DNS message where the QR bit is set to 1.

nxdomains-only

Count only NXDOMAIN responses.

popular-qtypes

Count only DNS messages where the query type is one of: A, NS, CNAME, SOA, PTR, MX, AAAA, A6, ANY.

idn-only

Count only DNS messages where the query name is in the internationalized domain name format.

aaaa-or-a6-only

Count only DNS messages where the query type is AAAA or A6.

root-servers-net-only

Count only DNS messages where the query name is within the root-servers.net domain.

chaos-class

Counts only DNS messages where QCLASS is equal to CHAOS (3). The CHAOS class is generally used for only the special hostname.bind and version.bind queries.

priming-query

Count only DNS messages where the query type is NS and QNAME is “.”.

servfail-only

Count only SERVFAIL responses.

authentic-data-only

Count only DNS messages with the AD bit is set.

QNAME FILTERS

Defines a custom QNAME-based filter for DNS messages. If you refer to this named filter on a dataset line, then only queries or replies for matching QNAMEs will be counted. The QNAME argument is a regular expression. For example:

qname_filter WWW-Only ^www. ;
dataset qtype dns All:null Qtype:qtype queries-only,WWW-Only ;

PARAMETERS

dsc currently supports the following optional parameters:
min-count=NN

Cells with counts less than NN are not included in the output. Instead, they are aggregated into the special values -:SKIPPED:- and -:SKIPPED_SUM:-. This helps reduce the size of datasets with a large number of small counts.

max-cells=NN

A different, perhaps better, way of limiting the size of a dataset. Instead of trying to determine an appropriate min-count value in advance, max-cells allows you put a limit on the number of cells to include for the second dataset dimension. If the dataset has 9 possible first-dimension values, and you specify a max-cell count of 100, then the dataset will not have more than 900 total values. The cell values are sorted and the top max-cell values are output. Values that fall below the limit are aggregated into the special -:SKIPPED:- and -:SKIPPED_SUM:- entries.

FILE NAMING CONVENTIONS

The filename is in the format:
${timestamp}.dscdata.${format}

For example:
1154649660.dscdata.xml

DATA FORMATS

XML

A dataset XML file has the following structure:

JSON

A dataset JSON file has the following structure:

{
"name": "dataset-name",
"start_time": unix-seconds,
"stop_time": unix-seconds,
"dimensions": [ "Label1", "Label2" ],
"data": [
{
"Label1": "D1-V1",
"Label2": [
{ "val": "D2-V1", "count": N1 },
{ "val": "D2-V2", "count": N2 },
{ "val": "D2-V3", "count": N3 }
]
},
{
"Label1": "D1-V2-base64",
"base64": true,
"Label2": [
{ "val": "D2-V1", "count": N1 },
{ "val": "D2-V2-base64", "base64": true, "count": N2 },
{ "val": "D2-V3", "count": N3 }
]
}
]
}

dataset-name, Label1, and Label2 come from the dataset definition.

The start_time and stop_time attributes are given in Unix seconds. They are normally 60-seconds apart. dsc usually starts a new measurement interval on 60 second boundaries. That is:

stop_time mod{60} == 0

The Label1 attributes (D1-V1, D1-V2) are values for the first dimension indexer. Similarly, the Label2 attributes (D2-V1, D2-V2 D2-V3) are values for the second dimension indexer. For some indexers these values are numeric, for others they are strings. If the value contains certain non-printable characters, the string is base64-encoded and the optional BASE64 attribute is set to 1/true.

There are two special vals that help keep large datasets down to a reasonable size: -:SKIPPED:- and -:SKIPPED_SUM:-. These may be present on datasets that use the min-count and max-cells parameters (see section PARAMETERS). -:SKIPPED:- is the number of cells that were not included in the output. -:SKIPPED_SUM:-, is the sum of the counts for all the skipped cells.

Note that “one-dimensional datasets” still use two dimensions in the output. The first dimension type and value will be “All” as shown in the example below:

The count values are always integers. If the count for a particular tuple is zero, it should not be included in the output.

Note that the contents of the output do not indicate where it came from. In particular, the server and node that it came from are not present.

GEOIP

Country code and AS number lookup is available using MaxMind GeoIP Legacy API if it was enabled during compilation.

Multiple options can be give to the database and are directly linked to the options for GeoIP_open() but without the prefix of GEOIP_, example:

geoip_v4_dat "/usr/local/share/GeoIP/GeoIP.dat" STANDARD MEMORY_CACHE;
geoip_asn_v6_dat "/usr/local/share/GeoIP/GeoIPASNumv6.dat" MEMORY_CACHE;

GeoIP documentation says:
STANDARD

Read database from file system. This uses the least memory.

MEMORY_CACHE

Load database into memory. Provides faster performance but uses more memory.

CHECK_CACHE

Check for updated database. If database has been updated, reload file handle and/or memory cache.

INDEX_CACHE

Cache only the the most frequently accessed index portion of the database, resulting in faster lookups than GEOIP_STANDARD, but less memory usage than GEOIP_MEMORY_CACHE. This is useful for larger databases such as GeoIP Legacy Organization and GeoIP Legacy City. Note: for GeoIP Legacy Country, Region and Netspeed databases, GEOIP_INDEX_CACHE is equivalent to GEOIP_MEMORY_CACHE.

MMAP_CACHE

Load database into mmap shared memory. MMAP is not available for 32bit Windows.

EXAMPLE

local_address 127.0.0.1;
local_address ::1;
#local_address 127.0.0.0 255.0.0.0;
#local_address 192.168.0.0 24;
#local_address 10.0.0.0 8;

run_dir "/var/lib/dsc";

minfree_bytes 5000000;

pid_file "/run/dsc.pid";

# Example filters
#
#bpf_program "udp port 53";
#bpf_program "tcp port 53 or udp port 53";

# Use this to see only DNS *queries*
#
#bpf_program "udp dst port 53 and udp[10:2] & 0x8000 = 0";

#pcap_buffer_size 4194304;
#pcap_thread_timeout 100;
#drop_ip_fragments;
interface eth0;

dataset qtype dns All:null Qtype:qtype queries-only;
dataset rcode dns All:null Rcode:rcode replies-only;
dataset opcode dns All:null Opcode:opcode queries-only;
dataset rcode_vs_replylen dns Rcode:rcode ReplyLen:msglen replies-only;
dataset client_subnet dns All:null ClientSubnet:client_subnet queries-only max-cells=200;
dataset qtype_vs_qnamelen dns Qtype:qtype QnameLen:qnamelen queries-only;
dataset qtype_vs_tld dns Qtype:qtype TLD:tld queries-only,popular-qtypes max-cells=200;
dataset certain_qnames_vs_qtype dns CertainQnames:certain_qnames Qtype:qtype queries-only;
dataset client_subnet2 dns Class:query_classification ClientSubnet:client_subnet queries-only max-cells=200;
dataset client_addr_vs_rcode dns Rcode:rcode ClientAddr:client replies-only max-cells=50;
dataset chaos_types_and_names dns Qtype:qtype Qname:qname chaos-class,queries-only;
#dataset country_code dns All:null CountryCode:country queries-only;
#dataset asn_all dns IPVersion:dns_ip_version ASN:asn queries-only max-cells=200;
dataset idn_qname dns All:null IDNQname:idn_qname queries-only;
dataset edns_version dns All:null EDNSVersion:edns_version queries-only;
dataset edns_bufsiz dns All:null EDNSBufSiz:edns_bufsiz queries-only;
dataset do_bit dns All:null D0:do_bit queries-only;
dataset rd_bit dns All:null RD:rd_bit queries-only;
dataset idn_vs_tld dns All:null TLD:tld queries-only,idn-only;
dataset ipv6_rsn_abusers dns All:null ClientAddr:client queries-only,aaaa-or-a6-only,root-servers-net-only max-cells=50;
dataset transport_vs_qtype dns Transport:transport Qtype:qtype queries-only;
dataset client_port_range dns All:null PortRange:dns_sport_range queries-only;
#dataset second_ld_vs_rcode dns Rcode:rcode SecondLD:second_ld replies-only max-cells=50;
#dataset third_ld_vs_rcode dns Rcode:rcode ThirdLD:third_ld replies-only max-cells=50;
dataset direction_vs_ipproto ip Direction:ip_direction IPProto:ip_proto any;
#dataset dns_ip_version_vs_qtype dns IPVersion:dns_ip_version Qtype:qtype queries-only;
#dataset priming_queries dns Transport:transport EDNSBufSiz:edns_bufsiz priming-query,queries-only;
#dataset priming_responses dns All:null ReplyLen:msglen priming-query,replies-only;
#dataset qr_aa_bits dns Direction:ip_direction QRAABits:qr_aa_bits any;
#dataset servfail_qname dns ALL:null Qname:qname servfail-only,replies-only;
#dataset ad_qname dns ALL:null Qname:qname authentic-data-only,replies-only;

#statistics_interval 60;
#no_wait_interval;
output_format XML;
#output_format JSON;

#geoip_v4_dat "/usr/share/GeoIP/GeoIP.dat" STANDARD MEMORY_CACHE MMAP_CACHE;
#geoip_v6_dat "/usr/share/GeoIP/GeoIPv6.dat";
#geoip_asn_v4_dat "/usr/share/GeoIP/GeoIPASNum.dat" MEMORY_CACHE;
#geoip_asn_v6_dat "/usr/share/GeoIP/GeoIPASNumv6.dat" MEMORY_CACHE;

#client_v4_mask 255.255.255.0;
#client_v6_mask ffff:ffff:ffff:ffff:ffff:ffff:0000:0000;

FILES

/usr/local/etc/dsc/dsc.conf
/usr/local/etc/dsc/dsc.conf.sample

AUTHORS

Jerry LundstrÃ¶m, DNS-OARC
Duane Wessels, Measurement Factory / Verisign
Ken Keys, Cooperative Association for Internet Data Analysis
Sebastian Castro, New Zealand Registry Services

Maintained by DNS-OARC

https://www.dns-oarc.net/tools/dsc

BUGS

For issues and feature requests please use:

https://github.com/DNS-OARC/dsc/issues

For question and help please use:

dsc@dns-oarc.net