Database Fields

This tutorial describes the various counters and traffic views in the Traffic Sentinel database, and provides details on the individual fields.

Overview

The Traffic Sentinel database is used by most Traffic Sentinel pages. It can be accessed directly through the Javascript API. It divides into information about the current state of the system accessed via the Network class, and a set of per-minute historical views accessed via the Query class. This tutorial will focus on the latter. The query views are presented as if they were flat tables ordered by time containing many fields/columns. The most important views are:

  • traffic - network traffic flows (with automatic de-duplication)

    (sourceaddress,destinationaddress,serverport,frames,bytes,...)

  • ifcounters - network interface counters

    (ifinoctets,ifoutoctets,ifinerrors,ifindiscards,iftype,ifspeed,ifstatus,...)

  • host - server, hypervisor, VM, container and application counters

    (hostname,osrelease,load_one,cpu_user,mem_used,diskpartmax,v_hostname,v_uuid,vcpu_pc,tcp_retrans_segs,http_method_get,...)

  • application - application transaction records

    (app_name,app_operation,httpurl,httphost,memcachekey,op_count,op_duration,op_bytes,...)

  • events - event log

    (timestamp,id,name,type,value,severity,address,ifindex,comment,...)

Agents

Most agents are embedded in vendor hardware products such as switches, routers and load-balancers and send sFlow or NetFlow/IPFIX feeds. Traffic Sentinel can also use SNMP to supplement these feeds by collecting additional measurements. The increasing availability of open-source software agents that send standard sFlow feeds is driving visibility into servers, hypervisors, VMs, containers, and applications. The agent field is available in most views.

Datasources

In the standard sFlow data model, every measurement comes from a particular datasource defined by agent IP address, ds_class and ds_index, and written as agent>ds_class:ds_index. For example the interface counters for interface 17 on switch 10.1.2.3 come from datasource 10.1.2.3>0:17 (or just 10.1.2.3>17). The Sentinel database reflects the sFlow data model so in most views the field datasource is available.

Keys and Values

Each field is either a key (e.g. ipsource) or a value (e.g. bytes). This affects how they are treated in a query. If the query selects "ipsource,bytes" then it will sum the bytes for each unique IP source address found. If the key is used with a function such as "count(ipsource)" or "first(ipsource)" then it behaves as a value.

Value fields can also be aggregated in different ways: while "sum(bytes)" is the default, "rate(bytes)", "mean(bytes)", "max(bytes)", "min(bytes)", "sdev(bytes)", can be requested. Some of these aggregation functions take multiple arguments such as "max(ifinoctets ifoutoctets)", "sum(ifinucasts ifinmulticasts ifinbroadcasts)", and "ratio(bytes frames)". For details see advanced scripting.

Missing Fields

When a field is missing it will have the value "null" so you can test for it with a filter such as macsource=null. By default a query will ignore rows where one of the keys is null, but you can override this by setting query.nullsinkeys=true in a scripted query, or by asking for something like first(macsource) or last(macsource) in the select list so that the field is not treated as a key in the query.

Packet-oriented feeds such as sFlow from switches and routers will allow Traffic Sentinel to populate a wide range of fields, including layer2 fields such as macsource and vlansource; layer 3-4 fields such as ipsource and tcpsourceport; fields carried inside a tunnel such as ipsource.1; packet-size-distrubution counts such as sizejumbo, and deep-decode fields such as icmpunreachableport. However when the feed is NetFlow/IPFIX it will typically be restricted to a subset of the outer layer3-4 fields only.

Certain fields such as sourcename and destinationname are missing from the most recent minutes because they depend on an external DNS lookup. They appear in the database a few minutes behind.

traffic fields

The full list of traffic fields is accessible here. The traffic fields can be divided into a number of distinct categories:

  1. Polymorphic Fields

    Some fields will adapt their type on a flow by flow basis. The sourceaddress and destinationaddress fields can take the form of a MAC address, an IP address an IPv6 address or even an Appletalk, DecNet or Fibre-channel address depending on the flow being represented. Similarly, the sourceport and destinationport field can appear as "UDP:6343" or "TCP:443" depending on the top-layer protocol detected (protocol field). The fields clientaddress, serveraddress, clientport, serverport complete this category. They are described below.

  2. Protocol-Specific Fields

    If your query is for a particular protocol layer only then instead of using the polymorphic fields you can use the corresponding protocol-specific fields. Their names begin with their protocol. So ipsource will only ever be an IPv4 address or null, and ip6source will only ever be an IPv6 address or null. Other examples of protocol-specific fields are macsource, ipprotocol, tcpsourceport.

  3. Client-Server Fields

    The client-server fields adapt to the measured servicedirection. The service direction may be inferred by Traffic Sentinel in a number of different ways, such as by taking into account the protocolPriorities.txt config file and learning service-ports from application sFlow feeds. So in some rows the clientaddress field may be the same as the sourceaddress field and in other rows it may be the same as the destinationaddress field. To understand the relationships it can be helpful to query and examine the full set of fields with:

    query.select = "sourceaddress,destinationaddress,sourceport,destinationport,
    servicedirection,clientaddress,serveraddress,clientport,serverport,frames"
    

    Addional client-server key fields include clientzone,clientgroup,serverzone and servergroup. They correspond in the same way to the sourcezone,sourcegroup, destinationzone,destinationgroup fields which will be described next.

  4. Zone/Group Fields

    The Traffic Sentinel configuration allows IP and IPv6 CIDRs to be used to carve IP address space into named <zone> and <group> sections. Overlapping CIDRs are captured by the longest-prefix match. These lookups are applied dynamically at query time when a zone/group field is selected. The fields are sourcezone,sourcegroup, destinationzone,destinationgroup and their client-server equivalents. If an address falls outside all CIDRs in the configuration it takes the special value "EXTERNAL". So you can filter to select off-site traffic like this:

    sourcezone!=EXTERNAL & destinationzone=EXTERNAL
    

    To understand these fields better it helps to create a custom Traffic>TopN chart and set the keys to sourcegroup and destinationgroup. Then you can refine your classification by adding more CIDRs to the configuration.

    A custom grouping by CIDR can be specified at query time using the "mask()" function. See advanced scripting.

    The zone/group hierarchy is also use to separate sFlow agents, using <agent> and <agentrange> sections as well as CIDRs. This gives rise to database fields zone and group which are available in most views. However it is important to remember that zone and group are very different from sourcezone and sourcegroup. A filter such as group='Building C' will select all traffic through switches in Building C regardless of the source and destination addresses in the flows, while sourcegroup='Building C' will select traffic from the CIDRs in Building C, as seen by any switch in the network.

  5. Tunnel Fields

    When the monitoring feed is sFlow, Traffic Sentinel may encounter and decode tunnel encapsulations such as GRE, VXLAN and Geneve. In that case there will be a second "inner" IP protocol layer. These inner fields are accessible as fields with a subscript, e.g. macsource.1, ipsource.1, ipprotocol.1, serverport.1.

  6. DNS Lookup Fields

    Traffic Sentinel will make DNS requests to try and learn the names of all the busiest sources of traffic. When those lookups succeed the sourcename, destinationname, clientname and servername fields are populated. However the lookups can take time to complete so these fields do not appear in the database until a few minutes after "now". Derived fields sourcedomain and destinationdomain can also be queried. They are equivalent to the functions domain(sourcename 2) and domain(destinationname 2).

  7. Protocol Classification Fields

    Following the servicedirection classification, another configuration file protocolgroups.txt is then used to assign the identified service traffic to a user-defined category such as "file-transfer" or "web". This category appears as the protocolgroup field. So a particular flow may have protocol=TCP, serverport=TCP:443 and protocolgroup=web.

    If a network firewall or probe supplies a separate protocol classification string as part of its measurement feed, then it will appear as classification (like serverport) and/or classificationgroup (like protocolgroup).

  8. Country Lookup Fields

    The configuration file GeoIP.csv is used as a lookup to map IP addresses to their country of allocation. This results in the sourcecountry, destinationcountry, clientcountry and servercountry fields.

  9. Joined Host Fields

    When host-sFlow agents are running on the servers and hypervisors in the network, their metadata can be joined into the traffic database. This means you can ask for traffic by uuidsource or uuiddestination. These fields can also be used to bring in metadata from elsewhere, for example lookup(uuidsource cloud.csv 4) will fetch the 4th field from the metadata file cloud.csv, which could be the Openstack project name.

  10. Location Fields

    It is sometimes convenient to tie traffic to the port location of it's source. These fields merge in the discovered end-host locations and make them available. sourcelocation and destinationlocation report the discovered port location of the source and destination address respectively. The sourcelocationagent, sourcelocationgroup and sourcelocationzone fields are derived from that. The macsourcelocation and macdestinationlocation fields operate the same way but specifically for the outer MAC layer fields macsource and macdestination.

ifcounters fields

The full list of ifcounters fields is accessible here. The ifcounters fields can also be divided into a number of distinct categories:

  1. Interface Keys

    The key fields relating to interfaces are mostly those of the SNMP MIB II interfaces table. The fields ifindex, ifname, ifdescr, ifalias, iftype, ifspeed have the same meaning as they do there except that ifspeed is a 64-bit integer in bits/sec. The ifstatus field combines ifAdminStatus and ifOperStatus into one number, or you can use ifadminup, ifadmindown, ifoperup and ifoperdown as separate boolean fields that take the value 1 or 0. The field interface is available (e.g. 10.1.2.3>17), as are the common datasource, agent, zone and group fields.

    Note that some versions may require the global.prefs setting "EnableIFN=YES" before the ifname, ifdescr and ifalias fields are available.

  2. Interface Counters

    Most of the fields in the ifcounters view fall into this category. A per-minute count of frames, bytes, broadcasts, multicasts, errors and discards for ingress and egress. The naming convention follows the SNMP MIB-II interfaces table. For example: ifinoctets, ifoutoctets,ifinucasts, ifoutucasts. Ethernet interfaces may also offer ethernet-specific counts such as fcserrors and symbolerrors, and Wifi interfaces may offer wifi-specific counts such as wifiwepundecryptable and wifidiscardedfragments.

  3. Utilization Fields

    The derived field ifinutilization will compute "(rate(ifinoctets) x 100) / ifspeed". The ifoututilization will do the same for the egress direction. However it is common to want the max of these two so instead of asking for max(ifinutilization ifoututilization) you can just ask for rate(inoututilsecs) to get the same answer more efficiently.

    For utilization in Bits/sec, use rate(ifinoctets) and rate(ifoutoctets) then multiply the answer by 8.

  4. Switch Agent Metrics

    The ifcounters view also contains metrics that apply to the whole switch. They appear under datasource index dsindex=0x3FFFFFFF meaning internal interface or under dsindex=0 meaning whole agent. They include CPU fields such as cpu60, memory fields such as totalmemory, freememory and environmental fields such as temperature and fans_failed.

    Some switch agents now report the full suite of host-sFlow fields, so they are also represented in the host view, described below.

host fields

The full list of host fields is accessible here. They can be divided into the following categories:

  1. Key Performance Indicators

    The host view includes a rich set of standard performance indicators measuring load_average, CPU utilization, memory utilization, swapping activity, disk I/O activity, disk performance, disk utilization, network I/O, and TCP/UDP performance. Agents exist to report these numbers from a wide range of Operating Systems, implemented as open-source freeware.

  2. Host Keys

    The host view includes a rich set of keys that allows the data to be joined with network traffic data and also with orchestration-layer metadata (e.g. Openstack, OVN). A host-sflow agent will send these keys as part of the feed. The hostname, os_release and uuid fields identify the host, and it also sends the MAC address for each network adaptor.

  3. Virtual Machines and Containers

    When the host is a hypervisor running VMs or containers the host-sFlow agent will also supply basic performance counters for each VM separately (e.g. vcpu_pc, vmem_total, vdiskcapacity). The keys fields v_uuid and v_hostname are also avaiable. If the hypervisor does not support host-sFlow then running the host-sFlow agent on every VM is also possible. If the virtual switch supports sFlow then enabling it will help to extend visibilty into the traffic patterns on all vports.

  4. Java JVMs

    A special set of fields are populated when a Java JVM is instrumented with sFlow. For example, fields such as jvm_heap_used and jvm_heap_max allow profiling of the memory footprint, and jvm_gc_ms exposes the time spent garbage collecting.

  5. Application Sub-Agent Counters

    Independent sub-agents can be used to supplement the host-sFlow data. A sub-agent monitoring a web server application (e.g. apache, tomcat, nginx, node.js) will send the same standard HTTP sFlow as from a load-balancer, and those counters will appear in the host view. For example: http_method_get and http_status_4xx. For details, see this list of host-sFlow extensions.

application fields

The full list of application transaction fields is accessible here. The application view drives the Services>Trend pages. The fields divide into these categories:

  1. Common Transaction Fields

    Some fields are avaiable for all kinds of application transactions. The layer-4 socket information is always supplied, so the ipclient, ipserver, clientport and serverport fields are available. The value fields include op_count - the total number of transactions, as well as op_duration - the transaction times in milliseconds, op_reqbytes - bytes requested (client to server) and op_respbytes - response bytes.

  2. HTTP transactions

    Agents that send HTTP sFlow will send random 1-in-N sampled transactions. It is those transactions that appear in the application view (and are used by the Services>Trend>HTTP page). Fields include httpurl, httphost, httuseragent and httpxff (the X-Forwarded-For header field). Derived fields httpxffip1 and httpxffcountry1 can be used to mine the X-Forwarder-For field for details on the orginal requesting client. Referrer details are accessible via httprefhost and httprefpath. HTTP sFlow is sent by some commerical load-balancer products, and agents exist for some open-source solutions.

  3. Memcached transactions

    The memcache protocol is often critical to data-center performance, so it is useful to track patterns of cache hits and cache misses. When memcached is instrumented with sFlow support then all the memcached "stats" are available in the host view, and this application view contains 1-in-N sampled transactions. So fields such as memcachecommand, memcachekey and memcachestatus are populated, along with the layer-4 socket information. Services>Trend>Memcached provides interactive browsing of the data.

  4. Generic Client-Server transactions

    Client-server applications that do not have protocol-specific sFlow structures can still be monitored using the generic client-server "enterprise" sFlow structures, as defined here. In addition to the common fields described above, freeform string fields are defined for app_name, app_operation, app_attributes, app_status, app_error, app_initiator and app_target. In turn, these operations can be set against a backdrop "context" in app_ctxt_name, app_ctxt_operation and app_ctxt_attributes.

    events fields

    The full list of events fields is accessible here. The events view can be used to aggregate event-counts, or it can be used as a time-ordered log. The difference depends on whether the id field is included as a key in the query. It is a unique indentifier for the event so it turns the query into a log. The fields name, type, value, url and comment are freeform string fields whose meaning depends on the event type. The timestamp field records the time the event was raised.

Related Topics