Advanced Scripting

This tutorial describes advanced scripting capabilities that greatly increase the scalability and speed of queries made using Traffic Sentinel's scripting mechanism. Before proceeding with this tutorial it is worth studying the Scripting Queries tutorial since it introduces the scripting capabilities of Traffic Sentinel.

Note: This tutorial describes features that were added in Traffic Sentinel 4.0. If you are running an older version of Traffic Sentinel you will need to upgrade before you can use these features.

Multi-Query

The multi-query mechanism is a way to run multiple queries in parallel. Since disk access is typically the bottleneck on a multicore server; all the queries in a multi-query typically complete in almost the same time that it would take to complete just one of the queries.

For example, the following script takes an array of IP addresses and finds the top 5 servers and top 5 services accessed by each address. The script requires two queries for each address, for a total of 6 queries.

var hosts    = ["10.0.0.50","10.0.0.51","10.0.0.52"];
var interval = "today";
var n        = 5;

var report = Report.current();
var q, filter, t;
for each (var host in hosts) {
  report.heading("Client: " + host);
  filter = "ipclient = " + host;
  q = Query.topN("historytrmx",
                 "serveraddress,bytes",
                 filter,
                 interval,
                 "bytes",
                 n);
  t = q.run();
  report.table(t);

  q = Query.topN("historytrmx",
                 "serverport,bytes",
                 filter,
                 interval,
                 "bytes",
                 n);
  t = q.run();
  report.table(t);
}

The following script combines the six queries into a single multi-query; giving a 6 times improvement in execution speed.

var hosts    = ["10.0.0.50","10.0.0.51","10.0.0.52"];
var interval = "lastweek";
var n        = 5;

var selects  = [];
var filters  = [];

var q = Query.topN("historytrmx",
                   selects,
                   filters,
                   interval,
                   "bytes",
                   n);
q.multiquery = true;

for each (var host in hosts) {
  var filter = "ipclient = " + host;

  selects.push("serveraddress,bytes");
  filters.push(filter);

  selects.push("serverport,bytes");
  filters.push(filter);
}

var tables = q.run();

var report = Report.current();
var i = 0;
for each (var host in hosts) {
  report.heading("Client: " + host);

  report.table(tables[i++]);
  report.table(tables[i++]);
}

A multi-query is specified by using arrays for one or more of the query properties. In addition the property multiquery must be set to true. When a multi-query is run, it returns an array of results corresponding to the array(s) of parameter values. The only query parameters that cannot be specified as arrays are view, interval and multiquery.

Result Streaming

Typically when you run a query you get a table of results. If the query generates a large table, the table will take up a lot of memory on the server. Result streaming is a way to process the query results row by row as they are generated, rather than requiring the entire table to be stored.

For example, suppose you wanted a query to return all IP addresses that sent more than 100MBytes of traffic. You don't know how many addresses exceed the threshold, so you can't set the query truncate parameter. Instead, you need to ask for traffic totals for every address and apply the threshold to identify the addresses you are interested in. The following script shows how result streaming can be used to avoid generating the large table of results.

var interval = "today";
var threshold = 100000000;

var q = Query.topN("historytrmx",
                 "ipsource,bytes",
                 null,
                 interval,
                 "bytes",
                 100000);

var t = q.run(
  function(row,table) {
    if(row[0] && row[1] >= threshold) table.addRow(row);
  }
             );
t.printCSV(true);

Result streaming is invoked when a function is provided as an argument to the Query run() method. The function is then applied to each row of data. In this example, the function applies the threshold and only adds rows to result table if the byte count exceeds the threshold.

Note The truncate value was set to 100000. A truncate value of -1 would have allowed any number of results to be returned, however, it is strongly recommended that a truncate value be set, even if it is very large, since it establishes a limit on the amount of memory that the query will need to use to store intermediate results.

Result streaming can be applied to a multi-query. The following example calculates the total bytes sent or received by each address and applies the threshold to the total.

var interval = "today";
var threshold = 100000000;

var q = Query.topN("historytrmx",
                 ["ipsource,bytes","ipdestination,bytes"],
                 null,
                 interval,
                 "bytes",
                 100000);
q.multiquery = true;

var totals = {};
var t = q.run(
  function(row,table) {
    if(row[0]) {
      if(totals[row[0]]) totals[row[0]] += row[1];
      else totals[row[0]] = row[1];
    }
  }
             );

var result = Table.create(["Address","Bytes"],["address","double"]);
for (var addr in totals) {
   if(totals[addr] >= threshold) result.addRow([addr,totals[addr]]);
}
result.sort(1);
result.printCSV(true);

Finally, when using a multi-query, you may provide an array of functions as an argument to the Query run() command in order to apply a different function to each query.

Related Topics