HumanlogQL - Reference

The HumanLog query language is a pipeline based language that is meant to make exploring data easy and flexible. It is used for all aspects of querying data:

  • for historical queries
  • for stream queries
  • to author dashboard panels
  • to write alerting rules

The language is largely inspired by the Kusto query language, developed by Microsoft and used by various other analytical engines. We chose to use this language to help users reuse their existing knowledge (for those that already know how to use Kusto), and to avoid forcing users to learn yet another single-domain language.

No sunk cost

Who wants to learn a query language that works with only one platform?

We want to make usage of humanlog as easy and valuable as possible. If you end up using another KustoQL based technology later down your observability journey, familiarity gained with HumanlogQL will serve you well.

High level concept

A query is a series of pipeline operations. Each operation in the pipeline is separated by a | character. You start by specifying an optional datasource (i.e. logs or traces), and then piping this through a series of operations.

logs | <operator_0> | <operator_1> | ... | <operator_N>

The datasource is optional. If not specified, it will be assumed to be logs.

<operator_0> | <operator_1> | ... | <operator_N>
Nerdy fact: the empty query

The empty query:

// no query

is equivalent to querying the logs datasource for all rows.

// functional equivalent
logs | filter true

Querying in humanlog involves two important concepts: tables and rows. Like in other languages, you can use operators and functions.

  • A table is like a database table: it has columns with names and types, and rows. Tables are also refered to as tabular data.
  • A row is an entry in a table: for each table column, the row will contain a scalar value.

Functions and operators that operate on a whole table at once are called:

  • tabular operators: operators that work on a set of table rows (either the entire table, or a subset of the table). Example: filter is a tabular operator that filters the rows on the input that will make it to the output.
  • aggregate functions: functions that work on a set of values inside table rows (either all rows in the table, or a subset of rows). Example: count() is an aggregate function that counts the total number of rows on the input, and emits the count on the output.

Functions and operators that operate on a specific scalar values inside a table row are called:

  • scalar operators: operate with one row in scope at a time. They can operate on columns in the row. Think of your usual ==, !=, +, -, contains and so on. These are either binary or unary operators, like in most other programming languages. Example: msg == "hello world" uses the == operator and references the msg column and a "hello world" string literal.
  • scalar functions: operate with one row in scope at a time. They can reference columns in the row. Think of functions like strcat("hello", "world") which concatenates strings. These are essentially the same as operators, but are always invoked with a function name and parenthesis, and can take 0-to-many arguments. Example: len(msg) > 0 uses the len function, which computes the length of its input. For strcat("hello", ", ", "world"), strcat is the function, which takes a variable amount of string typed arguments.

In our earlier example:

traces | <operator_0> | <operator_1> | ... | <operator_N>

A pipeline <operator_0> will be composed of tabular operator followed by arguments for the operator. Each operator has its own syntax. You can find a complete list of all tabular operator here.

Pipeline

As you saw before, a query contains 0-to-many tabular operator separated by |. Like in UNIX pipes, the output of each operator is the input to the next. The first operator in the pipeline will take its input from the datasource specified (defaults to logs is not specified).

Logs and Spans

Humanlog has ✨special✨ support for specific types of tabular data.

  • Logs
  • Spans (in distributed tracing, spans make up a trace).
  • ... more special types may come later.

The original datatype of a datasource is well known, and if the shape of the data is not changed by the tabular operators (columns are not removed or added), humanlog will detect that the datatype has not changed and provide enhanced presentation for this datatype. In concrete terms:

Logs get prettified: just like in the CLI, your logs will be printed in easy-to-read (in our opinion) fashion.

<insert example of pretty logs>

Spans get prettified: special elements are rendered for spans, which makes them easier to dig into.

<insert example of pretty spans>

If you have requests for special presentation modes, please let us know!

Tables

Like for logs and spans, we try to detect some interesting type of data shapes and render them in a useful manner. For instance, when we detect data with timestamps as the first column and histograms as the second column, we will attempt to render the data as a heatmap.

<insert example of heatmap>

Examples

Filter for logs with messages:

logs | filter msg != ""

Count of spans in the last hour:

traces | where time > ago(1h) | summarize count()

Trace IDs where the trace container the most spans, in the last hour.

traces
| filter time > ago(1h)
| summarize span_count=count() by trace_id
| sort by span_count desc

Find all logs with a specific request ID:

where request_id == "<some_uuid>"