Introduction
Bronto supports a subset of SQL for searching and analyzing log data.
⚠️ This is not full ANSI SQL — advanced features such as subqueries, joins, and complex expressions are not supported. The focus is on simple WHERE-style filtering with comparison, boolean, and pattern-matching operators.
The search bar corresponds to the WHERE
clause of a SELECT
SQL statement, and can be used to filter the results of your query. For example, you can search your firewall logs for users accessing a specific destination with
direction='outbound' AND destination_address='52.214.86.65'
which retrieves all events that have an attribute named direction
with a value equal to 'outbound'
and an attribute destination_address
equal to '52.214.86.65'
.
The single quotes in the above example are important, as otherwise “outbound
” would be interpreted as an attribute rather than a string value.
Bronto automatically parses the attributes in your log data, regardless of whether that log data is structured, semi-structured, or completely unstructured. You can view the attributes for a log event in the inspector when you click on a log event.
Reserved Attributes
Attributes that begin with @
are reserved attributes that contain some special information:
attribute | | Description |
---|
@raw | | The full text of a log event in its raw form. |
@origin | | The ip address from which Bronto received the event. |
Implicit Expansion of Bare Words
To make free-text search queries simple and concise, Bronto provides a syntactic sugar that is not part of standard SQL which means that simply typing error
in the search bar will match all log events that contain “error” in their text.
Specifically: any character string that appears on its own is automatically expanded into an ILIKE
expression that checks whether the string appears anywhere in the log event.
To search for any event that contains "username"
as well as either "500"
or "404"
, you can query with
username AND (500 OR 404)
In this example, the query filter is automatically expanded to
@raw ILIKE '%username%' AND (@raw ILIKE '%500%' OR @raw ILIKE '%404%')
Quoted and unquoted strings behave the same way in this context, so quotes may be omitted (unless the search pattern contains special characters).
Boolean Operators
Boolean operators combine the result of other expressions that return a value of TRUE
, FALSE
or UNKNOWN
.
Boolean Operator | Example | Description |
---|
AND | expr1 AND expr2 | Combines two expressions and returns TRUE when both expressions are TRUE . |
OR | expr1 OR expr2 | Combines two expressions and returns TRUE when either expressions is TRUE . |
NOT | NOT expr | Negates the result of the expression. |
Comparison Operators
Comparison operators compare one expression or value with another, and return a value of TRUE
, FALSE
or UNKNOWN
.
Comparison Operator | Example | Description |
---|
[NOT] LIKE | attribute LIKE '%value%' | Returns TRUE if the left operand matches the specified pattern (right operand). The pattern can include regular characters or the % wildcard, which matches 0 or more of any character. For example, attribute LIKE 'foo%' returns any log events that start with “foo”. The regular characters specified in the pattern are case sensitive and must match exactly. This operator is only applicable to the STRING data type. The result is negated if NOT appears before the LIKE operator. |
[NOT] ILIKE | attribute ILIKE '%value%' | Same behaviour as the LIKE operator, uppercase/lowercase differences are ignored when matching a pattern. |
= | attribute = value | Compares the equality of two expressions. Returns TRUE if both expressions are equal to each other. |
<> | attribute <> value | Compares the equality of two expressions. Returns TRUE if the left operand is not equal to the right operand. |
!= | attribute != value | Same as the <> operator. Compares the equality of two expressions. Returns TRUE if the left operand is not equal to the right operand. |
< | attribute < value | Compares two expressions. Returns TRUE if the left operand has a lower value than the right operand, otherwise the result is False . |
<= | attribute <= value | Compares two expressions. Returns TRUE if the left operand has a value lower than or equal to the right operand, otherwise the result is False . |
> | attribute > value | Compares two expressions. Returns TRUE if the left operand has a greater value than the right operand, otherwise the result is False . |
>= | attribute >= value | Compares two expressions. Returns TRUE if the left operand has a value greater than or equal to the right operand, otherwise the result is False . |
IS [NOT] NULL | attribute IS NULL | Returns TRUE when the expression evaluates to NULL , for example, myAttribute IS NULL will return TRUE if a log event does not contain an attribute with the name myAttribute . The IS NOT NULL operator is equivalent to NOT <expression> IS NULL . |
Operator Precedence
When an expression has multiple operators, the operator precedence determines the sequence of operations. Operators have the precedence level shown in the table below, with level 1
being the highest precedence, and 4
being the lowest. An operator with a higher precedence is evaluated before an operator with a lower precedence.
Level | Operators |
---|
1 | = , != , <> , < , > , <= , >= |
2 | [NOT] LIKE , [NOT] ILIKE , IS [NOT] NULL |
3 | NOT |
4 | AND |
5 | OR |
Use parentheses to override the defined operator precedence in an expression.
Quoting & Escaping
In SQL mode, different types of quotes serve distinct purposes to ensure your queries work correctly.
Quote Types
Quote Type | Purpose | Example |
---|
Single quotes ('' ) | STRING literals (values) | 'error' |
Double quotes ("" ) | Attribute names | "status_code" |
Using Quotes Correctly
STRING Literals
Always use single quotes around values:
'error'
'timeout exceeded'
'ireland'
Attributes
Wrap attributes in double quotes:
"status_code"
"geo_country"
"request-time"
Unquoted strings of characters are always be interpreted as an attributes.
However, if your attribute contains any of the following special characters, you must wrap it in double quotes:
=, !, <, >, -, +, *, /, %, ^, &, |
For example:
In this example, "request-time"
refers to an attribute that includes a dash (-
).
Escaping Quotes
Quotation marks (both single '
and double "
) can be escaped by doubling them up, for example 'O''Brien'
is a STRING
literal with an apostrophe (O'Brien
).
Data Types & Conversions
Supported Data Types
Any expression in a query has a related data type which defines how they will be treated by operators.
Data type | Description | Literal Value |
---|
NUMERIC | Exact number data type for integer data. Valid for integers in the range -2^63 to 2^63-1 . | Any unquoted sequence of digits. |
DOUBLE | Approximate number data type for decimal data. Precise to 15 significant decimal digits. | Any unquoted sequence of digits with a single . character anywhere in that sequence. |
STRING | Data type for variable length character strings that can consist of letters, numbers, and symbols. | Any sequence of characters surrounded by single quotes. |
Implicit Data Type Conversions
Bronto automatically converts a value from one data type to another when such a conversion makes sense, for example, if your log event has an attribute duration=0.1291
it will match both duration<1
(LONG
literal) and duration<'1'
(STRING
literal).
The rules for implicit data type conversion for binary operators are as follows:
Precedence | Condition | Implicit Conversion |
---|
1 | Both sides are attributes. | No implicit conversion takes place. |
2 | One side is of type DOUBLE and the other side can be converted to DOUBLE . | Other side converted to DOUBLE . |
3 | One side is of type LONG and the other side can be converted to LONG . | Other side converted to LONG . |
4 | One side is an attribute with a STRING value and the other side is of type LONG or DOUBLE . | No implicit conversion takes place. Due to the schemaless nature of Bronto it is possible for an attribute to hold values with differing datatypes. This rule exists to avoid lexicographical STRING comparison when a numerical comparison was intended, for example, this means that if your log event has attribute duration equal to 'null' then duration>100 will NOT return TRUE (as it would under a lexicographical string comparison). |
5 | One side is of type STRING . | Other side is converted to STRING . |
6 | None of the previous conditions match. | No implicit conversion takes place. |
String & Text Functions
String functions perform operations on a string input value and return a string or numeric value.
Bronto supports the following string functions.
Function | Example | Description |
---|
LOWER | LOWER('Error') | Returns a string after converting any uppercase characters to lowercase. The function can be applied to a constant string or a column. |
Regular Expressions
Regular expression operators are currently an early access feature which is disabled by default. Reach out to support if you would like to enable this feature for your account.
Regular expressions provide a powerful and flexible way to perform pattern matching. Compared to simple string matching, regex provides rich wildcards such as .
(any character), \d
(any number), \s
(any whitespace), and [xyz]
(any character that is either x
, y
or z
), and repetition operators: *
matches a sequence of zero or more string; +
matches one or more; ?
matches zero or none. For example, the regex cat*
matches cat
followed by any characters (or none at all). For example,
-
cat.*
matches "catapult"
-
cat.*
matches "The cat sat"
-
h[aeiou]llo
matches any of "hallo", "hello", "hollo"
-
(?i)cat
matches "cat"
, "CAT"
, "cAt"
((?i)
is the case-insensitive inline modifier)
-
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
matches email addresses.
The precise regex syntax supported By Bronto is RE2, for which you can find the specification here. RE2 does not support some features provided by other regex variants (e.g., PCRE), such as positive & negative lookahead/lookbehind.
When using regex to query your log data, the regex pattern does not need to match the entire event exactly. For example, the pattern colou?r
will match the log event “My favourite colour is red” - it is not necessary to write .*colou?r.*
.
Operator | Example | Description |
---|
[NOT] REGEXP | attribute REGEXP '197\.149\.[12][0-59][0-9]\.[0-2].*' | Returns TRUE if the left operand matches the specified regex pattern (right operand). This operator is only applicable to the STRING data type. The result is negated if NOT appears before the REGEXP operator. |
[NOT] RLIKE | attribute RLIKE '197\.149\.[12][0-59][0-9]\.[0-2].*' | Same behaviour as the REGEXP operator. |
~ | attribute ~ value | Same behaviour as the REGEXP operator. |
!~ | attribute !~ value | Same behaviour as the NOT REGEXP operator. |
~* | attribute ~* value | Same behaviour as the REGEXP operator, but with a case insensitive modifier. |
!~* | attribute !~* value | Same behaviour as the REGEXP operator, but with a case insensitive modifier. |
Attribute Parsing With Regex Capture Groups
If Bronto has not automatically parsed the attributes in your data (for example, if you have an unstructured custom log format) and you have not set up the GROK parsing rules for your data source, then you can use named regex capture groups to parse attributes in your data.
A named capture groups is declared by using the following syntax:
where PATTERN
is a regex pattern, and name
denotes the attribute you want to parse.
For example, suppose you are working with this unstructured custom log:
20171223-22:15:36:154|Step_ExtSDM|30002312|calculateCaloriesWithCache totalCalories=127032
20171223-22:15:36:157|Step_ExtSDM|30002312|calculateAltitudeWithCache totalAltitude=240
20171223-22:15:36:157|Step_StandReportReceiver|30002312|REPORT : 7019 5011 150346 240
20171223-22:15:36:356|Step_LSC|30002312|onStandStepChanged 3592
20171223-22:15:36:365|Step_LSC|30002312|onExtend:1514038537000 1 0 4
20171223-22:15:36:657|Step_SPUtils|30002312| getTodayTotalDetailSteps = 1514038440000##7019##548365##8661##13406##27180468
20171223-22:15:36:657|Step_SPUtils|30002312|setTodayTotalDetailSteps=1514038440000##7020##548365##8661##13501##27180976
20171223-22:15:36:668|Step_ExtSDM|30002312|calculateCaloriesWithCache totalCalories=127054
20171223-22:15:36:672|Step_ExtSDM|30002312|calculateAltitudeWithCache totalAltitude=240
20171223-22:15:36:673|Step_StandReportReceiver|30002312|REPORT : 7020 5012 150368 240
20171223-22:15:37:351|Step_LSC|30002312|onStandStepChanged 3593
Each event has some structure - there are individual fields delimited by the ”|” character, specifically:
-
the timestamp,
-
the component name,
-
the process id,
-
an unstructured message.
We can parse these attributes at query-time by running the following regex query:
@raw REGEXP '^(?<timestamp>.*?)\|(?<component_name>.*?)\|(?<pid>.*?)\|(?<message>.*?)$'
Named capture groups may only be used:
-
In aggregate functions, e.g.,
count
, min
, max
.
-
In the group by clause.
-
For column selection.
Named capture groups cannot be reused as part of the query filter, for example it is not possible to write a query likepid=30002312 AND @raw REGEXP '^(?<timestamp>.*?)\|(?<component_name>.*?)\|(?<pid>.*?)\|(?<message>.*?)$'
Now suppose that we are specifically interested in the log events which read REPORT
:
20171223-22:15:36:157|Step_StandReportReceiver|30002312|REPORT : 7019 5011 150346 240
20171223-22:15:36:673|Step_StandReportReceiver|30002312|REPORT : 7020 5012 150368 240
which contain additional information on:
-
the number of steps,
-
the report number,
-
the total calories burned,
-
the altitude.
And suppose we want to find the maximum altitude across some time period. First we could restrict the filter to match only the events which contain “REPORT”:
@raw LIKE '%REPORT%' AND @raw REGEXP '^(?<timestamp>.*?)\|(?<component_name>.*?)\|(?<pid>.*?)\|(?<message>.*?)$'
And then we could use nested capture groups to further parse the message, by changing the (?<message>.*?)
element to
(?<message>REPORT : (?<num_steps>\d+) (?<report_num>\d+) (?<total_cals_burned>\d+) (?<altitude>\d+))
so that the full query is now
@raw LIKE '%REPORT%' AND @raw REGEXP '^(?<timestamp>.*?)\|(?<component_name>.*?)\|(?<pid>.*?)\|(?<message>REPORT : (?<num_steps>\d+) (?<report_num>\d+) (?<total_cals_burned>\d+) (?<altitude>\d+))$'
With this regex filter, the new attributes “timestamp”, “report_num”, “total_cals_burned”, and “altitude” become available, and we can easily apply a max(altitude)
function to our query.