YARA-L known issues and limitations
This document describes the known issues and limitations in YARA-L.
Outcome aggregations with repeated field unnesting
When a rule refers to a repeated field on an event variable and that repeated field contains more than one element, each element is unnested into a separate event row.
For example, the two IP address strings in the repeated field target.ip
on event $e
in the following rule are unnested into two instances of event $e
,
each with a different target.ip
element.
rule outbound_ip_per_app {
meta:
events:
$e.principal.application = $app
match:
$app over 10m
outcome:
$outbound_ip_count = count($e.target.ip) // yields 2.
condition:
$e
}
Event record before repeated field unnesting
The following table shows the event record before repeated field unnesting:
metadata.id | principal.application | target.ip |
---|---|---|
aaaaaaaaa |
Google SecOps |
[192.0.2.20 , 192.0.2.28] |
Event records after repeated field unnesting
The following table shows the event record after repeated field unnesting:
metadata.id | principal.application | target.ip |
---|---|---|
aaaaaaaaa |
Google SecOps |
192.0.2.20 |
aaaaaaaaa |
Google SecOps |
192.0.2.28 |
When a rule refers to a repeated field that is a child of another
repeated field like security_results.action
, unnesting happens at both
the parent field level and child field level. The resulting set of
instances unnest from a single event is the Cartesian product of elements
in the parent field and elements in the child field.
In the following example rule, event $e
with two repeated values on security_results
and two repeated
values on security_results.actions
are unnested into four instances.
rule security_action_per_app {
meta:
events:
$e.principal.application = $app
match:
$app over 10m
outcome:
$security_action_count = count($e.security_results.actions) // yields 4.
condition:
$e
}
Event record before repeated field unnesting
The following table shows the event record before repeated field unnesting:
metadata.id | principal.application | security_results |
---|---|---|
aaaaaaaaa |
Google SecOps |
[ { actions: [ ALLOW, FAIL ] } , { actions: [ CHALLENGE, BLOCK ] } ] |
Event records after repeated field unnesting
The following table shows the event record after repeated field unnesting:
metadata.id | principal.application | security_results.actions |
---|---|---|
aaaaaaaaa |
Google SecOps |
ALLOW |
aaaaaaaaa |
Google SecOps |
FAIL |
aaaaaaaaa |
Google SecOps |
CHALLENGE |
aaaaaaaaa |
Google SecOps |
BLOCK |
This unnesting behavior in rule evaluation can produce unexpected
outcome aggregations when the rule references one or more repeated fields
with a parent field that is also a repeated field. Non-distinct aggregations
like sum()
, array()
, and count()
cannot account for duplicate values on
other fields on the same event produced by the unnesting behavior. In the following example
rule, event $e
has a single hostname google.com
, but the outcome hostnames
aggregates over unnested four instances of the same event $e
, each with a duplicate
principal.hostname
value. This outcome yields four hostnames instead of one
due to the unnesting of repeated values on security_results.actions
.
rule security_action_per_app {
meta:
events:
$e.principal.application = $app
match:
$app over 10m
outcome:
$hostnames = array($e.principal.hostname) // yields 4.
$security_action_count = count($e.security_results.action) // yields 4.
condition:
$e
}
Event record before repeated field unnesting
The following table shows the event record before repeated field unnesting:
metadata.id | principal.application | principal.hostname | security_results |
---|---|---|---|
aaaaaaaaa |
Google SecOps |
google.com |
[ { action: [ ALLOW, FAIL ] } , { action: [ CHALLENGE, BLOCK ] } ] |
Event record after repeated field unnesting
The following table shows the event record after repeated field unnesting:
metadata.id | principal.application | principal.hostname | security_results.action |
---|---|---|---|
aaaaaaaaa |
Google SecOps |
google.com |
ALLOW |
aaaaaaaaa |
Google SecOps |
google.com |
FAIL |
aaaaaaaaa |
Google SecOps |
google.com |
CHALLENGE |
aaaaaaaaa |
Google SecOps |
google.com |
BLOCK |
Workaround
Aggregations that ignore duplicate values or eliminate duplicate values are not affected by this unnesting behavior. Use the distinct version of an aggregation if you're encountering unexpected outcome values due to unnesting.
The following aggregations are not affected by the unnesting behavior described previously.
max()
min()
array_distinct()
count_distinct()
Outcome aggregations with multiple event variables
If a rule contains multiple event variables, there is a separate item in the aggregation for each combination of events that is included in the detection. For example, if the following example rule is run against the listed events:
events:
$e1.field = $e2.field
$e2.somefield = $ph
match:
$ph over 1h
outcome:
$some_outcome = sum(if($e1.otherfield = "value", 1, 0))
condition:
$e1 and $e2
event1:
// UDM event 1
field="a"
somefield="d"
event2:
// UDM event 2
field="b"
somefield="d"
event3:
// UDM event 3
field="c"
somefield="d"
The sum is calculated over every combination of events, enabling you to use both event variables in the outcome value calculations. The following elements are used in the calculation:
1: $e1 = event1, $e2 = event2
2: $e1 = event1, $e2 = event3
3: $e1 = event2, $e2 = event1
4: $e1 = event2, $e2 = event3
5: $e1 = event3, $e2 = event1
5: $e1 = event3, $e2 = event2
This results in a potential maximum sum of 6, even though $e2 can only correspond to 3 distinct events.
This affects sum, count, and array. For count and array, using count_distinct
or array_distinct
can solve the issue, but there is currently no workaround
for sum.
Parentheses at the start of an expression
Using parentheses at the start of an expression triggers the following error:
parsing: error with token: ")"
invalid operator in events predicate
The following example would generate this type of error:
($event.metadata.ingested_timestamp.seconds -
$event.metadata.event_timestamp.seconds) / 3600 > 1
The following syntax variations return the same result, but with valid syntax:
$event.metadata.ingested_timestamp.seconds / 3600 -
$event.metadata.event_timestamp.seconds / 3600 > 1
1 / 3600 * ($event.metadata.ingested_timestamp.seconds -
$event.metadata.event_timestamp.seconds) > 1
1 < ($event.metadata.ingested_timestamp.seconds -
$event.metadata.event_timestamp.seconds) / 3600
Index array in outcome requires aggregation for single values on repeated field
Array indexing in the outcome section still requires aggregation. For example, the following does not work:
outcome:
$principal_user_dept = $suspicious.principal.user.department[0]
However, you can save the output of the array index in a placeholder variable and use that variable in the outcome section as shown here:
events:
$principal_user_dept = $suspicious.principal.user.department[0]
outcome:
$principal_user_department = $principal_user_dept
OR condition with non-existence
If an OR condition is applied between two separate event variables and if the
rule matches on non-existence, the rule successfully compiles, but can produce
false positive detections. For example, the following rule syntax can match
events having $event_a.field = "something"
even though it shouldn't.
events:
not ($event_a.field = "something" **or** $event_b.field = "something")
condition:
$event_a and #event_b >= 0
The workaround is to separate the conditions into two blocks where each block only applies the filter to a single variable as shown here:
events:
not ($event_a.field = "something")
not ($event_b.field = "something")
condition:
$event_a and #event_b >= 0
Arithmetic with unsigned event fields
If you try to use an integer constant in an arithmetic operation with a UDM field whose type is an unsigned integer, you will get an error. For example:
events:
$total_bytes = $e.network.received_bytes * 2
The field udm.network.received_bytes
is an unsigned integer. This happens due
to integer constants defaulting to signed integers, which don't work with
unsigned integers in arithmetic operations.
The workaround is to force the integer constant to a float which will then work with the unsigned integer. For example:
events:
$total_bytes = $e.network.received_bytes * (2/1)