Text patterns
Text patterns are the most common type of patterns in YARA rules. They are plain text strings, like in the following example:
rule TextExample {
strings:
$text = "foobar"
condition:
$text
}This is the simplest case: an ASCII-encoded, case-sensitive string. However, text patterns can have modifiers that alter the way in which the pattern will be interpreted. Those modifiers are appended at the end of the pattern definition, as discussed below.
“nocase” modifier
Text patterns in YARA are case-sensitive by default, but you can turn your
pattern into a case-insensitive one by appending the nocase modifier at the
end of the pattern definition:
rule CaseInsensitiveTextExample {
strings:
$text = "foobar" nocase
condition:
$text
}With the nocase modifier the pattern “foobar” will match “Foobar”, “FOOBAR”,
and “fOoBaR”. This modifier can be used in conjunction with any modifier, except
base64, base64wide and xor.
“wide” modifier
The wide modifier can be used to search for strings encoded with two bytes per
character, something typical in many executable binaries.
For example, if the string “Borland” appears encoded as two bytes per
character (i.e. B\x00o\x00r\x00l\x00a\x00n\x00d\x00), then the following rule
will match:
rule WideCharTextExample1 {
strings:
$wide = "Borland" wide
condition:
$wide
}However, keep in mind that this modifier just interleaves the ASCII codes of the
characters in the string with zeroes, it does not support truly UTF-16 strings
containing non-English characters. If you want to search for strings in both
ASCII and wide form, you can use the ascii modifier in conjunction
with wide, no matter the order in which they appear.
rule WideCharTextExample2 {
strings:
$wide_and_ascii = "Borland" wide ascii
condition:
$wide_and_ascii
}The ascii modifier can appear alone, without an accompanying wide modifier,
but it’s not necessary to write it because in absence of wide the string is
assumed to be ASCII by default.
“xor” modifier
The xor modifier can be used to search for strings that are XORed with a
single byte.
The following rule will search for every single byte XOR applied to the string " This program cannot" (including the plaintext string):
rule XorExample1 {
strings:
$xor = "This program cannot" xor
condition:
$xor
}The above rule is logically equivalent to:
rule XorExample2 {
strings:
$xor_00 = "This program cannot"
$xor_01 = "Uihr!qsnfs`l!b`oonu"
$xor_02 = "Vjkq\"rpmepco\"acllmv"
// Repeat for every single byte XOR
condition:
any of them
}You can also combine the xor modifier with wide and ascii modifiers. For
example, to search for the wide and ascii versions of a string after every
single byte XOR has been applied you would use:
rule XorExample3 {
strings:
$xor = "This program cannot" xor wide ascii
condition:
$xor
}The xor modifier is applied after the wide modifier. This means that using
the xor and wide together results in the XOR applying to the interleaved
zero bytes. For example, the following two rules are logically equivalent:
rule XorExample4 {
strings:
$xor = "This program cannot" xor wide
condition:
$xor
}rule XorExample4 {
strings:
$xor_00 = "T\x00h\x00i\x00s\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00\x00c\x00a\x00n\x00n\x00o\x00t\x00"
$xor_01 = "U\x01i\x01h\x01r\x01!\x01q\x01s\x01n\x01f\x01s\x01`\x01l\x01!\x01b\x01`\x01o\x01o\x01n\x01u\x01"
$xor_02 = "V\x02j\x02k\x02q\x02\"\x02r\x02p\x02m\x02e\x02p\x02c\x02o\x02\"\x02a\x02c\x02l\x02l\x02m\x02v\x02"
// Repeat for every single byte XOR operation.
condition:
any of them
}If you want more control over the range of bytes used with the xor modifier
use:
rule XorExample5 {
strings:
$xor = "This program cannot" xor(0x01-0xff)
condition:
$xor
}The above example will apply the bytes from 0x01 to 0xff, inclusively, to the string when searching. The general syntax is xor(minimum-maximum).
“fullword” modifier
Another modifier that can be applied to text patterns is fullword. This
modifier guarantees that the pattern will match only if it appears in the file
delimited by non-alphanumeric characters. For instance, the string “domain”, if
defined as fullword, doesn’t match “www.mydomain.com” but it
matches “www.my-domain.com” and “www.domain.com”.
“base64” modifier
The base64 modifier can be used to search for strings that have been base64
encoded. A good explanation of the technique is at:
https://www.leeholmes.com/searching-for-content-in-base-64-strings/
The following rule will search for the three base64 permutations of the string “This program cannot”:
rule Base64Example1 {
strings:
$a = "This program cannot" base64
condition:
$a
}This will cause YARA to search for these three permutations:
VGhpcyBwcm9ncmFtIGNhbm5vd
RoaXMgcHJvZ3JhbSBjYW5ub3
UaGlzIHByb2dyYW0gY2Fubm90The base64wide modifier works just like the base64 modifier but the results
of the base64 modifier are converted to wide.
The interaction between base64 (or base64wide) and wide and ascii is as
you might expect. wide and ascii are applied to the string first, and then
the base64 and base64wide modifiers are applied. At no point is the
plaintext of the ascii or wide versions of the strings included in the
search. If you want to also include those you can put them in a secondary
pattern.
The base64 and base64wide modifiers also support a custom alphabet. For
example:
rule Base64Example2 {
strings:
$a = "This program cannot" base64("!@#$%^&*(){}[].,|ABCDEFGHIJ\x09LMNOPQRSTUVWXYZabcdefghijklmnopqrstu")
condition:
$a
}The alphabet must be 64 bytes long.
The base64 and base64wide modifiers are only supported for text patterns
that are at least 3 bytes long. Using these modifiers with a hex patterns,
regular expression, or text patterns that are too short, will cause a compiler
error. Also, the xor, fullword, and nocase modifiers used in combination
with base64 or base64wide will cause a compiler error.
Incompatibility notice
In YARA 4.x the base64 and base64wide modifiers can produce false positives.
For instance, the pattern "This program cannot" base64 can match both
“Dhis program cannow” and “This program cannot”. This issue has been solved
in YARA-X, but the drawback is that patterns shorter than 3 characters
don’t accept these modifiers.