YARA es una herramienta de pattern matching para identificar y clasificar malware. Permite escribir reglas que describen patrones (strings, bytes, expresiones regulares) presentes en archivos maliciosos. Si un archivo coincide con una regla YARA, se identifica como perteneciente a esa familia o tipo de malware.

¿YARA es como un antivirus?

YARA es una herramienta de detección por patrones, no un antivirus completo. No monitoriza en tiempo real ni bloquea ejecución por sí solo. Pero es el motor de detección que usan muchos antivirus y EDRs internamente. YARA detecta; el AV/EDR actúa sobre la detección.

¿Dónde consigo reglas YARA ya escritas?

Principales repositorios: signature-base de Neo23x0 (Florian Roth), Yara-Rules community rules, reglas de MALPEDIA, reglas de vendors (Elastic, Mandiant, CrowdStrike publicaciones). También los sandboxes (Joe Sandbox, CAPE) generan reglas YARA automáticamente.

¿Puedo usar YARA en Linux y Windows?

Sí. YARA es multiplataforma: binario nativo para Windows, Linux y macOS. También disponible como librería Python (yara-python) para scripting. Incluido en FLARE VM y REMnux.

¿Cuál es la diferencia entre YARA y Sigma?

YARA detecta patrones en ARCHIVOS (binarios, documentos, memoria). Sigma detecta patrones en LOGS (Event Log, Sysmon, auditd). YARA es para análisis de muestras; Sigma es para detección en SIEM. Son complementarios, no competidores.

PrincipianteYARAreglasdetecciónpattern matchingtutorialescritura

Reglas YARA: Escritura desde Cero hasta Detección Efectiva

Tutorial completo de escritura de reglas YARA para detección de malware. Sintaxis, strings (texto, hex, regex), condiciones, metadata, y ejemplos prácticos para detectar ransomware, RATs, packers y documentos maliciosos.

MalwareIntel Research·21 de mayo de 2026·9 min lectura

Serie: Entornos de Análisis — Parte 9

YARA: el lenguaje de detección de malware

YARA es la herramienta que todo analista de malware necesita dominar. Creada por Victor Alvarez (VirusTotal/Google), permite describir familias de malware con reglas basadas en patrones textuales, binarios y lógicos. Una buena regla YARA puede detectar todas las variantes de una familia de malware con una sola definición.

Este artículo enseña a escribir reglas YARA desde cero, con ejemplos prácticos para cada tipo de malware.

Anatomía de una regla YARA

Estructura básica

rule Nombre_De_La_Regla {
    meta:
        description = "Descripcion de que detecta"
        author = "Tu nombre"
        date = "2026-05-21"
        hash = "sha256 de la muestra de referencia"
        
    strings:
        $s1 = "texto a buscar" ascii
        $s2 = { 4D 5A 90 00 }
        $s3 = /regex[0-9]+pattern/
        
    condition:
        $s1 or ($s2 and $s3)
}

Secciones

Sección	Obligatorio	Descripción
`meta:`	No	Metadatos: autor, descripción, hashes, referencias
`strings:`	No	Patrones a buscar: texto, hex, regex
`condition:`	Sí	Lógica que determina si la regla coincide

Tipos de strings

Strings de texto

$texto_ascii = "http://evil.com" ascii           // Solo ASCII
$texto_wide = "http://evil.com" wide             // Unicode UTF-16LE (Windows)
$texto_both = "http://evil.com" ascii wide        // Ambos
$texto_nocase = "password" nocase                 // Case insensitive
$texto_full = "CreateRemoteThread" ascii nocase fullword  // Palabra completa

fullword: solo coincide si el string está delimitado por caracteres no alfanuméricos. Evita falsos positivos: "key" coincide con "monkey", pero "key" fullword no.

Strings hexadecimales

$hex_exact = { 4D 5A 90 00 03 00 00 00 }        // MZ header exacto
$hex_wildcard = { 4D 5A ?? ?? 03 00 }            // ?? = cualquier byte
$hex_jump = { 4D 5A [2-4] 03 00 }               // [2-4] = 2 a 4 bytes cualquiera
$hex_alt = { 4D ( 5A | 5B ) 90 00 }             // Alternativa: 5A o 5B

Wildcards ??: útiles para bytes que cambian entre versiones del malware (offsets, timestamps, claves).

Expresiones regulares

$regex_ip = /[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/
$regex_url = /https?:\/\/[a-zA-Z0-9\.\-]+\.[a-z]{2,}/
$regex_base64 = /[A-Za-z0-9+\/]{50,}={0,2}/      // Base64 largo
$regex_email = /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/

Las regex en YARA son más lentas que strings de texto. Usar con moderación y solo cuando text/hex no son suficientes.

Condiciones

Operadores básicos

condition:
    $s1                          // $s1 existe en el archivo
    $s1 and $s2                  // Ambos existen
    $s1 or $s2                   // Al menos uno existe
    not $s1                      // $s1 NO existe
    $s1 and not $s2              // $s1 existe, $s2 no
    2 of ($s1, $s2, $s3)         // Al menos 2 de los 3
    any of ($s*)                 // Cualquiera de las strings que empiezan con $s
    all of ($s*)                 // Todas las strings que empiezan con $s
    3 of them                    // Al menos 3 de todas las strings definidas

Condiciones con posición

condition:
    $mz at 0                     // $mz en el offset 0 (inicio del archivo)
    $s1 in (0..1024)             // $s1 en los primeros 1024 bytes
    #s1 > 5                      // $s1 aparece mas de 5 veces
    @s1 < 100                    // Primera ocurrencia de $s1 antes del byte 100

Condiciones con tamaño

condition:
    filesize < 1MB               // Archivo menor a 1 MB
    filesize > 100KB and filesize < 5MB
    uint16(0) == 0x5A4D          // Primeros 2 bytes = "MZ" (PE file)
    uint32(0) == 0x464C457F      // Primeros 4 bytes = "\x7fELF" (ELF file)

Ejemplos prácticos

Regla 1: detectar ransomware genérico

rule Ransomware_Generic {
    meta:
        description = "Detects generic ransomware indicators"
        author = "MalwareIntel"
        
    strings:
        // Shadow copy deletion
        $del1 = "vssadmin" ascii nocase
        $del2 = "delete shadows" ascii nocase
        $del3 = "wmic shadowcopy" ascii nocase
        $del4 = "bcdedit" ascii nocase
        $del5 = "recoveryenabled" ascii nocase
        
        // Ransom note keywords
        $note1 = "your files have been encrypted" ascii nocase
        $note2 = "bitcoin" ascii nocase
        $note3 = "decrypt" ascii nocase
        $note4 = ".onion" ascii
        $note5 = "ransom" ascii nocase
        
        // Crypto API usage
        $crypto1 = "CryptEncrypt" ascii
        $crypto2 = "CryptGenKey" ascii
        $crypto3 = "BCryptEncrypt" ascii
        
        // File enumeration
        $enum1 = "FindFirstFile" ascii
        $enum2 = "FindNextFile" ascii
        
    condition:
        uint16(0) == 0x5A4D and                     // Es un PE
        filesize < 5MB and                            // Tamano razonable
        (2 of ($del*) or 3 of ($note*)) and          // Indicadores de ransomware
        1 of ($crypto*) and                           // Usa criptografia
        all of ($enum*)                               // Enumera archivos
}

Regla 2: detectar Agent Tesla (infostealer .NET)

rule AgentTesla_Infostealer {
    meta:
        description = "Detects Agent Tesla keylogger/infostealer"
        author = "MalwareIntel"
        
    strings:
        // .NET indicators
        $dotnet = "_CorExeMain" ascii
        
        // Agent Tesla specific strings
        $at1 = "logins.json" ascii wide
        $at2 = "Login Data" ascii wide
        $at3 = "\\Thunderbird\\Profiles" ascii wide
        $at4 = "\\FileZilla\\recentservers.xml" ascii wide
        $at5 = "smtp" ascii nocase
        $at6 = "keylog" ascii nocase
        
        // Screenshot capability
        $screen = "CopyFromScreen" ascii
        
        // Browser credential paths
        $chrome = "\\Google\\Chrome\\User Data" ascii wide
        $firefox = "\\Mozilla\\Firefox\\Profiles" ascii wide
        
    condition:
        uint16(0) == 0x5A4D and
        $dotnet and
        3 of ($at*) and
        ($screen or $chrome or $firefox)
}

Regla 3: detectar Cobalt Strike beacon

rule CobaltStrike_Beacon {
    meta:
        description = "Detects Cobalt Strike beacon in memory or on disk"
        author = "MalwareIntel"
        
    strings:
        // Beacon config markers
        $config = { 00 01 00 01 00 02 ?? ?? 00 01 00 02 }
        
        // Reflective loader
        $loader = "ReflectiveLoader" ascii
        
        // Default named pipes
        $pipe1 = "MSSE-" ascii
        $pipe2 = "postex_" ascii
        $pipe3 = "status_" ascii
        $pipe4 = "msagent_" ascii
        
        // Sleep mask
        $sleep = { 4C 8B 53 08 45 8B 0A 45 8B 52 04 }
        
    condition:
        $config or
        ($loader and 1 of ($pipe*)) or
        ($sleep and 1 of ($pipe*))
}

Regla 4: detectar documento Office con macro maliciosa

rule Malicious_Office_Macro {
    meta:
        description = "Detects Office document with suspicious VBA macro"
        author = "MalwareIntel"
        
    strings:
        $ole = { D0 CF 11 E0 A1 B1 1A E1 }    // OLE header
        
        $auto1 = "Auto_Open" ascii
        $auto2 = "Document_Open" ascii
        $auto3 = "Workbook_Open" ascii
        
        $sus1 = "Shell" ascii
        $sus2 = "WScript.Shell" ascii
        $sus3 = "powershell" ascii nocase
        $sus4 = "cmd.exe" ascii nocase
        $sus5 = "DownloadString" ascii nocase
        $sus6 = "URLDownloadToFile" ascii nocase
        
        $obf1 = "Chr(" ascii
        $obf2 = "ChrW(" ascii
        $obf3 = "StrReverse" ascii
        
    condition:
        $ole at 0 and
        1 of ($auto*) and
        (2 of ($sus*) or 3 of ($obf*))
}

Regla 5: detectar UPX packed

rule UPX_Packed {
    meta:
        description = "Detects UPX packed executable"
        author = "MalwareIntel"
        
    strings:
        $upx1 = "UPX0" ascii
        $upx2 = "UPX1" ascii
        $upx3 = "UPX!" ascii
        $upx4 = "UPX2" ascii
        
    condition:
        uint16(0) == 0x5A4D and
        2 of ($upx*)
}

Ejecución de reglas YARA

Línea de comandos

# Escanear un archivo
yara rules.yar sample.exe

# Escanear recursivamente un directorio
yara -r rules.yar /path/to/samples/

# Escanear con multiples archivos de reglas
yara -r rule1.yar rule2.yar /path/to/samples/

# Mostrar strings que coincidieron
yara -s rules.yar sample.exe

# Escanear proceso en memoria (por PID)
yara rules.yar [PID]

# Timeout (para reglas con regex complejas)
yara -t 60 rules.yar sample.exe

Python (yara-python)

import yara

# Compilar reglas
rules = yara.compile(filepath='rules.yar')
# O desde string
rules = yara.compile(source='rule test { condition: true }')

# Escanear archivo
matches = rules.match(filepath='sample.exe')
for match in matches:
    print(f"Rule: {match.rule}")
    for s in match.strings:
        print(f"  String: {s}")

# Escanear datos en memoria
with open('sample.exe', 'rb') as f:
    matches = rules.match(data=f.read())

Repositorios de reglas YARA

Repositorio	Autor	Contenido
signature-base	Neo23x0 (Florian Roth)	3000+ reglas de malware, webshells, exploits
Yara-Rules	Community	Reglas organizadas por categoría
MALPEDIA YARA	Fraunhofer FKIE	Reglas por familia de malware
Elastic YARA	Elastic Security	Reglas integradas con Elastic SIEM
CAPE YARA	kevoreilly	Reglas del sandbox CAPE
InQuest YARA	InQuest Labs	Reglas para documentos maliciosos

# Descargar signature-base
git clone https://github.com/Neo23x0/signature-base

# Escanear con todas las reglas
yara -r signature-base/yara/ sample.exe

Buenas prácticas

Práctica	Motivo
Siempre incluir `uint16(0) == 0x5A4D` para PE rules	Evitar falsos positivos en archivos no-PE
Usar `fullword` en strings cortas	Evitar coincidencias parciales
Incluir `filesize` en la condición	Limitar scope, mejorar rendimiento
Probar reglas contra dataset de archivos legítimos	Verificar tasa de falsos positivos
Incluir `meta:` con hash de la muestra de referencia	Trazabilidad
No depender de una sola string	El malware puede cambiar una string entre versiones
Combinar strings de diferentes partes del malware	Más robusto ante cambios parciales
Documentar por qué cada string es relevante	Mantenimiento futuro

Errores comunes

Error	Solución
Regla demasiado genérica (muchos FP)	Añadir más strings y condiciones restrictivas
Regla demasiado específica (solo detecta una muestra)	Usar wildcards en hex, opciones en condition
Regex compleja que tarda mucho	Simplificar regex, usar strings de texto cuando sea posible
No verificar formato de archivo	Añadir `uint16(0) == 0x5A4D` o `uint32(0) == 0x464C457F`
Strings ASCII que coinciden con archivos legítimos	Combinar con strings más específicas, usar condiciones AND

Mapeo MITRE ATT&CK

YARA detecta malware pero no técnicas directamente. Sin embargo, las reglas se pueden mapear a técnicas por el tipo de malware que detectan:

Regla YARA	Técnica ATT&CK
Ransomware (shadow copy deletion strings)	T1486, T1490
Keylogger (SetWindowsHookEx, GetAsyncKeyState)	T1056.001
C2 beacon (config patterns, named pipes)	T1071, T1573
Injection (CreateRemoteThread, VirtualAllocEx)	T1055
Packed binary (UPX, Themida markers)	T1027.002
Office macro (AutoOpen, Shell, WScript)	T1204.002, T1059.005

Fuentes y referencias

Alvarez, V. "YARA: The Pattern Matching Swiss Knife." https://virustotal.github.io/yara/
YARA Documentation. "Writing YARA Rules." https://yara.readthedocs.io/
Roth, F. "signature-base: YARA Rules." https://github.com/Neo23x0/signature-base
Yara-Rules Community. "Community YARA Rules." https://github.com/Yara-Rules/rules
Mandiant. "CAPA uses YARA internally." https://github.com/mandiant/capa
SANS. "Writing Effective YARA Rules." SANS ISC.

Preguntas frecuentes

Libros recomendados

Practical Malware Analysis (Sikorski & Honig)

Amazon (enlace afiliado)

Learning Malware Analysis (Monnappa)