AvanzadoYARAmódulosPEentropíaavanzadodetección profesional

YARA Avanzado: Módulos PE, Math y Regex para Detección Profesional

YARA avanzado: módulos PE (imports, sections, timestamps), módulo math (entropía, mean), módulo hash, condiciones complejas con for loops, y técnicas profesionales para escribir reglas robustas con baja tasa de falsos positivos.

MalwareIntel Research··7 min lectura
Serie: Entornos de Análisis — Parte 10

Más allá de strings: detección por estructura

Las reglas YARA básicas buscan strings y bytes. Las reglas avanzadas analizan la estructura del archivo: sus imports, secciones, timestamps, entropía y metadatos. Esto permite detecciones más robustas que sobreviven a cambios en strings pero no a cambios en la arquitectura del malware.

Módulo PE

Imports

import "pe"

rule Injection_Imports {
    meta:
        description = "PE with code injection capability based on imports"
    condition:
        pe.imports("kernel32.dll", "CreateRemoteThread") and
        pe.imports("kernel32.dll", "VirtualAllocEx") and
        pe.imports("kernel32.dll", "WriteProcessMemory")
}

rule Credential_Dumping_Imports {
    condition:
        pe.imports("advapi32.dll", "OpenProcessToken") and
        pe.imports("advapi32.dll", "AdjustTokenPrivileges") and
        (pe.imports("dbghelp.dll", "MiniDumpWriteDump") or
         pe.imports("kernel32.dll", "ReadProcessMemory"))
}

rule Crypto_Ransomware_Imports {
    condition:
        pe.imports("advapi32.dll", "CryptEncrypt") and
        pe.imports("advapi32.dll", "CryptGenKey") and
        pe.imports("kernel32.dll", "FindFirstFileW") and
        pe.imports("kernel32.dll", "FindNextFileW")
}

Número de imports

import "pe"

rule Few_Imports_Suspicious {
    meta:
        description = "PE with very few imports - possible packing or dynamic resolution"
    condition:
        pe.number_of_imports < 5 and
        pe.number_of_sections > 0 and
        uint16(0) == 0x5A4D
}

Secciones

import "pe"

rule UPX_Sections {
    meta:
        description = "Detects UPX packed PE by section names"
    condition:
        pe.sections[0].name == "UPX0" and
        pe.sections[1].name == "UPX1"
}

rule Suspicious_Section_Names {
    meta:
        description = "PE with non-standard section names"
    condition:
        for any section in pe.sections : (
            section.name == ".vmp0" or    // VMProtect
            section.name == ".themida" or  // Themida
            section.name == ".enigma" or   // Enigma
            section.name == ".aspack" or   // ASPack
            section.name == ".nsp0"        // NSPack
        )
}

rule Writable_Code_Section {
    meta:
        description = "PE with writable .text section - self-modifying code"
    condition:
        for any section in pe.sections : (
            section.name == ".text" and
            section.characteristics & 0x80000000 != 0  // IMAGE_SCN_MEM_WRITE
        )
}

rule Entry_Point_Not_In_Text {
    meta:
        description = "Entry point outside .text section - possible packing"
    condition:
        pe.entry_point < pe.sections[0].raw_data_offset or
        pe.entry_point > pe.sections[0].raw_data_offset + pe.sections[0].raw_data_size
}

Timestamps

import "pe"

rule Future_Timestamp {
    meta:
        description = "PE with compilation timestamp in the future"
    condition:
        pe.timestamp > 1800000000  // Ajustar al timestamp actual
}

rule Zeroed_Timestamp {
    meta:
        description = "PE with zeroed timestamp - deliberately removed"
    condition:
        pe.timestamp == 0
}

rule Delphi_Compiled {
    meta:
        description = "PE compiled with Delphi (Borland timestamp epoch)"
    condition:
        pe.timestamp > 708992400 and pe.timestamp < 709000000
        // Delphi uses a different epoch (June 19, 1992)
}

Exports y DLL characteristics

import "pe"

rule DLL_With_Suspicious_Export {
    condition:
        pe.characteristics & pe.DLL and
        pe.exports("DllRegisterServer")
        // Abusado por regsvr32 side-loading
}

rule No_ASLR_No_DEP {
    meta:
        description = "PE without ASLR and DEP - compiled without security features"
    condition:
        not (pe.dll_characteristics & pe.DYNAMIC_BASE) and
        not (pe.dll_characteristics & pe.NX_COMPAT)
}

Digital signatures

import "pe"

rule Signed_But_Suspicious {
    meta:
        description = "Signed PE with suspicious characteristics"
    condition:
        pe.number_of_signatures > 0 and
        pe.number_of_imports < 10 and
        filesize < 100KB
        // Pequeño, pocas imports, pero firmado = posible certificado robado
}

Módulo Math: entropía

Detectar packing por entropía

import "math"
import "pe"

rule High_Entropy_PE {
    meta:
        description = "PE with high entropy sections - likely packed or encrypted"
    condition:
        uint16(0) == 0x5A4D and
        for any section in pe.sections : (
            math.entropy(section.raw_data_offset, section.raw_data_size) > 7.0
        )
}

rule All_Sections_High_Entropy {
    meta:
        description = "All PE sections have high entropy - heavily packed"
    condition:
        uint16(0) == 0x5A4D and
        pe.number_of_sections > 0 and
        for all section in pe.sections : (
            section.raw_data_size > 0 implies
            math.entropy(section.raw_data_offset, section.raw_data_size) > 6.5
        )
}

rule Packed_PE_Comprehensive {
    meta:
        description = "Comprehensive packed PE detection"
    condition:
        uint16(0) == 0x5A4D and
        (
            // Alta entropia en al menos una seccion
            for any section in pe.sections : (
                math.entropy(section.raw_data_offset, section.raw_data_size) > 7.2
            )
        ) and
        (
            // Pocas imports O nombres de seccion sospechosos
            pe.number_of_imports < 5 or
            for any section in pe.sections : (
                section.name matches /UPX|\.vmp|aspack|themida|enigma/
            )
        )
}

Entropía del archivo completo

import "math"

rule Encrypted_File {
    meta:
        description = "File with very high entropy - possibly encrypted"
    condition:
        math.entropy(0, filesize) > 7.8 and
        filesize > 1KB
}

Otras funciones math

import "math"

rule Low_Mean_Byte_Value {
    meta:
        description = "File with unusual byte distribution"
    condition:
        math.mean(0, filesize) < 20 or math.mean(0, filesize) > 235
        // Valores extremos de mean indican datos no aleatorios o padding
}

rule Serial_Correlation_High {
    meta:
        description = "High serial correlation - possible structure in encrypted data"
    condition:
        math.serial_correlation(0, filesize) > 0.5
}

Módulo Hash

import "hash"

rule Known_Malware_Section_Hash {
    meta:
        description = "Detects malware by hash of .text section"
    condition:
        for any section in pe.sections : (
            section.name == ".text" and
            hash.sha256(section.raw_data_offset, section.raw_data_size) == "abc123..."
        )
}

rule Known_Resource_Hash {
    meta:
        description = "Detects malware by hash of embedded resource"
    condition:
        hash.md5(0, filesize) == "d41d8cd98f00b204e9800998ecf8427e" or
        hash.sha256(0, filesize) == "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
}

Módulo ELF (Linux)

import "elf"

rule Mirai_ELF_Indicators {
    meta:
        description = "Detects Mirai-like ELF botnet"
    strings:
        $cred1 = "vizxv" ascii
        $cred2 = "xc3511" ascii
        $cred3 = "hi3518" ascii
    condition:
        elf.type == elf.ET_EXEC and
        (elf.machine == elf.EM_ARM or elf.machine == elf.EM_MIPS) and
        2 of ($cred*)
}

rule Static_Stripped_ELF {
    meta:
        description = "Statically linked stripped ELF - common in malware"
    condition:
        uint32(0) == 0x464C457F and
        elf.number_of_sections == 0  // Stripped (no section headers)
}

Módulo dotnet (.NET)

import "dotnet"

rule DotNet_Obfuscated {
    meta:
        description = "Obfuscated .NET assembly"
    condition:
        dotnet.is_dotnet and
        for any resource in dotnet.resources : (
            resource.length > 50000
            // Recursos grandes = posible payload cifrado embebido
        )
}

rule DotNet_ConfuserEx {
    meta:
        description = ".NET assembly obfuscated with ConfuserEx"
    strings:
        $confuser = "ConfuserEx" ascii wide
        $confuser2 = "Confuser.Core" ascii
    condition:
        dotnet.is_dotnet and 1 of ($confuser*)
}

Condiciones avanzadas: for loops

Contar ocurrencias

rule Multiple_C2_URLs {
    strings:
        $url = /https?:\/\/[a-zA-Z0-9\.\-]+\.[a-z]{2,}\/[a-z]+\.php/ 
    condition:
        #url > 3  // Mas de 3 URLs con patron de C2
}

For loops sobre strings

rule Strings_In_Specific_Section {
    strings:
        $s1 = "CreateRemoteThread"
        $s2 = "VirtualAllocEx"
    condition:
        for all of ($s*) : (
            @ > pe.sections[0].raw_data_offset and
            @ < pe.sections[0].raw_data_offset + pe.sections[0].raw_data_size
        )
        // Todas las strings deben estar en la primera seccion
}

Condiciones con filesize ranges

rule Typical_RAT_Size {
    strings:
        $net = "connect" ascii
        $cmd = "shell" ascii
    condition:
        uint16(0) == 0x5A4D and
        filesize > 50KB and filesize < 2MB and
        all of them
}

Reglas de producción: combinando módulos

Regla profesional: packed PE con injection capability

import "pe"
import "math"

rule Packed_PE_With_Injection {
    meta:
        description = "Packed PE with code injection imports"
        author = "MalwareIntel"
        severity = "high"
        
    condition:
        uint16(0) == 0x5A4D and
        filesize < 5MB and
        
        // Packed: alta entropia
        for any section in pe.sections : (
            math.entropy(section.raw_data_offset, section.raw_data_size) > 7.0
        ) and
        
        // Injection capability
        (
            pe.imports("kernel32.dll", "CreateRemoteThread") or
            pe.imports("kernel32.dll", "QueueUserAPC") or
            pe.imports("ntdll.dll", "NtCreateThreadEx")
        ) and
        
        // Memory allocation in remote process
        pe.imports("kernel32.dll", "VirtualAllocEx")
}

Regla profesional: .NET infostealer

import "pe"
import "dotnet"

rule DotNet_Infostealer {
    meta:
        description = "Detects .NET infostealer (Agent Tesla, RedLine pattern)"
        
    strings:
        $browser1 = "\\Google\\Chrome\\User Data" ascii wide
        $browser2 = "\\Mozilla\\Firefox\\Profiles" ascii wide
        $browser3 = "Login Data" ascii wide
        $ftp = "\\FileZilla\\recentservers.xml" ascii wide
        $mail = "\\Thunderbird\\Profiles" ascii wide
        $crypto = "\\Electrum\\wallets" ascii wide
        
        $exfil1 = "smtp" ascii nocase
        $exfil2 = "api.telegram.org" ascii
        $exfil3 = "ftp://" ascii
        
    condition:
        dotnet.is_dotnet and
        filesize < 5MB and
        3 of ($browser*, $ftp, $mail, $crypto) and
        1 of ($exfil*)
}

Testing de reglas YARA

Verificar falsos positivos

# Escanear contra directorio de archivos legitimos
yara -r my_rules.yar C:\Windows\System32\ 2>/dev/null | wc -l
# Si > 0: tienes falsos positivos. Refinar la regla.

# Escanear contra coleccion de malware conocido
yara -r my_rules.yar /malware/samples/ | wc -l
# Verificar que detecta las muestras esperadas

yaraQA (Florian Roth)

# Verificar calidad de reglas
pip install yaraqa
yaraqa -r my_rules.yar
# Reporta: reglas sin meta, condiciones debiles, posibles issues

Fuentes y referencias

Preguntas frecuentes

Artículos relacionados

Este contenido tiene fines exclusivamente educativos y de investigación en ciberseguridad defensiva. No se proporcionan binarios maliciosos ni payloads ejecutables. El uso indebido de esta información es responsabilidad exclusiva del usuario. Leer disclaimer completo.