Write-ups
DE-001:
TryHackMe - Introduction to Detection Engineering
TryHackMe - Introduction to Detection Engineering
What was the task?
Complete TryHackMe's "Introduction to Detection Engineering" room covering detection types, the detection engineering lifecycle, key frameworks (MITRE ATT&CK, Alerting and Detection Strategy (ADS), Detection Maturity Level (DML)), and applying the ADS Framework to an Active Directory privilege escalation detection scenario.
What did I observe?
Detection engineering operates on four detection types across two categories: Environment-based (Configuration and Modelling) and Threat-based (Indicator and Threat Behaviour). Configuration detection is easiest to implement in static environments but breaks in dynamic infrastructure. Indicator-based detection (Indicators of Compromise (IOCs), IP lists) is fastest to deploy but becomes obsolete as adversaries change infrastructure. Threat behaviour detection focuses on Tactics, Techniques and Procedures (TTPs) and withstands adversary evolution but requires extensive baseline data.
The Detection as Code approach treats detection rules like software with version control and Continuous Integration/Continuous Deployment (CI/CD) pipelines. Most Security Information and Event Management (SIEM) systems lack native version control for detection rules, which means changes are untracked and difficult to audit.
The ADS Framework mandates nine stages before production deployment: Goal, Categorisation, Strategy Abstract, Technical Context, Blind Spots & Assumptions, False Positives, Validation, Priority, Response. The Validation stage requires maintaining test scripts that generate true-positive events alongside detection rules.
Ryan Stillions' DML model shows most organisations operate at DML-1/DML-2 (atomic indicators and artifacts), which he describes as "chasing vapor trails". Higher maturity levels (DML-5+ for techniques, tactics, strategy) require progressively more contextual intelligence and abstraction.
The practical exercise demonstrated detecting unusual PowerShell host processes by monitoring for
system.management.automation.dll loading into non-standard processes, a common Operational Security (OPSEC)-friendly technique adversaries use to avoid detection.
What I learned?
Detection engineering is continuous iteration, not one-time rule deployment. Rules must be modified as attack vectors and environments change. Combining detection types creates stronger coverage than relying on a single approach, e.g., pairing baseline modelling with configuration detection reduces false positives.
The ADS Framework forces you to think through the entire detection lifecycle before writing rules. The "Blind Spots & Assumptions" stage is counterintuitive but critical: explicitly documenting where your detection can fail enables risk assessment and compensating controls. For the PowerShell example, assumptions included endpoint tools running correctly, logs being forwarded and SIEM indexing successfully.
Validation is non-negotiable. Every detection should have accompanying test commands that generate true-positive events. Without this, you're deploying unverified rules into production.
The DML model revealed that most organisational resources concentrate on IOC feeds (DML-1), which are retroactive and adversary rate-of-change dependent. Moving toward behaviour-based detection (DML-5+) provides better Return on investment (ROI) but requires baseline data and more implementation effort.
Detection as Code (DaC) solves SIEM limitations by treating rules as version-controlled code. Sigma rules are vendor-agnostic and deployable across multiple SIEMs, which enables code reusability across detections.
What is the security-relevant takeaway?
Indicator-based detection (IP lists, hash values) is reactive and short-lived. Adversaries change infrastructure faster than organisations update indicator feeds. Investing disproportionately in DML-1/DML-2 approaches creates a maturity gap where detection always lags behind adversary capabilities.
Behaviour-based detection targeting TTPs provides sustainable defensive advantage because adversaries cannot easily change tactics without significant operational cost (Pyramid of Pain principle). However, this requires accurate baselines and cross-departmental collaboration to define normal behaviour.
Before deploying any detection rule, complete the ADS Framework validation stage with executable test commands. Untested detections generate alert fatigue when false positives fire, or worse, fail silently when real threats occur. Maintain validation scripts alongside detection rules in version control.
Map detection gaps proactively using MITRE ATT&CK during threat modelling, not reactively after incidents. Identify which TTPs your environment is vulnerable to, determine required data sources, then assess collection coverage. Missing log sources create blind spots that adversaries can exploit.
Platform
tryhackme.com/room/introtodetectionengineering
LOG-001:
TryHackMe - Intro to Logs
TryHackMe - Intro to Logs
What was the task?
Complete TryHackMe's "Intro to Logs" room, a beginner module covering log types, formats, standards, collection, storage, retention, deletion and a hands-on exercise using rsyslog and Unix tools to collect, parse, consolidate and analyse logs from multiple sources.
What did I observe?
Logs from different sources come in incompatible formats, e.g. nginx uses Combined Log Format, syslog uses its own timestamping, and application logs use JSON or XML. Before any analysis is possible, these need to be normalised to a common timestamp format. The practical exercise used awk and sed to rewrite timestamps from each source into a sortable format, then sort and uniq to deduplicate.
Time synchronisation (NTP) is a prerequisite. Without it, correlating events across systems is unreliable. Log retention follows a Hot/Warm/Cold model: recent logs in fast-query storage, older logs in a data lake, archived logs in compressed cold storage. GDPR (General Data Protection Regulation) requires unnecessary data to be removed, but deletions must be deliberate and auditable, with backups taken where retention is legally required. rsyslog can be configured per-service (e.g. isolating all sshd messages to a dedicated file) with a single config file in
/etc/rsyslog.d/.
What I learned?
Centralisation is not optional. Without it, attack timelines cannot be reconstructed
across systems. Parsing and normalisation are required before any analysis, as raw logs
from heterogeneous sources are not directly comparable.
A single
awk pipeline can normalise, extract, and reformat logs, but the logic
differs per source format and must be written individually. Filtering after consolidation
(e.g. by IP) is an effective way to reduce scope and focus on relevant events.
Log integrity matters: taking a hash (
sha256sum) during collection is required
if logs may later be used as evidence.
What is the security-relevant takeaway?
Effective detection depends on log collection working correctly before it depends on
analysis tools. Missing centralisation, incorrect time synchronisation, or insufficient
log retention silently break event correlation.
Log rotation is a security control, not housekeeping. Logs rotated too aggressively may remove data needed for investigation, while logs that are never rotated create availability and compliance risks. Retention policies must balance forensic needs, storage limits and European/national data protection requirements.
Platform
tryhackme.com/room/introtologs
AUTH-001:
Authentication Log Analysis Tool (soc-auth-triage)
Authentication Log Analysis Tool (soc-auth-triage)
What was the task?
Build a command-line tool to analyse system authentication logs and extract
SSH-related authentication events. The script should identify failed login
attempts, targeted usernames, time-based attack patterns and potential
compromises by correlating failed logins with subsequent successful ones
from the same source IP.
What did I observe?
On a production VPS, a small number of source IPs were responsible for the majority of failed
login attempts, each registering hundreds of attempts within hours.
The most targeted account was root, followed by common service accounts like
mysql and backup.
Attack activity clustered in specific hours rather than being spread evenly, which is consistent with automated tools, not manual attempts. Modern Linux systems use RFC3339 timestamps and log
Connection closed ... [preauth] instead of the traditional Failed password pattern. On some minimal cloud images, /var/log/auth.log doesn't exist at all and systemd logs only to journalctl.
What I learned?
Depending on configuration, systemd may log to files or to journalctl only.
fail2ban blocks attempts before they appear as Failed password entries, so
pattern matching must also handle Connection closed ... [preauth].
Correlating different event types (failed vs. successful) requires process substitution < <(...) to avoid subshell variable scope issues.
Key learning: In security log analysis, simple Unix tools (
grep, awk, cut, sort, uniq) combined in clear pipelines
are often more readable and maintainable than heavily complex regular expressions.
Beyond the technical implementation, I learned the importance of being explicit about assumptions when analysing logs. Log formats, available fields and even the presence of log files differ between systems, so tooling must fail safely and make its limitations visible rather than silently producing incomplete results. This reinforced that log analysis is less about writing clever parsing logic and more about systematically reducing uncertainty while avoiding false conclusions.
What is the security-relevant takeaway?
Automated scanning is constant background noise on any public-facing server.
When reviewing authentication logs, the most relevant correlation to check first is
whether repeated failed login attempts from the same source IP are later followed by
a successful login, as this requires closer inspection.
Time-based clustering helps distinguish automated scanning from normal user behaviour. Repeated attempts against non-existent accounts usually indicate reconnaissance, where an attacker is testing which usernames exist before trying valid credentials.
Repository
github.com/daniel-ploetzl/soc-auth-triage