# ThreatHunter — SIEM‑Lite (project.md) ## Overarching Goals (Read This First) ThreatHunter is a **SIEM‑lite visibility and reputation‑correlation system** designed to: 1. Run with **minimal endpoint footprint** (CPU, disk, memory). 2. Avoid constant streaming or always‑on services on endpoints. 3. **Process data locally** and upload only meaningful security signals (“hits”). 4. Integrate directly into **Syncro** as a first‑class **Security** alert/ticket. 5. Provide clear value to SMB customers while remaining technician‑grade internally. Non‑Goals (v1): - Full packet capture or IDS/IPS - Replacing EDR or enterprise SIEMs - Continuous real‑time streaming agents - Deep behavioral analytics Positioning: > “See the suspicious outbound traffic your computers are making — without installing heavy security software.” --- ## High‑Level Architecture **Endpoint (PowerShell, scheduled via Syncro)** - Collects Windows Firewall / WFP events (short‑lived connection friendly) - Buffers events locally (30–60 minutes) - Performs local correlation against a reputation hotlist - Hashes executables **only on confirmed hits** - Uploads only hit payloads to VPS **VPS (ThreatHunter Server)** - Receives hit payloads - Stores and correlates hits per tenant/device - Creates **Syncro Security tickets/alerts** - Hosts ThreatHunter Web UI (tech + customer) --- ## Endpoint Design ### Execution Model - PowerShell script - Runs every 5 minutes via Syncro - Exits after each run (no persistent service) ### Data Source - Windows Filtering Platform / Firewall events from Security Event Log - Captures allowed/blocked connection events including short‑lived connections ### Local Storage Layout ``` C:\ProgramData\ThreatHunter\ ├── config.json ├── state.json ├── buffer\ │ └── events_YYYYMMDD_HH.jsonl ├── cache\ │ └── hashcache.json └── logs\ └── threathunter.log ``` ### Buffering Rules - Append‑only JSONL files - Retain 30–60 minutes of data - Prune by time, not size ### Processing Window - Correlate when: - Oldest buffered event ≥ 30 minutes - OR buffer size threshold reached --- ## Local Correlation Logic 1. Read buffered events 2. Extract **unique remote IPs** 3. Load local hotlist into memory (once per run) 4. Match remote IPs against hotlist 5. For matches: - Group by `(remote_ip, process_name)` - Calculate first/last seen and count 6. Enrich: - Resolve executable path (best‑effort) - Compute SHA256 hash **only on hit** - Use local hash cache to avoid re‑hashing 7. Upload hit payload(s) to VPS --- ## Hotlist Strategy - Approx. 80,000 IPv4 addresses - Stored locally as `hotlist.json.gz` - Loaded into memory once per processing run - Updated periodically from VPS - Integrity verified via hash/signature - TTL enforced (fallback to server confirmation if stale) Future upgrade: - Bloom filter + server confirmation --- ## Hit Payload Schema (Endpoint → VPS) ```json { "tenant_id": "...", "device_id": "...", "hostname": "...", "agent_version": "1.0", "hotlist_version": "2025‑01‑01", "hits": [ { "remote_ip": "1.2.3.4", "remote_port": 443, "protocol": "tcp", "process_name": "bad.exe", "pid": 1234, "exe_path": "C:\\Path\\bad.exe", "sha256": "...", "first_seen_utc": "...", "last_seen_utc": "...", "hit_count": 42, "sample_events": [] } ] } ``` --- ## VPS Responsibilities ### API - `POST /api/v1/hits` - Authenticate tenant/device - Store hits in database ### Syncro Integration - On new hit: - Create or update **Syncro ticket** - Category: **Security** - Tags: `Security`, `ThreatHunter` - Deduplicate tickets using: `(tenant_id, device_id, remote_ip, process_name, sha256)` within time window ### Web UI - Technician view: all tenants, full detail - Customer view: tenant‑only, simplified fields - Key pages: - Dashboard - Hit list - Hit detail - Device detail --- ## Milestones **M1** — Endpoint pipeline works end‑to‑end **M2** — VPS API + DB + basic UI **M3** — Syncro Security ticket creation **M4** — Tuning, allowlists, suppression, polish --- ## Definition of Done (v1) - Endpoint impact is negligible - Only hit data is uploaded - Syncro Security tickets created reliably - Techs can triage from Syncro and drill into ThreatHunter UI