ThreatHunter/project.md

4.3 KiB
Raw Permalink Blame History

ThreatHunter — SIEMLite (project.md)

Overarching Goals (Read This First)

ThreatHunter is a SIEMlite visibility and reputationcorrelation system designed to:

  1. Run with minimal endpoint footprint (CPU, disk, memory).
  2. Avoid constant streaming or alwayson services on endpoints.
  3. Process data locally and upload only meaningful security signals (“hits”).
  4. Integrate directly into Syncro as a firstclass Security alert/ticket.
  5. Provide clear value to SMB customers while remaining techniciangrade internally.

NonGoals (v1):

  • Full packet capture or IDS/IPS
  • Replacing EDR or enterprise SIEMs
  • Continuous realtime streaming agents
  • Deep behavioral analytics

Positioning:

“See the suspicious outbound traffic your computers are making — without installing heavy security software.”


HighLevel Architecture

Endpoint (PowerShell, scheduled via Syncro)

  • Collects Windows Firewall / WFP events (shortlived connection friendly)
  • Buffers events locally (3060 minutes)
  • Performs local correlation against a reputation hotlist
  • Hashes executables only on confirmed hits
  • Uploads only hit payloads to VPS

VPS (ThreatHunter Server)

  • Receives hit payloads
  • Stores and correlates hits per tenant/device
  • Creates Syncro Security tickets/alerts
  • Hosts ThreatHunter Web UI (tech + customer)

Endpoint Design

Execution Model

  • PowerShell script
  • Runs every 5 minutes via Syncro
  • Exits after each run (no persistent service)

Data Source

  • Windows Filtering Platform / Firewall events from Security Event Log
  • Captures allowed/blocked connection events including shortlived connections

Local Storage Layout

C:\ProgramData\ThreatHunter\
├── config.json
├── state.json
├── buffer\
│   └── events_YYYYMMDD_HH.jsonl
├── cache\
│   └── hashcache.json
└── logs\
    └── threathunter.log

Buffering Rules

  • Appendonly JSONL files
  • Retain 3060 minutes of data
  • Prune by time, not size

Processing Window

  • Correlate when:
    • Oldest buffered event ≥ 30 minutes
    • OR buffer size threshold reached

Local Correlation Logic

  1. Read buffered events
  2. Extract unique remote IPs
  3. Load local hotlist into memory (once per run)
  4. Match remote IPs against hotlist
  5. For matches:
    • Group by (remote_ip, process_name)
    • Calculate first/last seen and count
  6. Enrich:
    • Resolve executable path (besteffort)
    • Compute SHA256 hash only on hit
    • Use local hash cache to avoid rehashing
  7. Upload hit payload(s) to VPS

Hotlist Strategy

  • Approx. 80,000 IPv4 addresses
  • Stored locally as hotlist.json.gz
  • Loaded into memory once per processing run
  • Updated periodically from VPS
  • Integrity verified via hash/signature
  • TTL enforced (fallback to server confirmation if stale)

Future upgrade:

  • Bloom filter + server confirmation

Hit Payload Schema (Endpoint → VPS)

{
  "tenant_id": "...",
  "device_id": "...",
  "hostname": "...",
  "agent_version": "1.0",
  "hotlist_version": "20250101",
  "hits": [
    {
      "remote_ip": "1.2.3.4",
      "remote_port": 443,
      "protocol": "tcp",
      "process_name": "bad.exe",
      "pid": 1234,
      "exe_path": "C:\\Path\\bad.exe",
      "sha256": "...",
      "first_seen_utc": "...",
      "last_seen_utc": "...",
      "hit_count": 42,
      "sample_events": []
    }
  ]
}

VPS Responsibilities

API

  • POST /api/v1/hits
  • Authenticate tenant/device
  • Store hits in database

Syncro Integration

  • On new hit:
    • Create or update Syncro ticket
    • Category: Security
    • Tags: Security, ThreatHunter
  • Deduplicate tickets using: (tenant_id, device_id, remote_ip, process_name, sha256) within time window

Web UI

  • Technician view: all tenants, full detail
  • Customer view: tenantonly, simplified fields
  • Key pages:
    • Dashboard
    • Hit list
    • Hit detail
    • Device detail

Milestones

M1 — Endpoint pipeline works endtoend
M2 — VPS API + DB + basic UI
M3 — Syncro Security ticket creation
M4 — Tuning, allowlists, suppression, polish


Definition of Done (v1)

  • Endpoint impact is negligible
  • Only hit data is uploaded
  • Syncro Security tickets created reliably
  • Techs can triage from Syncro and drill into ThreatHunter UI