ThreatHunter/project.md

177 lines
4.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# ThreatHunter — SIEMLite (project.md)
## Overarching Goals (Read This First)
ThreatHunter is a **SIEMlite visibility and reputationcorrelation system** designed to:
1. Run with **minimal endpoint footprint** (CPU, disk, memory).
2. Avoid constant streaming or alwayson services on endpoints.
3. **Process data locally** and upload only meaningful security signals (“hits”).
4. Integrate directly into **Syncro** as a firstclass **Security** alert/ticket.
5. Provide clear value to SMB customers while remaining techniciangrade internally.
NonGoals (v1):
- Full packet capture or IDS/IPS
- Replacing EDR or enterprise SIEMs
- Continuous realtime streaming agents
- Deep behavioral analytics
Positioning:
> “See the suspicious outbound traffic your computers are making — without installing heavy security software.”
---
## HighLevel Architecture
**Endpoint (PowerShell, scheduled via Syncro)**
- Collects Windows Firewall / WFP events (shortlived connection friendly)
- Buffers events locally (3060 minutes)
- Performs local correlation against a reputation hotlist
- Hashes executables **only on confirmed hits**
- Uploads only hit payloads to VPS
**VPS (ThreatHunter Server)**
- Receives hit payloads
- Stores and correlates hits per tenant/device
- Creates **Syncro Security tickets/alerts**
- Hosts ThreatHunter Web UI (tech + customer)
---
## Endpoint Design
### Execution Model
- PowerShell script
- Runs every 5 minutes via Syncro
- Exits after each run (no persistent service)
### Data Source
- Windows Filtering Platform / Firewall events from Security Event Log
- Captures allowed/blocked connection events including shortlived connections
### Local Storage Layout
```
C:\ProgramData\ThreatHunter\
├── config.json
├── state.json
├── buffer\
│ └── events_YYYYMMDD_HH.jsonl
├── cache\
│ └── hashcache.json
└── logs\
└── threathunter.log
```
### Buffering Rules
- Appendonly JSONL files
- Retain 3060 minutes of data
- Prune by time, not size
### Processing Window
- Correlate when:
- Oldest buffered event ≥ 30 minutes
- OR buffer size threshold reached
---
## Local Correlation Logic
1. Read buffered events
2. Extract **unique remote IPs**
3. Load local hotlist into memory (once per run)
4. Match remote IPs against hotlist
5. For matches:
- Group by `(remote_ip, process_name)`
- Calculate first/last seen and count
6. Enrich:
- Resolve executable path (besteffort)
- Compute SHA256 hash **only on hit**
- Use local hash cache to avoid rehashing
7. Upload hit payload(s) to VPS
---
## Hotlist Strategy
- Approx. 80,000 IPv4 addresses
- Stored locally as `hotlist.json.gz`
- Loaded into memory once per processing run
- Updated periodically from VPS
- Integrity verified via hash/signature
- TTL enforced (fallback to server confirmation if stale)
Future upgrade:
- Bloom filter + server confirmation
---
## Hit Payload Schema (Endpoint → VPS)
```json
{
"tenant_id": "...",
"device_id": "...",
"hostname": "...",
"agent_version": "1.0",
"hotlist_version": "20250101",
"hits": [
{
"remote_ip": "1.2.3.4",
"remote_port": 443,
"protocol": "tcp",
"process_name": "bad.exe",
"pid": 1234,
"exe_path": "C:\\Path\\bad.exe",
"sha256": "...",
"first_seen_utc": "...",
"last_seen_utc": "...",
"hit_count": 42,
"sample_events": []
}
]
}
```
---
## VPS Responsibilities
### API
- `POST /api/v1/hits`
- Authenticate tenant/device
- Store hits in database
### Syncro Integration
- On new hit:
- Create or update **Syncro ticket**
- Category: **Security**
- Tags: `Security`, `ThreatHunter`
- Deduplicate tickets using:
`(tenant_id, device_id, remote_ip, process_name, sha256)` within time window
### Web UI
- Technician view: all tenants, full detail
- Customer view: tenantonly, simplified fields
- Key pages:
- Dashboard
- Hit list
- Hit detail
- Device detail
---
## Milestones
**M1** — Endpoint pipeline works endtoend
**M2** — VPS API + DB + basic UI
**M3** — Syncro Security ticket creation
**M4** — Tuning, allowlists, suppression, polish
---
## Definition of Done (v1)
- Endpoint impact is negligible
- Only hit data is uploaded
- Syncro Security tickets created reliably
- Techs can triage from Syncro and drill into ThreatHunter UI