Feature deep-dive

AutoMod — calm-first moderation, deeper than you expected.

Most Discord bots' AutoMod is a list of regex toggles and a sensitivity slider. Sloth Lee's starts there and adds three more layers on top — and crucially, lets you review borderline calls instead of acting on them blind.

The four layers, in order

Layer 1 — Pattern matching

The fastest, most predictable layer. We ship with a baseline word list (slurs, common spam phrases, a curated set of known phishing URL patterns) and you add your own. Every rule has an action — log, warn, timeout, ban — and a condition (which channels, which roles are exempt).

This is where most servers spend their first day. It's predictable, easy to debug, and works without any AI. If someone says the F-slur in #general, layer 1 has them timed out before the other layers wake up.

Layer 2 — VADER sentiment

The Valence Aware Dictionary and sEntiment Reasoner. A rules-based sentiment analyser tuned for short social-media text. We score every message's emotional valence and intensity. By itself, sentiment isn't moderation — but combined with pattern matching it identifies pile-ons: three or more negative-valence messages directed at the same user within 5 minutes triggers a review-queue entry.

Pile-ons are where text-based moderation usually fails. Each individual message is borderline; the cumulative effect is harassment. VADER catches the cumulative.

Layer 3 — OpenAI moderation API

OpenAI's public moderation endpoint scores text on eleven categories: harassment, hate, self-harm, sexual, violence, etc. We hit it for messages that pattern matching flagged but didn't auto-act on, plus a 1% sample of everything else as a smoke check. Flagged messages go to the review queue.

OpenAI's endpoint is free and well-tuned. It's not infallible — no AI is — but combined with the first two layers it catches things rules-only systems miss (e.g., metaphorical threats, coded slurs, indirect harassment).

Layer 4 — ML phishing detector

A small classifier trained on Discord-specific phishing patterns: fake Steam giveaways, “you got Nitro” DMs, fake support impersonations, NFT/crypto scams. Runs on every message containing a URL plus a 5% sample of others. Outputs a phish-probability score.

The classifier learns from confirmed phishing flags across all Sloth Lee servers (anonymised, opt-out). When one server flags a new scam pattern, the classifier picks it up; within hours every other server is protected. Network effects work in your favour here.

The review queue — what makes this different

Layers 2-4 don't auto-act unless the score is extreme. Borderline calls go to a review queue your mods see in the dashboard:

The flagged message + surrounding context (5 messages before, 5 after).
Which layer flagged it + what score.
One-click actions: dismiss, warn, timeout, ban, escalate to senior mod.
Internal notes — invisible to the member, useful for the next mod who looks.

Most automated moderation either acts blindly (high false- positive rate, members get timed out for nothing) or doesn't act at all (mods drown in alert spam). The review queue is the middle path — the bot does the triage, your mods make the call.

Sensitivity tiers

Three preset sensitivities + custom. Each adjusts how aggressive the four layers are:

Low: layer 1 strict; layers 2-4 only log, never queue. For communities where mod judgement should always be primary.
Medium (default): layer 1 strict; layers 2-4 queue borderline; auto-act on extreme cases (immediate phishing, severe slurs).
High: layer 1 strict; layers 2-4 act more aggressively. Recommended for servers under active raid pressure or high spam volume.
Custom: you set thresholds per layer per channel per role. The granular version, available on the paid tier.

What it doesn't do

Doesn't shadow-ban.Every action is visible in the audit log and the affected user is DM'd. We don't do invisible enforcement — it erodes trust.
Doesn't train on your messages.The ML classifier learns from confirmed phishing flags (anonymised). It doesn't train on the message bodies of regular conversation. Opt-out from contributing flags is one click in dashboard settings.
Doesn't pretend to be human.AutoMod replies are clearly bot-flagged. We don't do “a friendly mod said your message broke the rules” — it's a bot, said as such.

How this works on the free tier

Layer 1 (pattern matching) and layer 2 (VADER sentiment) are on the free tier. Review queue is on the free tier with a 50-item cap (resets daily). Layers 3 and 4 — OpenAI moderation + ML phishing — are on the paid tier; they cost us per-call so they aren't feasible to run for every free server.

Most communities are well-served by layers 1+2 alone. Layers 3+4 matter when you have a determined adversary (organised raids, sophisticated phishing) or scale (1k+ messages/day where mods can't hand-review everything).