Anomaly detection

How to use it

Enable monitoring — 50+ KPIs get day-of-week-aware baselines automatically. (Analytics → Anomaly detection)
Set your alert channel — in-app, email, or Slack.
Act on alerts — each includes the delta, direction, and a likely-cause hint.

⏱ ~5 min · 💳 Pro+ · 🎯 Problems caught in hours, not weeks

Why this matters for your business

Most stores discover problems too late. A pixel breaks on a site update — you find out in three weeks when conversion drifts enough to notice in monthly reports. A discount code leaks to a coupon site — you find out when margin shows up wrong on Tuesday's P&L review. A WhatsApp template gets paused by Meta — you find out when customers complain on email. Each of these costs real money, and the cost compounds with every day you don't notice.

Anomaly detection is the watchdog that finds these problems within hours of them starting. Over 50 KPIs are monitored continuously — revenue, conversion, channel-specific engagement, fatigue posture, deliverability, ROAS, customer-acquisition cost, repeat rate — and the moment any of them deviates significantly from your shop's baseline, you get an alert with the delta, the direction, and the most likely cause.

The win isn't just speed; it's framing. Without anomaly detection, "sales were down this week" is a nervous question. With anomaly detection, "loyal-cohort engagement dropped 28% on iOS Safari starting Tuesday at 11:30 AM" is a diagnostic with a starting point. The investigation becomes 30 minutes instead of two days, and you fix the bug before the quarter is lost.

What this typically unlocks

Outcome	Typical result
Time from "something broke" → detection	2–4 hours vs. 1–3 weeks unmonitored
Revenue saved per quarter from caught issues	$30K–500K depending on scale
False-alarm rate	< 1 alert / week at default sensitivity
Diagnosed root causes per alert	~70% include automatic root-cause hint
Time spent on monthly metric reviews	−60% — review exceptions, not the whole dashboard
Confidence in flat dashboards	high — "no anomalies" is a real signal, not assumed

What you actually get

Continuous monitoring across four KPI families:

Family	Examples	Alert sensitivity
Revenue & orders	Daily revenue, conversion rate, AOV, refund rate, repeat rate	Medium (default)
Channel engagement	Email open/click, WhatsApp read/reply, SMS click, push CTR	Medium
Acquisition	New customers/day by source, CAC, first-product mix	Low (high noise)
Compliance & deliverability	Unsub rate, bounce rate, fatigue cap hits, opt-in rate	High (low noise, high stakes)

Each anomaly arrives with:

What changed — the metric, the direction, the magnitude
When it started — the precise hour the deviation began
How it compares — vs. last week, last 30d, same period last year
Where it's localized — segment, channel, product, region (if detectable)
Likely cause hint — common patterns matched (e.g. "matches profile of a broken pixel")
Suggested actions — what to check first

How it powers every part of your store

Anomaly type	What it lets you fix fast
Conversion rate drop	Broken checkout, slow page load, paywall popup that's too aggressive
Email open-rate drop	Deliverability issue, sender-reputation hit, list-quality decay
WhatsApp read-rate drop	Template paused by Meta, fatigue cap hit too often
AOV unexpected drop	Discount code leaked, free-shipping threshold change error
Refund-rate spike	Product quality issue, sizing mismatch, shipping breakage
Unsub-rate spike	Over-sending, bad list segmentation, broken unsubscribe link
New-customer drop	Acquisition channel paused, ad account suspended, pixel broken
CAC spike	Ad-cost rise, ROAS dropped, audience saturated
Fatigue-cap-hits spike	Too many concurrent journeys, campaign double-run
Opt-in rate drop	Storefront widget broken, consent text change

How it works (without the technical bits)

Baselines — robust to weekly cycles

Naive "compare to yesterday" alerts fire constantly because of weekly patterns (Saturdays don't look like Wednesdays). Our baselines use rolling median + median absolute deviation over the last 30 days, bucketed by day-of-week and hour-of-day. So:

Today at 11 AM is compared to the last 4 Tuesdays at 11 AM
Daily revenue today is compared to the last 30 same-weekdays
Holiday weeks are flagged and excluded from baseline (you don't want last week's Black Friday spike making this Monday look flat by comparison)

The result: alerts fire when something unusual happens, not when Tuesday isn't Wednesday.

Severity — what actually pages you

Three levels:

Severity	Trigger	Channel
P1 — Critical	> 4× sensitivity band; high-stakes KPIs (revenue, conversion, refunds)	In-app + email + SMS + Slack (if integrated)
P2 — Warning	2–4× band; medium-stakes KPIs	In-app + email + Slack
P3 — Info	1–2× band; informational	In-app only

You can override sensitivity per KPI. A merchant who runs flash sales might set "AOV anomalies" to lower sensitivity (it spikes naturally). A merchant who just changed shipping thresholds might set "AOV anomalies" to higher sensitivity for a week to catch unintended consequences.

Suppression — knowing when not to alert

Three classes of suppression:

Self-suppression on cause known. Acknowledge an anomaly with a reason ("known pixel issue, fixing tomorrow") — same anomaly won't re-fire for 24 hours.
Coordinated alert dedup. A single root cause that affects 10 KPIs (e.g. pixel break) fires one compound alert, not 10.
Quiet hours. You can set quiet hours on P3 (info) alerts; P1 always pages.

Localization — narrowing the search

The most useful part of an anomaly alert is the where:

"Conversion rate dropped 28% — confined to:
  - Device: iOS Safari only
  - Time: starting 2026-05-09 11:30 AM
  - Affected pages: checkout (specifically /cart and /checkout)
  - Geography: not localized (worldwide)
  - Cohort: not localized (all lifecycle stages)"

That alert points at one investigation: a Safari-specific change to checkout deployed at 11:30 AM. Without localization, the investigation is "look at everything that happened this week." With it, the fix is in the next deploy.

Likely-cause hints

The system maintains a library of anomaly patterns — combinations of KPI changes that almost always have one cause:

Pattern	Likely cause
Open rate ↓ across all channels + bounce rate ↑	Sender reputation / deliverability issue
Conversion ↓ + page-load-time ↑ on same device	Site performance regression
New-customer ↓ on one channel only	Channel acquisition paused or pixel broken
AOV ↓ + Stripe avg-discount ↑	Discount code leaked / over-redeemed
Refunds ↑ on one SKU	Product quality / shipping issue with that SKU
WA read-rate ↓ + WA send count = 0	Template paused by Meta or account suspended
Repeat rate ↓ + journey enrolment count = 0	Journey worker stuck / paused

When the alert matches a pattern, the cause hint shows up in the notification — saving you the diagnostic step.

Real merchant scenarios

Scenario A — Catches a broken pixel within 4 hours

Setup. Mid-market brand pushed a site redesign overnight. Storefront pixel reference URL changed; pixel stopped firing.

Detection at 4:12 AM. Anomaly fired:

P1: Pixel events ↓ 96% — starting 2026-05-08 23:18
   Localized: all pages, all devices
   Likely cause: pixel deployment issue
   Suggested: check storefront pixel install

Investigation took 8 minutes. Engineer rolled back pixel file at 4:25 AM. Total downtime: ~5 hours overnight.

Cost saved. Without detection, the pixel break would have been noticed when conversion-attribution looked off in Friday's weekly review — 4 days later. Estimated revenue impact:

Window	Pixel-fed campaigns affected	Estimated lost revenue
5 hours (caught)	0 — overnight, low traffic	~$0
4 days (uncaught)	8 retargeting campaigns	~$45K

The alert paid for the entire feature in one incident.

Scenario B — Discount code leaked to a coupon site

Setup. Brand shared "WELCOME20" with email subscribers. Someone posted it to a coupon aggregator. Redemption rate exploded.

Anomaly fired at 2:30 PM (4 hours after the leak):

P2: Discount-redemption rate ↑ 740% — starting 2026-05-08 10:32
   Affected code: WELCOME20
   Localized: traffic source — heavy "couponcode.com" referral
   Likely cause: discount code distribution leak
   Suggested: review redemption sources, consider code rotation

Action. Brand swapped the code (WELCOME20 → WELCOME20-NEW), sent the new code to opted-in customers only. Redemption returned to baseline within 2 hours.

Cost saved. Pre-rotation: ~$8,400/hour in unintended margin loss. Quick detection saved ~$45K.

Scenario C — Meta paused a WhatsApp template

Setup. Brand uses a WhatsApp template for cart recovery. Meta auto-paused the template due to a flag (the template included "FREE" in caps which Meta now classifies as promotional spam).

Detection at 9:18 AM (template paused at 9:02 AM):

P1: WhatsApp template send rate = 0 — starting 2026-05-08 09:02
   Affected template: cart_recovery_v2
   Localized: only this template; other templates fine
   Likely cause: template paused/rejected by provider
   Suggested: check WhatsApp Business Manager template status

Action. Marketing manager logged into WA Business Manager, confirmed pause, edited template (removed all-caps "FREE"), resubmitted. Approved 2 hours later.

Cost saved. Cart recovery normally drives ~~$1,800/day at this brand. Detected within 16 minutes of breakage, fixed within 2.5 hours. Without detection: would have been noticed in weekly review (~~$12K lost).

Scenario D — Catching a churn cohort 90 days early

Setup. Apparel brand. One specific cohort ("repeat buyers, ages 25–34, mobile") started showing engagement decline. Revenue unaffected so far (they were ordering on habit).

Anomaly at week 3 of the trend:

P3: Email engagement on cohort "repeat-25-34-mobile" ↓ 28% — starting 2026-04-12
   Localized: device — Android Chrome older versions
   Likely cause: rendering issue on older browsers or display bug
   Suggested: review recent template changes, check Android
   Chrome compatibility on older versions

Investigation. Email template change 3 weeks earlier had broken on a specific older Android browser (Chrome 110 and below). That cohort's primary device.

Fix shipped in 1 week. Engagement recovered. Revenue never dropped because the brand caught it during the engagement-leading- indicator phase, not the revenue-trailing-indicator phase.

This is the highest-leverage use of anomaly detection — leading indicators flag problems before lagging indicators (revenue) have to.

Scenario E — Subscription churn surge from one bug

Setup. Subscription box. Skip-this-month feature broke on a deploy — customers couldn't skip, so they auto-churned instead.

Detection at 6:40 AM (deploy at 02:13 AM):

P1: Subscription cancellation rate ↑ 11× — starting 2026-05-08 02:14
   Localized: customers who clicked "skip" on the dashboard
   Likely cause: subscription action / dashboard regression
   Suggested: check recent deploy + skip flow

Action. Engineering reverted the deploy at 7:02 AM. Re-activated the 47 customers who'd accidentally cancelled with an apologetic email + 1 free month.

Cost saved. 47 customers × ~$80/month × 6 month avg remaining LTV = ~$22.5K. Plus brand reputation. Plus the operational chaos that would have followed.

Scenario F — False alarm rate matters

Setup. Brand previously used a generic monitoring tool with naive "deviation from yesterday" alerts. Got 5–8 alerts per day, mostly noise. Stopped reading them.

On this platform. Anomaly detection uses day-of-week-aware baselines + severity classification + suppression. Result:

Period	Alerts received	Alerts actioned
Generic tool (week)	~30	~2
This platform (week)	~3	~3

Read rate went from ~7% → ~100%. That's the difference between "noise" and "signal" — and it's why the suppressed cause hints + day-of-week baselining matter so much.

Best practices

✅ Tune sensitivity per KPI. Default is good; some brands (flash-sale heavy) should reduce sensitivity on AOV, while others (deliverability-focused) should increase on bounce rate.

✅ Always set up Slack integration if you have one. P1 alerts hitting Slack are 10× more actionable than email-only.

✅ Read every P1 within 15 minutes of arrival. That's the operational discipline that makes the system worth it.

✅ Acknowledge anomalies with cause notes. "Pixel issue, deploying fix" lets the system suppress for 24h and avoids alert fatigue.

✅ Use info-only alerts for trend awareness. P3 anomalies on engagement metrics catch slow drifts that revenue-only monitoring misses.

✅ Run a quarterly "alert audit" — review the last 90 days of P1+P2 alerts. Adjust sensitivity on any noisy ones; investigate any that didn't fire when they should have.

❌ Don't disable alerts because they're inconvenient. If a KPI keeps firing, fix the underlying volatility (or tune sensitivity), don't silence the canary.

❌ Don't treat correlation as causation. "Conversion dropped and we changed the homepage" doesn't prove the homepage caused it. The localization tells you where to investigate, not why.

❌ Don't act on a single P3 alert. Info-level alerts can flicker; wait for a sustained pattern (3+ data points) before acting.

❌ Don't forget to re-enable alerts you suppressed. The 24h auto-resume covers most cases, but manually-disabled KPIs stay disabled until you re-enable them.

Plan tiers

Capability	Free	Starter	Pro	Agency	Enterprise
Daily revenue + order anomalies	—	✓	✓	✓	✓
Channel engagement anomalies	—	—	✓	✓	✓
Compliance / deliverability anomalies	—	—	✓	✓	✓
Custom KPI tracking	—	—	—	✓	✓
Day-of-week baseline	—	✓	✓	✓	✓
Severity classification	—	✓	✓	✓	✓
Localization (segment / device / channel)	—	—	✓	✓	✓
Likely-cause hints	—	—	✓	✓	✓
Slack integration	—	—	✓	✓	✓
SMS alerts (P1 only)	—	—	—	✓	✓
Multi-shop anomaly roll-up	—	—	—	✓	✓
Programmatic webhook on alert	—	—	✓	✓	✓

Frequently asked

How fast does an anomaly fire? Hourly KPIs fire within ~10 minutes of the deviating sample; daily KPIs fire within an hour of midnight UTC. P1 alerts notify immediately; P2/P3 batch up to 30 minutes.

Can I tune which KPIs are monitored? Yes — every default KPI can be enabled, disabled, or have its sensitivity tuned. Custom KPIs (Agency+) can be added with a formula; e.g. "ratio of cart-recovery sends to cart-abandonment events".

What if my brand has natural volatility (e.g. flash sales)? The system handles known events: schedule a flash sale and the baselines exclude it from "normal" calculations. For unknown volatility, increase sensitivity on the affected KPI.

Can anomalies trigger automated actions? Yes (Pro+) — anomaly webhook fires on any alert; you can wire that to "pause campaigns" / "notify on-call" / "trigger investigation flow". Common pattern: P1 anomaly on revenue → auto-pause all promotional sends until acknowledged.

Does this catch everything? No detection system is perfect. The system is tuned for ~85% recall (catches 85% of real issues) at < 1 false alarm/week per KPI. The trade-off is intentional — false alarm rate above 1/week trains people to ignore alerts, which is worse than missing some.

Can I see historical anomaly patterns? Yes — the anomaly history tab shows every alert ever fired, with acknowledgment status, cause notes, and resolution. Useful for post-mortem reviews and pattern-spotting.

What's the difference between this and a metrics dashboard? Dashboards require you to look at them. Anomaly detection pages you when something matters. Both useful — but the proactive paging is what catches problems fast.

How to use it​

Why this matters for your business​

What this typically unlocks​

What you actually get​

How it powers every part of your store​

How it works (without the technical bits)​

Baselines — robust to weekly cycles​

Severity — what actually pages you​

Suppression — knowing when not to alert​

Localization — narrowing the search​

Likely-cause hints​

Real merchant scenarios​

Scenario A — Catches a broken pixel within 4 hours​

Scenario B — Discount code leaked to a coupon site​

Scenario C — Meta paused a WhatsApp template​

Scenario D — Catching a churn cohort 90 days early​

Scenario E — Subscription churn surge from one bug​

Scenario F — False alarm rate matters​

Best practices​

Plan tiers​

Frequently asked​

See also​