You open Automation Studio on Monday morning and see it: a red “Error” status on an automation that was supposed to run all weekend. Customer welcome emails haven’t sent since Friday. Three days of new signups are sitting in a data extension, waiting for a journey that never triggered.
Sound familiar? You’re not alone. Here are the 7 most common SFMC automation failures we see, and exactly how to prevent each one.
1. File Drop Automations That Never Fire
What happens: A file drop automation is waiting for a file from an external system (CRM, data warehouse, FTP). The file never arrives, so the automation never starts. No error is logged because technically nothing “failed” — it just never ran.
How to prevent it: Monitor for absence of activity, not just errors. If a file drop automation that normally runs daily hasn’t triggered in 24 hours, you need an alert. This is where most manual monitoring fails — you can’t check for something that didn’t happen unless you’re tracking expected schedules.
2. SQL Query Errors in Query Activities
What happens: A query activity references a field that was renamed, a data extension that was deleted, or uses syntax that worked in a previous SFMC release but now throws an error. The automation runs, the query fails, and downstream activities operate on stale or empty data.
How to prevent it: Test queries after any schema change. Monitor data extension row counts after query activities run — if a DE that normally has 50,000 rows suddenly has 0, the query likely failed. Automated monitoring can flag these anomalies instantly.
3. Expired SFTP or API Credentials
What happens: File transfer activities fail because SFTP credentials expired or were rotated by IT. This is especially common in enterprise environments where security policies mandate credential rotation every 60-90 days.
How to prevent it: Maintain a credential rotation calendar and test connections proactively. When monitoring detects a file transfer failure, the alert should include enough context to immediately identify credential expiry as the likely cause.
4. Data Extension Schema Mismatches
What happens: An import activity fails because the source file has a new column, a changed column order, or a data type mismatch. This often happens when upstream systems change their export format without notifying the SFMC team.
How to prevent it: Set up validation checks that verify imported row counts match expectations. Monitor for partial imports — an automation might “succeed” but only import 100 of 10,000 expected rows because of a schema issue in row 101.
5. Journey Builder Entry Source Depletion
What happens: A journey’s entry source data extension stops receiving new records. The journey shows “Running” but isn’t injecting anyone. From the Journey Builder UI, everything looks fine — you only notice when campaign metrics drop to zero.
How to prevent it: Monitor journey injection rates alongside entry source data extension populations. If the entry DE’s row count flatlines, or if the journey’s injection count drops below historical averages, trigger an alert. This requires looking at the system holistically, not just at individual components.
6. Send Throttling and Deliverability Hits
What happens: An automation triggers a send to a larger-than-expected audience (e.g., a segmentation query returns too many results due to a missing WHERE clause). This blows through your hourly send limit, causes throttling on subsequent sends, and can damage your sender reputation with ISPs.
How to prevent it: Monitor send volumes against expected ranges. Flag any send where the audience size exceeds the historical average by more than 2x. This simple check can prevent accidental mass sends and the deliverability problems they cause.
7. Triggered Send Definition Deactivation
What happens: A triggered send gets paused or deactivated — sometimes by a team member, sometimes by SFMC itself due to excessive errors. Journeys and automations that reference this triggered send continue to run, but no emails actually send. SFMC doesn’t alert on this.
How to prevent it: Regularly audit triggered send statuses. If a triggered send that handles critical communications (welcome emails, order confirmations, password resets) goes inactive, you need to know within minutes, not days.
The Common Thread
Notice the pattern? Most of these failures are silent. SFMC won’t page you at 2 AM because a journey stopped injecting contacts. It won’t send a Slack message when an automation hasn’t run in 24 hours. It just… continues, quietly broken.
That’s why purpose-built monitoring exists. Martech Monitoring checks your automations, journeys, and data extensions on a schedule, and alerts you the moment something deviates from expected behavior. You can start monitoring for free — no credit card required.
Because the only thing worse than an SFMC failure is an SFMC failure nobody knows about.