Idempotent publishing: how to avoid duplicate social media posts

When retries and network failures happen, you can end up with duplicate posts. Cover idempotency strategies, deduplication keys, and how to design publishing systems that are safe to retry.

Idempotent publishing: how to avoid duplicate social media posts

The post that appeared twice

You build a publishing workflow. It works. Content goes from your system to X, LinkedIn, and Threads. You walk away.

Then one day you check your timeline and see the same post twice. Same text, same image, posted minutes apart. Your LinkedIn page has a duplicate. Threads has a duplicate. X has two identical tweets in a row.

Nothing in your system reported an error. Nothing explicitly asked to publish twice. But somewhere between a timeout, a retry, and an ambiguous API response, the post was created once — and then created again.

This is the duplicate post problem. It is not exotic. It is not a bug in the platform APIs. It is the natural consequence of publishing into systems you do not fully control, combined with the natural impulse to retry when something appears to fail.

Why duplicates happen

Duplicates come from a gap between what your system knows and what actually happened.

The typical sequence:

  1. Your system sends a publish request
  2. The request reaches the platform
  3. The platform creates the post
  4. The response travels back to your system
  5. The response is lost — network timeout, connection reset, server error before the response body arrives

Your system sees a failure. The platform sees a success. The post is live, but your system does not know that. So it retries. The platform receives what it considers a new request — because it is a new request — and creates a second post.

This is not a bug in the retry logic. The retry logic did exactly what it was supposed to. The problem is that social media APIs are not idempotent by default. Calling POST /tweets twice creates two tweets. There is no built-in mechanism to say “this is the same request as before, do not create a new one.”

The same pattern appears in other scenarios:

  • Webhook retries. Your CMS fires a webhook when content is published. Your handler calls the publishing API. The handler returns a 500 due to an unrelated error after the publish call succeeds. The CMS retries the webhook. The handler runs again. Second post.
  • Queue redelivery. A message queue delivers a publish job to a worker. The worker publishes successfully but crashes before acknowledging the message. The queue redelivers. The worker publishes again.
  • Cron overlap. A cron-based poller takes longer than expected. The next cron invocation starts before the first one finishes. Both process the same article.
  • User impatience. A button click triggers a publish. The response is slow. The user clicks again. Two requests, two posts.

Every automated publishing system will eventually encounter one of these. The question is whether your system handles them or pretends they cannot happen.

What idempotency means for publishing

An operation is idempotent when performing it multiple times produces the same result as performing it once. A GET request is naturally idempotent — fetching a page ten times does not create ten pages. A POST request that creates a social media post is not naturally idempotent — posting ten times creates ten posts.

Making publishing idempotent means ensuring that if the same publish intent is sent more than once, only one post is created. The second, third, or tenth request recognizes that it is a duplicate and either returns the original result or does nothing.

This is not the same as deduplication after the fact. Deleting the second post after it appears is damage control, not idempotency. True idempotency prevents the duplicate from being created in the first place.

Strategy 1: Client-side idempotency keys

The most common approach to idempotent API calls is an idempotency key — a unique identifier that the client generates and sends with each request. The server uses this key to recognize duplicate requests.

The pattern:

  1. Before sending a publish request, generate a unique key (a UUID, a hash of the content, or a deterministic identifier)
  2. Include the key in the request
  3. The server checks whether it has already processed a request with that key
  4. If yes, return the original response without creating a new post
  5. If no, process the request normally and store the key with the result
import { randomUUID } from "crypto";
const idempotencyKey = randomUUID();
const response = await fetch("https://api.postproxy.dev/api/posts", {
method: "POST",
headers: {
"Authorization": `Bearer ${POSTPROXY_API_KEY}`,
"Content-Type": "application/json",
"Idempotency-Key": idempotencyKey,
},
body: JSON.stringify({
post: { body: "Your post text here" },
profiles: ["twitter", "linkedin", "threads"],
}),
});

If your retry logic sends the same request with the same idempotency key, the server knows it is a duplicate and does not create a second post.

Key generation matters. A random UUID is safe but means you need to store and reuse the same UUID across retries. A deterministic key — derived from the content itself — is more robust because any system that produces the same content will naturally produce the same key.

import { createHash } from "crypto";
function generateIdempotencyKey(body, profiles) {
const input = JSON.stringify({ body, profiles });
return createHash("sha256").update(input).digest("hex");
}

This approach has a tradeoff: if you intentionally want to publish the same text again (a recurring post, for example), a content-based key will block it. Use a key that includes a timestamp or scheduling context to distinguish intentional repetition from accidental duplication.

Strategy 2: Track what you have already published

If the publishing API does not support idempotency keys, you can implement deduplication on your side by tracking what has been published.

Before each publish call, check your local record:

import fs from "fs";
const PUBLISHED_FILE = "./published-posts.json";
function alreadyPublished(articleId) {
if (!fs.existsSync(PUBLISHED_FILE)) return false;
const published = JSON.parse(fs.readFileSync(PUBLISHED_FILE, "utf-8"));
return published.includes(articleId);
}
function markPublished(articleId) {
const published = fs.existsSync(PUBLISHED_FILE)
? JSON.parse(fs.readFileSync(PUBLISHED_FILE, "utf-8"))
: [];
published.push(articleId);
fs.writeFileSync(PUBLISHED_FILE, JSON.stringify(published));
}
async function publishIfNew(article) {
const articleId = article.guid || article.link;
if (alreadyPublished(articleId)) {
console.log(`Skipping already-published article: ${article.title}`);
return;
}
const response = await fetch("https://api.postproxy.dev/api/posts", {
method: "POST",
headers: {
"Authorization": `Bearer ${POSTPROXY_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
post: { body: `${article.title}\n\n${article.link}` },
profiles: ["twitter", "linkedin", "threads"],
}),
});
if (response.ok) {
markPublished(articleId);
}
}

This is simple and works for most cases. But it has the exact gap described earlier: if the publish call succeeds but the process crashes before markPublished runs, the next invocation will publish again.

To close that gap, mark the article as “in progress” before publishing and “completed” after:

function markInProgress(articleId) {
const state = loadState();
state[articleId] = { status: "in_progress", startedAt: Date.now() };
saveState(state);
}
function markCompleted(articleId, postId) {
const state = loadState();
state[articleId] = { status: "completed", postId };
saveState(state);
}
async function publishIfNew(article) {
const articleId = article.guid || article.link;
const state = loadState();
if (state[articleId]?.status === "completed") {
return; // Already published
}
if (state[articleId]?.status === "in_progress") {
const elapsed = Date.now() - state[articleId].startedAt;
if (elapsed < 60000) {
return; // Another process is likely handling this
}
// Stale in_progress state — safe to retry
}
markInProgress(articleId);
const response = await fetch("https://api.postproxy.dev/api/posts", {
method: "POST",
headers: {
"Authorization": `Bearer ${POSTPROXY_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
post: { body: `${article.title}\n\n${article.link}` },
profiles: ["twitter", "linkedin", "threads"],
}),
});
if (response.ok) {
const result = await response.json();
markCompleted(articleId, result.id);
}
}

This does not eliminate duplicates entirely — there is still a window between the publish call succeeding and the state file updating — but it narrows the window dramatically. For most blog-to-social workflows, this level of protection is sufficient.

Strategy 3: Content-based deduplication

Instead of tracking what you have sent, check what already exists on the destination.

Before publishing, query your recent posts and compare:

async function isDuplicate(postBody) {
const response = await fetch("https://api.postproxy.dev/api/posts?limit=20", {
headers: {
"Authorization": `Bearer ${POSTPROXY_API_KEY}`,
},
});
const recentPosts = await response.json();
return recentPosts.some((post) => post.body === postBody);
}

This catches duplicates regardless of how they were created — retries, queue redelivery, cron overlap, manual mistakes. It works even if your state file was lost or corrupted.

The tradeoff is performance and reliability. An extra API call before every publish adds latency. If the check itself fails, you are back to guessing. And exact string matching may miss near-duplicates (same text with different whitespace, for example).

Content-based deduplication works best as a safety net layered on top of one of the other strategies, not as the primary mechanism.

Strategy 4: Atomic state transitions

For systems using a database, the most robust approach is to make the state transition and the publish intent atomic.

-- Attempt to claim the article for publishing
UPDATE articles
SET publish_status = 'publishing',
publish_started_at = NOW()
WHERE id = :article_id
AND publish_status = 'pending';

If the UPDATE affects one row, you have claimed the article. Proceed with publishing. If it affects zero rows, another process already claimed it. Skip.

After publishing:

UPDATE articles
SET publish_status = 'published',
post_id = :post_id,
published_at = NOW()
WHERE id = :article_id;

The database acts as a coordination layer. Even if ten workers process the same article simultaneously, only one will successfully transition it from pending to publishing. The others see zero affected rows and back off.

This is the same pattern used in job queues, payment processing, and any system where exactly-once execution matters. It is heavier than a JSON file but eliminates the race conditions that file-based tracking cannot fully prevent.

Designing retries that are safe

Once you have a deduplication mechanism in place, retries become safe instead of dangerous. But the retry logic itself still matters.

Retry only on ambiguous failures. A 500 error might mean the request was never processed, or it might mean it was processed but the response failed. A 400 error means the request was rejected — retrying will produce the same rejection. Only retry on status codes that indicate a transient or ambiguous failure (500, 502, 503, 504, network timeouts).

async function publishWithRetry(payload, maxRetries = 3) {
const idempotencyKey = generateIdempotencyKey(payload);
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
const response = await fetch("https://api.postproxy.dev/api/posts", {
method: "POST",
headers: {
"Authorization": `Bearer ${POSTPROXY_API_KEY}`,
"Content-Type": "application/json",
"Idempotency-Key": idempotencyKey,
},
body: JSON.stringify(payload),
});
if (response.ok) return await response.json();
// Do not retry client errors
if (response.status >= 400 && response.status < 500) {
throw new Error(`Client error: ${response.status}`);
}
// Retry on server errors
console.log(`Attempt ${attempt + 1} failed with ${response.status}`);
} catch (err) {
if (err.message.startsWith("Client error")) throw err;
console.log(`Attempt ${attempt + 1} failed: ${err.message}`);
}
// Exponential backoff
if (attempt < maxRetries) {
const delay = Math.pow(2, attempt) * 1000;
await new Promise((resolve) => setTimeout(resolve, delay));
}
}
throw new Error("All retry attempts failed");
}

Use exponential backoff. If the failure is due to rate limiting or temporary overload, hammering the endpoint makes things worse. Double the delay between each attempt. A common pattern: 1 second, 2 seconds, 4 seconds.

Cap the number of retries. Three retries is usually enough. If the request fails four times, the issue is unlikely to be transient. Surface the failure to a human or a monitoring system rather than retrying indefinitely.

Preserve the idempotency key across retries. This is the critical detail. The whole point of an idempotency key is that the same key is sent on every attempt. If you generate a new key for each retry, you have defeated the purpose.

Handling partial success with retries

Multi-platform publishing adds another layer. A single publish call to Postproxy might succeed on LinkedIn but fail on X. The response tells you exactly what happened per platform.

The safe retry strategy for partial success:

  1. Parse the per-platform results
  2. Identify which platforms failed
  3. Retry only the failed platforms, not the entire publish

If you retry the entire call — including platforms that already succeeded — and the API does not deduplicate per platform, you will get duplicates on the platforms that worked the first time.

Postproxy returns explicit per-platform outcomes. Use them. A partial success is not a failure — it is information about what still needs to happen.

The cost of not handling this

Duplicate posts are not just aesthetically annoying. They have real consequences.

Credibility. A duplicate post looks like a mistake. On a personal account, it is mildly embarrassing. On a brand account, it looks unprofessional. On a company’s LinkedIn page, it raises questions about whether anyone is actually managing the account.

Platform penalties. Some platforms penalize duplicate content in their algorithms. Posting the same text twice in quick succession may reduce the reach of both posts. Repeated duplicates can trigger spam detection.

Broken analytics. Engagement is split across two identical posts. Your metrics show half the likes, half the comments, on each. The actual performance of the content is impossible to measure accurately.

Trust in automation. If your automated publishing system produces duplicates, the team loses trust in it. Someone starts manually checking every post. Someone adds an approval step “just in case.” The automation that was supposed to save time now creates more work than doing it manually.

A practical checklist

If you are building or maintaining a publishing workflow, run through these questions:

  1. What happens if the publish API call succeeds but the response is lost? Does your system retry? Does the retry produce a duplicate?

  2. What happens if your webhook handler is called twice for the same event? Is your handler idempotent, or does it blindly publish on every invocation?

  3. What happens if two instances of your poller run simultaneously? Is there a coordination mechanism, or will both publish the same article?

  4. Do you track what has been published? Is the tracking reliable even if the process crashes mid-execution?

  5. Do your retries use a stable idempotency key? Or does each retry look like a new request to the server?

If you cannot confidently answer these questions, your system will produce duplicates. Maybe not today. But eventually, under the right combination of timing and failure, it will.

Idempotency is not paranoia

Building idempotent publishing systems is not over-engineering. It is acknowledging the reality of distributed systems.

Networks are unreliable. Processes crash. Queues redeliver. Cron jobs overlap. Webhooks retry. Every one of these is normal behavior, not an edge case.

Retries are a policy, and idempotency is what makes that policy safe to execute. Without it, every retry is a gamble. With it, retries become a routine recovery mechanism that your system handles calmly, without producing side effects in the real world.

The goal is not to prevent failures. Failures will happen. The goal is to make your system safe to retry when they do — so that recovering from a failure never creates a new problem.

Ready to get started?

Start with our free plan and scale as your needs grow. No credit card required.