← blog Leggi in italiano
EN 9 min read

From Sanity to browser: event-driven cache invalidation with SQS FIFO

How I built the invalidation pipeline between Sanity and the GraphQL cache: HMAC-signed webhooks, SQS FIFO queue with delivery delay, cascade revalidation via DynamoDB routing table, and selective purge on Stellate + Next.js ISR.

awssanitygraphqlecommercearchitecture

The Stellate article described the invalidation cycle briefly. In practice, this part of the system is the most delicate: a mistake means either stale data in production or a Lambda running in circles on already-processed messages. It’s worth describing in detail.

The context: a GraphQL BFF on Lambda in front of Sanity and Shopify, with Stellate as CDN and Next.js as frontend. Every Sanity publish must propagate in under a minute through three layers: Stellate purge, DynamoDB routing table update, Next.js ISR revalidation.

The CMS burst problem

An editor updating a product description saves the draft four times before publishing. Without throttling, each save triggers a webhook → a Lambda → a Stellate call. Five purges for a single change.

The solution is putting a queue between the webhook and the consumer. But the choice of queue type is non-trivial.

Why SQS FIFO

A standard SQS queue gives no ordering guarantees and can deliver the same message more than once. In an invalidation system, processing the same document twice is harmless but expensive — each processing requires DynamoDB queries, Stellate calls, and a request to the frontend. With a FIFO queue:

  • Messages within the same MessageGroupId are delivered in order
  • A message is not delivered until the previous one in the same group has been processed or expired
  • contentBasedDeduplication eliminates identical messages received within 5 minutes

The MessageGroupId is set to the Sanity document’s _typeproduct, page, category. This means products queue behind each other without blocking pages, and vice versa.

Queue configuration

const revalidateQueue = new Queue(stack, 'revalidate-queue', {
  cdk: {
    queue: {
      fifo: true,
      contentBasedDeduplication: true,
      queueName: `${app.stage}-revalidate.fifo`,
      visibilityTimeout: toCdkDuration('6 minutes'),
      deliveryDelay: toCdkDuration('45 seconds'),
      deadLetterQueue: {
        queue: revalidateDlq.cdk.queue,
        maxReceiveCount: 3,
      },
    },
  },
});

Four decisions here.

deliveryDelay: 45 seconds — messages aren’t delivered to the consumer for 45 seconds after being sent. If the editor saves the product ten times in quick succession, all messages arrive at the consumer together as a batch. Instead of ten separate purges, only one runs.

contentBasedDeduplication: true — SQS computes a hash of the message body. If two messages with the same MessageGroupId have the same hash, the second is silently discarded. Result: multiple saves of the same document collapse into a single message.

visibilityTimeout: 6 minutes — while a consumer Lambda is processing a message, that message becomes invisible to other consumers. The timeout must exceed the Lambda timeout (2 minutes). If the Lambda takes longer than the visibilityTimeout to finish, the message becomes visible again and gets reprocessed — potentially in parallel.

maxReceiveCount: 3 — if the consumer fails three times on the same message, the message moves to the DLQ for manual inspection. Without this, a corrupt message could cycle indefinitely.

The contentBasedDeduplication problem

contentBasedDeduplication uses JSON.stringify of the message body to compute the hash. Two objects with the same content but properties in different order produce different JSON — and therefore different hashes, and therefore no deduplication.

Sanity adds a _rev field to every document. _rev changes on every save, even if the content didn’t change. Without stripping it, two consecutive saves of the same document produce messages with different _rev → different hashes → no deduplication.

I solved this with a toMessageBody function that:

  1. Removes _rev from the document before serializing
  2. Recursively sorts object keys before JSON.stringify
export const toMessageBody = <T extends Record<string, unknown>>(obj: T) =>
  JSON.stringify(deepSortObjectByKeys(omit(obj, ['_rev'])));

With this, two saves of the same document with identical content produce the same hash → SQS deduplicates → the consumer sees a single message.

The webhook Lambda

The webhook receives the POST from Sanity, validates the HMAC signature, and enqueues the document:

export const revalidateHandler = () =>
  Handler('api', async ({ requestContext }) => {
    const { error, success } = useResponse();

    if (requestContext.http.method !== 'POST') {
      return error({ message: 'Method not allowed', statusCode: 405 });
    }

    const { isValidSignature } = useSanityWebhook({
      secret: process.env.SANITY_REVALIDATE_WEBHOOK_SECRET || '',
    });

    if (!isValidSignature()) {
      return error({ message: 'Invalid signature', statusCode: 401 });
    }

    const { jsonBody, sendMessage } = useQueue();

    const { MessageId, $metadata } = await sendMessage({
      queueUrl: process.env.REVALIDATE_QUEUE_URL || '',
      groupId: jsonBody?._type,
    });

    if ($metadata.httpStatusCode !== 200) {
      return error({ message: 'Unable to send message to queue' });
    }

    return success({ message: `Document "${jsonBody?._id}" sent to queue` });
  });

The HMAC signature uses @sanity/webhook:

export const useSanityWebhook = ({ secret }: { secret: string }) => ({
  isValidSignature: () => {
    const body = useBody();
    if (!body) return false;
    const signature = useHeader(SIGNATURE_HEADER_NAME) || '';
    return isValidSignature(body, signature, secret);
  },
});

SIGNATURE_HEADER_NAME is sanity-webhook-signature — Sanity signs the body with HMAC-SHA256 and puts the signature in the header. Validation fails if the body was altered or the secret doesn’t match.

The Lambda responds to Sanity immediately after enqueueing the message. It doesn’t wait for processing — that happens asynchronously in the consumer.

The consumer Lambda

The consumer receives a batch of messages (up to 10) and processes them together:

export const revalidateConsumerHandler = ({ routingQuery, routingConfig }) =>
  Handler('sqs', async ({ Records }) => {
    const documentIds: string[] = [];

    for (const record of Records) {
      const document = parseJsonToObject<SanityDocumentLike>(record.body);

      if (isSanityDocument(document) && !documentIds.includes(document._id)) {
        documentIds.push(document._id);
      }
    }

    if (!documentIds.length) return;

    const routingData = await selectiveRevalidate({
      routingQuery,
      routingConfig,
      documentIds,
    });

    const slugs = uniqueArray(routingData.map(({ slug }) => slug));

    const routingConfigGroupedByType = groupBy(routingData, ({ typename }) => typename);

    const cdnPurgePromises = [
      ...Object.entries(routingConfigGroupedByType).map(([type, docs]) =>
        cdnPurgeType({
          type,
          keyFields: docs.reduce<KeyFieldInput[]>(
            (acc, { documentId }) =>
              acc.find(({ value }) => value === documentId)
                ? acc
                : [...acc, { name: '_id', value: documentId }],
            []
          ),
        })
      ),
      cdnPurgeType({
        type: 'Routing',
        keyFields: slugs.map(value => ({ name: 'slug', value })),
      }),
    ];

    await Promise.all(cdnPurgePromises);
    await storefrontRevalidate({ slugs });
  });

The consumer deduplicates documentIds extracted from the batch (FIFO guarantees ordering but the consumer can still receive duplicate messages in certain edge cases). Then it delegates everything to selectiveRevalidate.

selectiveRevalidate and cascade dependencies

This is the most interesting part. Changing a Sanity document can invalidate pages that don’t directly contain that document.

Example: a “global component” (a banner, a menu, a footer section) is referenced by 50 pages. When the banner changes, all 50 pages need to be revalidated. If the routing table stored only the direct relationship documentId → slug, this wouldn’t work.

The solution is a dependencies field in every DynamoDB routing table record. Each route has a list of Sanity _ids that contribute to its content — the primary document plus all referenced documents.

export const selectiveRevalidate = async ({ routingQuery, routingConfig, documentIds }) => {
  // Find all records where documentId matches OR
  // the document is in the route's dependencies
  const routingDataToRevalidate = await fetchAllDocumentToRevalidateFromTable(documentIds);

  const documentIdsToRevalidate = uniqueArray([
    ...documentIds,
    ...routingDataToRevalidate.map(({ documentId }) => documentId),
    ...routingDataToRevalidate.flatMap(({ dependencies }) => dependencies),
  ]);

  // Rebuild routing for all affected IDs
  const routingTable = await buildRoutingTable({
    routingConfig,
    documentIds: documentIdsToRevalidate,
    routingQuery,
  });

  await saveRoutingTable(routingTable);
  return routingTable;
};

The DynamoDB query uses a FilterExpression with OR contains(dependencies, :dep):

const filterExpression = documentIds.reduce(
  ({ filterExpression, filterExpressionAttrValues }, documentId, index) => {
    const attributeName = `:dep${index}`;
    const containClause = `documentId = ${attributeName} OR contains (dependencies, ${attributeName})`;
    return {
      filterExpression: [...filterExpression, containClause],
      filterExpressionAttrValues: {
        ...filterExpressionAttrValues,
        [attributeName]: documentId,
      },
    };
  },
  { filterExpression: [], filterExpressionAttrValues: {} }
);

contains on a DynamoDB list attribute searches for an exact element — in this case a Sanity _id in the record’s dependency list.

The routing table as an idempotent hash

saveRoutingTable uses a ConditionExpression to write only if the content changed:

const itemHash = hash(docJson);

return new PutCommand({
  Item: { ...doc, itemHash },
  ConditionExpression:
    'attribute_not_exists(itemHash) OR (itemHash <> :itemHash)',
  ExpressionAttributeValues: {
    ':itemHash': itemHash,
  },
  TableName: process.env.ROUTING_TABLE_NAME,
});

If the record already exists with the same hash, the PutCommand throws a ConditionalCheckFailedException — caught and silently ignored. This makes the write idempotent: re-running saveRoutingTable on the same data produces no unnecessary writes.

Stellate purge and ISR revalidation

After selectiveRevalidate, the consumer runs three operations in parallel:

Purge by typename: groups documents by GraphQL type (SpfProduct, Page, etc.) and calls _purgeType for each with _ids as keyFields. Stellate invalidates only the cache entries that match those documents.

Routing type purge: invalidates the routes in Stellate’s cache by slug. Necessary because the allRouting resolver is cached separately.

Next.js ISR: calls the frontend’s revalidation endpoint with the list of slugs:

export const storefrontRevalidate = ({ slugs, baseUrl, secret }) =>
  redaxios.post(`${baseUrl}/api/revalidate`, { slugs }, { params: { secret } });

Next.js on-demand ISR rebuilds the indicated pages in the background — users visiting during the rebuild still get the stale version, the next visit gets the updated one.

The DLQ

The Dead Letter Queue is a second FIFO queue. When the consumer fails three times on the same message, SQS moves it there automatically. Without manual inspection, that document remains stale in production indefinitely.

maxReceiveCount: 3 is a balance: too low and temporarily unprocessable messages (Stellate momentarily unreachable) end up in DLQ; too high and a document causing a systematic error blocks its MessageGroupId for too long.

A common pattern is a second Lambda that reads from the DLQ periodically and sends an alert — in our case, inspection was manual via the AWS console.

What I learned

deliveryDelay solves the burst without stateful aggregation logic. The alternative is a stateful aggregator — more complex to manage and test.

contentBasedDeduplication requires deterministic serialization. It’s not prominently documented but is the critical requirement: same key, same order, no volatile fields. Sanity’s _rev is the canonical example.

Cascade dependencies are necessary for any structured CMS. A Sanity document that’s a global component can touch dozens of URLs. Without the dependencies field, those pages remain stale even after the purge.

visibilityTimeout must be coordinated with Lambda timeout. If the Lambda timeout is 2 minutes, visibilityTimeout should be at least 3-4 minutes. Otherwise SQS re-delivers the message while the Lambda is still working — and the document gets processed twice.