← blog Leggi in italiano
EN 7 min read

Address validation in e-commerce: building a service that actually works

Addresses entered by customers are chaotic: abbreviations, ambiguous postal codes, city names with apostrophes, different formats per country. I built a validation service from scratch. Here are the real challenges.

awslambdatypescriptecommercenodejs

Addresses that come in from e-commerce orders are a mess. “VIA ROMA, 1”, “Via roma1”, “v.roma 1”, “Viaroma 1” are all the same address — but no fulfillment system knows that. If an address doesn’t match exactly what the courier expects, the package comes back.

I built an address validation microservice for a fulfillment system handling orders from multiple marketplaces across several European countries. This article covers the architectural decisions and the non-obvious problems I ran into along the way.

What the service does

The service exposes three separate endpoints:

  • POST /validate-address — normalizes and validates an address. Input: addressLine1, city, zipCode, provinceCode, countryCode. Output: formatted address + geo coordinates.
  • POST /validate-customer — normalizes recipient name, phone and email, with country-specific rules.
  • POST /suggest-address — given a partial or imprecise address, returns suggestions from AWS Location Service and Google Maps.

Three independent Lambdas, each with their own dependencies and throttling. Deploying them separately means each endpoint can be scaled and updated without touching the others.

The validation flow

An address going through /validate-address passes through four stages:

1. Sanitization — removes redundancies, normalizes the format, compacts street abbreviations.

2. Validation — checks that fields follow the rules: addressLine1 between 6 and 100 characters, must contain a street number, zipCode in the correct format for the country.

3. Geocoding — calls an external service with postal code and country code, gets back the list of associated localities.

4. Formatting — picks the correct city from the list, normalizes province code and geo coordinates.

Output:

{
  "addressLine1": "VIA ROMA 1",
  "city": "ROMA",
  "zipCode": "00186",
  "provinceCode": "RM",
  "countryCode": "IT",
  "geo": { "lat": 41.8919, "lng": 12.5113 }
}

The street abbreviation problem

The first problem I hit: the same address arrives written in dozens of different ways. “Via”, “V.”, “VIA”, “via.” are all the same thing. So are “Piazza”, “P.zza”, “P.za”, “PIAZZA”.

I built a compactAddressLine() method with over 100 patterns to normalize these:

// PIAZZA → P.ZA
// VIA → V.  (but "VIALE" → "V.LE")
// STRADA PROVINCIALE 123 → SP123
// PIANO PRIMO → P.1
// INTERNO CINQUE → INT.5
// FRAZIONE → FRAZ.

The goal isn’t to make the address readable — it’s to make it consistent. If the courier has “V. ROMA 1” in their database and you send “Via Roma, 1”, the shipping label gets rejected.

One subtlety: decimal numbers in street numbers. “Via Roma 1.5” doesn’t exist in Italy, but it shows up in Amazon orders. It gets normalized to “VIA ROMA 1 5” (space instead of dot) to avoid breaking courier validations.

Ambiguous postal codes: when a zipCode maps to multiple cities

This is the most subtle problem. A postal code doesn’t uniquely identify a city — especially in metropolitan areas. The geocoding service returns a list of possible matches.

I need to pick the right one:

private formatAddress(address: Address, places: Place[]): Address {
  if (places.length === 1) {
    return { ...address, city: places[0].city, provinceCode: places[0].province_code };
  }

  // Normalize the input city name and look for an exact match
  const normalizedInput = this.normalizeCityName(address.city);
  const match = places.find(
    p => this.normalizeCityName(p.city) === normalizedInput
  );

  if (match) {
    return { ...address, city: match.city, provinceCode: match.province_code };
  }

  // No exact match — check if all municipalities share the same province code
  const provinceCodes = places.map(p => p.province_code);
  const safeProvinceCode = provinceCodes.every(p => p === provinceCodes[0])
    ? provinceCodes[0]
    : address.provinceCode; // fallback: use what the user entered

  return { ...address, city: places[0].city, provinceCode: safeProvinceCode };
}

City name normalization handles apostrophes and punctuation:

private normalizeCityName(city: string): string {
  return removeSpecialChar(city)
    .trim()
    .toLowerCase()
    .replace(/['\s.-]+/g, '-');
}
// "Sant'Agata di Militello" → "sant-agata-di-militello"
// "Reggio d'Emilia"         → "reggio-d-emilia"

This way “SANT AGATA DI MILITELLO”, “Sant’Agata di Militello” and “sant-agata-di-militello” all match against the same record.

Country-specific rules

Every country has its quirks. A few I didn’t expect.

Italy — zipCode must be exactly 5 digits. Province code must be 2 characters. If either is wrong, the order gets rejected at the courier.

Ireland (Eircode) — Irish postal codes have a unique format: 7 alphanumeric characters (e.g. D02XY45). I don’t use the standard geocoding service for Ireland — an Eircode already uniquely identifies a zone and requires dedicated handling.

Netherlands — Dutch zipCode format is 4 digits + 2 letters (e.g. 1234AB). The internal system only keeps the 4 numeric digits.

Portugal — format XXXX-XXX. Orders often arrive without the dash; I add it during normalization.

France — recipient phone number is mandatory. Other countries: optional. Customer validation checks this:

const phoneRequiredCountries = ['FR'];
if (phoneRequiredCountries.includes(countryCode) && !customer.phone) {
  throw new CustomerInvalidError('Phone is required for FR');
}

Suggestions from multiple providers

The /suggest-address endpoint serves a different use case: an operator has an address that fails validation and wants to see alternatives. It calls AWS Location Service and Google Maps and returns both responses.

A failing provider must not block the response. If AWS Location is down, I still want the Google Maps result:

private async searchAddress(params: SearchParams): Promise<string> {
  try {
    return await params.addressClient.searchAddress(params.addressLine);
  } catch (error) {
    return error instanceof Error ? error.message : 'Address search failed';
  }
}

async suggestAddress(address: Address): Promise<AddressSuggestion> {
  const [aws, google] = await Promise.all([
    this.searchAddress({ addressClient: this.awsAddressClient, addressLine }),
    this.searchAddress({ addressClient: this.googleAddressClient, addressLine }),
  ]);
  return { aws, google };
}

The caller always receives an {aws, google} object — even if one of them contains an error message instead of an address. This lets the UI show whatever result is available without having to handle complex error states.

The service as a shared package

An architectural choice that proved useful: the service layer is exported as an internal npm package, not just as an HTTP API.

The fulfillment system imports the services directly via the monorepo’s internal package:

import { AddressService, CustomerService, SuggestionService }
  from 'address-validation/services';

This eliminates a network hop for operations on the critical order path. When an order arrives, address validation happens inline — no HTTP call to another Lambda. Only when validation fails and an operator wants to see a suggestion does the REST endpoint come into play.

The validation logic lives in one place and can be used both via API and inline. If a rule changes (e.g. a new country is added), the change propagates everywhere automatically.

What I’d do differently

Sequential instead of parallel suggestions — in the first version I called AWS Location and Google Maps in sequence. Latency was the sum of both. With Promise.all it drops to the slower of the two. I fixed it, but should have done it from the start.

No circuit breaker — if the geocoding service is slow or unreachable, every request waits for the full timeout before failing. A circuit breaker that stops trying after N consecutive failures would reduce perceived latency during an outage.

No per-IP rate limiting — API Gateway throttles at 50 req/s overall, but there’s no granular protection by IP. A client with consistently malformed addresses can consume the entire geocoding quota.


Address validation seems like a solved problem — there are ready-made services like Loqate or SmartyStreets. But integrating an external paid service with your own domain specifics (local street abbreviations, courier-specific rules, internal postal code formats) still requires an adaptation layer. At that point, building it yourself gives you full control and zero paid external dependencies.