All case studies

Payment Processing Engine

Custom payment orchestration layer processing $50M+ annually with 99.99% success rate.

ClientE-commerce Platform
Duration8 months
RoleLead Engineer
$50M+
Volume
99.99%
Success Rate
180ms
Avg Latency
12%
Retry Recovery

Overview

Designed and built a custom payment orchestration layer that processes $50M+ annually across multiple payment providers. The system handles intelligent routing, automatic retry logic, and real-time reconciliation with a 99.99% success rate.

Challenge

The existing payment integration was a direct Stripe integration that:

  • Failed silently on network timeouts, leaving orders in limbo
  • Had no retry logic — failed charges required manual intervention
  • Couldn't route between providers for cost optimization
  • Lacked reconciliation — monthly accounting took days of manual work
Info

Analysis of 6 months of payment data revealed that 3.2% of initial charge attempts failed but were recoverable. At $50M annual volume, that's $1.6M in potentially lost revenue.

Solution

Idempotent Payment Pipeline

Every payment operation is idempotent and observable:

// Payment operations are idempotent via idempotency keys
async function processPayment(intent: PaymentIntent): Promise<PaymentResult> {
  const idempotencyKey = `pay_${intent.orderId}_${intent.attempt}`;
 
  // Check for existing result
  const existing = await db.paymentResult.findUnique({
    where: { idempotencyKey },
  });
  if (existing) return existing;
 
  // Process with timeout and circuit breaker
  const result = await circuitBreaker.fire(async () => {
    return stripe.paymentIntents.create(
      {
        amount: intent.amount,
        currency: intent.currency,
        customer: intent.customerId,
        metadata: { orderId: intent.orderId },
      },
      { idempotencyKey }
    );
  });
 
  // Persist result atomically
  return db.paymentResult.create({
    data: { idempotencyKey, ...mapResult(result) },
  });
}

Intelligent Retry Strategy

Not all failures are equal. The retry engine classifies failures and applies appropriate strategies:

const RETRY_STRATEGIES: Record<FailureType, RetryConfig> = {
  network_timeout: {
    maxAttempts: 3,
    backoff: "exponential",
    baseDelay: 1000,
  },
  rate_limited: {
    maxAttempts: 5,
    backoff: "exponential",
    baseDelay: 5000,
  },
  card_declined: {
    maxAttempts: 1, // Don't retry hard declines
    backoff: "none",
    baseDelay: 0,
  },
  insufficient_funds: {
    maxAttempts: 3,
    backoff: "fixed",
    baseDelay: 86400000, // Retry daily
  },
};

Real-Time Reconciliation

Every transaction is reconciled against provider records within 5 minutes:

  1. Payment processed → event emitted
  2. Reconciliation worker picks up event
  3. Fetches provider-side record via API
  4. Compares amounts, fees, and status
  5. Flags discrepancies for review
Recovery Rate
12%
Percentage of initially-failed payments recovered through intelligent retries

Technical Decisions

Why TypeScript: Payment code demands type safety. Runtime type errors in payment processing can mean lost revenue or double charges. TypeScript catches these at compile time.

Why Circuit Breaker: When Stripe has a partial outage, continuing to send requests makes the problem worse. The circuit breaker pattern prevents cascade failures and provides fast feedback to the application.

Why PostgreSQL: ACID transactions are non-negotiable for payment state management. The idempotency key pattern relies on unique constraints and atomic inserts.

Results

  • $50M+ processed annually through the orchestration layer
  • 99.99% success rate (up from 96.8% with direct integration)
  • 12% recovery rate on initially-failed payments
  • 180ms average latency for the full payment pipeline
  • Zero reconciliation discrepancies in 12 months of operation
Tech Stack
TypeScriptStripePostgreSQLDockerNode.jsRedis