Cratopus icon
FAST!

The Hidden Complexity of the 'Default' Choice: Crate vs. AWS API Gateway

Written by Derrick Antaya

For many, AWS API Gateway is the default choice. It’s integrated into the console, it’s “standard,” and it’s AWS. But for a growing SaaS, that convenience is a mirage. To understand why we built Crate, you have to look at the four critical layers where AWS fails: Developer Experience (DX), Operational Predictability, Technical Resilience, and Economic Scalability.

I. The DX Gap: Velocity Templates vs. Modern Engineering

AWS uses Velocity Template Language (VTL) for request and response transformations. If you have ever tried to debug a nested JSON-to-XML mapping in a VTL template, you know the pain. It is a proprietary, non-type-safe language that exists in a vacuum. You cannot easily unit test it in your local Go or JS environment.

The Crate Approach: Declarative JSONata

We replaced the brittle nature of VTL with JSONata. Because JSONata is a declarative query and transformation language, it is inherently safer and more readable. In Crate, we implemented this using a native Go engine that treats transformations as Stateless Functions.

// Theory: The Crate Transformation Budget
// This ensures that even the most complex deep-object mapping 
// remains within our latency budget.
func transform(ctx context.Context, input []byte, expr string) ([]byte, error) {
    // We enforce a strict 5ms deadline for transformations to ensure 
    // we never block the event loop or degrade throughput.
    ctx, cancel := context.WithTimeout(ctx, 5*time.Millisecond)
    defer cancel()

    // By using Go's context, we can cancel the JSONata execution 
    // the moment it exceeds the budget, protecting node health.
    return jsonata.Execute(ctx, input, expr)
}

By wrapping these in a Go context with a strict 5ms execution deadline, we ensure that a complex transformation never blocks the event loop. This provides a deterministic “transformation budget” that AWS’s shared mapping engine simply cannot guarantee.

II. The Operational Gap: Shared Jitter vs. Bare Metal Isolation

AWS API Gateway is a multi-tenant service. Your requests are handled by a shared fleet of proxies. When a “noisy neighbor” on that same fleet spikes their traffic, you feel it in your p99 latencies.

The Crate Approach: atomic.Value on Bare Metal

We host our Business Tier on Dedicated Bare Metal at Colocrossing. By removing the hypervisor layer entirely, we eliminate the “Micro-Jitter” associated with virtualized cloud environments.

To achieve zero-downtime updates, we avoid the heavy “Nginx reload” pattern. Instead, we use atomic.Value to hot-swap our routing table in memory.

// Theory: Zero-Downtime Configuration Swapping
type Gateway struct {
    // Routes stores our *RouteMap. Using atomic.Value allows 
    // for Lock-Free reads, which is critical at 100k req/s.
    Routes atomic.Value 
}

func (g *Gateway) UpdateConfig(newMap *RouteMap) {
    // Atomic pointer swap ensures that current requests continue 
    // using the old map while new requests immediately use the new one.
    // This avoids the common 'stop-the-world' latency spikes seen in 
    // mutex-based or file-based reloads.
    g.Routes.Store(newMap)
}

This approach leverages CPU cache coherency better than a Mutex ever could. In a high-concurrency gateway, a sync.RWMutex can cause cache-line contention between cores. By using an atomic pointer swap, we ensure that the “Read” path is as close to hardware-speed as Go allows.

III. The Resilience Gap: Passive vs. Active Failover

In AWS, “High Availability” often means multi-region setups that require complex Route53 health checks and manual DNS intervention. This is Passive Failover—it takes time for DNS to propagate while your users see 5xx errors.

The Crate Approach: In-Flight Retries & Waterfall Routing

This is where Crate truly deviates from the industry. We implemented In-Flight Retries. If a downstream provider returns a 5xx, Crate detects it mid-request. If you have a secondary destination configured, Crate automatically re-routes that exact payload to the healthy node before the client ever sees an error.

To handle the Idempotency Paradox, we distinguish between network-level failures and application-level failures:

// Theory: The Idempotency Safety Check
func shouldRetry(req *http.Request, err error, cfg RouteConfig) bool {
    // Network errors (refused connection) are generally safe to retry 
    // because the data likely never reached the application logic.
    if isNetworkError(err) {
        return true
    }
    
    // For 5xx status codes, the server might have partially processed the 
    // request. We only retry if the user has explicitly whitelisted 
    // the method via the Crate Dashboard.
    return cfg.RetryMethods[req.Method] 
}

IV. The Economic Audit: The Egress Tax

The final, and perhaps most painful, part of the AWS experience is the Egress Fee. Charging ~$0.09/GB for data leaving their network is a “Success Tax.”

Crate’s Waterfall Routing—our “Financial Firewall”—is designed to maximize your profit. By treating various cloud providers (Vercel, DigitalOcean, Fly.io) as a pool of resources, Crate intelligently routes traffic to exhaust free tiers and low-cost bandwidth before ever hitting your expensive AWS instances.

Comparison Summary

Metric AWS API Gateway Crate.cc
Mapping Engine VTL (Brittle, Untestable) JSONata (Declarative, Safe)
Isolation Shared Multi-tenant Dedicated Bare Metal
Update Logic Deployment Groups (Slow) atomic.Value (Atomic/Instant)
Failover DNS-Based (Passive) In-Flight (Active/Mid-request)
Pricing Usage-Based + Egress Fixed/Predictable