Why Your JSON Parser Is Slower Than It Should Be
Sometime in 2019 I got pulled into a war room because a webhook delivery service was missing its 99th-percentile SLA by a factor of three. The service was supposed to deliver and process payment-event webhooks under 200ms p99; we were sitting at 600+ on a good day, occasionally spiking past 1.5 seconds when a customer with a particularly aggressive event pattern showed up.
Everyone assumed it was the network. The webhook service made outbound HTTP calls to customer endpoints, and the world is full of customers whose endpoints take a while. So we spent two days on retries, connection pooling, and outbound request timeouts. None of it moved the p99.
I finally got around to running a flame graph against the service, which I should have done on day one. About 38% of the wall-clock time on a slow request was inside json.Unmarshal. Not the network. Not the database. Not the customer endpoint. The JSON parser, parsing the same 50KB blob, on the same machine, twice per request — once on receive, once before re-serializing to send out.
The fix took me an afternoon. The talk I subsequently gave to the team about it is more or less what's in this article.
JSON is the lingua franca of modern web services, but the performance characteristics of JSON parsing are a thing most developers never seriously think about. Until they do. Usually because production is on fire. So let's go through the actual reasons your JSON parsing is slower than it should be, in roughly the order I run through them when somebody DMs me a flame graph.
Table of Contents
- Parsing Is Not Free (And You Probably Forgot)
- The Real Cost: Allocations, Not CPU
- Schema-less Parsing on the Hot Path
- Double-Parsing the Same Payload
- Pretty-Printing, Logging, and the 30% Overhead Nobody Notices
- When You Should Be Streaming Instead
- String Validation and the UTF-8 Tax
- How to Actually Measure
- The Takeaway
Parsing Is Not Free (And You Probably Forgot)
The first thing to internalize: JSON parsing is genuinely expensive on a per-byte basis compared to most things you do in a typical request handler. On modern hardware with a tuned parser like simdjson you can get 1–3 GB/s of throughput. With the default standard-library parser in most languages, you're closer to 200–500 MB/s. That sounds fast until you remember that your handler is doing many other things, and that you're often parsing the same payload more than once.
Here's a back-of-the-envelope I run when I'm trying to figure out whether parsing is the issue. If your service is processing 1,000 requests per second, with an average payload size of 10 KB, and you're parsing each payload twice (once on receive, once on log/audit), at 300 MB/s parser throughput, you're spending:
1000 req/s × 10 KB × 2 parses ÷ 300 MB/s
= 1000 × 10000 × 2 / 300_000_000
= 0.067 sec/sec, i.e. 6.7% of a CPU core
per server, just on JSON parsing.
That's at perfectly reasonable scale. Now bump payload size to 50KB (which is normal for any payload that includes nested entities and metadata), or bump the parse count to four (request, audit, log enrichment, response re-serialize), and you're suddenly burning 30–40% of a core just turning bytes into objects and back again. That's where I was sitting in 2019, and it was visible in the flame graph the moment I bothered to look.
The Real Cost: Allocations, Not CPU
This is the part that took me embarrassingly long to figure out earlier in my career. The CPU cost of JSON parsing is not actually the problem most of the time. The allocation cost is.
When you call json.Unmarshal(payload, &v) in Go, or JSON.parse(s) in Node, or json.loads(s) in Python, the parser walks through the bytes, builds a tree of objects, allocates strings for every key, allocates strings for every string value, allocates wrapper objects for every nested map and array. On a typical 10KB payload with 80–100 fields, you can easily allocate 300+ small objects. Each of those allocations is fast in isolation. The aggregate cost is GC pressure that shows up later as p99 jitter.
I remember the exact moment this clicked. We were debugging a payments service whose p99 latency would spike to 800ms every 30 seconds, like clockwork. Nothing in the application logs explained it. The infra team thought it was a noisy neighbor on the underlying VM. After three days, I finally correlated the spikes with garbage collector pause logs from the runtime. The service was producing so much short-lived allocation pressure from JSON parsing that the GC was forced into stop-the-world cycles every 30 seconds.
The fix was a combination of three things: switching to a streaming parser for the hot path, reusing decoder instances instead of allocating fresh ones, and most importantly, deserializing into pre-defined structs rather than into map[string]interface{} (more on that next). p99 dropped from 800ms to about 90ms, and the periodic GC spikes disappeared. We didn't add a single CPU. The hardware didn't change. We just stopped allocating so aggressively.
Schema-less Parsing on the Hot Path
Here's the single biggest performance mistake I see in code reviews on JSON-heavy services: using schema-less generic parsing on a hot path when you actually know the schema.
In Go, this is the difference between:
// Schema-less: parser must allocate map and infer types
var data map[string]interface{}
json.Unmarshal(payload, &data)
// Schema-known: parser fills a pre-shaped struct
var event PaymentEvent
json.Unmarshal(payload, &event)
The first form is convenient. You can write it without thinking, you don't have to define a struct, you can poke around the fields with type assertions. It's also two to four times slower than the second form, and produces vastly more allocations. The parser has to allocate a map, allocate every key as a string, allocate boxed interface{} values for every field, and infer the type of each value at runtime.
The schema-aware form is faster because the parser knows in advance that field amount is an int64, field currency is a 3-character string, and field customer is a nested struct with its own known shape. The parser can write directly into the destination memory without intermediate allocations.
I've seen the same pattern in Node (using fast-json-stringify with a schema vs. raw JSON.stringify) and Python (using orjson with type hints vs. json.loads). The exact APIs differ but the principle is universal: tell the parser what shape to expect, and it will be much faster.
The pushback I get on this is always: "but our payload shape changes." Sometimes that's actually true — multi-tenant platforms with customer-specific schemas, generic ingestion endpoints, that kind of thing. But more often it's not really true. The schema is "fixed enough" that you could define a struct for the 95% case and fall back to schema-less parsing for the unusual paths. That's the right default.
If you genuinely have variable shapes, look at hybrid approaches: parse a small known envelope first (e.g., the routing fields you need to dispatch on), then parse the body lazily into the schema you actually need based on the type field. This is what most well-designed event systems do.
Inspect a JSON Payload's Shape
If you're trying to decide whether a payload is regular enough to model as a struct, the JSON Formatter gives you a fully-expanded view with key-by-key structure. Useful for spotting which fields are stable and which are open-ended.
Open JSON Formatter →Double-Parsing the Same Payload
This one is so common I'm willing to bet you've done it. I've done it. Multiple times.
The pattern: a webhook handler receives a JSON payload. Step one, it parses the payload into a generic map to figure out the event type. Step two, based on the event type, it parses the payload again into a typed struct. The bytes get walked twice. Every string gets allocated twice. The handler does double the work for no reason.
// The wasteful pattern
var generic map[string]interface{}
json.Unmarshal(payload, &generic)
eventType := generic["type"].(string)
switch eventType {
case "payment.succeeded":
var event PaymentSucceeded
json.Unmarshal(payload, &event) // parse #2
case "payment.failed":
var event PaymentFailed
json.Unmarshal(payload, &event) // parse #2
}
The fix is to define an envelope struct that captures only the routing field, and use the envelope to dispatch:
// Better: parse envelope, then defer body parsing
type Envelope struct {
Type string `json:"type"`
Data json.RawMessage `json:"data"`
}
var env Envelope
json.Unmarshal(payload, &env)
switch env.Type {
case "payment.succeeded":
var data PaymentSucceededData
json.Unmarshal(env.Data, &data)
case "payment.failed":
var data PaymentFailedData
json.Unmarshal(env.Data, &data)
}
The json.RawMessage type tells Go's parser "don't actually decode this nested field; just hold onto the bytes." When you go to decode it later, you only walk the bytes that are relevant to the specific event. This pattern has equivalents in basically every JSON library worth using: RawMessage in Go, Buffer tricks in Node, RawValue in serde-json, etc.
I introduced this pattern at one job and it cut parser time on the webhook ingestion path by 47% on the median request. Not because the parser got faster, but because we stopped asking it to do the same work twice.
Pretty-Printing, Logging, and the 30% Overhead Nobody Notices
Here's a thing developers do without realizing it costs anything: they log the full request and response payloads as pretty-printed JSON. Sometimes the same payload gets pretty-printed twice — once on the way in, once on the way out — for "audit purposes."
Pretty-printing is more expensive than people realize. You're not just adding whitespace; you're doing a re-serialization pass through the entire object tree, and if you logged the parsed object (rather than the original bytes), you're allocating a fresh string buffer that's about 30–40% larger than the compact form because of all the indentation and newlines.
I worked on a service in 2021 where the request logger was secretly the second-most-expensive thing in the request path. The team had set up structured logging with a "log payload as pretty-printed JSON" option for debuggability, and forgotten to gate it behind a verbosity check in production. We had every request being parsed once for handling, then re-serialized to pretty-printed JSON for logging, then logged. The "harmless" debug feature was eating 28% of CPU on the service.
The takeaways:
- If you log payloads, log the original bytes, not a re-serialized form.
string(payload)in Go,payload.toString()in Node, etc. - If you must pretty-print for human-readable logs, gate it behind a debug-level check that's off in production.
- Logging itself can be a parser cost in disguise. Most structured loggers serialize a log line as JSON, which means every value you stick in your log fields goes through another round of JSON encoding. If you're logging large structured payloads as fields, you're paying for that.
If you're trying to view JSON in a debugger or a development environment, do it client-side, not in your hot path. Tools like the JSON formatter exist precisely so you don't have to bake pretty-printing into your services.
When You Should Be Streaming Instead
The default mental model for JSON parsing is "read all the bytes into memory, then parse them." This is fine for payloads under a few MB. It is not fine for larger payloads, and it's especially not fine when you're dealing with arrays of records that you process one at a time.
Concrete example: I once had to ingest a daily JSON file from a data partner that contained an array of about 4 million transaction records. Total file size was around 800 MB. The original implementation called json.Unmarshal on the entire file, which produced a slice of 4 million record structs in memory, peak RSS pushing 6 GB once you accounted for all the allocated string and nested struct overhead. The OS started OOM-killing the worker.
The fix was to switch to a streaming decoder that processed one record at a time:
dec := json.NewDecoder(file)
// Read opening bracket of the array
if _, err := dec.Token(); err != nil {
return err
}
for dec.More() {
var record Transaction
if err := dec.Decode(&record); err != nil {
return err
}
process(record) // handle this record, then forget it
}
Memory dropped from 6 GB peak to about 80 MB steady-state. Throughput improved too, because we stopped fighting the GC. Most JSON libraries in production-grade languages have a streaming/incremental decoder; if you don't know yours, look it up before you write a "load the whole file" implementation.
The general rule I use: if the payload contains a top-level array of records, and you can process them one at a time, you should be streaming. The convenience of Unmarshal isn't worth the memory hit at any nontrivial scale.
String Validation and the UTF-8 Tax
Most JSON parsers validate that string values are valid UTF-8. This is correct behavior — JSON strings are required to be UTF-8 by RFC 8259 — but it costs cycles. On payloads that are mostly strings (common in event payloads, logs, customer data), UTF-8 validation can be 15–25% of the total parse cost.
You usually shouldn't disable this. UTF-8 validation is the thing that prevents bizarre downstream bugs from invalid input. But there are cases where you can get a real win:
- If you're parsing JSON that came from a trusted internal source and was already validated upstream, some parsers let you skip re-validation.
- If you're parsing JSON that you're going to immediately serialize back out unchanged (e.g., webhook forwarding, proxying), use a streaming forward path that doesn't fully decode strings, just identifies their boundaries.
- If you control both producer and consumer, and you're hitting a bottleneck, a binary format (Protobuf, MessagePack, CBOR, FlatBuffers) sidesteps both UTF-8 validation and the bulk of the parsing cost. Use the right tool for the job.
I've seen teams jump to "just use Protobuf" as the answer to every JSON performance problem. It's not always the right answer — JSON is human-readable, debuggable, and universally supported — but if you're at the point where parser cost dominates your service and you control both ends, it's worth measuring. A typical fintech event payload that's 8KB as JSON is often 2–3KB as Protobuf, with parser throughput 5–10x faster.
How to Actually Measure
Don't optimize JSON performance based on vibes. Don't optimize it based on this article either. Measure first.
Here are the tools I actually use:
1. Flame graphs
The single highest-leverage thing. Run a flame graph against your service under realistic load and look for parser frames. In Go, pprof; in Node, 0x; in Python, py-spy; in Java, async-profiler. If JSON parsing isn't visible on the flame graph, it's probably not your bottleneck. Don't bother optimizing it.
2. Allocation profilers
CPU profiling alone misses the GC story. Use the allocation profiler in your runtime (go tool pprof -alloc_objects, --prof in V8, tracemalloc in Python). Look for hotspots in the parser. High allocation count is often a stronger signal of "this is your problem" than high CPU time.
3. Microbenchmarks of parser variants
Once you've identified the hot path, run microbenchmarks on representative payloads against different parser implementations. In Go: encoding/json vs. jsoniter vs. sonic. In Node: JSON.parse vs. simdjson-rust-bindings. In Python: json vs. orjson vs. ujson. The variance is sometimes 5–10x for the same workload.
4. Synthetic but realistic payloads
Don't benchmark against toy payloads. Capture real production payloads (sanitize them first), pick a representative distribution, and benchmark against those. The performance characteristics of a parser on a 200-byte payload are wildly different from its characteristics on a 50KB payload with deep nesting.
Here's the rough order I try things in, when I'm pretty sure parser cost is real:
- Eliminate double parsing of the same payload (envelope + RawMessage pattern).
- Replace schema-less parsing with typed struct parsing on the hot path.
- Switch to a streaming decoder for any large or array-shaped payloads.
- Disable pretty-printing in production logs (or, better, log original bytes).
- Swap the standard library parser for a faster third-party one if all the above isn't enough.
- Re-architect to a binary format if you control both ends and are still bottlenecked.
Steps 1–4 are usually enough. Step 5 has caveats around library maturity and dependency cost. Step 6 is a real project, not an afternoon.
The Takeaway
Three things to remember.
One: JSON parsing is more expensive than your intuition suggests. At realistic production scale, it can be 10–40% of your service's CPU time, and it's almost always a contributor to GC pressure. The cost is real even when you're using a "fast" parser.
Two: Most of the win is from doing less parsing, not from finding a faster parser. Parse each payload once. Use schema-aware parsing when you can. Stream when the data is large. Don't pretty-print on the hot path. These four rules cover the majority of real-world JSON performance problems I've seen.
Three: Measure before you optimize. JSON parsing is a great place to spend an afternoon if your flame graph clearly says it's the problem, and a complete waste of an afternoon if it doesn't. Don't reach for parser optimization as a generic first move; reach for a profiler.
If you want to get familiar with JSON shapes and what a payload actually looks like at structural level, the JSON formatter is a useful sandbox. For converting between formats during the design phase — say, evaluating whether YAML or XML would be a better fit for human-edited config — the YAML to JSON and XML formatter tools handle the conversion side. None of these will speed up your production parser, but they'll help you reason about the shape of the data you're actually dealing with, which is half the battle.
Anyway, that's the talk. The webhook service from the opening of this piece eventually shipped a v2 that did about half the parsing of the original. P99 stayed under 200ms for the rest of the time I was at that company. The flame graph stopped looking embarrassing. And I learned the lesson I should have learned earlier: the fastest code is the code that doesn't run, and the fastest parse is the parse you skip.
Inspect a JSON Payload
Pretty-print, validate, and explore the structure of any JSON payload — without involving your production hot path.
Open JSON Formatter