Skip to content
Go back

JSON Interoperability Vulnerabilities: A Deep Dive

Published:  at  06:06 PM

Hey everyone,

JSON (JavaScript Object Notation) has become the standard for data interchange on the web, and like many of you, I use it daily. Its simplicity and human-readable format are great, but this simplicity can be deceiving. I’ve been digging into how subtle differences in how JSON is parsed can lead to serious security vulnerabilities, especially in today’s complex, polyglot microservice architectures. It’s a fascinating area, and I wanted to share some of my findings.

The Problem: JSON’s “Simplicity” is a Lie!

JSON’s apparent simplicity masks a surprising lack of strict standardization. While RFC 7159 provides a general guideline, it leaves room for interpretation, especially when it comes to handling edge cases. This has led to the development of numerous JSON parsers, each with its own quirks and idiosyncrasies. I’ve lost count of the number of times I’ve seen weird parsing behavior in different languages.

In a modern application, a single JSON payload might be processed by multiple services, each using a different JSON parser. For example:

If these parsers interpret the same JSON payload differently, it can lead to inconsistencies and vulnerabilities. And that’s where things get interesting (and potentially dangerous).

Categories of JSON Interoperability Vulnerabilities

I’ve been categorizing JSON interoperability vulnerabilities into several key areas based on my research and bug bounty experiences:

1. Inconsistent Handling of Duplicate Keys

The JSON specification states that the behavior of duplicate keys in a JSON object is “undefined.” This is a big problem! This means that different parsers can handle them in different ways. Some parsers might:

Example:

Consider the following JSON payload:

{
  "user": "alice",
  "user": "bob"
}

One parser might interpret this as {“user”: “alice”}, while another might interpret it as {“user”: “bob”}. It’s a recipe for disaster!

Vulnerability Scenario:

Imagine a scenario where a user sends a request to create an account with the following JSON payload:

{
  "username": "alice",
  "username": "admin",
  "role": "user"
}

If the first parser in the chain takes the first occurrence of the username key, the user will be created with the username “alice”. However, if a subsequent parser takes the last occurrence, the user might be created with the username “admin” but the role “user” — a classic privilege escalation! I’ve actually found this in the wild.

Here’s a Python example to illustrate the different behaviors:

import json

# Example 1: Last value taken (Python's default)
json_data1 = '{"user": "alice", "user": "bob"}'
data1 = json.loads(json_data1)
print(f"Python Example 1: {data1}")  # Output: {'user': 'bob'}

# Example 2:  Simulating a "first value taken" behavior (not built-in, need to process manually)
def first_duplicate_wins(json_string):
    data = {}
    import re
    for match in re.finditer(r'"([^"]+)":\s*("[^"]*"|[^,}]+)', json_string):
        key, value = match.groups()
        if key not in data:
            try:
                data[key] = json.loads(value) #handle nested JSON
            except json.JSONDecodeError:
                data[key] = value
    return data

data2 = first_duplicate_wins(json_data1)
print(f"Python Example 2 (Simulated 'first' behavior): {data2}")

2. Key Collision: Character Truncation and Comments

Some JSON parsers exhibit unexpected behavior when encountering certain characters or syntax within keys, such as truncation or misinterpretation of comments.

Character Truncation: Certain parsers might truncate keys after encountering specific characters, leading to unintended key collisions. Null bytes are a classic example.

Misinterpreted Comments: While JSON itself doesn’t support comments, some parsers might allow them as an extension. However, inconsistencies in how these comments are handled can lead to vulnerabilities. For instance, a parser might misinterpret a comment as part of a key, leading to unexpected behavior. This is less common, but I’ve seen it in some older or less common parsers.

Vulnerability Scenario:

Consider a JSON parser that truncates keys after encountering a null byte (\0). An attacker could send a payload like this:

{
  "admin\0": "true",
  "user": "alice"
}

If the parser truncates the key, it might store the value as if the key was simply “admin”. If another part of the system checks for the “admin” key, this could lead to an unauthorized privilege escalation. This is a fun one to test for.

Here’s how you might test for null byte truncation in Python (note: Python’s json.loads doesn’t truncate, but other libraries or custom parsers might):

import json

def check_null_byte_truncation(key):
    test_json = f'{{"{key}": "test"}}'
    try:
        data = json.loads(test_json)
        # Check if the key in the parsed object is different from the original
        if list(data.keys())[0] != key:
            print(f"Potential truncation with key: {key}")
            print(f"Parsed as: {list(data.keys())[0]}")
        else:
            print(f"No truncation detected with key: {key}")
    except json.JSONDecodeError as e:
        print(f"JSONDecodeError with key: {key} - {e}")

# Test with a null byte
check_null_byte_truncation("admin\0")
check_null_byte_truncation("admin")

3. JSON Serialization Quirks

JSON serialization is the process of converting data structures (like objects or arrays) into a JSON string. Different programming languages and libraries might implement serialization differently, leading to inconsistencies. This is where you really need to pay attention to the specific libraries being used.

Example:

Consider how different languages might serialize a dictionary/map with non-string keys:

Vulnerability Scenario:

If a backend system serializes a JSON object with integer keys, and a frontend system deserializes it with a parser that expects string keys, the frontend might not be able to access the data correctly, potentially leading to errors or denial of service.

Here’s a Python and Javascript example:

import json

# Python
data_python = {1: "value"}
try:
    json_string_python = json.dumps(data_python)
    print(f"Python Serialized: {json_string_python}")
except TypeError as e:
    print(f"Python TypeError: {e}") #error
```html
<script>
  // JavaScript
  const data_js = { 1: "value" };
  const json_string_js = JSON.stringify(data_js);
  console.log(`JavaScript Serialized: ${json_string_js}`); //Output: {"1":"value"}

  const parsed_js = JSON.parse(json_string_js);
  console.log(`JavaScript Parsed:`, parsed_js);
  console.log(`Type of key '1':`, typeof parsed_js["1"]); //string
</script>

4. Float and Integer Representation

While JSON supports numbers, there can be subtle differences in how floating-point numbers and integers are represented and parsed across different systems. This can lead to rounding errors or unexpected behavior when converting between different representations. This is a classic source of bugs, especially when dealing with financial data.

Example:

Different languages may use different levels of precision when representing floating-point numbers. This can lead to slight variations in the serialized JSON, which, while seemingly innocuous, can cause issues in systems that rely on exact matching.

Vulnerability Scenario:

Consider a financial application that uses JSON to exchange transaction data. If different systems use different levels of precision for representing currency amounts, it could lead to discrepancies in the amounts processed, potentially resulting in financial losses or fraud.

Here’s a Python example:

import json

# Example: Floating-point precision
number1 = 1.000000001
number2 = 1.000000002
json_string1 = json.dumps({"num": number1})
json_string2 = json.dumps({"num": number2})
print(f"JSON String 1: {json_string1}")
print(f"JSON String 2: {json_string2}")

data1 = json.loads(json_string1)
data2 = json.loads(json_string2)

if data1 == data2:
    print("Numbers are equal after serialization/deserialization")
else:
    print("Numbers are different after serialization/deserialization")

5. Permissive Parsing and Other Bugs

Some JSON parsers are more “permissive” than others, meaning they might accept JSON that doesn’t strictly adhere to the standard. This can include:

While permissive parsing can be convenient in some cases, it can also create security vulnerabilities. It’s a trade-off, and in security, strictness is generally better.

Example:

A parser that allows trailing commas might interpret the following JSON as having two keys:

{
  "user": "alice",
  "role": "user",
}

If another parser in the chain is stricter and interprets this as having only two keys, but a different ordering, it could lead to inconsistencies.

Vulnerability Scenario:

If a parser allows unquoted keys, an attacker might be able to inject unexpected characters into keys, potentially bypassing input validation or exploiting other vulnerabilities. For example, if a system expects a key to be “id”, an attacker might send {“id\n”: 123}, and if the newline character is not properly handled, it could lead to log injection.

Here’s a Javascript example:

<script>
    // Example of permissive parsing with unquoted keys (not standard JSON)
    try {
      const jsonString = "{id: 123}";  // Unquoted key 'id'
      const parsedData = JSON.parse(jsonString);
      console.log("Parsed data (permissive parser behavior):", parsedData);
    } catch (error) {
      console.error("Error parsing JSON:", error); // This will throw an error in standard JSON
    }

    // Example of trailing comma
    try{
        const jsonStringWithComma = '{"key1": "value1", "key2": "value2",}';
        const parsedDataComma = JSON.parse(jsonStringWithComma);
        console.log("Parsed data with trailing comma:", parsedDataComma);
    } catch(error){
        console.error("Error parsing JSON with trailing comma", error);
    }
</script>

Real-World Examples and Case Studies

While it’s tricky to point to specific, publicly disclosed exploits solely caused by these JSON quirks (they’re often part of a bigger attack), the underlying issues are real. I’ve seen variations of these in bug bounties and internal security audits. Input validation issues, which are super common, often involve JSON.

How to Find These Vulnerabilities in Bug Bounties

Okay, this is the part you’re probably most interested in: how to find these vulnerabilities in bug bounty programs. Here’s my strategy:

  1. Identify JSON Endpoints: The first step is to find endpoints that accept JSON input. Look for API endpoints, forms that submit data as JSON, and any other place where the application processes JSON. Burp Suite is your friend here. Pay close attention to the Content-Type header.

  2. Fuzz with Modified JSON: This is where the fun begins. Start sending modified JSON payloads to these endpoints. Here are some techniques I use:

  1. Observe the Behavior: Carefully observe how the application responds to your modified JSON payloads. Look for:
  1. Use Automation: For large applications, it’s helpful to automate the process of fuzzing JSON input. Tools like Burp Suite Intruder or custom scripts can be used to send a large number of modified JSON payloads automatically. I often write quick Python scripts to generate payloads.

  2. Check Different Parsers: If you can, try to identify the specific JSON parsers being used by the application. This can help you tailor your attacks to the specific quirks of those parsers. Sometimes, error messages will leak this information. For example, a Python traceback might reveal that json.loads is being used.

  3. Look for Chained Vulnerabilities: JSON vulnerabilities are often part of a chain. For example, a JSON parsing vulnerability might allow you to inject data into a database, which could then be used to exploit a SQL injection vulnerability.

Mitigation Strategies

So, what can we do to defend against these attacks? Here are some best practices that I recommend:

Important Notes for Bug Bounties:

Conclusion

JSON’s ubiquity and apparent simplicity make it easy to overlook the potential security risks associated with its interoperability. As applications become more complex and distributed, it’s crucial to be aware of the subtle differences in how JSON is parsed and handled by different systems.

By following the mitigation strategies outlined in this blog post, developers can significantly reduce the risk of JSON interoperability vulnerabilities and build more secure and robust applications. The key takeaway? Treat JSON parsing as a security-sensitive operation.

Validate everything. And happy bug hunting!


Suggest Changes

Previous Post
Hacking Your Brain: Can We Enhance Intelligence?
Next Post
Hacking AI: Exploiting OWASP Top 10 for LLMs