One of the core challenges of an API is the balance between connecting data and protecting that data. APIs want to talk – they want to connect users to other users, offering data along the way. Unfortunately, not everyone can be trusted, and the ever-increasing count of API breaches has raised alarms about data privacy and organizational integrity in the minds of users and developers alike.
How, then, should organizations properly prevent these breaches – and how should they respond when they occur?
The Anatomy of API Breaches
Before we dive into this at length, we should first clarify what a “breach” actually is in the context of APIs. In the most basic sense, an API breach occurs when unauthorized users exploit some sort of vulnerability in an API or the systems that support an API. These breaches can take a variety of forms, including those that are direct (as in the system itself is broken into and data exfiltrated) and indirect (such as when a service with exposed pagination patterns is scraped without the knowledge of the users or the permission of the underlying API and service).
Breaches can have a range of far-reaching consequences, including the reduction of user trust due to exposed data, reputational damage (especially in the security space), operational damage to ongoing services, and even legal and regulatory challenges arising from poor security practices up to and including multimillion dollar fines.
These threats should make breach prevention a high priority, but there are many reasons this often does not get properly implemented in production. Setting aside inexperience and lack of awareness, the reality is that many API breaches can look just like a normal API request – in some cases, normal requests can themselves be used to escalate queries into new attacks that providers can’t even begin to imagine. This is especially true in the case of multiple interconnected APIs, as attacks on these systems can be extremely complex.
The Limitations of Traditional API Security Approaches
In order to try and reduce security issues in complex API and microservice environments, many providers have adopted API Gateways and Web Application Firewalls, or WAFs, as a critical part of their security path. Unfortunately, there are some major limitations to this particular approach.
Firstly, it should be acknowledged that WAFs do provide some level of protection in the sense that traffic can be filtered and known attacks can be blocked. This of course requires the WAF provider to know of the attack in the first place, and when such a solution uses something like a Signature-Based Model, this ultimately means preventing attacks in a reactive, not proactive, approach.
Secondly, the use of API Gateways may not deliver all the desired security benefits.Gateways allow providers to unify multiple request paths into a singular codebase – it literally acts like a gateway, controlling the flow of data into and out of the system.
Unfortunately, as we’ve discussed before, this is principally a management feature, and not a security one. Gateways are not an appropriate solution for security for a wide range of reasons, but the biggest reason is the simple fact that gateways are not designed for that function – they are meant to be a management system, not a security one, and though they may offer security features in some applications, they are not designed to be more than that.
Fundamentally, the issue with the WAF + Gateway equation is that it is heavily steeped in the idea of Behavioral Analysis. Behavioral Analysis is the idea that a behavior can be monitored and catalogued to be tracked in the future as a potential threat. While this works to mitigate long-tail threats and common vectors, it does nothing for novel attacks. Such an approach also tends to miss attacks that mimic natural or allowed functions, as in the case noted above of scraping a service with an exposed pagination pattern.
All of this is made even more difficult in the context of investigating a breach after the fact. The fact is that WAFs and Gateways don’t provide one major element of a security posture – context. Both can only really respond to new attacks if they’ve already defined those attacks or have some sort of base upon which to trigger a warning, and as such, the context that is provided for investigation is very low.
Imagine for a moment that you discover your service has had a data breach. Your user profile numbers, emails, and organization names have appeared in the wild. How did they get there? You head towards your Gateway and your WAF to do some digging. In the process, all you find are a slew of requests that look like normal traffic from expected IPs. When you dig into the data that has been exposed, you realize that the data has been exfiltrated by abusing a pagination flaw where customers could be enumerated by their account number via an endpoint that is used to connect related accounts.
In this case, the API was used as designed, the source was correct, and there was no signature that would have been tripped. Without grabbing the external data, your investigation turned up nothing whatsoever. Without additional context, you were left entirely in the dark, and there was very little way you could have seen this problem ahead of time in your current setup.
The Role of Application Layer Visibility
Application Layer Visibility is critical towards ensuring a proper security posture is actively being implemented. Application layer visibility refers to the thorough monitoring and analysis of interaction with an API above and beyond simple network activity, providing a comprehensive understanding of how data is being interacted with on a call-by-call basis. If network visibility is seeing the numbers that were dialed and the length of the call, application visibility is tapping into the phone line and listening to the conversation.
This allows for precise examination of data content and behavior, aiding in the proactive detection of threats, effective policy enforcement, anomaly identification, and post-event investigations. Notably, this approach goes beyond the traditional network monitoring that is common in offerings such as WAFs, offering deeper insights into traffic and more actionable context.
Inline Monitoring and API Data Breaches
Inline monitoring is a proactive approach wherein providers actively inspect requests as they flow through the infrastructure and the API. Unlike passive monitoring methods, inline monitoring allows for real-time analysis and intervention based on predefined security policies. Providers can utilize signatures or other heuristic-based detection methods, but can also utilize the increased contextual framing provided by inline monitoring to detect actions that may be nested in normal traffic patterns.
When an API breach occurs, context is as good as gold. Inline monitoring provides this context, offering essential insights into the vector, exploited vulnerabilities, and potentially compromised data involved. This contextual information is indispensable for incident response teams as it enables them to conduct thorough investigations, remediate vulnerabilities, and implement preventive measures to prevent similar attacks in the future.
Proper inline monitoring can also drastically reduce response times. By providing DFIR teams with context and potential targets for remediation, they can not only more quickly respond to threats, they can also plug small gaps before they escalate and turn into much larger issues.
DFIR & API Breaches
Logging API interactions is very important to establishing a comprehensive security posture. These logs provide a comprehensive record of the interactions across the API, and can grant invaluable context for Digital Forensics and Incident Response (DFIR) teams. Logging API activity and related details enrich forensic investigations by providing a detailed trail of events that can help teams detect potential vulnerabilities.
Stopping the Breach
The first thing any DFIR team must do is stop the breach in its tracks. Logs play a crucial role here, as context can provide ample evidence as to where the breach has occurred, the scope of the breach, and whether immediate response and remediation is having the intended effect. Without ample logging systems, all of this must be done “in the dark” – conversely, every log with even a small piece of information and context helps this process immensely.
Forensic experts can analyze these logs to reconstruct the sequence of events, identify the source of the breach, and determine the extent of data exposure, as well as identify potential ongoing attacks that might otherwise not be noticed in the furor of the initial threat. This level of visibility is essential for understanding the attack's scope and impact, enabling organizations to take appropriate remedial measures and strengthen their security posture.
Establish Regular Operations
Once a breach is handled, organizations need to be able to point towards a litmus test to show that the situation has been resolved. DFIR efforts play a big role here – by knowing what the breach has surfaced or exposed, DFIR teams can ensure that the problem is scope limited and actually resolved. More to the point, they can remediate issues that popped up to ensure the problem doesn’t escalate or exist just under the surface.
This, of course, requires that the logs are actually doing what they were meant to do. This is where auditing comes into play – before an event ever happens, logs should be reviewed and audited to ensure that they can:
- Establish a baseline of what is regular functionality;
- Log any system which may deviate from regular functionality;
- Provide insights as to what a return to regular functionality might look like.
Understand Scale and Scope
This is the most public efforts undertaken by DFIR teams, and specifically focuses on the scale and scope of the breach in question. What data was stolen? What functions were abused, if any? Was any business logic or public systems used to trigger this attack?
When evaluating breaches, it often helps to contextualize the scope and scale on an axis of “deep” to “shallow”, and “narrow” to “broad”. It is possible for an attack to be “deep and narrow” – a single endpoint was attacked, with all functions and data sources exposed. It’s also possible for an attack to be “shallow and narrow” – a single endpoint was attacked, but only the function itself was broken, with the majority of the underlying data completely isolated from the breach.
Scope and scale has some major implications for legal and compliance concerns. Many industries are subject to strict regulatory requirements governing data handling and security – by maintaining comprehensive logs, organizations can demonstrate compliance with these regulations both in ongoing functions and in security solutions, which can be crucial to avoid fines and legal issues.
Organizations must balance legal and privacy concerns when logging sensitive data, such as personally identifiable information (PII) or other confidential information. Compliance with data protection laws, such as GDPR or HIPAA, requires careful handling and storage of such data, with proper anonymization or encryption measures in place to safeguard privacy. If this is not done correctly, the penalty for such a mishandling of data can, in some cases, be even worse than the punishment for a breach, and could incur additional penalties if particularly egregious. Providing a solid understanding of scale and scope is not just critical for your security posture, it may be critical for existential purposes.
Efficiency and Cost
There’s one major overarching theme throughout all of these categories of effort – efficiency and the pure cost that is introduced in inefficient approaches.
DFIR is labor-intensive – everything you do has to be checked, rechecked, verified, then checked again. Any amount of logging, automated reporting, etc. that can speed up this process or make it more efficient is worth its weight in gold. Regulatory requirements often require reporting the breach within days, and even if regulatory requirements didn’t require such stringent tracking, allowing a breach to exist for any significant amount of time is an existential crisis for any service.
DFIR also requires high accuracy. Ample logging and documentation is the name of the game for accuracy. Ensure you are logging everything and enriching with context wherever possible. It’s not just good enough to know something occurred – you need ample context and data informing this record.
Best Practices for API Forensics
Recommendations for Implementing Application Layer Visibility and Logging
Implementing application layer visibility and robust logging practices is essential for effective cybersecurity. Below are some best practices for ensuring this is done securely and effectively:
- Select the Right Tools – make sure you are choosing suitable solutions for your security posture. Simply plugging in a WAF and configuring a Gateway is not enough, but neither is choosing a security solution and signing up for an account. Do your due diligence to ensure that these tools capture the correct data, including traffic, request/response details, payloads, etc.
- Define policies – establish clear policies as to what is logged and monitored. Ensure that you are compliant with legal and regulatory frameworks including GDPR, CPPA, etc. Make sure that you are retaining traffic in a way that is appropriate and legally compliant.
- Review and update – regularly audit your systems and ensure that they match the current needs of both the API environment and your instance. Capture relevant information, and update what is relevant as new threats related to third party vendors or technologies become commonplace.
- Prioritize security assets – integrate a Digital Forensics and Incident Response (DFIR) team into your structure, and ensure that you are adequately resourcing security across the board. Saving some pennies here and there with subpar funding or solutions might make short-term fiscal sense, but when some of the fines for poor regulatory compliance mount into the millions and billions, the long-term investment case becomes much clearer.
- Identify critical systems – determine which APIs are critical to your organization's operations and prioritize their protection. Establish monitoring and alerting mechanisms specifically for these APIs, and develop incident playbooks to ensure that API incidents are detected, contained, and resolved.
- Practice, Practice, Practice – Regularly conduct tabletop exercises to test your response plans and ensure that you are ready for potential attacks.
APIs are only getting more complicated and powerful, and the threats are only growing more damaging. Addressing the growth of APIs and the threats targeting them requires a shift towards much more powerful and complex contextually informed security strategies. As we move forward, the ability to contextualize API traffic will be paramount in safeguarding the digital ecosystems of the future.
By integrating application layer visibility and deep introspection alongside inline monitoring and comprehensive logging, a digital forensics and incident response flow can be enabled that helps providers protect against API breaches and respond more effectively when they occur. Application Programming Interfaces (APIs) are everywhere – the average user interacts with an API of some kind every day. These APIs have become incredibly powerful and pivotal in modern technology, enabling a multitude of unique experiences and systems. However, as these APIs have proliferated and become that much more powerful, they have brought forth some very significant security challenges.