FireTail at API World 2022

FireTail had the wonderful opportunity to speak at API World 2022. Our co-founder and CEO Jeremy gave a talk on API security as an application problem, and explained how solutions like FireTail can help.

FireTail at API World 2022

API security is a key concern right now, as API use has come to account for over 83% of all internet traffic, and API attacks are rising rapidly as a result. Jeremy goes over case studies of breaches and what we can learn from them.

A strong API security posture starts with visibility and includes several other elements like authentication and authorization for a holistic approach to secure APIs.

Key topics covered include:

  • What kinds of breaches have happened to date- case studies of API breaches from a variety of sources show us the range and complexity of recent API attacks.
  • What the causes of those breaches have been- learn from the mistakes of others in the industry so you can secure your APIs against a wide range of threats.
  • The API attack vectors- find out the most common attack vectors targeted by malicious actors in order to gain insight into vulnerabilities and take action now. 
  • Data visibility- visibility is the single most important aspect of API security, however, how can we gain the visibility we need? And what level of data visibility is really necessary to protect APIs?
  • How to think about API security as a program- there are many different aspects of API security that work together in tandem (visibility, centralized audit trail, code library, authentication, authorization, detection and response, alerting and monitoring, and more). API security is not a monolith and needs to be looked at from a variety of angles.

Webinar Transcript

Jeremy

Please note if you're watching this presentation after API world 2022, which happened in October. This is a rerecording of the presentation given there. 

Alright, let's kick things off. Hey, my name is Jeremy Snyder. I'm very happy to be here today. I'm the founder and CEO of a company called FireTail. We work in the API security space. And here today, what I want to share with you is my thoughts on why API security is an application security problem and not a network security problem. 

So just before we kind of dive into the content today, a few words about myself. I started my career as an IT and cybersecurity practitioner. I worked my way up through IT organizations at a couple of SaaS companies, and then a video game company for four years before going to join a little startup called AWS. 

In the early days of cloud computing, when a lot of customers were still kind of skeptical, what this, you know, cloud thing is all about. And is this really just kind of some spare server sitting somewhere and so on? 

In any event, I've spent the remainder of my career kind of working in customer facing roles within the cloud and cloud security space, most notably the last 5 to 6 years with a company first called Divvy Cloud and then acquired by Rapid7, where I then also spent some time working on M&A and got to be a part of a team doing three acquisitions, which was a fascinating process, but ultimately led me to kind of start FireTail towards the end of 2021 and really start to engage kind of deeper with customers in 2022. And that's a big part of what I'm here to talk to you today about some of the lessons learned and the things we observed along that journey. 

So let's start with some context why our API is so important all of a sudden? Well, the truth is, it's not all of a sudden. And for those who are in attendance at the physical conference of API world or even the virtual, you know, took their time to dedicate that towards an event centered around APIs, this is not going to be a surprise to you. APIs are critical in the modern web. Every mobile app, every IoT device is actually just a client speaking to a backend over an API where the transactions are processed through that API interface. 

And we know that more than 80% of web traffic nowadays is API calls. And if you think about that for a second, that's staggering. That means that, you know, four out of every five kind of “packets” that traverse across the routing infrastructure of the internet are not done on the behalf of humans. They're done on behalf of systems calling other systems or software calling other software. And when we think about, kind of, the architecture of an API in the application stack, you realize that the API is the thing that is sitting exposed on the network edge by definition and kind of by requirement. And so it represents 90% of the attack surface for these web applications. 

Gartner predicts that APIs will be the number one attack surface for most enterprises as early as this year, 2022. I tend to think that is probably wrong. I tend to think this year will still be, you know, primarily focused on business email compromise and phishing. But it is clear that the volume of attacks against APIs is up dramatically over the last 18 to 24 months. 

If we take a step and we back and we start to think about, you know, how do we get to where we are today? In the 2020s, there's been an evolution in, kind of, the model of systems talking to systems and how they exchange data with each other. Back in the 90s, you know, we had EDI, which was this horribly bloated format, but was effective for what it did. But it was, you know, just very, very expensive to run and process. This was really kind of the web 1.0 phase, right, when all of these things were being done for the very, very first time. And primarily organizations were running infrastructure in data centers, in facilities, or, frankly, on the server underneath somebody's desk. Right. 

We moved into the 2000s, and things started to get a little bit more evolved. And ironically, I say ironically in my mind because, well, we'll see you in a second- But it was Microsoft that was really pioneering this model of web services and systems, talking to systems over Soap and XML and these kind of defined contracts around document type definitions and so on. And this often became known as kind of the client server model or a flavor of the client server model of computing. And we started to get a little bit more efficient with how we ran server infrastructure, moving to things like virtual machines. 

Fast forward to today, and we've kind of learned a few lessons. And here's where the irony kicks in. You know, it was Microsoft pioneering that shift into the 2000s. But nowadays if you talk to most organizations about how they're running and building their APIs, you'll find that a lot of it is centered on open source technologies, not necessarily on net. But we've kind of moved away from the Soap and XML model that had very tightly constrained data contracts and data structures, and we've moved into more Rest and and GraphQL, you know, where we use Json as the format to exchange documents or exchange queries and result sets. 

We've moved from a client server kind of narrowly, you know, one server serving a set of clients into a very distributed API centric model where complex programs are modular and there are components that are dedicated to kind of each function of them, and each component is represented with a set of APIs at the edge of the component. And we've also started to see a shift away from virtual machines into more kind of serverless functions and containers. So we're getting, you know, increasingly efficient with our infrastructure, but we're also getting increasingly simple and relying on third parties to provide a lot of the kind of the underpinning plumbing, if you will, with that cloud infrastructure layer. So that's going to come in later in the conversation when we kind of deep into dig into contextualization of APIs. 

But let's examine what's happened or what we know has happened relative to data breaches against APIs. And I give that asterisks there because of course, we can only analyze what we know about. And so primarily our analysis at tail has been focused on publicly disclosed API breaches. We actually keep a tracker on our website. Feel free to check that out. You can find the link at the footer of our website. And what we've done is we've kind of done root cause analysis again, based on what has been publicly disclosed about these breaches. And what we've discovered is a few common things and common API attack vectors. 

Number one is broken authorization logic. There's often this kind of fallacy that developers will embrace, which is that authentication is the same as authorization, meaning that if I can establish that your identity is Jeremy, I will allow you to issue any kind of follow up queries. You can request any function, you can request any data set, etc. And so this kind of faulty application logic around authorization is the single largest category of API breaches to date. And there are a couple different OWASp items in this top ten of API security requirements that are really focused on broken logic one around data access, one around functional access. You'll see those as Bola and BFL broken object level authorization and broken function level authorization. 

The second category is actually the first step in that identity chain, which is the authentication. And this can be a couple of different things. This can be authentication checks that aren't actually tightly enforcing whether a token is valid. Or they can be things like um. Having sequential tokens, so it's very easy to forge an identity. Or they can be things like internal APIs that accidentally get exposed and don't really require authentication. And so we've seen kind of 92 million records breached as a result of flawed authentication controls. And then we've seen 10 million records breached as a result of security misconfiguration. This is a very broad category. And arguably this is the category that might actually break away from the application layer slightly, in the sense that this can also be a result of something misconfigured at the cloud infrastructure layer. But one of the common application constructs is around misconfiguration is I release a v5 of my API, and I don't deprecate v4 and v3, which I may know to be susceptible to certain types of application logic attacks or certain other flaws.  

And then last but definitely not least, and I think a category that's probably underreported and increasing is injection. And injection really refers to kind of what you can think of as the evolved state of SQL injection attacks. It's basically, can I construct a query that will either make the API misbehave or make it return too many records or things like that? This is most common against GraphQL API, so this is something to watch out for if you're thinking about how you're shifting your own API architecture inside your organization. 

Two other items to note here. Well, these are not direct causes of breaches. One of the challenges that all cybersecurity initiatives has have is that if you don't know about it, you can't protect it. And so, you know, 6.7 million of these records were acknowledged as being shadow APIs by the organizations that were breached. Effectively, shadow API means an API created by a developer that information security is not tracking. I tend again to think that this may be an underreported number, but regardless, this is something to watch out for. You know, does your infosec org know about the APIs that are being created? Do they have visibility onto that? And along with that visibility, one of the other requirements for really any successful security initiative is the ability to kind of do investigative work, digital forensics, checking out logs, etc.. And so again, probably underreported, but we know that 4.8 million records were not even flagged or logged in terms of, you know, what transactions were actually occurring at the API layer. So that's what we know. And if we think about, you know, what does that mean at the high level.

And we talked to CISOs about their concerns regarding APIs lack of inventory. So that visibility question enforcing perimeter security. And interestingly, when you dig into that, it's actually less about kind of the firewall security perimeter. It's not really those network perimeter controls. It's really around do we have gateways? Do we have logical controls around our APIs, end to end tracing of code? You know, from the development cycle to the API, live in production, to be able to go back to the application developer, to go back to the application code and figure out what went wrong is a super critical requirement. The security configurations we talked about just a minute ago. But, you know, just bear in mind it's it's an unknown here. How many required security configurations are there for an API? We would argue from our side at retail that, you know, really those four attack vectors that we just went, went through are really some of the top ones to think about. And then, you know, anything you do on top of that can be very beneficial. API change management, like all IT change management. 

I think what's interesting here is that, you know, this shows up on this list because all of a sudden, API centric development is becoming very important for organizations. And, you know, we went through that exact same cycle with cloud infrastructure about 5 or 6 years ago. You know, with the CSM company I was with at the time, we heard this consistently as one of the major concerns. We don't even know about the changes happening to our cloud environment. And then last but not least, and this is again, very reminiscent of the early days of cloud. It's this gap between developers and security teams. You know, the typical pattern is, hey, let's empower the developers. Let's give them what they need in order to, you know, go innovate, go develop at the pace our users require. Maybe there are competitive pressures. Who knows? But then developers tend to get ahead of security teams, and security teams may not have the visibility. They may not have the understanding of what security implications really are and so they kind of create this gap very natural. 

So let's talk for a second about API security and some of the ways that we need to address it. So the first thing that you know we we always think about is visibility. So we know what the attack vectors are. We know that they're primarily around kind of, you know, where the traffic is going to who is issuing the traffic. So we've got those questions of authentication and and authorization of the arguments. You know, are we supposed to allow this query to happen based on the user's identity and their permissions inherited from that identity? And then we know that we need the visibility onto the request parameters, the request payload and the response payload. And we needed the ability to log them. And so when we think about this and we think about kind of the common network structures that exist, and this is admittedly done specifically with AWS in mind, we find that it's really only at the application layer that you can get visibility into all the things that you need to see in order to have like a strong assertion about whether this API call was good or bad. And so that is one of the main drivers behind our view here at FireTail, that API security is an application level construct. 

Additionally, we know that the logic for processing things like authorization checks, query payload and parameter inspection is stuff that happens at the application layer. So how do we think about implementing an effective API security program? Well, one of the ways I like to think about security in these kind of modern environments that are very software defined in very fast moving. Because it's software defined. What you'll tend to see from organizations is that they also define the way they interact with the infrastructure through software. So you hear about infrastructure as code, you hear about automated automated deployment pipelines, etc.

So I like to think about the timeline of changes to these environments and think about kind of a pre-production environment and a production environment. So what do we think about doing in those pre-production environment for our APIs? Well, we're in the coding phase at this point. So of course we want to make sure that our coding practices are secure. Can we do software composition analysis? Can we do secure code analysis? Can we eliminate vulnerabilities both from our code and from third party components that we're utilizing? Maybe those are open source libraries. Maybe those are third party containers that we're pulling down to support our application. So those are a couple of the key steps there. When we get ready to launch this thing, are there a set of tests that we can do to make sure that we're catching things before they go live?

So you'll often hear about fuzzing, which is kind of, you know, a set of tests against an API. But I think it's also really important to run a set of logical tests around specifically around identity. So again, those authentication and authorization questions, can we check authenticated users and the permissions that they API is allowing them to execute. And then when we get into runtime, when we get into production, do we know that we've got coverage in place against those top four attack vectors? And do we know that we've got detection and response in place with some centralized logging. And that's kind of one of those crucial things to allow the information security team to have good coverage from their perspective. 

And then last but not least, I want to think about, you know, kind of contextual awareness, one of the most interesting evolutions in the broader cloud security picture over the last few years is kind of the development of complex. Think of them as kind of multi attack vector attacks against organizations. So, you know, you might enter an organization through one breach vector and then you know, execute some lateral movement, some discovery of of infrastructure, find some credentials and leverage those and so on.

And so for a really strong, effective program and, you know, some bonus points on the security side, are there things that you can do where you can correlate your APIs to the applications running and then correlate, you know, those APIs to the infrastructure that supports those applications? 

And then finally, can you integrate all of this with your security operations tools that might be ticketing, that might be standard alerting mechanism that you use, and so on. And so those are some of the main thoughts that we have around this. 

So just to kind of bring it all together, we know that the main causes for data breaches on API are application layer logical issues primarily around those attack vectors that we discussed. And so if we think about how we defend against them, we really again need to function at that application layer. We need to design and verify application logic controls that correspond to those attack vectors. And we should do what we can do from the network defense side. API gateways can be very helpful. They can also protect against things like rate limiting and so on. TLS termination. Some very, very helpful constructs that come out of theirs. WAF may also have some limited value depending on your organization. And then detection and response is also better than nothing. But you need the right data in the logs. And of course you need the logs in the same location. 

So hopefully that's been helpful to kind of stimulate some thinking around implementing API security within your organizations. If you have any questions, please feel free to reach out to me. I'm just Jeremy@firetail.io if you're interested still in leveraging the early access offer that we presented at API world 2022, that URL is still valid. And thank you so much for watching.

API World 2022

To learn more about API breaches, security, and how to secure your APIs, schedule a free 30-minute demo with FireTail.