April 17, 2026

Incident Response in the Neocloud - Lambda Cloud (Part 2)

tl;dr

We're closing out the Neocloud IR series with Lambda Cloud. If Nebius felt like AWS-lite, Lambda feels more like a GPU-first SaaS flatter, leaner, and with fewer knobs to turn. That's great for researchers spinning up H100s. It's less great when you're the one responding to an incident. In this post we cover what you actually need to know about Lambda's structure, identity, and logging so you're not reading the docs for the first time during a breach. Keep reading if you want to know why Lambda Cloud has the most interesting audit logs....

Series Roadmap

Part 1: Nebius ✅
Part 2̶: CoreWeave ❌
Part 2: Lambda Labs ← you are here

Want to elevate your cloud incident response skills? Check out CloudLabs, a safe environment to up skill and practice with real cloud incident response scenarios.

Why not CoreWeave

Our new friends at CoreWeave decided that we d̶o̶n̶'̶t̶ ̶h̶a̶v̶e̶ ̶e̶n̶o̶u̶g̶h̶ ̶m̶o̶n̶e̶y̶, aren't running GPU workloads and therefore cannot use the CoreWeave platform.

Why Lambda

Lambda Cloud (lambda.ai) is one of the most widely-used GPU-as-a-service providers for ML teams. Their Public Cloud offering gives you on-demand and reserved GPU instances, managed filesystems, and a simple firewall layer. The appeal is speed: you can go from signup to a running H100 in minutes. That same simplicity is what makes Lambda a challenging platform for IR, there are (very) few native security features, which means your IR playbooks need to rely more heavily on Kubernetes forensics.

Lambda aggressively expanding its on-demand cloud offering via 1-Click Clusters | PM Insights — Lambda Cloud not to be confused with AWS Lambda

‍

Hierarchy: Accounts and Teams

Lambda's model is very flat compared to Nebius's Tenant/Project structure. Your organizational unit is a Team, which sits under a single billing account. There are no nested projects, no per-region workspaces, no org-level containers.

That has two major IR implications:

No resource separation. Every resource, instances, filesystems, SSH keys, API keys lives in one shared pool. If an identity is compromised, the entire Team's resources are in scope. You can't "contain the project" it's an open-house.
Scoping is straightforward. Easy to understand the full hierarchy, the downside to this is that you also can't isolate part of the workload.

If your organization runs multiple workloads on Lambda, the only real separation boundary is a separate Team (and therefore a separate billing account).

IAM

Lambda's identity model has three components:

Team members: humans authenticated via email/password. Two roles exist: Admin and Member. Both roles have full access to Team resources; the only difference is that Admins can edit billing.
API keys: programmatic tokens generated in the Cloud Console. Used for the Cloud API (instances, firewalls, audit events, etc.).
SSH keys: injected into instances at launch for shell access. Manageable via the Console or the API.

Keep in mind that both standard roles have full (Kubernetes) cluster access, this is by design and noted in the documentation:

A few things jump out from an IR perspective:

No least-privilege for humans. Any Member can list, launch, or terminate any instance, read any filesystem, and see any SSH or API key metadata. There is no read-only or forensics role. If you want a "break-glass IR identity," you'll need to create one, document it, and rotate credentials regularly.
No SSO/MFA. At time of writing, there is no way to setup federated access via your IDP (e.g. Entra ID). Meaning that a separate identity object lives Lambda Cloud, not subject to security measures in your IDP.
API keys are the primary abuse target. Anyone with an API key has full programmatic control. Treat them like root credentials: short-lived where possible, stored in a secret manager, rotated on a schedule, and revoked the instant a user changes role or leaves.

Tip: Make sure MFA is set to required for the organization it's not enforced by default

‍

Logging & Forensics

Lambda exposes audit logs through a single Audit Events endpoint in the Cloud API:

Retention is 6 months by default, better than Nebius's 90 days.
Logs capture user- and API-level events (logins, key creation, instance lifecycle, firewall changes).
There is no native GUI export. Pulling logs at scale is an API job.

The interesting thing is that this is very well documented all possible event log entries are documented (eat that Microsoft).

Logs can only be acquired or inspected via the API not in the console. For example to retrieve logs between April 16 - April 17 we can make the following request:

_{curl --request GET --url 'https://cloud.lambda.ai/api/v1/audit-events?start=2026-04-16T00%3A00%3A00.000000Z&end=2026-04-17T00%3A00%3A00.000000Z' \
--header 'accept: application/json' \
--user '<YOUR-API-KEY>:'}

An example event is shown below:

This is where things get interesting, compare the event with the table above. You have to combine the first three fields to get the full event name _{cloud.identity.created}.

The problem with the audit logs

When you look at an event you'll immediately notice some issues:

No username only a UID
No IP-address
No user-agent
No session ID or other correlation fields
Usage of anonymous values

All of the above fields are critical when doing incident response. At the time of testing and writing we haven't been able to get meaningful data out of the audit events. The audit log only records what and when something happened. This is a huge challenge when you are trying to do root cause analysis or forensics on a given user/api-key. It is possible to find all activity based on an API key, however it's not possible to lookup which user has a what API key.

At this stage our verdict is that if you want to do incident response in Lambda Cloud you will have a very hard time. Without meaningful event details it is simply not possible to attribute activity to a given identity which is a cornerstone in any given investigation.

The Incident Readiness Shortlist

So what can you do if you're using Lambda Cloud? Please talk to your account manager, maybe they can fix some of the earlier mentioned issues around audit logging. Furthermore, if you're running production workloads on Lambda, do these before you need us:

Isolation. If you have prod and non-prod on one Team, split them. It's the only real isolation Lambda gives you.
Treat API keys like crown jewels. Inventory them when they get created, rotate on a schedule, and revoke on departure.
Deploy container monitoring. Lambda's audit log won't tell you what an attacker did once they had shell. Visibility on the endpoint needs to be configured.
Pull audit events into your SIEM. if you want to proactively search for events, forward them to your SIEM (and hopefully additional details will be added in the future).

We'd love to help you with incident readiness or incident response in the (Neo) Cloud contact us at info@invictus-ir.com or schedule a meeting directly.

About Invictus Incident Response

We are an incident response company and we ❤️ the cloud. We help our clients stay undefeated.

🆘 Incident Response support: reach out to cert@invictus-ir.com or go to https://www.invictus-ir.com/24-7

‍

News & Updates

View all

Research

May 22, 2026

Incident Response in Kubernetes (GKE)

Research

May 12, 2026

Incident Response in Kubernetes (EKS)

Be ready for the next cloud incident.

Speak to an Expert