Incident Response in the Neocloud

In this three part series, we define the Neocloud landscape, identify the threat landscape and break down the IR essentials for Neoclouds. If you have an AWS background, you’ll find Nebius's structure familiar. To help you move faster, we built the Invictus-Nebius script to automate your data acquisition. Whether it’s locking down IAM or centralizing your logs in Neoclouds, the goal is the same: the best time to prepare was yesterday; the second best time is now. The absolute worst time is during an incident.

Introduction

What if I told you there's a whole new world out there...Yes, it’s time to talk about Neoclouds from an incident response perspective. It might be that you're reading this and wondering what Neoclouds are, don't worry we got you covered. Also, don’t panic, we distill this down to the same fundamentals you are familiar with on normal cloud incident response cases. Ready to see how deep the rabbit hole goes?

Neocloud

What are Neoclouds?

Neoclouds are AI-first cloud providers offering specialized GPU infrastructure. By focusing solely on accelerated compute, they provide these resources at a lower cost and with higher performance than traditional hyperscalers, making massive AI workloads more accessible.

Neoclouds vs. The Big Three: What's the difference?

Neoclouds specialize in one specific domain: access to high-performance GPUs. Unlike traditional clouds that rely on virtualization, most Neocloud services are built on Kubernetes and offer bare-metal nodes. This architecture allows AI models to run with maximum hardware efficiency.

In contrast, hyperscalers can’t match the specialized power density and agility of Neoclouds. Public reporting indicates that even the largest cloud providers now leverage Neocloud infrastructure to supplement their own capacity and run intensive AI workloads.

What does this mean for Incident Response?

When we talk about incident response in Neoclouds, it’s easy to get overwhelmed by new terminology. But if we strip away the "AI" label, we are left with the same fundamentals of cloud incident response:

Hierarchy (Tenants, Projects)
Identity and Access Management (IAM & Federation)
Logging & Forensics (Audit Trails)

Who are the major players?

While the Neocloud market is expanding rapidly, this series will focus on three major players that incident responders are most likely to encounter today:

1. Nebius
2. CoreWeave
3. Lambda Labs

The AI Threat landscape

Before we jump into the technicals, we should briefly discuss the threat landscape around AI and Neoclouds. After all, the threat is what causes the 5pm Friday call for us incident responders.

In the Neocloud space, the threat landscape is evolving rapidly. While intelligence maturity is still catching up to the technology, both public reporting and first-hand experience at Invictus highlight five high-relevancy threat vectors.

1. Cryptojacking / Resource Hijacking (T1496)

Detect and mitigate unauthorized use of H100 clusters.

One of the most prevalent threats in the traditional cloud has relevancy here too. Criminal threat actors are not looking for your credit card data; they are looking for your H100s. A compromised cluster can be repurposed to mine cryptocurrency or train the attacker's own models, potentially costing you tens of thousands of dollars in a few hours.

2. Model Exfiltration (AML.T0010)

Prevent theft of proprietary model weights and IP.

While not as common as resource jacking, theft of proprietary AI models is relevant to some threat profiles. Unlike typical database theft, model weights are massive files that are frequently loaded into high-performance memory. State-sponsored actors and cyber mercenary actors could attempt to steal proprietary model weights for state interests or even further a competitor’s interests.

3. Data Poisoning (AML.T0020)

Detect integrity attacks on high-speed storage.

One threat that keeps a CISO up at night is the infringement on data integrity. Modifications, even the slightest to training data stored on high-speed parallel file systems (like Lustre or Weka) could ruin a model's integrity without triggering any alerts. Sabotage-focused threat actors likely to execute such an attack could cause massive damages.

4. Supply Chain Compromise (T1195)

Neutralize malicious models from public repos.

Neocloud environments heavily rely on containers pulled from public repositories (Hugging Face, Docker Hub). In the last couple of years, industry has seen numerous-high-impact supply chain compromises in cloud environments. Neoclouds represent an equal opportunity for both criminal and espionage focus threats. Such as, injecting malicious pre-trained models or files that execute code upon loading are a potential entry point to achieve the actor’s objectives.

5. Agentic Abuse (AML.T0051)

Prevent indirect prompt injection attacks.

One of the more common attacks observed by criminals and insiders in the AI world is when an attacker uses an indirect prompt injection by hiding malicious instructions in a document or website that your autonomous AI agent is likely to read. The AI agent, which has been granted high-level permissions to get work done, follows the hidden instructions often to exfiltrate data or delete clusters.

Nebius

We are kicking things off with Nebius. Why Nebius? Well, they are based in the Netherlands, so we want to start this journey in our own backyard.

Hierarchy

Within Nebius your resources are organized into a parent-child structure of Tenants and Projects.

Based on this picture below we can see in this example that the tenant ID is _{tenant-e00mzw6m41jbhs75m6} and the name is _{invictus-ir-rca}.‍

For an incident responder, this hierarchy is the key to scoping. Knowing the Tenant ID and the Name allows you to immediately define the blast radius. A compromise in one project doesn't necessarily mean the entire tenant is affected. By understanding this structure, you can contain a threat surgically within a project without shutting down an entire global operation.

IAM

In Neoclouds, one of the most important topics is IAM because just like with normal clouds, an identity is required to perform any activity. For each investigation you'll want to make sure you understand the following:

Nebius comes with four built-in groups, but you can also create custom groups in the picture below, we've added an Incident Response group for the IR team.

Within a group you can add members and Access Permits this just means permissions. The nice thing is you can setup permissions on a Tenant or Project level.

For an IR group we suggest having global viewer permissions so you can see all resources including security settings.

Nebius manages identities via either Direct Invitation or Federation (e.g., Entra ID). For Incident Response, this distinction dictates where detection and response should take place. With invited users, containment controls reside entirely within Nebius. With federated users, however, Nebius only sees the successful login token; it does not see the authentication attempts. This means critical attack indicators such as brute force attacks, impossible travel, or MFA fatigue will exist only in your external IdP logs (e.g. Entra Sign-in logs). Requiring responders to pivot out of Nebius immediately to assess the full scope of the compromise.

Service accounts are a space you need to watch closely because it will likely be a target for abuse. Service accounts are used for programmatic access and tasks (e.g., scripts, images and container registries). So, they will likely be a favorite choice for threat actors.

Each service account has a (random) name and ID. You can grant or limit access using Groups, service accounts authenticate via two methods:

With one of the keys the service account requests a token from Nebius IAM; and with the token, the service account can perform actions. Nebius only allows creating and using access keys for service accounts to Logging and the Container Registry. So a scenario where an access key is leaked should not lead to a compromise of IAM directly. Regardless it's crucial to protect these types of identities

Logging & Forensics

Nebius has a lot of similarity to AWS and you can also see that in the logging component. Nebius has two audit log types:

The Control Plane logging is is in JSON format and follows the CloudEvents specification. To access the logs you can either use the GUI via Manage --> Audit Logs or the Nebius CLI with _{nebius audit v2 audit-event}.

Within each event there are a lot of different fields, in this example event a Public Key was created, the other fields highlighted are required to get the full picture of the event.

To help you with analysis we have created a table that contains an overview of the most important fields and their relevance for an IR scenario:

Field Name	Description	Relevance to IR
.time	Event timestamp	Timeline: Establishes the "when" of the incident.
.type	Event recorded (e.g., accesskey.create)	What: Most important field for building detections.
.action	Operation type (CREATE, DELETE)	Severity: Differentiates impact (e.g. READ vs WRITE).
.subject	User email or Service Account ID	Attribution: Identifies the specific actor responsible.
.resource	Resource Name and Type	Target: Pinpoints exactly which asset was touched.
.request	Source IP and Client Software	IOCs: Reveals threat infrastructure (e.g., `curl`).
.status	Success status (OK, ERROR)	Outcome: Did the malicious attempt succeed?
.region	Region (e.g., eu-north1)	Location: Where is it hosted physically.

Bases on our IR experience this means that most organizations will not log these events as it's not on by default. Keep in mind, these events are often contain the most relevant information in an IR such as who accessed a file and from where. If you are responsible for security, this is an important consideration.

Audit logging observations

Here are some things to keep in mind while performing audit log analysis in Nebius:

Exporting the logs

If you're in an IR scenario and you need to export out all the logs there are two options.

We decided to build a script for the first option it leverages your CLI profile to identify all accessible tenants, quickly pulling the last 90 days of data into a JSON file. Once complete, the script prints a detailed acquisition summary for your records.

You can then begin your analysis in your favourite log analysis tool for further investigation.

The Incident Readiness Shortlist

If you or your customers are running workloads on Nebius, don't wait for the Friday 5 p.m. incident call. Here are the essential actions you should take right now to ensure you're ready to respond to an incident.

About Invictus Incident Response

We are an incident response company and we ❤️ the cloud and specialize in supporting organizations in preparing and responding to a cyber attack. We help our clients stay undefeated!

Incident Response in the Neocloud - Nebius (Part I)

TL;DR