Features, Insight, Opinion

Mitigating the risks of using open source in KasperskyOS

Dmitry Romanov, Software development group manager for KasperskyOS at Kaspersky.

Dmitry Romanov, Software development group manager for KasperskyOS at Kaspersky, discusses the mechanics of the KasperskyOS that help minimise security risks traditionally associated with operating systems in this exclusive opinion piece  

According to a report by the Linux Foundation Research, OpenSSF, and researchers from Harvard Business School, 96% of modern applications include open-source components. While the use of system software and Linux kernel drivers is widespread and often efficient, it can introduce significant risks — particularly those associated with software supply chain attacks.

When developing the KasperskyOS operating system, we at Kaspersky asked ourselves: how can the widespread use of untrusted code be made safe, especially in products designed for industries with stringent cybersecurity requirements? In this article, we explore the mechanisms built into KasperskyOS that help mitigate risks commonly associated with traditional operating systems.

Cyber security incidents have continued to rise year over year, with supply chain attacks now accounting for a significant proportion of them. The year 2024 has been no exception, as demonstrated by several high-profile incidents:

  • A critical backdoor discovered in XZ/liblzma, which enabled attackers to bypass OpenSSH authentication, execute remote commands, and suppress event log entries;
  • The compromise of the polyfill.io script;
  • The upload of malicious packages to the PyPI repository with names closely resembling legitimate ones;
  • The breach of the Discord community bot, top.gg.

While such attacks are not entirely new, their frequency and sophistication have grown. For example, in 2023, Microsoft’s infrastructure was compromised via a supply chain attack on JFrog Artifactory, a widely used artifact manager. One of the most well-known examples dates back to 2020, when SolarWinds, a provider of IT infrastructure monitoring solutions, was breached. The resulting compromise affected thousands of public and private sector organisations globally. Similarly, the 2021 Codecov breach involved the compromise of a code coverage utility, with far-reaching consequences.

Threat Models in Modern Operating Systems and the Use of Sandboxes for Applications and Drivers

In developing KasperskyOS, we have drawn extensively on global best practices and established approaches to secure operating system design. Our process included analysing international standards for operating system security and secure software development, as well as studying the methodologies adopted by creators of today’s most respected and commercially successful secure platforms. We have been particularly inspired by systems such as Android, ChromeOS, GrapheneOS, Whonix, Ubuntu Core, Qubes OS, the Genode OS Framework, Legato, HarmonyOS.NEXT, OpenBSD, seL4, and others.

A common thread among these systems is a re-evaluation of the traditional threat models for both personal and professional devices. These devices are now permanently connected to the global internet, frequently processing complex, untrusted content. They also support third-party applications, which introduce further unpredictability into the system. Moreover, the growing use of wireless technologies such as NFC, Bluetooth, and Wi-Fi expands the potential attack surface. Physical security is also a concern, as many modern devices—including wearables—are vulnerable to theft or loss, requiring the threat model to account for scenarios involving direct physical access.

One of the most effective strategies for mitigating these risks is the use of sandboxing: isolating application code and modules that handle untrusted data within tightly controlled execution environments with minimal privileges. This architectural approach limits the potential impact of any successful attack by containing malicious behaviour within a single process or virtual machine. Access to system resources is governed by detailed security policies that consider the roles and requirements of users, platform developers, and administrators.

At Kaspersky, we have extended this model further. While sandboxing is often confined to the application layer in many systems, we advocate for its application to traditionally less-isolated components as well—specifically, device drivers and the network stack. This is particularly relevant as more systems, including those in critical infrastructure, depend on components based on the Linux kernel. Given this dependency, it is crucial to mitigate the risks posed by potential vulnerabilities or deliberately planted backdoors in Linux drivers and subsystems.

One of the promising avenues we are currently exploring is the secure reuse of Linux drivers to accelerate the porting of KasperskyOS to new hardware platforms. Our team is actively addressing both the technical and legal challenges associated with incorporating GPL-licensed code, and is investigating mechanisms that would allow system code to be reused securely within the KasperskyOS architecture.

Threats and Mitigation Strategies in KasperskyOS

KasperskyOS leverages its microkernel architecture to implement effective compensating controls that mitigate many of the risks typically associated with monolithic operating systems. In this section, we focus on a specific subset of the threat model under development for KasperskyOS—those related to device drivers.

Threat modelling commonly involves the use of data flow diagrams that represent the flow of information within a system. These diagrams typically identify both subjects and objects (such as processes) and passive ones (e.g., data stores, files, hardware registers, memory structures), as well as information flows and trust boundaries.

In widely used monolithic kernel-based operating systems, drivers generally operate within just two trust boundaries: between user space and the kernel, and between the kernel and the target hardware. This architectural limitation makes it difficult to isolate driver-specific threats from those affecting the kernel as a whole. As a result, mitigating risks such as supply chain attacks—where driver source code may be maliciously modified, or latent vulnerabilities inadvertently introduced—becomes highly challenging. While Windows systems employ mechanisms such as driver signing and centralised distribution to address this, the Linux ecosystem largely relies on software hardening techniques to make exploitation more difficult. However, practical experience shows that even the most advanced hardening cannot fully prevent a determined and well-resourced attacker from exploiting a vulnerability.

By contrast, microkernel-based systems such as KasperskyOS are not subject to these constraints. Their architectural separation of system components facilitates the definition of additional trust boundaries, allowing for a more granular and effective approach to threat mitigation.

To illustrate, consider a simplified threat model for an abstract device driver. Sources of potential threats may include the devices interacting with the driver, user processes for which the driver abstracts the hardware and allocates resources, and the various stacks or subsystems that provide further abstraction layers over the driver, such as network stacks.

The following table will present several types of threats relevant to this model, along with examples of compensating measures specific to the KasperskyOS architecture.

 

Threat Subthreat Description Desired behaviour Mitigation
DRV 1*. Denial of service DRV 1.1. Denial of service to the entire operating system due to a driver malfunction A driver error is exploited to create an exceptional situation, such as dereferencing a null pointer. The operating system kernel panics and enters an erroneous state.

In most popular operating systems, this threat cannot be mitigated. Hardening will not help, because such an attack does not involve deliberate code injection or execution path modification.

A driver malfunction does not result in a denial of service to the entire system Executing the driver in a sandbox with reduced privileges. The driver should run in a separate user space process. The kernel API related to physical memory allocation should be available only to a trusted driver manager, the drivers themselves should get the necessary resources (ports, memory regions, interrupt numbers) through passing Object Capability to the corresponding objects. Different reactions to an erroneous driver stop (process termination) are possible: restarting, writing to the audit log.
DRV 1.2 Failure to maintain dependent services due to extremely long activity during driver operation. A bug in the driver logic is exploited to reduce its responsiveness, for example by executing an infinite loop. Processes and subsystems that use the driver may stop responding due to blocking while waiting for events. A driver disruption does not cause a denial of service for dependent services. Starting the driver in a separate process. Driver health monitoring: ‘pulse check’, watchdog timer. In some cases, such measures will allow restarting a hung driver and restoring its context.
DRV 2. Information leakage DRV 2.1. Leakage of data placed in a heap Due to access beyond the allowed object boundaries, uninitialised values or other errors during the processing of user data, data may leak from the driver and the OS kernel. Such data may contain secrets, such as encryption keys used in the Crypto API. No data leakage or leakage is limited to non-sensitive data.
  • Separating the driver from the kernel, running in user space. This ensures that the driver does not share common secrets with the OS kernel.
  • Application of safe programming practices, SDL.
  • Validation of call parameters by the generated parser.
DRV 2.2. Leakage of data placed in the stack Due to access beyond the allowed object boundaries, uninitialised values or other errors during the processing of user data, data may leak from the driver and the OS kernel. Such data may contain secrets, such as encryption keys used in the Crypto API. No data leakage or leakage is limited to non-sensitive data. Same as in DRV 2.1

* – Threat codes are symbolic identifiers used exclusively for internal reference within this table and throughout the article.

Demonstration Scenario

Let us examine a basic example drawn from the threat model discussed earlier: a network interface card (NIC) driver is deliberately embedded with functionality that causes a denial of service. The system includes a critical user-space process responsible for monitoring hardware, with both the NIC driver and network stack used to transmit event logs related to hardware status. In this context, a complete system shutdown could result in damage to high-value equipment, while a failure to transmit logs is classified as a low-severity incident.

To illustrate this scenario, we constructed an experimental testbed using a Radxa ROCK 3A single-board computer. The setup comprises two system images performing an identical operational task—one based on a minimal GNU/Linux build, the other on KasperskyOS Community Edition (CE). Both systems utilise an identical copy of a network card driver containing the same deliberately embedded vulnerability. When a specially crafted network packet is received, the driver immediately enters an error state.

The experiment demonstrates that the system built on KasperskyOS preserves its critical functionality, while the GNU/Linux-based system experiences a complete kernel failure. It is important to note that this example illustrates protection against a specific vulnerability under defined conditions. To address a broader range of threats, solution architecture must be designed accordingly, incorporating appropriate security patterns and design practices.

Pilot supply chain attacks that pose serious risks to system performance and data security are now more relevant than ever. Yet, such threats are often overlooked by operating system developers, primarily because—without the ability to isolate drivers in a sandbox—they lack effective means to address them. As a result, they are left to rely on the goodwill of open-source software contributors and the quality of publicly available components.

You can find the stand description and source code in our GitFlic repository. To explore our operating system and its security principles in more detail, feel free to reproduce the example using KasperskyOS Community Edition.

Image Credit: Kaspersky

Previous ArticleNext Article

GET TAHAWULTECH.COM IN YOUR INBOX

The free newsletter covering the top industry headlines