Design Ideas for a Future Computer Virus... and for a Future Security Architecture

-- by François-René Rideau

This article was initially written from 2002-01-14 to 2002-01-19. Notes on possible counter-measures were added from 2003-07-07 to 2003-07-08. This article is available under the bugroff license. Its length is about 7000 words.

Introduction

In a preceding article, Les virus informatiques comme sous-produits des logiciels exclusifs (in French; see also its -- still incomplete -- adaptation into English, Computer Viruses Are Caused By Proprietary Software), I explained the reasons why proprietary OSes are so intrinsically unsafe and prone to viruses: because software is distributed by as many monopolist black-box vendors as there are pieces of software, without any possibility of trust or coordination, and end-users (who cannot afford to surrender control of their computer to a single monopolist) are to be able install software packages received by email, CD-ROM, etc. Even if operating system makers had an interest in making things reliable (they don't), the situation is intrinsically desperate as far as security goes. The fact that mainstream systems be so ridiculously easy to infect means that those viruses that currently survive and prosper are extremely crude and primitive ones: no need to spend time being elaborate, instead just ride on the wave of one of the huge security issues that are common in proprietary systems, and spread from end-user to end-user.

As a counterpoint, I wanted to explore how elaborate viruses would have to be, so as to survive and propagate in free software systems, considering the much more secure practices by which software is distributed. And since free software systems evolve in reaction to such threats, I also explored the dynamics of this evolution, that would only raise the stakes of the challenge for such viruses.

As explained in the previous article, in the world of free software, there is a free market in organizing software into coherent reliable "distributions", that end-users install directly each from one trusted server or CD-ROM set. There is never a need for the end-user to install any small utility, pirated software, teaser demonstration, or anything; silly animations can be available using applet viewers, and real programs are packaged by your distribution maker (Debian, FreeBSD, OpenBSD, RedHat, etc.). Thus, it is a rare event that an end-user installs software, and mail user-agents and file browsers are configured to not install or run any untrusted software by default. The virus must survive all this time without being detected, and the day that the rare opportunity comes, it must be able to run on the software configurations of the day, that might have changed since the day the virus was initially launched. Also, those people who would have to be infected so that a virus could spread to lots of end-users are the developers and maintainers of software packages; but they are also the people most proficient to avoid being caught by easy tricks, the people with the greatest interest in not being caught, and the people who would most quickly find out and eradicate the virus before it could do much damage, and who would be able to fix that damage afterwards.

All these phenomena combine in making it a very difficult task for a dedicated virus developer to take by storm the free software community. To measure the difficulty of such a task, I decided to throw a few ideas for the design of a very robust virus, that could defeat the vigilance of free software developers. I then also explore the counter-measures that may have to be taken to efficiently defuse the threat of such viruses.

Malicious effects

To start with, let's recall the kind of things that a virus or other malware can do, once it has actively infected a system. Indeed, although a virus has to replicate itself so as to spread, this might not be its sole activity, and even this activity can have undesirable side-effects.

Using up resources

The most common malicious effect, is to just use the limited resources of infected systems for purposes in different from its owner's.

Counter-measures: Quotas can help stop the loose waste of resource and detect the culprit while leaving enough resources for the administrator to inspect the live system (particularly if the quota drops to zero when the end-user isn't interacting with the application or otherwise explicitly endowing it). As for saving human resources spent administering machines, it will take better, more automated system administration and application development tools.

Defeating privacy

When you monitor a system, you can try to access private data.

However, before you can do all those things, you also need to have quite a good model of how these things work: detect where they are, how they can be used, when keys are used, what they open, whether it's safe to use them, etc. Avoiding needless risks, traps, etc.

Counter-measures: Have fine-grained access-control lists or capabilities, so that a program may not access data it isn't meant to access. Access-control on meta-data can also help put an early stop to adaptative attempts to propagate.

Tampering with data

When you have write access to some data, you can tamper with it and breach the integrity of the system.

These are by far the most malicious things one can do with a virus. Definitely evil. Definitely difficult to do, too, since detecting what applies and preparing for action without getting caught, in a fully automated program that doesn't know much in advance about its target systems, and with little to no interaction, is quite a feat.

Counter-measures: the same counter-measures of access-control as for breaches in privacy can be extended to deal with breaches in integrity. Additionally, application-level write access to data that matters to users should in as much as possible take place in reversible ways: modifications should be logged together with previous unmodified state, so as to be able to override mischevious modifications and undo any bad effect. In other words, user-level system data should be stored in a "monotonic" knowledge-base, where the available information increases but is never deleted (or at least, not until it has been safely backed up, and not without explicit user consent). Considering the price of storage hardware and the input rate of a user with keyboard and mouse, this could be done systematically for any document entered on a typical console, and possibly also for typical uses of audio documents and pictures, etc.

Summary of malicious effects

Note that the difficulties of achieving these malicious effects are not very much changed from a proprietary software system to a free software system. However, the diversity of free software systems in competition as regards security as well as thousands of details to which to adapt certainly makes it harder for virus developers to write a virus that can adapt to all the varying conditions met in the diverse systems they will have to target, as opposed to the uniform conditions imposed in the monocultures that proprietary software systems are. See Understanding the World below.

Counter-measures: To control access to resources in general, fine-grained quotas can help refine detection, at the expense of additional management: per-user, per-session, per-program and per-process quotas and access-control, resource-level and application-level filters; some of them might be temporarily passed over as capabilities, but reclaimed by the quota-owner and terminated with it. Additionally, all user-produced modifications should hopefully be stored in permanent storage media. Dynamic resource allocation and supervision tools will ultimately have to be developed that will automatically manage normal increase in resource usage and detection of suspicious patterns. Actually, the same techniques that must be used to prevent virus attack are only a natural elaboration of routine techniques that allow for safe and efficient computing in general, and ought to become available in standard system administration tools and development environments.

Stealth

In a secure system, viruses don't often get the opportunity to propagate or to execute malicious code. And when they do, they can be busted quickly, which would stop further propagation. Until they can reach the climax of their life and die, they must survive, and thus avoid detection by the immunity systems of target computer systems (intrusion-detection software, watchful system administrators, attentive users). The main stealth techniques for going undetected can be divided into Polymorphism, Discretion and Track-Covering.

Polymorphism

Polymorphism avoids recognition of the virus in its inactive forms by simple pattern-matching techniques thanks to the introduction of randomness and complexity in the aspect of the virus. Lack of polymorphism makes a virus easy to recognize, once first identified.

Counter-measures: There can be an arms-race of pattern-recognition and code-obfuscation between viruses and virus-detection programs; but viruses have the initiative in this regard, and current viruses will forever remain undetectable by the current generation of virus detectors. Moreover if virus designers get really serious, they will become completely undetectable by any affordable pattern-recognition technique.

Thus, while reactive anti-virus software can be a complement to the security, and a quick and dirty audit tool for the time being, they cannot constitute an effective barrage against security breaches. People responsible for computer security should not rely on them, and instead take proactive measures starting with sensible access control.

Discretion

Discretion is about avoiding recognition of the virus in its active forms by people or programs monitoring running processes. It works by only altering infected programs' behaviour in subtle ways that are difficult to observe, and sometimes even impossible to observe with the coarse-grained tools most used for monitoring (e.g. ltrace or strace).

Counter-measures: The arms-race here will be in the development of high-level application-specific filters and logging. That is, system administrators must also develop a better model of what is normal interaction, so as to detect unusual patterns. Well understood high-level patterns of interaction, of course, should be directly encoded in programming languages, which then makes possible to enforce the disciplined adherence of system behaviour to such high-level patterns. It then becomes possible to control system access, and to log and monitor authorized and unauthorized accesses at a more synthetic level, which makes it both easier for the administrator to assess, and more difficult for the attacker to mimick. Of course, such evolution toward higher-level programming languages with system-enforced invariant testing and enforcement ought to be the general direction of software development in general.

Track-Covering

Track-Covering alters the very means of detection of the immunity system, so as to confer invisibility to behaviours that would otherwise be quickly noticed. It consists in using the powers of the infected binaries, libraries and kernel, so as to make their alterations invisible to someone inspecting the system using tools that are themselves infected.

Counter-measures: Once the virus was activated without triggering immediate detection, it is often too late to save the infected system. However, once again, the system-wide enforcement of high-level discipline of abstract system behaviour, can help limit any damage (particularly so if data that matters to users is stored in monotonic data bases), and also help track culprit processes. This means that systems in the future ought to never execute potentially unsafe code but in high-level virtual machines (possibly compiled to machine code and cached, but subject to strict high-level abstract semantics).

All in all, and not so surprizingly, we find that high-level abstract machines are a useful tool for white hat hackers as well as for black hat hackers, and here, the good guys have the initiative, as systems will hopefully emerge that adopt safe virtual machines as compulsory protection against malicious (or merely buggy) software.

Architecture

The previous section on stealth was about negative design constraints on a virus: what a virus cannot do and how it cannot do things, and in within what limits it can live and survive. There remains to see what a virus can do and how it could be organized, so as to be able to replicate and do other useful things during its lifespan. That's what architecture is about.

Multilayered Security

It is a standard security procedure to provide failover modes of execution for systems that must resist attacks. Somehow, the same applies to viruses, that have to resist understanding by potential anti-virus developers. The idea is that the virus, like any kind of robust software, must be designed around several levels of behaviour, each with its own security requirements, so that if one security measure is defeated, there are still other fallback measures to defeat for the attacker to get to the heart of the system.

Counter-measures: To resist the attack of viruses designed in multiple layers of security, system designers will themselves have to design system security architecture in multiple layers. So as to prevent malicious programs from selectively triggering their behaviour systems will screen any information on which to base mischievous selection of behaviour. Unless specifically meant to access such information, software modules should not be able to distinguish a "normal" failure from an access violation in their own operation or the operation of a peer component. Introspection of access rights should be forbidden by default; indirect introspection through the gathering of system statistics and similar data should be discouraged, too. Honey pots and other kind of fool's gold should be systematically setup so as to catch suspicious behaviour -- and the absence of introspection should prevent the virus from avoiding discovery in such setting. The virus developer too, can be made to feel unsafe about being discovered, fast. Access rights designed for fault containment should also prevent complete system takeover once malicious software is executed in one place. Special cryptographic keys unlocked by a password prompted on a secure system menu should be required to modify the kernel and other security sensitive configuration, for instance. Finally, the layered use of virtual machines that ensure execution according to proper semantic model at each level of the system minimizes the edge for dangerous behaviour from malicious or spurious programs.

Targets

The whole structure of the virus will be articulated around its various modes of execution, as determined by what target it has already infected, and what targets will be trying to infect next.

Counter-measures: Basic infrastructure such as kernel, compiler, communication layers, user interface, etc., should be properly isolated from tampering. Use of unauthorized customized versions should typically happen within virtual machines that isolate the rest of the system from the risky activity. Using infected virtual machines as honey pot, it is easier to gather enough information on the attacker to successfully crack down on him. Automatic audits by secure parts of the system based on checksums of sensitive code and data, as stored on read-only media can help assess access rights violations. Modification of pervasive configuration scripts as well as of binaries should be taken with extreme care, and the effect of unaudited modifications should be sandboxed by default until properly validated. Programs being distributed, the perfect target for malicious hackers, can be subject to code review; interfaces and access rights policies can be cryptographically signed by many reviewers, independently from the code itself, each release of which needs only be signed by its author. Ultimately, this means that software will be developed using high-level programming languages that have a static and/or dynamic type system that can express constraints on side-effects.

Understanding the World

As we saw above over and over, a crucial part in a successful virus is that it have a good enough model of the world, the various protection mechanisms it will have to defeat, the current environment in which is it running now, etc., so as to succeed. Actually, whereas all the features discussed above are "mere" engineering, that any dark side hacker or team of hacker could easily perform, given enough dedication, the crux of a virus, what will make it successful or not, will be its ability to adapt to the hostile world in which it is let loose. And when having to fight the security defenses of the ultimate target sought, that is, the light side hackers who make and distribute free software, having lots of well-oiled mechanisms won't be enough, without the sense to use the right one at the right moment.

All in all, we saw that a really adaptative virus would have to be based on an expert system that has a dynamic knowledge base, that describes both all the relevant security practices and all relevant software development and distribution practices, so that the virus can establish sensible strategies. Interestingly, if such an expert system existed, it could be used by "white-hat hackers" to build more robust systems impervious to virus attacks.

Counter-measures: Just like malicious software is ultimately based on an understanding of what bad behaviour is possible to do that isn't understood as bad by system defenses, system defenses is based on understanding what is legitimate software behaviour and what is malicious behaviour. Ultimately, virus development is an arms-race between the malicious and the legitimate programmers; but as long as legitimate developers have the initial control of their own machine, they have the initiative in taking measures to defend their computer systems. One notable pitfall to avoid is the paranoia by which system administrators would fail to understand that some behaviour is legitimate and prohibit it: either they will succeed in prohibition, and destroy part of the utility of the computer systems, or they will fail in their prohibition, and open new vulnerabilities in their systems as users develop work-arounds for unadapted administrative policies. In the end, the computers are tools to satisfy the users, and administrators are there to serve users, not to impede their work. When system administrators fall in that pitfall, malicious developers win. (Same as when governments destroy liberties of law-abiding citizens as a response to bombings by terrorist outlaws.) Instead, legitimate users, developers, etc., must cooperate into defining and refining more efficient, higher-level, more automated uses of their computer systems -- and thus keep the lead in the race toward better computerized understanding of the world.

There is no way in which computer security experts can prevent virus developers from picking a good architecture. What we can do is develop computer environments that provide no nutrients for viruses to develop: make it so it will take a virus developer extreme pain in terms of complexity to handle, to achieve little gain in terms of virus survival. We can starve viruses out, if we are proactive in administering what programs can do in general, so as to trap viruses as a (very) particular case. Note then how we're back to the initial problem discussed in previous article, where free software developers can afford a policy that is both tighter and dynamically easier to adapt as compared to proprietary software developers with respect to access rights for installing new software; which is why free software is intrinsically more secure.

Conclusion

For a virus to be successful in surviving and doing malicious things in current and future free software operating systems, it would require quite an amount of proficiency, work, tenacity. Making a virus that can robustly scale to a large range of situations, is really a particularly hard instance of the problem of making a robust and adaptative piece of software, with the additional constraint that it is hardly possible if at all to upgrade, fix or patch the software after it was originally let loose, whereas those who will attack the virus, though they start without much information, will be able to disassemble it, test it in laboratory conditions, exchange information, grow new intrusion-detection techniques, learn better habits, etc.

With a fraction of the work invested in building a really robust virus, the person or group of developers able to build it could get rich and famous at developing actually useful software: writing compilers or decompilers, doing security consultancy, developing copy-protection schemes, growing expert systems, engineering large projects, teaching software design, building systems that automate tasks currently done by human administrators. In contrast, as far as infecting proprietary systems goes, viruses need be neither robust nor scalable, neither stealthy nor architected around multiple layers: a dedicated teenager can write one that will spread all over the world. Meanwhile, the overhead cost of entry before one can do useful, worthwhile work is very low in free software communities, whereas it is very high in proprietary software communities. These economic considerations again explain why viruses are such a constant nuisance with proprietary systems, whereas it is unlikely that they will ever be much of a danger with free software systems.

Additional Pointers

Faré -- François-René Rideau -- Ðặng-Vũ Bân