pachchigarsnehi/README.md

## README.md

      
    Raw
  

              README.md
            
          
    I have received Production intern offer from FB, and while preparing for it I could not find many resources to prepare for the systems interview round. Hence, I decided to create my own preparation material. If you find any error, feel free to send a PR to improve the notes. I found these questions from this blog and I referred to OS book by Tanenbaum.
There are 2 rounds for PE internship, the first is the coding round and second systems round. In the coding round the questions are of easy to medium leetcode level. In the systems round, the main focus is to check the OS knowledge, with a particular focus on Linux based concepts. The following are the set of questions I had prepared, try to get an in-depth understanding of how and why a process is designed in such a manner.
What happens during the boot process from the moment you turn on the machine until you get a login prompt?


The first step is powering up of motherboard, BIOS(EPROM) chip, RAM, and other devices
POST is done by BIOS, identifies various IO devices(disks, keyboard, screen).
Handoff from BIOS: Boot disk is picked(could be a bootable external disk, USB.
MBR is read from the first sector of the boot disk and the boot code is executed
All 4 partition metadata is read and the “active” partition is selected
The MBR loads the first 512 bytes of the active partition into the memory and instructs the CPU to execute them.
The very first (three) bytes of the partition boot sector contain a single JMP instruction, telling the CPU to skip xx bytes ahead and execute the next stage of the bootloader from there.
The CPU follows the JMP instruction and seeks to the beginning of the bootstrap code contained within the partition boot sector, and starts to execute.
It looks up a file stored on the partition itself (as a regular file), and tells the CPU to execute its contents to begin the final part of the boot process.

Grub configuration file is /boot/grub/grub.conf (/etc/grub.conf is a link to this). The following is sample grub.conf of CentOS.

#boot=/dev/sda
default=0
timeout=5
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.18-194.el5PAE)
          root (hd0,0)
          kernel /boot/vmlinuz-2.6.18-194.el5PAE ro root=LABEL=/
          initrd /boot/initrd-2.6.18-194.el5PAE.img #As you notice from the above info, it contains kernel and initrd image.


So, in simple terms GRUB just loads and executes Kernel and initrd images. To set up the user environment, the kernel executes the /sbin/init program.
When the init command starts, it becomes the parent  or grandparent of all of the processes that start up automatically on the system. First, it runs the /etc/rc.d/rc.sysinit script, which sets the environment path, starts swap, checks the file systems, and executes all other steps required for system initialization. For example, most systems use a clock, so on them rc.sysinit reads the /etc/sysconfig/clock configuration file to initialize the hardware clock. Another example is if there are special serial port processes which must be initialized, rc.sysinit will execute the /etc/rc.serial file.
After selecting the runlevel, init services for that runlevel are started.
Getty program is started, prompt for login and /sbin/login to check password

Refrences:


Chapter 10, section 1 from the OS book
MBR blog

What happens in Linux, on a kernel level, when you type in ls -l?


Bash will search in alias list for aliases with ls. Assuming there isn’t, shell will search for the binary in the $PATH directories.
ls will be found in /bin/
Shell will do Fork to create a child process, then execve to run the executable found, next will use waitpid to wait for the process to finish up(parent shell process is in blocked state).
The child, (ls) will run stat, lstat system calls to list details
Once the child finishes, waitpid will signal parent process to resume
Note: Check which signal is sent when a child finishes to signal the parent.

How would you troubleshoot a system in which you are not able to start an application on the server?


Check if permissions are there to execute
Check logs if any
Check if the application is compiled for the architecture. Use objdump or file utility and compare your processor information using uname -a
If dynamically linked, check if the interpreter is present on the system

If you had a program that needed 1TB of RAM and you only have 16GB, how would the Linux system allocate memory?


Concept of paging and virtual memory
All the memory is not loaded at once, the text and data section is loaded.
Pages are allocated as requested
TLB is used to translate va(virtual memory) to pa(physical memory), if not present a miss happens
A page fault happens if the page is not there on physical memory
Extra Info: Why virtual memory, paging and multi-level paging is required. They all solve a different problem.

What is the default signal that is generated when sending a kill command to a process in Linux?

SIGTERM(Signal termination): Allows a process to complete shutdown
Extra Info:
-9 for SIGKILL(can’t be captured): It immediately stops the process. The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored. Ctrl+C is SIGINT(Signal interruption)
Example of SIGINT handler doing cleanup:
# Signal handler for SIGINT
def sigint_handler(signum, frame):
    print("\nReceived SIGINT (Ctrl+C). Exiting gracefully.")
    # Perform cleanup tasks here
    # Close the server socket
    server_socket.close()
    print("Server socket closed. Exiting.")
    sys.exit(0)
How do you determine if a drive is full?

Check disk usage and inodes usage: df -h, df -i
What does "$?" mean in bash?

test.sh
#! /bin/sh
echo '$#' $#
echo '$@' $@
echo '$?' $?
Sample Output
> ./test.sh 1 2 3
$#  3
$@  1 2 3
$?  0
You passed 3 parameters to your script.

$# = number of arguments. Answer is 3
$@ = what parameters were passed. Answer is 1 2 3
$? = was last command successful. Answer is 0 which means 'yes'

What does the command uname -r do?

Kernel release
From man page:
-s, --kernel-name
        print the kernel name
-n, --nodename
        print the network node hostname
-r, --kernel-release
        print the kernel release
-v, --kernel-version
        print the kernel version
-m, --machine
        print the machine hardware name
-p, --processor
        print the processor type (non-portable)
-i, --hardware-platform
        print the hardware platform (non-portable)
-o, --operating-system
        print the operating system
What part of the tcp header does traceroute modify?


TTL from 1->30, until the destination is reached
Extra Info: For UDP, port set to very high; gets port not opened error from the destination IP.

What's a zombie process?

Completed child process waiting for reaping by the parent process. i.e. they exist as an entry in the process table.
zombies are dead and mostly completed processes that do not take memory or CPU resources.
How do you load a Linux kernel/How do you load a module into the linux kernel?

Loading/unloading a kernel module
To load a kernel module, use the insmod utility. This utility receives as a parameter the path to the *.ko file in which the module was compiled and linked. Unloading the module from the kernel is done using the rmmod command, which receives the module name as a parameter.
$ insmod module.ko
$ rmmod module.ko
Refrences:


https://tldp.org/LDP/tlk/dd/drivers.html

What's a signal and how is it handled by the kernel?

Kernel will trap with the signal(number), check if it can be caught or ignored(SIGSTOP, SIGKILL). If not, check if a handler is present, otherwise complete the default action.
What happens when a signal arrives?

When a signal is about to get delivered, one of the following default actions take place depending on the signal:

The signal is ignored, i.e., it is discarded by the kernel and has no effect on the process. (The process remains unaware that the event had even occurred.)
The process is terminated, a.k.a. abnormal process termination, as opposed to normal process termination that occurs when the program terminates using exit().
A core dump file is generated and the process is terminated.
The execution of the process is suspended or resumed.

Refrences:


https://towardsdatascience.com/signals-in-linux-b34cea8c5791
https://embetronicx.com/tutorials/linux/device-drivers/interrupts-in-linux-kernel/
https://stackoverflow.com/questions/13341870/signals-and-interrupts-a-comparison

Extra Info: Interrupts vs Signal: Hardware crying for attention vs process getting notified of some events
What is the swap area, regarding memory?


A portion of hard disk space reserved to page out inactive pages.
Paging files can be added and removed dynamically and each one has a priority. Paging to a separate partition, accessed as a raw device, is more efficient than paging to a file for several reasons. First, the mapping between file blocks and disk blocks is not needed (saves disk I/O reading indirect blocks). Second, the physical writes can be of any size, not just the file block size. Third, a page is always written contiguously to disk; with a paging file, it may or may not be.

In IPv6 what is the A record equivalent?

AAAA record
Write a script that connects to 100 hosts, looks for a particular process and sends an email with a report
# Using threadpool python
import subprocess
import concurrent.futures
 
def getHosts(filename):
   with open(filename,'r') as f:
       lines = f.readlines()
   ret = []
   for line in lines:
       ret.append(line.strip("\n"))
   return ret
 
map_pinfo = dict()
def getPinfo(host):
   # print(host)
   result = subprocess.run(["ssh", host, "ps aux | grep java"],
                       shell=False,
                       stdout=subprocess.PIPE,
                       stderr=subprocess.PIPE,
                       check=False)
   return result
 
hosts = getHosts("hosts.txt")
# print(hosts)
 
with concurrent.futures.ThreadPoolExecutor(max_workers = 2) as executor:
  future_to_host = {executor.submit(getPinfo, host): host for host in hosts}
#    print(future_to_host)
  for future in concurrent.futures.as_completed(future_to_host):
   host = future_to_host[future]
   try:
       data = future.result()
   except Exception as exc:
       print('%r generated an exception: %s' % (host, exc))
   else:
       print('Completed host : %s ' % (host))
       map_pinfo[host] = data
 
# print(map_pinfo)
# Sending mail now
 
import smtplib, ssl, time
 
now = time.now()
message = """Subject: Process report
 
# Date: {now}
# Data: {data}"""
 
from_address = "my@gmail.com"
password = input("Type your password and press enter: ") # or read from a config file
 
host_data = ''
for key in map_pinfo:
   host_data += "Host: %s data: %s\n" % (key, map_pinfo[key])
 
 
context = ssl.create_default_context()
with smtplib.SMTP_SSL("smtp.gmail.com", 465, context=context) as server:
   server.login(from_address, password)
   server.sendmail(
       from_address,
       email,
       message.format(now=now,data=host_data),
   )
How does strace work?


Strace works by using the ptrace system call which causes the kernel to halt the program being traced each time it enters or exits the kernel via a system call. The tracing program (in this case strace) can then inspect the state of the program by using ptrace. Strace gets the arguments to each system call depending on how that system works. On x86-64 systems, arguments to system calls are passed in CPU registers. In this case, strace can call ptrace with the PTRACE_GETREGS argument to get a copy of the register values and print them.
Ministrace

How do you see which disks are currently mounted?


df
lsblk

List the ways to catch a signal for a program that you don't have source code for


Strace?( strace -e 'trace=!all' bash)
strace -p 

Explain containerization


A good blog
Containerization is an isolation technology, which allows to isolate a process at multiple levels through namespaces:

mnt namespace provides a root filesystem (this one can be compared to chroot I guess)
pid namespace so the process only sees itself and its children
network namespace which allows the container to have its dedicated network stack
user namespace (quite new) which allows a non root user on a host to be mapped with the root user within the container
uts provides dedicated hostname
ipc provides dedicated shared memory


All of this adds more isolation than chroot provides
Linux Namespaces helps to provide an isolated view of the system to each container; this includes networking, mount points, process IDs, user IDs, inter-process communication, and hostname settings.
Cgroups (Control Groups) is a Linux kernel feature that provides resource isolation, allow you to allocate system resources, such as CPU, memory, disk I/O, and network bandwidth, to different processes or groups of processes. This ensures that containers or processes within the containers receive their allocated resources without impacting other containers or system processes. Docker leverage cgroups to set resource limits, monitor usage, and enforce resource allocation policies.

Daemonize a Linux process

os.setsid()


When a process is running in a terminal, it is typically associated with a session and a controlling terminal. A session is a group of processes that share certain attributes, such as process group ID and controlling terminal. The controlling terminal is responsible for input/output operations and signals sent to the terminal.
By calling os.setsid(), the process becomes the leader of a new session, breaking away from the existing session and its controlling terminal. The setsid() function performs the following actions:

The process becomes the session leader: The calling process becomes the leader of a new session. This means it receives a new process group ID.
The process becomes detached from the controlling terminal: The process detaches from its current controlling terminal, effectively dissociating itself from any terminal session.


Redirecting standard file descriptors (stdin, stdout, stderr)

It is required when daemonizing a process to ensure that the process is detached from the controlling terminal and does not interact with the user's session.

Detaching from the Terminal: When a process is daemonized, it needs to detach from the controlling terminal to run independently in the background. By redirecting the standard file descriptors, the daemon process ensures that it does not read input from or write output to the terminal.
Preventing Hang-ups (SIGHUP: "Signal Hang-Up" and is a signal that is typically sent to a process to inform it of the termination of its controlling terminal or session leader.): By redirecting standard file descriptors, the daemon process prevents the SIGHUP signal from being sent to the process when the terminal session is closed. SIGHUP is typically sent to processes attached to a terminal when the terminal session ends. By redirecting the file descriptors, the daemon process avoids receiving this signal and continues running uninterrupted.

How do you make a process a service?

By creating service file for init.d or systemd
System Call and context switch


https://stackoverflow.com/questions/9238326/system-call-and-context-switch

How can you find whether a process is I/O bound or CPU bound?


Iostat?
Htop
Top

What is a filesystem and how does it work?


Read: https://tldp.org/LDP/sag/html/filesystems.html
The central concepts are:
Superblock: The superblock contains information about the filesystem as a whole, such as its size
Inode: An inode contains all information about a file, except its name. The name is stored in the directory, together with the number of the inode.
Data block:  The inode contains the numbers of several data blocks, which are  used to store the data in the file. There is space only for a few data block numbers in the inode, however, and if more are needed, more space for pointers to the data blocks is allocated dynamically. These dynamically allocated blocks are indirect blocks; the name indicates that in order to find the data block, one has to find its number in the indirect block first.
Directory block: A directory entry consists of a filename and the number of the inode which represents the file.
indirection block

How does the system call go from user from kernel space?

Steps:

The C compiler uses a predefined library of functions that have the names of the system calls, thus resolving the system call references in the user program to what would otherwise be undefined names.
It looks in the system call table to find the  system call entry no. for invoked system call.
It determines no. of parameters we are passing in a System Call.
Copy all the parameters from user space to Uarea.
Save Current context of the process for return purpose.
Then it invokes Operating System Trap (for Intel 0x80 instruction) to switch user space mode to kernel space. With this instruction it passing System call number and parameters to the kernel in the form of Specific register or on the Stack.
From the above, kernel determines the specific system call user space is invoking.
The kernel calculates the ( user) address of the first parameter to the system call by adding  ( or subtracting, depending on the direction of stack growth ) an offset to the user stack pointer, corresponding to the number of parameters to the system call
Finally kernel calls the appropriate System call routine i.e. a interrupt handler.
After executing the system call routine, Kernel determines if there was some error ?

if True : It set the “carry” bit for the PS register and copying the error number into the register 0 location
if no : Kernel clears the carry bit in PS register and copy the return value of system call into register 0 and 1.


Finally return the value to the user program.

VMstat

Sample output:
[user@fedora9 ~]$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0      0  44712 110052 623096    0    0    30    28  217  888 13  3 83  1  0
 0  0      0  44408 110052 623096    0    0     0     0   88 1446 31  4 65  0  0
 0  0      0  44524 110052 623096    0    0     0     0   84  872 11  2 87  0  0
 0  0      0  44516 110052 623096    0    0     0     0  149 1429 18  5 77  0  0
 0  0      0  44524 110052 623096    0    0     0     0   60  431 14  1 85  0  0             

Field:
Procs
    r: The number of processes waiting for run time.
    b: The number of processes in uninterruptible sleep.
Memory
    swpd: the amount of virtual memory used.
    free: the amount of idle memory.
    buff: the amount of memory used as buffers.
    cache: the amount of memory used as cache.
    inact: the amount of inactive memory. (-a option)
    active: the amount of active memory. (-a option)
Swap
    si: Amount of memory swapped in from disk (/s).
    so: Amount of memory swapped to disk (/s).
IO
    bi: Blocks received from a block device (blocks/s).
    bo: Blocks sent to a block device (blocks/s).
System
    in: The number of interrupts per second, including the clock.
    cs: The number of context switches per second.
CPU
    These are percentages of total CPU time.
    us: Time spent running non-kernel code. (user time, including nice time)
    sy: Time spent running kernel code. (system time)
    id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
    wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
    st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.

Sample scenarois of load, IO, etc.
Soft link vs Hard link

Soft Link (Symbolic Link):


Inodes: A soft link is represented by a separate inode, distinct from the inode of the target file. The soft link inode stores the path or name of the target file.
File Size: The soft link file has its own size, which depends on the length of the target file's path or name.
Permissions: Soft links have their own permissions, which may differ from the target file's permissions. The soft link itself needs to be accessible to access the target file.
Modification: If the target file is moved or renamed, the soft link becomes invalid and points to a non-existent file. Symbolic links can cross file system boundaries and point to files on different file systems.

Hard Link:


Inodes: A hard link is created by associating a new directory entry with the same inode as the target file. In other words, hard links share the same inode as the target file, meaning they refer to the same underlying file data on the file system.
File Size: Hard links do not have their own size; they share the size of the target file. Any changes to the file content or metadata are reflected in all hard links.
Permissions: All hard links to a file have the same permissions, as they refer to the same underlying inode.
File Deletion: Deleting a hard link does not affect the target file or other hard links. The file's data remains accessible - File Renaming: Since hard links share the same inode, renaming the target file will not break the link. All hard links will still point to the renamed file.

B-Tree vs B+ Tree

B-trees and B+ trees are both types of self-balancing search trees commonly used in database systems and file systems. While they share similarities, they also have key differences in terms of structure and usage. Here are the main differences between B-trees and B+ trees:


Node Structure:

B-tree: In a B-tree, each node contains both keys and pointers to child nodes or data. Keys are stored in non-decreasing order within a node.
B+ tree: In a B+ tree, each node contains only keys, and all keys are stored in non-decreasing order. Actual data or pointers to data are stored only in leaf nodes.


Leaf Node Structure:

B-tree: Leaf nodes in a B-tree can contain both keys and associated data.
B+ tree: Leaf nodes in a B+ tree contain keys and pointers to the actual data, while keys themselves serve as the primary means of indexing and searching.


Fanout:

B-tree: B-trees typically have a lower fanout (number of children per node) compared to B+ trees. This is because B-trees store data in internal nodes as well, reducing the number of keys per node.
B+ tree: B+ trees usually have a higher fanout since data is stored only in the leaf nodes, allowing more keys per node.


Data Duplication:

B-tree: B-trees can store duplicate keys both within internal nodes and leaf nodes, along with their associated data.
B+ tree: B+ trees allow duplicate keys only within leaf nodes, while the internal nodes contain only unique keys.


Range Queries:

B-tree: B-trees are well-suited for both point queries and range queries, as the data can be found in internal nodes as well as leaf nodes.
B+ tree: B+ trees excel at range queries since they store all the data in leaf nodes, allowing sequential access to data by traversing the leaf nodes in order.


Index Usage:

B-tree: B-trees are commonly used as index structures in databases to efficiently search for individual records.
B+ tree: B+ trees are widely used for indexing and are particularly efficient for range queries due to their leaf node structure.


Both B-trees and B+ trees offer efficient searching and insertion times, with the choice between them depending on the specific requirements of the application or system. B-trees are more suitable when both point and range queries are important, and duplicate keys need to be stored within internal nodes. B+ trees are preferable when range queries and efficient sequential data access are critical, and duplicate keys are allowed only within leaf nodes.