Skip to content

Instantly share code, notes, and snippets.

@sgoyal1012
Last active August 24, 2023 09:10
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save sgoyal1012/c650abc7a13f104066e29d8444e9432b to your computer and use it in GitHub Desktop.
Save sgoyal1012/c650abc7a13f104066e29d8444e9432b to your computer and use it in GitHub Desktop.
CheatSheet

CHEATSHEET

General -- Before every interview!!

  • YOU ARE FORGETTING TO INCREMENT INDEXES in while, DRUNK OR SOMETHING JODER!!??

  • ALWAYS THINK ABOUT STACK SPACE!!

  • C'est PAS Scala -- MUST RETURN!!

    • return statement
    • NO CASE CLASSES! --> Use a constructor!! -- new object!
  • ALARM BELLS-->

    • Every problem has some hint on the data structure, if it has something to do with efficient max/min --> heaps, divide and conquer --> Trees, in place --> quickSort
    • Use LONGS to calcualte sum of Integers!
  • Think about DP -- storing solutions!!

  • Use inbuilt functions, UNLESS explicity told to not do so..

  • Put corner case as //TODO at the top- Assert test cases yourself, write assertions

    • Use CONSTANTS.
  • Walk through code, not your algo! (Once coded)

    • ZINK of corner cases -- Like for Binary search square root, when x * x == x
  • Mark braces, such as // main beg, //main fin, //func beg etc.

  • Write util methods!!

  • ALWAYS put space over time.

  • Use if-else wisely, rather than lazy if's only.

  • Use Lombok's @getter @setter for POJO

  • Use objects if you have a pointer type requirement, eg. storing the max in a recursive call.

Recurrence algorithm derivation

https://www.youtube.com/watch?v=U7Aqf7tMOhY

Gotchas

  • NULL Checks!! --> head.next != null, should check if head != null!!!

  • For String, ALWAYS use .equals, AND NOT ==!!!!

  • Always do low + (high-low)/2; instead of high+low/2.

  • ALWAYS check the boolean order.. the expression to the left is evaluated first

  • Remember triplebyte?! Binary tree is NOT necessarily balanced!

Datatypes and Typecasting

For long, initialize like Long l = 0L;

instanceof

To check if an object is an instance of another class, example"

if (value instanceof String)

Typecasting

You can initialize a generic object map, be careful with typecasting!

HashMap<String, Object> map = new HashMap<String, Object>();

//TypeCasting here:
HashMap<String, Object> secondLevelMap = (HashMap<String, Object>) map.get(key);
String strValue = (String) map.get(key)

ACHTUNG! Be sure to add the warning for typecasting!

@SuppressWarnings("unchecked")
  • Always override the hashcode implementation of a class if you override the equals method.

Random Number between two numbers

Random r = new Random();
int Low = 10;
int High = 100;
int Result = r.nextInt(High-Low) + Low;
  • Math.min(a,b) to find the minimum of a and b.

Theory

Control

Switch - case

Can combine cases like this:

case '*':
case '/':
return 2;

Bit Manipulations

Base conversions

https://www.geeksforgeeks.org/convert-base-decimal-vice-versa/

Primer on everything

https://stackoverflow.com/questions/47981/how-do-you-set-clear-and-toggle-a-single-bit

https://www.youtube.com/watch?v=NLKQEOgBAnw

XOR-daar

Xor of a with a is zero. So if you have an array like [a,a,b,c,c], XOR-ing all elements will give you be. XOR is associative and commutative.

Two's complement representation

Negative numbers are represented using a 2's complement form. To obtain the 2's complement of a number:

  • Complement the bits
  • Add one to the result

Operator Precedence

== has higher priority than &. You might want to wrap your operations in () to specify your own priority.

ACHTUNG!!

For getting is a bit is set or not, check !=0 instead of >0 (because of negative numbers!!)

public boolean isOne(int n, int index) {
        int mask = 1 << index;
        return ((n & mask) != 0);
    }

Collections

Generics

interface Merger<T extends Comparable<T>> {

static void main(String[] args) {
  Merger<Integer> merger = new MergerImpl<Integer>();
  List<Integer> merged = merger.mergeLists(
      Arrays.asList(1, 3, 5),
      Arrays.asList(2, 4, 6)
  );
  System.out.println(merged);
}

List<T> mergeLists(List<T> l1, List<T> l2);

}
  
class MergerImpl<T extends Comparable<T>> implements Merger<T> {
  @Override
  public List<T> mergeLists(List<T> l1, List<T> l2) {
    List<T> list = new ArrayList<T>();
    return list;
  }
}

Initialization

ACHTUNG! You can ONLY use object types with Collections, no primitive types

So, new HashMap<int, char> is wrong, it must be new HashMap<Integer, Character>

HashMap<Integer, Integer> hashMap = new HashMap<Integer, Integer>();

Checking size of list properly

https://stackoverflow.com/questions/11512034/does-java-util-list-isempty-check-if-the-list-itself-is-null

if (test != null && !test.isEmpty()) {
   // ...
}

Deleting stuff, while avoiding ConcurrentModificationException

https://stackoverflow.com/questions/223918/iterating-through-a-collection-avoiding-concurrentmodificationexception-when-re

Common methods

Collections.removeAll(), Collections.retainAll(), Collections.sort(), Collections.reverse()

Size

list.size(), map.size()

HashMap

Implementation-Remember!! How do you get the index of the bucket?

-- Remember the W+R>P rule, for consistent hashing.

  • Always override the hashcode implementation of a class if you override the equals method.
https://www.youtube.com/watch?v=IwUwIrz9Ge8
  • Contains method
(hashMap.containsKey(target-nums[i]))
  • Add a key
map.put(key, map.get(key) + 1);
  • Get the value for a key
map.get(key)
  • Iterate over a keyset
for (String key : map.keySet()) {
    // ...
   
}
  • Remove a key
map.remove(key)

Comparator and Comparable

Comparable example

class Dog implements Comparable<Dog> {
  Integer age;
  String name;
  
  Dog(Integer age, String name) {
    this.age = age;
    this.name = name;
  }
  @Override
  public int compareTo(Dog d1) {
    return this.age.compareTo(d1.age);
  }
  
}

Comparator example

    Comparator<Dog> dogComparator = new Comparator<Dog> () {
      public int compare(Dog d1, Dog d2) {
        return (d1.age - d2.age);
      }
    };
  • For primitive types

compareTo method

System.out.println("abcd".compareTo("abc"));

HashSet

  • HashSet DOES NOT MAINTAIN ORDER!!
  • Remember you cannot get a random element, first convert to list and get(0)
List<String> list = new ArrayList<String>(listOfTopicAuthors);
  • Convert an array to a set like this:
Set<T> mySet = new HashSet<T>(Arrays.asList(someArray));
  • IMPORTANT!

TreeSet uses comparator to order stuff, HashSet uses equals method. For a specific object, you must define your own hashcode and equals method (can't do one without the other!, remember the contract). Easy way to define a hashcode (remember the wealthfront interview!)

// Just use the inbuilt hashcode()!!!!

@Override
    public int hashCode() {
      return left.hashCode() ^ right.hashCode();
    }

Equals method signature and implementation

@Override
    public boolean equals(Object pair1) {
      Pair<A,B> myObj = (Pair<A,B>) pair1;
      return (this.left.equals(myObj.left) && this.right.equals(myObj.right));
    }

Defining a treeSet with comparator (See the comparator class syntax!)

class MySalaryComp implements Comparator<Empl>{
 
    @Override
    public int compare(Empl e1, Empl e2) {
        if(e1.getSalary() > e2.getSalary()){
            return 1;
        } else {
            return -1;
        }
    }
}
TreeSet<Empl> salComp = new TreeSet<Empl>(new MySalaryComp());

BackTracking

A new result for every path!! -- When you are returning a List<List<Integer>> remember to create a new list!!

https://www.youtube.com/watch?v=wGbuCyNpxIg has great backtracking explanations.

Sudoku

  • Remember why you must UNSET in case of failure.

String

THIS STOPS NOW! (Remember Atlassian Interview, Duolingo round one)- DO NOT INITIALIZE char with double quotes! STOP IT!

Note that strings are immutable in java, to change, for example to swap characters; conver to charArray[], AB DUBARA MAT BHOOLNA!

ALWAYS USE StringBuilder to append, or put a TODO to come back to it.

Remember how to sort --

  • Arrays.sort(charArray) is IN PLACE JODER!!
  • Remember the use of the keyword intern
  • Remember why strings NEED to be immutable
  • How do you compare two strings for equal values? IMPORTANT!
  • Java 7 substring method memory leak!
  • StringBuilder vs '+'; and optimization in Java7

Questions

Levenshtein- think how you would convert abc to def:

https://www.youtube.com/watch?v=MiqoA-yF-0M

  • REPLACE: Convert ab to de, then replace c with f. (abc --> dec --> def)
  • INSERTION: Convert abc to de, then insert f at the end. (abc --> de --> +f --> def)
  • DELETION: Convert abc to defc, then delete c at the end. (abc --> defc --> -c --> def)

Now come up with the code/algo HDP! Always draw the matrix, and put the empty string as reference at (0,0)

Important- the number of changes for s1[0..i] and s2[0..j] is stored at E[i+1][j+1]. (More on the empty string point above

Slidiing window questions

https://leetcode.com/problems/minimum-window-substring/

Rabin Karp

Careful of the negative numbers' mod!

Methods

  • Get character at index
s.charAt(index)
  • Get length Note that this is a function length() as opposed to a property (as in case of arrays)
str.length()
  • StringBuilder
StringBuilder sb = new StringBuilder();
sb.append("a).append("b");
return (sb.toString());
  • Clearing a string!
stringBuilderObj.setLength(0)
  • Prepending using StringBuilder sb.insert(0, "prependStuff");
  • GOTCHA!
public String substring(int beginIndex,
               int endIndex)
Returns a new string that is a substring of this string. The substring begins at the specified beginIndex and extends to the character at index endIndex - 1. Thus the length of the substring is endIndex-beginIndex.

Time complexity for appending

http://www.pellegrino.link/2015/08/22/string-concatenation-with-java-8.html

  • Split string by delimiter and convert to the numbers

The delimiter string is a regex expression, and you need to escape in case of special chars. (Think Atlassian interview)

Eg.: plusDelimiter

String expression = "1+2+10+1";
String[] tokens = expression.split(" \\+");

Similary multiplication will be split(" \*"); ( Double backward slash)

Eg.:

for (String s: line.split(",")) {
          numbersArray[i][j] = Integer.parseInt(s);
              i++;
          }
public static char forDigit(int digit, int radix)
  • Convert an int to char
 public char intToChar(int i) {
        return (char) (i + '0');
    }

List

Initialization

List<Integer> myQueue = new ArrayList<Integer>();

Add at a particular position, remove from a particular position (useful when you use it to implement Stacks, Queues)

list.add(int index, E element)
remove(int index)
  • Create a list from an array
List<String> abc = new ArrayList<String>(Arrays.asList(arrString));
  • Convert List to Array -- ACHTUNG, DONT work for primitive types!!
String[] stockArr = new String[stockList.size()];
stockArr = stockList.toArray(stockArr);
String [] countries = list.toArray(new String[list.size()]);
  • Iterator
Iterator itr = al.iterator();
      
      while(itr.hasNext()) {
         Object element = itr.next();
         System.out.print(element + " ");
      }

Arrays

Iteration

Can ALSO iterate like this HDP!

for(String value : array)
{
    System.out.println(value);
}

Initialzation

  • The size of an array cannot be larger than Integer.MAX_VALUE (exact value depends on the compiler)
int[] 1DArray = new int[SIZE];
int[] 2DArray = new int[SIZE1][SIZE2];

// With specific data
int[] data;
data = new int[] {10,20,30,40,50,60,71,80,90,91};

// 2D array
  int screen[][] = {{1, 1, 1, 1, 1, 1, 1, 1},
                      {1, 1, 1, 1, 1, 1, 0, 0},
                      {1, 0, 0, 1, 1, 0, 1, 1},
                      {1, 2, 2, 2, 2, 0, 1, 0},
                      {1, 1, 1, 2, 2, 0, 1, 0},
                      {1, 1, 1, 2, 2, 2, 2, 0},
                      {1, 1, 1, 1, 1, 2, 1, 1},
                      {1, 1, 1, 1, 1, 2, 2, 1},
                      };

Pro Tip: Always Use row and col with 2d Arrays for indexing, instead of x and y, YOU WILL THANK ME LATER

Operations

  • Get length
array.length
  • Fill Arrays Arrays.fill()

  • Clone/Copy an array

 int[] b = a.clone(); - Gets a totally different instance

Linked List

Reverse a linked list

Three questions to ask: best approach here: https://stackoverflow.com/questions/354875/reversing-a-linked-list-in-java-recursively

  • What is the reverse of null (the empty list)? null.

  • What is the reverse of a one element list? the element.

  • What is the reverse of an n element list? the reverse of the second element on followed by the first element.

Stacks

Always check for EmptyStackException- returned by pop and peek

Initialization

  • Remember how stacks is used to evaluate postfix and also the valid string with brackets problem.
  • Operator precdence: https://youtu.be/jos1Flt21is?t=4m28s (Remember for parentheses and exponents too!) (exponenets - right to left, division/multiplication- left to right)
  • Prefix evaluation- from left to right; for Postfix: right to left
 Stack<T> stackNewestOnTop = new Stack<T>(); 

Methods

public E peek() - Get the topmost element WITHOUT removing it.
public E pop() - Remove the topmost element and return that element.
size()- Size of stack

BST

Number of nodes in a tree of height h

2^(h+1) -1

Trees

  • ALWAYS check if root is not null

  • Implement Deletion

  • Remember preOrder, inOrder and postOrder HDP!! JODER

https://www.youtube.com/watch?v=oSWTXtMglKE

Recursion - TAREEKH PE TAREEKH, TAREEKH PE TAREEKH, TAREEKH PE TAREEKH!

Vital Tip: Lists, Collections maintain their reference in recursion, so you dont have to return explicity. Just keep passing along, building along.

https://www.youtube.com/watch?v=KEEKn7Me-ms

Enums

class FreshJuice {
   enum FreshJuiceSize{ SMALL, MEDIUM, LARGE }
   FreshJuiceSize size;
}

Add Logger to Code

import java.util.logging.Level;
import java.util.logging.Logger;
private final static Logger logger = Logger.getLogger(TemperatureServlet.class.getName());
logger.log(Level.INFO, "New temperature is " + temperature);
logger.log(Level.SEVERE, "Failed with " + e.getMessage());

Mathematics

  • A general addition rule

In general, for any base (2,5,10 etc..); say base is b1; the sum of digits results in (d1 + d2 + d3) % base The carry results in (d1 + d2 + d3)/base

CHECA lo!

  • Sum of an AP with n terms, first term being a and last term being l.
Sn =1/2 * n(a + ℓ)
  • Sum of GP formula - Do you remember?

  • Combinatorics- Permutation and combination

Quick refresher- https://www.youtube.com/watch?v=bCxMhncR7PU https://www.youtube.com/watch?v=W7DmsJKLoxc

Lemma 1: r-permuations of n without repeats

P(n,r) = n!/(r-n)!
Lemma 3: Nonrepeating r-combinations of a set n

C(n,r) = n!/(n-r)!r!

Remember JODER

nC1 + nC2 + ..... + nCn = 2n

Convert int to char

    public char digitToChar(int i) {
        return (char) (i + 97);
    }

Sorting

https://www.youtube.com/watch?v=SLauY6PpjW4

Merge Sort- Cuidate con este codigo!

if (low < high) {
      int mid = low + (high - low)/2;
      mergeSort( low, mid );
      mergeSort( mid + 1, high);
      merge( low, mid, high);
    }
  • Finding mid like this, avoids overflow:
int m = l+(r-l)/2;

Graphs

Never ever forget the visited array!! - end up in an endless recursion..

  • For n Vertices, there are n(n-1) edges maximum possible (duh..)
  • Breadth first - visit all the nodes in the current depth - Level Order Traversal -- We use a Queue- QUEUEnki apne bacche ka khyal rakho bas, woh apne baccho ka khayal khud rakhenge
  • Depth first - visit all the nodes till the end- Islands problem - PreOrder traversal etc.
  • Topological Sort - STACK
  • See different representations and compare pros/cons in terms of space and time complexity https://www.youtube.com/watch?v=k1wraWzqtvQ

Heaps

  • GREAT API!!

-- Priority Queue --

PriorityQueue<Integer> queue = new PriorityQueue<>(10, Collections.reverseOrder());

See MIT Video -->

  • In the worst case, it is log n to heaapify the shit out of it!
    • Acthung!! --> Heapifying an array starts from Node n/2!
      • Why? As a parent is at (i/2) -- anything AFTER this MUST be a LEAF -- leafs by definition are heapified
    • Time complexity --> max_heapify takes O(L) --> and number of nodes decrease as you go up
  • Time complexity

Insertion - O(log n), Deletion - O(log n), Extract min - O(1)

  • Use java priority queues to implement heaps. Eg.: this uses a string comparator Learn how to define own comparators

See example here: https://stackoverflow.com/questions/683041/how-do-i-use-a-priorityqueue

 Comparator<String> comparator = new StringLengthComparator();
 PriorityQueue<String> queue = new PriorityQueue<String>(10, comparator);
  • Remember the array representation
The root element will be at Arr[0].
Below table shows indexes of other nodes for the ith node, i.e., Arr[i]:

Arr[(i-1)/2]	Returns the parent node
Arr[(2*i)+1]	Returns the left child node
Arr[(2*i)+2]	Returns the right child node
  • Write basic programs such as delete, extractMin, insert Always use hasLeftChildIndex(), hasRightChildIndex(), hasPArentIndex(), to simplify things..

Concurrency

Best Video!- https://www.youtube.com/watch?v=WOSbx8TRrjw (Remember the husband wife example for race condition)

If F is the percentage of the program which can not run in parallel and N is the number of processes, then the maximum performance gain is 1 / (F+ ((1-F)/n)).
Executors.newSingleThreadExecutor().execute(new Runnable() {
    @Override 
    public void run() {
        // code in here
    }
});

*The package java.concurrent.atomic contains many useful classes to perform atomic operations. An operation is atomic when you can safely perform the operation in parallel on multiple threads without using the synchronized keyword or locks as shown

Trie

https://www.youtube.com/watch?v=AXjmTQ8LEoI

  • Practice insertion/deletion/searching on white board- and draw diagrams.
  • Careful with where you set isWord == true

AVL

Think of all rotations, and do the insertion

Prim Spanning Trees and Djikstra

Difference: https://stackoverflow.com/questions/10448397/differences-between-minimum-spanning-tree-and-shortest-path-tree

MST != Shortest Path

Use priority queue's remove method to implement Tushar Roy's min-heap data structure..

https://www.youtube.com/watch?v=oP2-8ysT3QQ

https://www.youtube.com/watch?v=lAXZGERcDf4

ML

Andale esa diagram hijo de puta

Screen Shot 2019-10-21 at 2 29 41 PM

  • Skim over logistic regression/linear regression etc BASICS
Linear Regression could help us predict the student’s test score on a scale of 0 - 100. Linear regression predictions are continuous (numbers in a range).
Logistic Regression could help use predict whether the student passed or failed. Logistic regression predictions are discrete (only specific values or categories are allowed). We can also view probability scores underlying the model’s classifications.

Scala/Spark

Sources

BEST LINK!!! - Q and A

https://www.tutorialspoint.com/java/java_interview_questions.htm

Basics to know

  • Interface can only have abstract methods JODER! Abstract has both.
( Object reference variable ) instanceof  (class/interface type)
  • When a class is defined within a scope of another class, then it becomes inner class. If the access modifier of the inner class is static, then it becomes nested class.

Strongly vs Dynamically types

https://stackoverflow.com/questions/11328920/is-python-strongly-typed

Java Quick Refresher

https://www.tutorialspoint.com/java/java_quick_guide.htm

Concepts/Keywords to know

  • Static initializer

Call to super ALWAYS goes first!

constructor call must be the first statement in a constructor

Types of classes

JVM basics

  • Permgen space -- https://dzone.com/articles/jvm-permgen-%E2%80%93-where-art-thou

    • All threads SHARE the heap space
    • Every thread GOT ITS OWN stack space
  • https://www.youtube.com/watch?v=ZBJ0u9MaKtM&t=8s

    • Has a loading component, RunTime Data areas and an execution component

      • Loading component must load both your class and JAVA API .class files
    • ClassLoader Subsystem https://www.geeksforgeeks.org/classloader-in-java/

      • Load, Link and Initialize
      • Load has three different -->
        • BootStrap (internal java classes) --> rt.jar REMEMBER THIS!!! (Java core runtime classes)
        • Extension loads extension jars
        • Application class loader from the classpath specified (ClassNotFoundException here!)
      • Link has verify, prepare and resolve
        • verify is valid bytecode
        • prepare allocates values to class variables (like static variables)
        • Resolve --> all symbolic references are resolved (ClassDefNotFound)!
      • Initialize initializes, for example the static blocks; the static variables
    • RunTime Data Areas

      • Method Area and Heap
        • MethodArea for the JVM (perm gen space) (-XX:MaxPermSize) --> This is the out of memory error! In java 8, it is called a metaspace
        • All Objects are created in the heap (-Xms, -Xmx)
      • PC Register -- Program Counter to the next instruction to be executed, per thread
      • Java Stacks
        • Contains the stacks of the methods (stack frame)-- > StackOverFlowError (per thread) (-Xss param)
      • Native Method Stacks for the native methods being called
    • Execution Engine

  • https://www.geeksforgeeks.org/jvm-works-jvm-architecture/

  • Java applications are called WORA (Write Once Run Anywhere).

Concurrency

Refs

Concepts

https://www.mkyong.com/java/java-thread-mutex-and-semaphore-example/

Eg.:

  public static void main(String args[]) {
    Thread.currentThread().setPriority(Thread.MAX_PRIORITY);
    
    System.out.println(Thread.currentThread().getPriority());
  }

https://www.geeksforgeeks.org/implement-runnable-vs-extend-thread-in-java/

  • implements Runnable leaves room for other classes too, extends Thread does not
  • In implements Runnable, object is shared across threads, extends Thread needs a new object cada vez.

Russian Chaccha

  get() – gets the value from the memory, so that changes made by other threads are visible; equivalent to reading a volatile variable
  set() – writes the value to memory, so that the change is visible to other threads; equivalent to writing a volatile variable
  lazySet() – eventually writes the value to memory, may be reordered with subsequent relevant memory operations. One use case is nullifying references, for the sake of garbage collection, which is never going to be accessed again. In this case, better performance is achieved by delaying the null volatile write
  compareAndSet() – same as described in section 3, returns true when it succeeds, else false
  weakCompareAndSet() – same as described in section 3, but weaker in the sense, that it does not create happens-before orderings. This means that it may not necessarily see updates made to other variables
  • Race conditions -- two threads racing to get the same resources

    • Husband Wife getting the cash from ATMs
    • Lock the bank account!!
      • Old way is to use the keyword synchronized (can be put in front of a code block`
      • New way is to use Reentrant Lock --> keep small locks
      • Lock should be granular, minimize the block of code that you synchronize
      • Reentrant locks!! -- A newer slicker way https://www.geeksforgeeks.org/reentrant-lock-java/
      // Before the statements to lock
      lock.lock();
      //Run code
      lock.unlock();
      
  • Killing a thread

    • stop() is bad! -- Use something such as stopMe()
  • wait() --> different from sleep as it waits up to 1000 milliseconds (as opposed to exact 1000)

    • notify --> notify other thread, notifyAll --> notify all waiting threads
  • join() threads

    • Main() is always a thread!!
  • Executors --> new FixedThreadPool()

    • implements Callable<Integer> must provide Integer call --> returns in the Future -- eg. Future<Integer>
    • future.get() to get the results
    • Use ExecutorService (init and shutdown)

Notes about coding

Get the current thread name --> prints main() when run from main()
System.out.println(Thread.currentThread().getName()); 

Garbage collection

Typically, when tuning a Java application, the focus is on one of two main goals: responsiveness or throughput. We will refer back to these concepts as the tutorial progresses.

Responsiveness
Responsiveness refers to how quickly an application or system responds with a requested piece of data. Examples include:

How quickly a desktop UI responds to an event
How fast a website returns a page
How fast a database query is returned
For applications that focus on responsiveness, large pause times are not acceptable. The focus is on responding in short periods of time.

Throughput
Throughput focuses on maximizing the amount of work by an application in a specific period of time. Examples of how throughput might be measured include:

The number of transactions completed in a given time.
The number of jobs that a batch program can complete in an hour.
The number of database queries that can be completed in an hour.
High pause times are acceptable for applications that focus on throughput. Since high throughput applications focus on benchmarks over longer periods of time, quick response time is not a consideration.
  • In C, had to do everything yourself (malloc and free)

  • Java has garbage collector

Screen Shot 2019-10-23 at 2 16 31 AM

  • Hypothesis

    • Most objects soon become unreachable
    • Refs from old to new is less
  • Unreachable (Objects that are not reference by any other object)

  • Concepts

    • Objects are allocated in the heap; static members and class defs are in Permgen (Remember class loading)
    • Daemon thread for garbage collection
    • System.gc() may/may not call GC.
    • When cant allocate any more, you get the java.lang.OutOfMemoryError
  • Three steps --> Mark, Delete/Sweep and Compacting

    • Mark walks the graph and marks reachable as live
    • Sweep deletes unreachable objects
    • Compacting compacts the memory by moving around objects to make them contiguous
  • Generation collectors --> depends on whther an object is old or young

    • Heap divided into two areas
    • Young generation is for young/new objects
      • Divide into eden space and two survivor spaces
      • Eden is for adding new objects
    • Old generation for objects that live for a long time.
      • Unreachable objects here are collected by the Major GC.
    • Both minor and major are STOP THE WORLD ; Major GC can pause the application
  • Minor GC runs (Video 17:00 mark)

    • when no more memory in Eden Space (cant create new objects)
    • All objects that survived have a counter as to how many have survived
    • Objects are moved across survivor spaces
      • Why two survivor spaces -- to prevent an extra compacting step.
      • Make diagram and increment objects.
  • Whatever survives a threshold number of minor GCs --> promoted to Old Generation (-XX:MaxTenuringThreshold!)

  • Old generation

    • Major GC when old generation is full.
    • Causes a pause
    • Causes MARK - SWEEP - COMPACT of the whole heap
  • Performance metrics

    • Responsiveness/Latency vs throughput
      • If latency is important, pause unacceptable (in throughput is OK)
  • Types of Garbage Collector

    • Serial Collector (Single Thread)
    • Concurrent Collector (Keeps on running in parallel, does not wait for OLD to be full)
      • Use when more memory, pauses are not OK
    • Parallel collector -- Multiple CPUs to run GC, multiple threads doing it -- only does it when OLD generation is full
      • Use when application demands high throughput (can withstand pauses)
  • G1 Garbage collector -- Garbage-First

    • Divides into regions and collects the regions wih the most garbage first
    • More predicate pauses
    • Parallelism and concurrency together

    This Image!! -->

Screen Shot 2019-10-20 at 10 20 45 PM

  • Can specify what type of GC to use (-XX:+UseParallelGC for example)
  • finalizer method --> calls when GC is called

    • Do not put statements here, as you do not know WHEN they will be collected!!
    • finalize only called once.
    • The garbage collector invokes an object's finalize()method when it detects that the object has become unreachable.
  • Multiple flags to control GC/Heap

  • -XX:PermSize for example
  • Print gc logs
    • -XX:+PrintGCDetails
    • jvisualvm GUI

HashMaps

  • How do they work (loadFactor etc)

  • All Map types -- https://www.youtube.com/watch?v=APL28XpFP0c

    • HashMap is NOT thread safe

    • LinkedHashMap preserves order of key-value pairs in the insertion order

    • TreeMap maintains natural order of keys (like string lexicographic order)

    • WeakHashMap does GC on weak references

    • ConcurrentHashMap -- reads are lock free, writes require locks (but NOT the enitre map!)

      • Result can be BAD -- Not Atromic!!!
      • USe AtomicLong!!
    • Remember why a concurrent modification exception might occur!

  • Why do we need a Concurrent HashMap? https://www.geeksforgeeks.org/difference-hashmap-concurrenthashmap/

String special

https://www.youtube.com/watch?v=6pLEwJP1Afk

  • Backed by charArray
  • Strings are immutable!
    • Changes create a new string
  • Strings are cached
String s1 = "Hello";
String s2 = "Hello";
String s3 = new String("Hello");

Here s2 and s1 are same! (interned) --> and s3 is not!

String s3 = new String("Hello").intern() //interns it!
  • Why intern?! -- Because mutliple strings can have the same value!!

  • Why immutable?! -- Because multiple string objects point to the same address (chaning one might change another) (For eg. changing s1 might impact s2)

  • Useful APIs -->

    • indexOf -- where does a substring start
    • lastIndexOf -- Last index where does a substring start
    • matches -- Regex
    • substring
      • <1.6 had a memory leak! Full char array was copied
  • StringBuilder vs StringBuffer vs String https://www.geeksforgeeks.org/string-vs-stringbuilder-vs-stringbuffer-in-java/

    • StringBuilder and StringBuffer are mutable
    • StringBuffer is threadsafe, StringBuilder is NOT
  • + vs concat

    • + created a lot of intermediate strings! (In previous versions) -- now no longer intermediate (Uses StringBuilder)
    • concat does not work with integers etc.
  • Always check for NULLs! (or inverse the string you comparin to)

  • Always store passwords in a char[] instead of String

Singleton

https://www.youtube.com/watch?v=QsBQnFUx388

Java Generics*

https://docs.oracle.com/javase/tutorial/collections/interfaces/order.html

public class Name implements Comparable<Name> {
    private final String firstName, lastName;

    public Name(String firstName, String lastName) {
        if (firstName == null || lastName == null)
            throw new NullPointerException();
        this.firstName = firstName;
        this.lastName = lastName;
    }

Videos --> https://www.youtube.com/watch?v=Zc54gFhdpLA&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=2

Lecture 2

  • Document distance problem
    • Break the document into a vector of words.
    • Inner Product -- must scale --> arccos

Lecture 3

  • Why sorting?
  • Median finding/ Binary Search
  • Draw a tree to get the work done (recurrence trees!!)

Lecture 4 -- Priority Queue

  • Heap in an array visualized as a nearly complete BST
  • Max heap property
    • Exract Max --> modifies the heap!
    • max_heapify and build_max_heap
    • In the worst case, it is log n to heaapify the shit out of it!
    • Acthung!! --> Heapifying an array starts from Node n/2!
      • Time complexity --> max_heapify takes O(L) --> and number of nodes decrease as you go up

Lecture 5 -- BST

  • Motivation for BST (the airplane runway problem)
    • Sorted Array helps in finding and checking, but insertion is O(n)
      • Insertion is the bottleneck --> ENTER BINARY TREES!!
    • He said O(h) NOT O(log n) -- think joder!!

Lecture 6- AVL

  • Height of leaves is 0, height of any other node is max(height(rightChild), height(leftChild)) + 1
  • Must store the heights at every note -|hL - HR! <=1
  • How many nodes I can pick in a tree of Node H?
    • N(h) = N(h-1) + N(h-2) + 1
  • Rotations!!
    • Draw the Diagram -- INORDER TRAVERSAL MUST BE SAME BEFORE/AFTER ROTATION!!
      • One rotation and two rotation
      • Violations MIGHT bubble up!!
      • Divides into two cases

Lecture 7 -- Counting Sort, Radix Sort, lower bounds for sorting

  • Why is search lower bound log(n)?? -- because MUST have all the answers, and at each decision, the answer is yes or no only
  • Assume that inputs are all in some range
    • Fit in a word
  • Counting Sort
  • Radix Sort
    • Break Integer into bunch of columns (Excel example!!)
      • Sort by LSB, and then towards MSB (Sort using counting Sort)

Lecture 8 -- Hashing

  • Python dict
  • Hashing use cases -- every fuckin where!!
    • Prehash convert any key to an integer.
  • Simple uniform hashing -- load factor
    • Simple division (n mod m) hashing does not always work! ( can have clashes) -- for eg. if m is a power of 2 or something
  • Universal Hashing -- Random numbers are used

Lecture 9 -- Table Doubling, Karp-Rabin

  • Growing a table, need to REHASH!

    • Amortization, table doubling!!
    • Work to double the table is ONLY done rarely
  • Deleting is interesting! -- DO NOT DO Half,

    • As if you add a ninth element, it resizes to 16; then you remove; it resizes to 8; and so and so forth!!- leading to a O(n) time! -- so instead do n/4 (any power less than 2)
  • Karb Rabin

    • Naive complexity is O(|s| * |t|)
    • Just hash match is NOT enough!! Should check for the string equality as well

Lecture 10 -- Open Addressing

  • Try to find a slot repetitively till you find one
    • Keep probing till you find something empty, and the same thing goes for search
    • Delete is messed up -- because if you delete a key, it creates None and then you stop at it, return a None
      • Uses a flag called DeleteMe
    • Probing strategies (https://www.geeksforgeeks.org/hashing-set-3-open-addressing/)
      • Different strategies to probe
        • Linear probing -- problems of clustering, keys cluster together
        • Use double hashing -- two hashes to compute the position!

Lecture 11 -- Integer Arithmetic, Karatsuba Multiplication

  • Catalan numbers
  • Newton's method -- derivative MAA BEHEN!!
  • Quadratic convergence
    • Karatsuba's algorithm

Lecture 12 -- Newton's method

  • Breaks square root and division in terms of multiplication

Lecture 13 -- BFS (graphs!)

  • Graphs can be directed/undirected (Vertex, Edges)
  • Adjaceny lists joder!
  • Applications
    • Web Crawling, find friends, network broadcast, garbage collection
  • Representation -- Adjancency lists, vertex to list of direct vertices

Lecture 14 -- DFS

  • O(V + E) running time

  • Types of edges in a tree (backward, forward, cross etc)

    • MIND BLOWN !! --Think in terms of parenthesis!! - first internal brackets (stack are done) before outer
  • Topological sort -- think Job Scheduling -- putting clothes on!! example (must put socks on before shoes on, pantes before shoes on) -- only holds for directed acyclic graph!! -- GREAT PROOF!!

    • If there is an edge that goes from u to v, v MUST appear after u in the sort result

Lecture 15-- Single Source Shortest Paths

  • Worst case, number of edges is V^2
  • Bellman ford -- handles negative too, Djikstra -- only positive
  • Weights are NEVER a part of the complexity
    • In beginning, everything is INFINITY -- and then you try to reduce this!
    • Store the distance, as well as the predecessor (as that leads back to the path!)
  • Negative weights (why!) == think of social networks (likes/dislikes)
    • Negative CYCLES hijo de put!!
  • MUST reach all reachable paths!!
  • Relax edge constraints (if I find better, I use that and discard previous)
  • Observations
    • Subpaths of a shortest path are ALSO shortest path (Optimal substructure!!!)

Lecture 16 -- Djikstra

  • Remember the relaxation of edges
    • Start with infinity and then go TOP-DOWN
    • MAST PROOF!!
    • Its the cycles that cause the path!!
    • Algorithm that uses topological sort and then relaxes constraints
    • Greedy algorithm djikstra
    • Mix os depth-breadth first algo
    • THINK of time complexity!!
    • Fibonacci data heap!

Lecture 17 -- Bellman Ford

  • Djikstra MAY NOT EVEN TERMINATE!! (for cycles)
  • Exponential time complexity Linear time is great, exponential is bad, infinite gets you fired!!
  • Bellman Ford -- at most we can have nVertices times in the path -- else cycle!! -- https://www.youtube.com/watch?v=9PHkk0UavIM

Lecture 18 -- Improve on average complexity

  • Bidirectional search (Alternate Search forwards and backwards)
    • Have a forward destination, and a backward destination -- maintain distances and queues fot both forward and backward
      • SOME vertex has been processed in both forward & backward search -- TERMINATE
      • But NOT correct (does not get shortest)
      • Minimize AFTER finding such node

Lecture 19 -- DP

  • Careful brute force
  • Memoized fibonacci
    • fib(k) ONLY recurses the first time it is called.
  • Bottom up --> you build from the bottom (instead of the recursion tree where the problem to solve is at the top
  • Think as TOPOLOGICAL SORT of the problem DAG!! --> mind blown
  • Time --> Number of subproblem + Time per subproblem
  • Memoization wont work for graph with cycles!

Lecture 20 -- DP II

  • Think of DP as DAG! (A DAG of subproblems)
  • Steps for a DP problem
    • Define subproblems (count # of subproblems)
    • Guess (part of sollution)
    • relate subproblems..
    • recurse & memoize/build
    • Solve original problem --> check subproblem is ACYCLIC!!
    • Text Justification
      • Every word can start a line, quantify badness
      • Subproblems are suffixes of the array
      • ALWAYS CHECK TOPOLOGICAL ORDER
    • Parent pointers -- REMEMBER which GUESS was best! (and then reconstruct it along)
    • Perfect information blackjack

Lecture 21 -- DP III

  • How to define subproblems?!!
    • Either suffixes, prefixes or substrings
    • Paranthesisation of matrix multiplication
      • Guess the LAST multiplication
      • Use substrings IN case you are having to use both suffixes and prefixes
      • THINK about topological sorting
    • Edit distance and subsequence problem

Important links

MOST IMPORTANT -- Criteo Design Doc Questions

https://confluence.criteois.com/display/THC/%5BDesign%5D+Amazon+shopping+basket

Basic Terminologies

https://www.interviewbit.com/tutorial/storage-scalability-introduction/

MOST IMPORTANT II -- Harvard Video (https://www.youtube.com/watch?v=-W9F__D3oY4)

IF NOTHING -- See this https://youtu.be/-W9F__D3oY4?t=5213

  • Vertical Scaling has a limit! (Buy bigger machine) (_Single point of failure)
  • Horizontal Scaling multiple servers -- Scalability
    • Horizontal Scaling has network calls (slower) and data consistency issues are possible.
  • What if sessions are not on every server? You will be asked to log in everytime!!
    • Store session somewhere -- STORE THE STATE
    • Do RAID 1 (As sharing state on one machine can be a weak point)
  • Sticky Sessions -- store IDs in cookies -- can store server ID as well
    • What if I changes?
  • Craigslist example -- Much more reads than writes (always think about this!)
  • Memcache -- SQL Query Cache
  • Facebook is READ HEAVY -- people read your profile many times more than you actually update it (So CACHING!!)
  • Databse Master Slave Architecture
    • Have dedicated DBs for write/read
    • Replicate Masters!!
  • Load Balancer replication!! -- via heartbeats!
  • Database partitioning -- FB did it on unis! -- mit.facebook.com for eg. -- But now partition by Users
  • Sticky Sessions (imp for Amazon example) -- go to the shopping cart on the same server
  • Load Balancer for DBs!!
  • benefits of load balancer --> just send the IP of the load balancer, and can also make servers private.
  • Use secure ports 80 (think about security!!)

Concepts/Important buzz words

 When a network partition failure happens should we decide to

 Cancel the operation and thus decrease the availability but ensure consistency.
 Proceed with the operation and thus provide availability but risk inconsistency.
  • How cookies work
  • https://www.youtube.com/watch?v=QWw7Wd2gUJk
    • Save info on person's laptop, rather than server
    • Save settings, user preferences
    • HTTP is stateless!!
    • Cookie MUST be small!
  • A person stores a like button to FB, and thats FB knows who you are! -- MIND BLOWN!!
  • Think about Harvard video -- sessions must be shared somewhere (maybe on load balancer)

GROKKING

MUST-MUST-MUST

Screen Shot 2019-10-31 at 11 59 40 PM

Problems

Things learnt from tiny url
  • For caching follow 80-20 --> 20% of the rule reflect 80% of the queries

  • Come up with numbers (bandwitdth, memory, space etc)

  • API DEV KEY --- awesome idea!! -- throttle users, allows auth -- also have deletion!!

  • Think about DELETEcollissions!!

Using base64 encoding, a 6 letters long key would result in 64^6 = ~68.7 billion possible strings
Using base64 encoding, an 8 letters long key would result in 64^8 = ~281 trillion possible strings
  • Key generation services -- Random UUID generator

  • Think about cleanup services! -- A separate Cleanup service can run periodically to remove expired links from our storage and cache. This service should be very lightweight and can be scheduled to run only when the user traffic is expected to be low.

Things learnt from pastebin
  • ALWAYS write APIs in required and optional args (python style, for eg. user name is optional)
Things learnt from Facebook Messenger
  • After TB, it is PB
  • MUST be consistent ACROSS devices
  • ALWAYS create the table with numbers (ON THE SIDE)
  • Always think about ACKs -- message sent, message received etc
  • ALWAYS -- Let’s try to build a simple solution first where everything runs on one server
  • Push vs pull model -- websockets, long polling etc
  • Pagination on form factor
Things learnt from Instagram
  • Use Image Metadata (image lat long, date etc)
  • Key Value store again! -- Amazon S3
  • Seperate serivces for reads and writes!
  • Merging timestamp with photoID
  • The 80-20 RULE AGAIN!!
Things learnt from Rate Limiter
  • WTF is rate limiter?
Rate Limiting is a process that is used to define the rate and speed at which consumers can access APIs. Throttling is the process of controlling the usage of the APIs by customers during a given period. Throttling can be defined at the application level and/or API level. When a throttle limit is crossed, the server returns HTTP status “429 - Too many requests".
  • Types of throttling

    • Hard throttling, soft throttling and elastic throttling
  • Algorithms

    • Fixed window and rolling window
    • Why is fixed window bad
Imagine if Kristie sends three requests at the last second of a minute, then she can immediately send three more requests at the very first second of the next minute, resulting in 6 requests in the span of two seconds. The solution to this problem would be a sliding window algorithm which we’ll discuss later.
  • The locking solutin
    • Can use redis lock! -- will make it slow
  • Tradeoffs between fixed window, sliding window and hybrid

    • Sliding window takes a lot of memory -- Uses Redis SortedSet
  • Rate limiting by IP or user?

    • By IP, shared IP can be an issue (internet cafe's)
    • By user, some hacker can try multiple logins
Things learnt from Dropbox
  • ACID

  • Small chunks

Internally, files can be stored in small parts or chunks (say 4MB); this can provide a lot of benefits i.e.
all failed operations shall only be retried for smaller parts of a file. If a user fails to 
upload a file, then only the failing chunk will be retried
  • Keep metadata on the client

  • Has a block server, metadata server and synchronization server

  • Long polling again!

  • Client is split into

    • Chunker, indexer, watcher and internal metadata database
  • Clients handle slow servers with exponential backoff; mobile clients can update with a delay on demand basis to save resources

  • MySQL supports ACID -- so DB is MySQL

Things learnt from Typeahead
Things learnt from Twitter

Concepts

  • Consistent Hashing

    • DHT (Distributed Hash Table)
    • Why needed> think!
  • CAP Theorem

  • Redundancy vs Replication

  • Database Indexing

    • Decreases write performance! Also need space to store
    • Think of the book index analogy (need space to store glossary!!)
  • Basics of distributed systems

    • Scalability is how the system scales (Horizontal (Mongo etc.) vs Vertical (mySQL))
    • Reliability is correctness (amazon cart for example), availability is being available
    • If you reliable, you available, BUT NOT vice vers
    • Efficiency (throughout etc)
    • Serviceability vs Managerability
  • Partitioning

    • Methods of partitioning -- horizontal, vertical and directory
    • Types -- key based(hash), list based, round robin, composite
    • Problems with Data Partitioning
      • Joins are harder across DBs on different servers
      • Referential integrity
      • Rebalancing -- changing a scheme can be a P in the A
  • Kafka intro https://www.youtube.com/watch?v=PzPXRmVHMxI

Load balancers
  • Can be at THREE layers (between clients and web server, between web server and platform layr, between platform layer and DB)
  • Different techniques (least response time, round robin, least connection etc)
Cache types
  • Write around, write through, write back..
Proxy Server

Screen Shot 2019-11-03 at 12 54 55 AM

SQL vs NoSQL
Relational databases are structured and have predefined schemas like phone books that store phone numbers and addresses. Non-relational databases are unstructured, distributed, and have a dynamic schema like file folders that hold everything from a person’s address and phone number to their Facebook ‘likes’ and online shopping preferences.
SQL NoSQL
ACID Sacrifice ACID
Vertically Scalable Horizontally Scalable
When Structured, use this Use for data with dynamic schema/no schema such as facebook likes, amazon cart
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment