Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save php-cpm/b74d3b26681a7cf759d9958f8a6426bd to your computer and use it in GitHub Desktop.
Save php-cpm/b74d3b26681a7cf759d9958f8a6426bd to your computer and use it in GitHub Desktop.
这是原文的中文版本

面向技术面试学习很蛋疼,所以我们写了个小纸条来打辅助

这个列表是一个速成指南,也可以作为具体专题深入研究的引用。本文其实就是你从来没去听过的或者已经健忘的计算机科学课程的总结,因此不可能写的很深入。原文已经在github gist 上公开,任何人都可以参与修改。

数据结构基础

数组 Array

定义:

  • 基于有序的、通常以0开始的索引,存储数据元素
  • 基于集合理论中的多元组 tuples
  • 最古老最常用的数据结构之一  

须知:

  • 最好用于索引,不善于搜索、插入、删除
  • 线性数组 Linear arrays, 也叫一维数组,是最常用的.  - 长度固定,也即声明为一个指定大小。
  • 动态数组 Dynamic arrays 像一维数组一样,但有额外空间来存储更多元素  - 如果动态数组满了,将复制内容到一个更大的数组
  • 二维数组 Two dimensional arrays 像表格一样有x、y轴   

执行效率:

  • 访问 Indexing: 线性数组: O(1), 动态数组: O(1)
  • 查询 Search: 线性数组: O(n), 动态数组: O(n)
  • 优化查询 Optimized Search: 线性数组: O(log n), 动态数组: O(log n)
  • 插入 Insertion: 线性数组: n/a 动态数组: O(n)

链表 Linked List

定义:

  • 用与其他节点通过指针互相链接的节点来存储数据.
    • 节点,至少有一条信息和一个其他节点的引用.
    • 一个链表通过一个节点指向另一个节点的引用来 链接 各个节点.

须知:

  • 为优化插入和删除操作而设计,在访问和搜索方面较慢.
  • 双向链表 有指向上一个节点的引用.
  • 循环链表 是一种简单链表,他的 尾节点, 也就是最后一个节点链接到 头结点, 也就是第一个节点.
  • , 一般通过链表实现,也可以用数组实现.
    • 栈是 后进先出 (LIFO) 数据结构.
    • 由链表生成,限制只有头部可以做插入和删除.
  • 队列, 也可以通过数组或链表实现.
    • 队列是先进先出 (FIFO) 数据结构数据结构.
    • 由双向链表生成,限制只有头部可以做删除,尾部可以插入.

执行效率:

  • 访问 Indexing: Linked Lists: O(n)
  • 查询 Search: Linked Lists: O(n)
  • 优化查询 Optimized Search: Linked Lists: O(n)
  • 插入 Insertion: Linked Lists: O(1)

哈希表 Hash Table or Hash Map

定义:

  • 使用键值对存储数据.
  • 哈希函数 接收一个键返回一个特定键唯一的输出.
    • 这个过程称之为 哈希加密 hashing, which is the concept that an input and an output have a one-to-one correspondence to map information.
    • 哈希函数为这条数据返回一个内存中的唯一地址.

须知:

  • 为优化查找、插入、删除而设计.
  • 哈希冲突 是哈希函数为两个不同输入返回了相同输出.
    • 所有哈希函数都存在这个问题.
    • 常出现在哈希表非常大的时候.
  • 哈希对于关联数组和数据库索引非常重要.

执行效率:

  • 访问 Indexing: Hash Tables: O(1)
  • 查询 Search: Hash Tables: O(1)
  • 插入 Insertion: Hash Tables: O(1)

二叉树

定义:

  • 树形的数据结构,每个节点最多有两个子节点.
    • 一个左子节点,一个右子节点.

须知:

  • 为优化查询、排序而设计.
  • 退化树 是不平衡树,这时候只有一边,实际本质上就是一个链表.
  • 明显比其他数据结构更好实现.
  • 用于生成 二叉查找树.
    • 一个二叉树,通过比较键的大小决定子节点的位置.
    • 左子节点的键比他父节点小.
    • 右子节点的键比他父节点大.
    • 不能有重复节点.
    • 因为以上特性,比用二叉树更好用.

执行效率:

  • 访问 Indexing: 二叉查找树: O(log n)
  • 查询 Search: 二叉查找树: O(log n)
  • 插入 Insertion: 二叉查找树: O(log n)

查找基础

广度优先查找 Breadth First Search

定义:

  • 查找树或图的一种算法,从根节点开始,先查找树的级别.
    • 从同一级开始找每一个节点,一般是从左到右.
    • 上一步同时检查当前级别的节点的子节点.
    • 当检查完一个级别,移动到下一级别的最左节点.
    • 最右下方的节点是最后一个(所有节点中最深最右的节点).

须知:

  • 适用于搜索宽度比深度大的树.
  • 在遍历树时用队列存储信息.
    • 因为用到队列,他比深度优先查找更耗内存.
    • 这里队列使用更多内存因为需要存储指针

执行效率:

  • 查找 Search: 广度优先查找: O(|E| + |V|)
  • E 边的数量
  • V 顶点的数量

深度优先查找 Depth First Search

定义:

  • 查找树或图的一种算法,从根节点开始,先查找树的深度.
    • 从左边向下遍历树直到无法向下.
    • 一旦到达一个分支的结尾,回到上一级尝试遍历该分支的右子节点,如果他也有子节点则先左后右.
    • 当检测完一个分支移动到根节点的右侧节点,尝试从左一直遍历到底部.
    • 最右侧的节点是最后的节点,这个节点是他所有的上级的最右边).

须知:

  • 适用于搜索深度比宽度大的树.
  • 把节点推入栈.
    • 因为栈遵循 LIFO ,因此无需追踪节点的指针,就比广度优先查找省内存.
    • 一旦他在左侧无法继续遍历就开始执行出栈.

执行效率:

  • 查找 Search: 深度优先查找: O(|E| + |V|)
  • E 边的数量
  • V 顶点的数量

广度优先查找 Vs 深度优先查找

  • 简而言之,依据树的大小和形状.
    • 对于又宽又浅的树用广度优先查找
    • 对于又深又摘的树有深度优先算法

区别:

  • 因为 BFS 使用队列存储节点和子节点的信息,可能导致使用超过你电脑可用的内存. (不过一般不用担心这个问题.)
  • 如果在一个非常深的树上使用 DFS 可能没有必要找到那么深. 可以看看 xkcd.
  • BFS 建议用循环算法.
  • DFS 建议用递归算法.

高效排序基础

归并排序 Merge Sort

定义:

  • A comparison based sorting algorithm
    • Divides entire dataset into groups of at most two.
    • Compares each number one at a time, moving the smallest number to left of the pair.
    • Once all pairs sorted it then compares left most elements of the two leftmost pairs creating a sorted group of four with the smallest numbers on the left and the largest ones on the right.
    • This process is repeated until there is only one set.

须知:

  • This is one of the most basic sorting algorithms.
  • Know that it divides all the data into as small possible sets then compares them.

执行效率:

  • Best Case Sort: Merge Sort: O(n)
  • Average Case Sort: Merge Sort: O(n log n)
  • Worst Case Sort: Merge Sort: O(n log n)

快速排序 Quicksort

定义:

  • A comparison based sorting algorithm
    • Divides entire dataset in half by selecting the average element and putting all smaller elements to the left of the average.
    • It repeats this process on the left side until it is comparing only two elements at which point the left side is sorted.
    • When the left side is finished sorting it performs the same operation on the right side.
  • Computer architecture favors the quicksort process.

须知:

  • While it has the same Big O as (or worse in some cases) many other sorting algorithms it is often faster in practice than many other sorting algorithms, such as merge sort.
  • Know that it halves the data set by the average continuously until all the information is sorted.

执行效率:

  • Best Case Sort: Merge Sort: O(n)
  • Average Case Sort: Merge Sort: O(n log n)
  • Worst Case Sort: Merge Sort: O(n^2)

冒泡排序 Bubble Sort

定义:

  • A comparison based sorting algorithm
    • It iterates left to right comparing every couplet, moving the smaller element to the left.
    • It repeats this process until it no longer moves and element to the left.

须知:

  • While it is very simple to implement, it is the least efficient of these three sorting methods.
  • Know that it moves one space to the right comparing two elements at a time and moving the smaller on to left.

执行效率:

  • Best Case Sort: Merge Sort: O(n)
  • Average Case Sort: Merge Sort: O(n^2)
  • Worst Case Sort: Merge Sort: O(n^2)

归并排序 Vs 快速排序

  • Quicksort is likely faster in practice.
  • Merge Sort divides the set into the smallest possible groups immediately then reconstructs the incrementally as it sorts the groupings.
  • Quicksort continually divides the set by the average, until the set is recursively sorted.

算法基本类型

递归算法 Recursive Algorithms

定义:

  • 在定义中调用自己的算法.
    • Recursive case a conditional statement that is used to trigger the recursion.
    • Base case a conditional statement that is used to break the recursion.

须知:

  • 栈的层级太深 and 栈溢出.
    • If you've seen either of these from a recursive algorithm, you messed up.
    • It means that your base case was never triggered because it was faulty or the problem was so massive you ran out of RAM before reaching it.
    • Knowing whether or not you will reach a base case is integral to correctly using recursion.
    • Often used in Depth First Search

迭代算法 Iterative Algorithms

定义:

  • 重复有限次的调用,每次调用视为一次迭代.
    • 经常用于在数据集合中增量移动.

须知:

  • Generally you will see iteration as loops, for, while, and until statements.
  • Think of iteration as moving one at a time through a set.
  • Often used to move through an array.

递归 Vs 迭代

  • The differences between recursion and iteration can be confusing to distinguish since both can be used to implement the other. But know that,
    • Recursion is, usually, more expressive and easier to implement.
    • Iteration uses less memory.
  • 函数式语言 Functional languages 倾向于用递归. (例如 Haskell)
  • 指令式语言 Imperative languages 倾向于用迭代. (例如 Ruby)
  • 可以阅读这篇 Stack Overflow 帖子 了解更多.

Pseudo Code of Moving Through an Array (this is why iteration is used for this)

递归                               | 迭代
----------------------------------|----------------------------------
recursive method (array, n)       | iterative method (array)
  if array[n] is not nil          |   for n from 0 to size of array
    print array[n]                |     print(array[n])
    recursive method(array, n+1)  |
  else                            |
    exit loop                     |

贪心算法 Greedy Algorithm

定义:

  • An algorithm that, while executing, selects only the information that meets a certain criteria.
  • The general five components, taken from Wikipedia:
    • A candidate set, from which a solution is created.
    • A selection function, which chooses the best candidate to be added to the solution.
    • A feasibility function, that is used to determine if a candidate can be used to contribute to a solution.
    • An objective function, which assigns a value to a solution, or a partial solution.
    • A solution function, which will indicate when we have discovered a complete solution.

须知:

  • Used to find the optimal solution for a given problem.
  • Generally used on sets of data where only a small proportion of the information evaluated meets the desired result.
  • Often a greedy algorithm can help reduce the Big O of an algorithm.

Pseudo Code of a Greedy Algorithm to Find Largest Difference of any Two Numbers in an Array.

greedy algorithm (array)
  var largest difference = 0
  var new difference = find next difference (array[n], array[n+1])
  largest difference = new difference if new difference is > largest difference
  repeat above two steps until all differences have been found
  return largest difference

This algorithm never needed to compare all the differences to one another, saving it an entire iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment