Skip to content

Instantly share code, notes, and snippets.

@reata
reata / HashJoin.java
Created August 26, 2022 13:42
用单机代码解释JOIN策略:HashJoin
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
public class HashJoin implements JoinStrategy {
@Override
public List<Integer> join(List<Integer> R, List<Integer> S) {
List<Integer> output = new ArrayList<>();
HashMap<Integer, Integer> RHash = new HashMap<>();
@reata
reata / SortMergeJoin.java
Created August 25, 2022 12:27
用单机代码解释JOIN策略:SortMergeJoin
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
public class SortMergeJoin implements JoinStrategy {
/*
time complexity: O(mlog(m)) + O(nlog(n)) + O(m+n), m is the size of R and n is the size of S
*/
@Override
public List<Integer> join(List<Integer> R, List<Integer> S) {
@reata
reata / NestedLoopJoin.java
Last active August 25, 2022 12:38
用单机代码解释JOIN策略:NestedLoopJoin
import java.util.ArrayList;
import java.util.List;
public class NestedLoopJoin implements JoinStrategy {
/*
time complexity: O(m*n), m is the size of R and n is the size of S
*/
@Override
public List<Integer> join(List<Integer> R, List<Integer> S) {
@reata
reata / count_distinct_vs_group_by.sql
Last active August 22, 2022 05:25
SELECT DISTINCT和GROUP BY的去重效果相同吗?
/*
SELECT DISTINCT
*/
WITH foo (pk1, pk2) AS
(SELECT 1, 'a'
UNION ALL
SELECT 1, 'b'
UNION ALL
SELECT 2, 'a'
UNION ALL
@reata
reata / filter_location_matters_in_left_join.sql
Created March 11, 2022 13:49
过滤条件写在ON里和写在WHERE里有什么不同?
/*
LEFT JOIN without filter, all tuples in foo can find a match in bar
*/
WITH foo (id, bar_id) AS
(
SELECT 1, 1
UNION ALL
SELECT 2, 1
UNION ALL
SELECT 3, 2
@reata
reata / mix_augment_assignment_with_conditional_expression.py
Created October 3, 2021 06:02
Python的条件表达式等价于if else代码块吗?
#!/usr/bin/python3.6
import ast
from unittest import TestCase
def func1(value):
data = set()
if isinstance(value, set):
data |= value
@reata
reata / is_mysql_query_io_intensive_or_cpu_intensive.py
Last active November 11, 2019 14:06
Python中查询MySQL是CPU密集型还是IO密集型任务,能用多线程来加速吗?
import socket
import time
from multiprocessing import Pool as ProcessPool
from multiprocessing.dummy import Pool as ThreadPool
import pymysql
import psutil
class Timer: