Skip to content

Instantly share code, notes, and snippets.

@mengxinayan
Last active March 27, 2022 12:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mengxinayan/74ed2f5908e95a5b42fff8b1883b71bf to your computer and use it in GitHub Desktop.
Save mengxinayan/74ed2f5908e95a5b42fff8b1883b71bf to your computer and use it in GitHub Desktop.
Medium-temp

Algorithm

This week's LeetCode problem is 380. Insert Delete GetRandom O(1)

Implement the RandomizedSet class:

  • RandomizedSet() Initializes the RandomizedSet object.
  • bool insert(int val) Inserts an item val into the set if not present. Returns true if the item was not present, false otherwise.
  • bool remove(int val) Removes an item val from the set if present. Returns true if the item was present, false otherwise.
  • int getRandom() Returns a random element from the current set of elements (it's guaranteed that at least one element exists when this method is called). Each element must have the same probability of being returned.

You must implement the functions of the class such that each function works in average O(1) time complexity.

Input
["RandomizedSet", "insert", "remove", "insert", "getRandom", "remove", "insert", "getRandom"]
[[], [1], [2], [2], [], [1], [2], []]
Output
[null, true, false, true, 2, true, false, 2]

Explanation
RandomizedSet randomizedSet = new RandomizedSet();
randomizedSet.insert(1); // Inserts 1 to the set. Returns true as 1 was inserted successfully.
randomizedSet.remove(2); // Returns false as 2 does not exist in the set.
randomizedSet.insert(2); // Inserts 2 to the set, returns true. Set now contains [1,2].
randomizedSet.getRandom(); // getRandom() should return either 1 or 2 randomly.
randomizedSet.remove(1); // Removes 1 from the set, returns true. Set now contains [2].
randomizedSet.insert(2); // 2 was already in the set, so return false.
randomizedSet.getRandom(); // Since 2 is the only number in the set, getRandom() will always return 2.

In order to keep the average time complexity at O(1), a hash table is needed to store the mapping between values ​​and indexes. At the same time, an array is used as the data structure of the actual saved data, and there is a random number class to facilitate the generation of random numbers. The implementation methods of each method are described below:

  • Initialization method RandomizedSet(): Initialize the above-mentioned array, hash table and random number classes respectively
  • Insert method bool insert(int val): first determine whether val already exists, if so, return false, if not, insert val into the end of the array, and update the hash table map at the same time .put(val, list.size()-1);, finally return true
  • Deletion method bool remove(int val): first determine whether val already exists, if not, return false, if it exists, first obtain val through the hash table and index the index in the array, because it requires time The complexity is O(1), then the deletion method of the array cannot be used, but the value at the index position of the array is changed to the value of the last element of the array list.set(index, lastVal);, and updated at the same time. The mapping relationship in the hash table map.put(lastVal, index);, and then remove the last element of the array (this operation is O(1) complexity) and the mapping of val in the hash table, Finally returns true
  • Randomly returns an item int getRandom(): use the random class to randomly generate a number within the index range of the array, and return the number corresponding to the subscript.
class RandomizedSet {

    private List<Integer> list;
    private Map<Integer, Integer> map;
    Random rand;

    public RandomizedSet() {
        list = new ArrayList<>();
        map = new HashMap<>();
        rand = new Random();
    }
    
    public boolean insert(int val) {
        if (map.containsKey(val) == true) {
            return false;
        } else {
            list.add(val);
            map.put(val, list.size()-1);
            return true;
        }
    }
    
    public boolean remove(int val) {
        if (map.containsKey(val) == false) {
            return false;
        } else {
            int index = map.get(val);
            int lastVal = list.get(list.size()-1);
            list.set(index, lastVal);
            map.put(lastVal, index);
            list.remove(list.size()-1);
            map.remove(val);
            return true;
        }
    }
    
    public int getRandom() {
        return list.get(rand.nextInt(list.size()));
    }
}

/**
 * Your RandomizedSet object will be instantiated and called as such:
 * RandomizedSet obj = new RandomizedSet();
 * boolean param_1 = obj.insert(val);
 * boolean param_2 = obj.remove(val);
 * int param_3 = obj.getRandom();
 */

Review

This week's Review is for the following article: How slow is SELECT * ?

The most well-known query optimization rule is to avoid SELECT * as much as possible, and even if all columns are required, they should be listed by name. On the surface it looks like SELECT * will read unnecessary columns, but the deeper reasons are:

  • These columns are read from memory or, in the worst case, from disk. This means higher resource usage and higher latency.
  • If they’re read form disk they may be cached, if they’re cached they’ll stay in the cache for a longer time. This may be a waste of memory.
  • Columns will be sent over a network. This causes more work both on the database server and the application server.

In some cases SELECT * will seriously affect overall application performance:

  1. Columnar databases. many databases are columnar databases, which store different columns in different data structures, so that SUM(column) is faster, while SELECT * will be slower.
  2. Covering indexes. Comparing SELECT id, surname FROM user WHERE surname < 'c'; and SELECT * FROM user WHERE surname < 'c';, if there is an index on id, the former can be quickly read through the index The corresponding content in the line is compared, but the latter does not:
  3. Strength is in numbers. when there are too many columns in a table, the data in all columns cannot be read only by id
  4. Generate column/Virtual column. When the SQL statement is executed, it will calculate and generate virtual columns. In some cases, it will be written to disk for storage. In some cases, it will be calculated immediately. When it is calculated, it will bring greater overhead.
  5. Views. Views are built on JOIN operations, selecting only required columns may ignore some unnecessary tables, making queries faster
  6. InnoDB TEXTs and BLOBs. In InnoDB (the default storage engine for MariaDB and MySQL), large variable-length text and binary columns are stored in separate memory pages. When long data is stored in separate pages, reading it requires at least one physical read. This affects performance.

Finally, SELECT * is not a good practice, but its impact on query performance and server resource usage is often overstated. Usually other aspects are more important, namely using indexes and avoiding the use of internal temporary tables for two-step sorts.

Tip

I have just started working with shell scripts recently, and the multi-line comments in shell scripts are written:

: '
This is a
comment
in shell script
'

Share

This week is the second week of quarantine, and it is expected to continue to be quarantined in the next few weeks. At present, I have basically adapted to such a life. I plan to take a break and stop writing for a while, to organize and think about everything again, and hope you to know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment