Intorduction to Quick Select Algorithm
07/27/2022 Tags: Algorithms Python“In computer science, quickselect is a selection algorithm to find the kth smallest element in an unordered list. It is related to the quicksort sorting algorithm. Like quicksort, it was developed by Tony Hoare, and thus is also known as Hoare’s selection algorithm. Like quicksort, it is efficient in practice and has good average-case performance, but has poor worst-case performance. Quickselect and its variants are the selection algorithms most often used in efficient real-world implementations.” from Wiki page.
Brief
The quick selection is a algorithm that can efficiently find the k-th smallest/largest element in an unordered list. It selects a pivot, partitions the array into two parts, and repeats this process recursively. It’s alternatively called divide-and-conquer algorithm. The algorithm has good average performance, but is sensitive to the pivot that is chosen. If good pivots are chosen, meaning ones that consistently decrease the search set by a given fraction, so the time complexity is almost certain O(n). However, if bad pivots are consistently chosen, such as decreasing by only a single element each time, so the worst-case time complexity will be O(n^2). It’s an extermely powerful algorithm.
In this post, I would like to briefly discuss about several practices for the applications of quick select algorithm, such as Find Kth Largest Element in an Array and Top K Frequent Elements.
The General Process
Quickselect uses the same overall approach as quicksort, choosing one element as a pivot and partitioning the data in two based on the pivot, accordingly as less than or greater the pivot. However, instead of recursing into both sides, as in quicksort, quickselect only recurses into one side - the side with the element it is searching for. Here is the general process:
- Pick a pivot point
- Move all elements that are larger than or equal to the pivot to the left and vice versa.
- If the position of pivot is equal to k, then we can find the kth largest value at pivot point
- If not, perform quick select on the left partition or right partition depending on the kth largest element in the position after/before k
Exercises
Find Kth Largest Element in an Array
Given an integer array nums and an integer k, return the kth largest element in the array. Note that it is the kth largest element in the sorted order, not the kth distinct element. You must solve it in O(n) time complexity.
Example: Input: nums = [3,2,1,5,6,4], k = 2. Output: 5
findKthLargest.py
def findKthLargest(self, nums: List[int], k: int) -> int:
# time:O(N) space:O(1)
start, end = 0, len(nums) - 1
def swap(i, j):
nums[i], nums[j] = nums[j], nums[i]
def find_pivot_idx(start, end):
pivot_idx = start + (end - start) // 2
swap(pivot_idx, end)
pivot_val = nums[end]
# assign end of larger index
end_of_larger_idx = start
# Move all elements that are larger than or equal to the pivot to the left and vice versa
for i in range(start, end):
if nums[i] >= pivot_val:
swap(i, end_of_larger_idx)
end_of_larger_idx += 1
swap(end, end_of_larger_idx)
return end_of_larger_idx
def kth_largest(start, end, k):
# Pick a pivot point
pivot_idx = find_pivot_idx(start, end)
if pivot_idx - start + 1 == k:
return nums[pivot_idx]
elif pivot_idx - start + 1 < k:
# quick selection on the right partition
return kth_largest(pivot_idx + 1, end, k - (pivot_idx - start + 1))
else:
# quick selection on the left partition
return kth_largest(start, pivot_idx - 1, k)
return kth_largest(start, end, k)
Description: The solution was followed through the quick select algorithm. The basic functions for caculating the Kth largest element in an array are similar as the general process.
Top K Frequent Elements
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order
Example: Input: nums = [1,1,1,2,2,3], k = 2, Output: [1,2]
topKFrequent.py
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
# time:O(N) space:O(N)
ct = Counter(nums)
keys = list(ct.keys())
start, end = 0, len(keys) - 1
def swap(i, j):
keys[i], keys[j] = keys[j], keys[i]
def find_pivot_idx(start, end):
pivot_index = start + (end - start) // 2
swap(pivot_index, end)
pivot_val = ct[keys[end]]
end_of_larger_idx = start
# Move all elements that are larger than or equal to the pivot to the left and vice versa
for i in range(start, end):
if ct[keys[i]] >= pivot_val:
swap(i, end_of_larger_idx)
end_of_larger_idx += 1
swap(end, end_of_larger_idx)
return end_of_larger_idx
def topk_select(start, end, k):
# Pick a pivot point
pivot_index = find_pivot_idx(start, end)
if pivot_index - start + 1 == k:
return keys[:pivot_index+1]
elif pivot_index - start + 1 < k:
# quick selection on the right partition
return topk_select(pivot_index + 1, end, k - (pivot_index - start + 1))
else:
# quick selection on the left partition
return topk_select(start, pivot_index - 1, k)
return topk_select(start, end, k)
Description: The solution was followed through the quick select algorithm. The basic functions for caculating the top k frequent elements in an array are similar as the general process.
Reference
Thanks for reading! Feel free to leave the comments below or email to me. Any pieces of advice or discussions are always welcome. :)