Has Not Appeared
 0/6

Wavelet Tree

Authors: Benjamin Qi, Omri Jerbi

Wavelet trees are data structures that support efficient queries for the k-th minimum element in a range by maintaining a segment tree over values instead of indices"

Wavelet Tree

Wavelet trees are data structures that support efficient queries for the k-th minimum element in a range by maintaining a segment tree over values instead of indices.

Focus Problem – try your best to solve this problem before continuing!

Resources
IOI

Introduces Wavelet Tree

CF

Link in blog post is broken, check my comment.

Suppose you want to support the following queries:

  • Given a range ll, rr count the number of occurrences of value xx.
  • Given a range ll, rr find the kk smallest element

With wavelet tree, you can easily support those queries in log(M)log(M) time. when M is the maximum value in the array.

Wavelet tree structure

A wavelet tree is a binary tree where each node represents a range of values. The root node covers the entire range, and each subsequent level splits the range into two halves.

We are going to maintain a segment tree over values instead of indices. Each segment will contain indices whose values lie within the segment's range. We'll save those indices in a vector. Notice that an index can be in at most log(M)log(M) segments

Wavelet tree Visualization

Solving the first type of query

Given a range l, r count the number of occurrences of value x.

To calculate the number of occurrences from 𝑙𝑙 to 𝑟𝑟, we can use the following formula:

occurrences(l,r)=occurrences(r)occurrences(l)\begin{aligned} \texttt{occurrences}(l, r) = \texttt{occurrences}(r) - \texttt{occurrences}(l) \end{aligned}

This reduces the problem to counting the number of occurrences in a prefix.

One way to solve the problem is to go to the leaf node and perform a binary search for the number of indices less than 𝑟𝑟 However, let's explore a different approach that can also be extended to the second type of query.

A different way to find the Index of 𝑟𝑟 in the list of vertices

Instead of binary searching on the leaf, we update 𝑟𝑟 as we recurse down the tree. If we can determine the position (index) of 𝑟𝑟 in the left and right children of a node. We can recurse down the tree and determine its position in the leaf node.

To find the position of 𝑟𝑟 in a node's left and right children, we need to determine how many indices are smaller than the middle value (mid) and precede 𝑟𝑟. This can be done using a prefix sum.

Let's define:

  • c[i]c[i] = as 1 if index[i]index[i] is smaller than mid otherwise 0
  • prefixB[i]prefixB[i] as prefix sum of c[i]c[i]

Formally

c[i]=index[i]<mid?1:0;prefixB[i]=prefixB[i1]+c[i]c[i] = index[i] < mid ? 1 : 0; prefixB[i] = prefixB[i - 1] + c[i]

To update rr as we recurse down, we do the following:

  • To know the value of 𝑟 if we recurse left, we use prefixB[r]
  • If we recurse right, we use 𝑟 - prefixB[r]

Solving the second type of query

Given a range l, r find the k smallest element

We will determine whether the answer for a given node is in the left or the right segment. We can calculate how many times the elements within the segments' ranges appear in our range (l,r)(l, r) using our first type of query. Note that this also works for non-leaf nodes using the following formula:

occurrences(l,r)=rl\texttt{occurrences}(l, r) = r - l

Simular

This is similar to counting how many times a value appears up to index 𝑅 in our previous query. We did this by using the new 𝑅 value at the leaf node. But now, we consider the difference between the updated 𝑅 and 𝐿

Therefore, the occurrences of the left node is

left_occurrences=prefixB[r]prefixB[l]\texttt{left\_occurrences} = prefixB[r] - prefixB[l]

Left occurrences

Note that left_occurrences\texttt{left\_occurrences} is the number of indices between l and r whose value is less than mid

  • If left_occurrences\texttt{left\_occurrences} is greater or equal to kk, it means the kk-th smallest element is in the left subtree. Therefore, we update our range and recurse into the left child
  • If left_occurrences\texttt{left\_occurrences} is less than kk, it means the kk-th smallest element is in the right subtree. We adjust k by subtracting left_occurrences\texttt{left\_occurrences} from kk, update our range, and recurse into the right child

Notice

Notice we still update l,rl, r accordingly when we go left or right

the answer then will be the value of the node we end up on (leaf)

Implemention

Time Complexity: O(Qlog(M))\mathcal{O}(Q * log(M))

C++

#include <bits/stdc++.h>
using namespace std;
struct Segment {
Segment *left = nullptr, *right = nullptr;
int l, r, mid;
bool children = false;
vector<pair<int, int>> indices; // Index, Value
vector<int> prefixB;

Supporting updates

Let's support updates of type:

  • change value at index ii to yy

We can traverse down to the leaf to remove the old element and also traverse down to add the new element.

So what do the updates change?
Our indices vector
Our prefix vector

To change the indices vector, what we can do is, instead of storing a vector, use a set. Then erasing and adding values becomes easy.

On the other hand, To change the prefix vector, since each update could change our prefix vector a lot, we can't maintain just the normal vector. What we could do is use a sparse segment tree.

  • erasing and inserting can be done by just setting the value to 0 or 1 at the specific index
  • querying for a prefix can be done by querying the segment tree from 0 to ii

This approach is not memory efficient and requires a segment tree's implementation. A more friendly approach would be using an order statistics tree. Such that querying for a prefix would be equivalent to order_of_key(ii)

Problems

StatusSourceProblem NameDifficultyTags
CFNormal
Show TagsWavelet
SPOJNormal
Show TagsWavelet
COCINormal
Show TagsWavelet, Persistent Segtree
KattisVery Hard
Show TagsWavelet
GlobeX CupVery Hard
Show TagsWavelet

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!