7. Find Median from Data Stream
Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.For example,
[2,3,4]
, the median is 3
[2,3]
, the median is (2 + 3) / 2 = 2.5
Design a data structure that supports the following two operations:
void addNum(int num) - Add a integer number from the data stream to the data structure.
double findMedian() - Return the median of all elements so far.
Example:
addNum(1)
addNum(2)
findMedian() -> 1.5
addNum(3)
findMedian() -> 2
Solution: (Bruteforce)
Inserting into a vector. Sorting the array and finding the median Time Complexity: (n log n) for 1 query
Solution: (Insertion Sort)
Median can be accessed in O(1) time Inserting the num always in the sorted position (Finding the position and inserting) O(n)
class MedianFinder
{
public:
vector<int> v;
int findPos(vector<int> v,int num){
for(int i=0;i<v.size();i++){
if(num<v[i]){
return i;
}
}
return -1;
}
/** initialize your data structure here. */
MedianFinder()
{
}
void addNum(int num)
{
int x = findPos(v,num);
if(v.size() == 0 || x == -1){
v.push_back(num);
}
else{
v.insert(v.begin()+x,num);
}
}
double findMedian()
{
double med;
if (v.size() % 2 == 0)
{
int mid = v.size() / 2;
med = ((double)v[mid] + (double)v[mid - 1]) / 2;
}
else
{
int mid = v.size() / 2;
med = v[mid];
}
return med;
}
};
Time Complexity: O(n)
Solution: (Using Heap)
We can maintain two heap :
Two priority queues:
A max-heap
lo
to store the smaller half of the numbersA min-heap
hi
to store the larger half of the numbers
Insertion will take O(log n) .
The max-heap lo
is allowed to store, at worst, one more element more than the min-heap hi
. When the heaps are perfectly balanced, the median can be derived from the tops of both heaps. Median can be assessed in O(1) time.
Adding a number num
:
Add
num
to max-heaplo
. Sincelo
received a new element, we must do a balancing step forhi
. So remove the largest element fromlo
and offer it tohi
.The min-heap
hi
might end holding more elements than the max-heaplo
, after the previous operation. We fix that by removing the smallest element fromhi
and offering it tolo
.
Approach: [41, 35, 62, 5, 97, 108]
Adding number 41
MaxHeap lo: [41] // MaxHeap stores the largest value at the top (index 0)
MinHeap hi: [] // MinHeap stores the smallest value at the top (index 0)
Median is 41
=======================
Adding number 35
MaxHeap lo: [35]
MinHeap hi: [41]
Median is 38
=======================
Adding number 62
MaxHeap lo: [41, 35]
MinHeap hi: [62]
Median is 41
=======================
Adding number 4
MaxHeap lo: [35, 4]
MinHeap hi: [41, 62]
Median is 38
=======================
Adding number 97
MaxHeap lo: [41, 35, 4]
MinHeap hi: [62, 97]
Median is 41
=======================
Adding number 108
MaxHeap lo: [41, 35, 4]
MinHeap hi: [62, 97, 108]
Median is 51.5
class MedianFinder
{
public:
/** initialize your data structure here. */
priority_queue<int> maxh;
priority_queue<int, vector<int>, greater<int>> minh;
MedianFinder()
{
}
void addNum(int num)
{
maxh.push(num);
minh.push(maxh.top());
maxh.pop();
if (minh.size() > maxh.size())
{
maxh.push(minh.top());
minh.pop();
}
}
double findMedian()
{
if (maxh.size() > minh.size())
{
return maxh.top();
}
return ((double)(maxh.top()) + (double)(minh.top())) * 0.5;
}
};
Time Complexity: O(log n)
Last updated
Was this helpful?