There are two sorted arrays nums1 and nums2 of size m and n respectively.
Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).
Example 1:
nums1 = [1, 3] nums2 = [2] The median is 2.0
Example 2:
nums1 = [1, 2] nums2 = [3, 4] The median is (2 + 3)/2 = 2.5
To solve this problem, we need to understand "What is the use of median". In statistics, the median is used for:
Dividing a set into two equal length subsets, that one subset is always greater than the other.
If we understand the use of median for dividing, we are very close to the answer.
First let's cut into two parts at a random position :
left_A | right_A A[0], A[1], ..., A[i-1] | A[i], A[i+1], ..., A[m-1]
Since has elements, so there are kinds of cutting ().
And we know:
.
Note: when , is empty, and when , is empty.
With the same way, cut into two parts at a random position :
left_B | right_B B[0], B[1], ..., B[j-1] | B[j], B[j+1], ..., B[n-1]
Put and into one set, and put and into another set. Let's name them and :
left_part | right_part A[0], A[1], ..., A[i-1] | A[i], A[i+1], ..., A[m-1] B[0], B[1], ..., B[j-1] | B[j], B[j+1], ..., B[n-1]
If we can ensure:
then we divide all elements in into two parts with equal length, and one part is always greater than the other. Then
To ensure these two conditions, we just need to ensure:
(or: )
if , we just need to set:and
ps.1 For simplicity, I presume are always valid even if , , , or . I will talk about how to deal with these edge values at last.
ps.2 Why ? Because I have to make sure is non-negative since and . If , then may be negative, that will lead to wrong result.
So, all we need to do is:
Searching in , to find an object such that:
and where
And we can do a binary search following steps described below:
Now we have . And there are only 3 situations that we may encounter:
and
Means we have found the object , so stop searching.
Means is too small. We must adjust to get .
Can we increase ?
Yes. Because when is increased, will be decreased.
So is decreased and is increased, and may
be satisfied.
Can we decrease ?
No! Because when is decreased, will be increased.
So is increased and is decreased, and will
be never satisfied.
So we must increase . That is, we must adjust the searching range to .
So, set , and goto 2.
:
Means is too big. And we must decrease to get .
That is, we must adjust the searching range to .
So, set , and goto 2.
When the object is found, the median is:
when is odd
when is even
Now let's consider the edges values where may not exist. Actually this situation is easier than you think.
What we need to do is ensuring that . So, if and are not edges values (means all exist), then we must check both and . But if some of don't exist, then we don't need to check one (or both) of these two conditions. For example, if , then doesn't exist, then we don't need to check . So, what we need to do is:
Searching in , to find an object such that:
or or and
or or where
And in a searching loop, we will encounter only three situations:
- or or and
or or
Means is perfect, we can stop searching.- and and
Means is too small, we must increase it.- and and
Means is too big, we must decrease it.
Thanks to @Quentin.chen for pointing out that: and . Because:
So in situation 2. and 3. , we don't need to check whether and whether .
Java
class Solution { public double findMedianSortedArrays(int[] A, int[] B) { int m = A.length; int n = B.length; if (m > n) { // to ensure m<=n int[] temp = A; A = B; B = temp; int tmp = m; m = n; n = tmp; } int iMin = 0, iMax = m, halfLen = (m + n + 1) / 2; while (iMin <= iMax) { int i = (iMin + iMax) / 2; int j = halfLen - i; if (i < iMax && B[j-1] > A[i]){ iMin = iMin + 1; // i is too small } else if (i > iMin && A[i-1] > B[j]) { iMax = iMax - 1; // i is too big } else { // i is perfect int maxLeft = 0; if (i == 0) { maxLeft = B[j-1]; } else if (j == 0) { maxLeft = A[i-1]; } else { maxLeft = Math.max(A[i-1], B[j-1]); } if ( (m + n) % 2 == 1 ) { return maxLeft; } int minRight = 0; if (i == m) { minRight = B[j]; } else if (j == n) { minRight = A[i]; } else { minRight = Math.min(B[j], A[i]); } return (maxLeft + minRight) / 2.0; } } return 0.0; } }
Python
def median(A, B): m, n = len(A), len(B) if m > n: A, B, m, n = B, A, n, m if n == 0: raise ValueError imin, imax, half_len = 0, m, (m + n + 1) / 2 while imin <= imax: i = (imin + imax) / 2 j = half_len - i if i < m and B[j-1] > A[i]: # i is too small, must increase it imin = i + 1 elif i > 0 and A[i-1] > B[j]: # i is too big, must decrease it imax = i - 1 else: # i is perfect if i == 0: max_of_left = B[j-1] elif j == 0: max_of_left = A[i-1] else: max_of_left = max(A[i-1], B[j-1]) if (m + n) % 2 == 1: return max_of_left if i == m: min_of_right = B[j] elif j == n: min_of_right = A[i] else: min_of_right = min(A[i], B[j]) return (max_of_left + min_of_right) / 2.0
Complexity Analysis
Time complexity: .
At first, the searching range is .
And the length of this searching range will be reduced by half after each loop.
So, we only need loops. Since we do constant operations in each loop, so the time complexity is .
Since , so the time complexity is .
Space complexity: .
We only need constant memory to store local variables, so the space complexity is .
Analysis written by: @MissMary