Two Pointers Technique & Binary Search For Beginners
Sample Problem: Electronics Shop ( HackerRank )
(Find me on HackerRank @ Handle: BishalG )
Problem Summary:
Solution Idea:
(Find me on HackerRank @ Handle: BishalG )
Problem Summary:
In this problem, you are given two arrays of positive integer numbers of lengths n and m respectively and another positive integer b. Your task is to find out the maximum integer formed by the summation of two integer values taken from different arrays and is less than or equal to b. If there does not exist such summation print -1.
Solution Idea:
Since the constraints for n and m are small ( i.e. 1 <= n, m <= 1000 ) , so this problem is very much suitable to test different variants of our solution approaches.
Here, we will discuss about 3 different approaches to solve this problem progressively starting from least efficient solution to most efficient one.
Approach - 1: Brute-force
If we think in brute-force manner, the naive approach to solve this problem will be to try every possible pairing among these two arrays. For this, we can just iterate over first array and from each indices of first array ( say , i )access over the indices in another array (say , j ) and perform summation of values in these indices. If summation of values in these indices is found to be less than or equal to b, this could be a potential solution. So, overall answer of the problem is maximum among such summation.
The sample code for above approach can be something like this:
int getMoneySpent(vector<int> keyboards, vector<int> drives, int b) { int n = keyboards.size(); int m = drives.size(); int ans = -1; for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { if ((keyboards[i] + drives[j]) <= b) { ans = max(ans, keyboards[i] + drives[j]); } } } return ans; }
In above solution, for total n indices in first array, we are iterating over m indices in second array and performing summation and comparisons, this will lead to O ( n * m ) time complexity.
Lets think about some better approaches.
Approach - 2: Binary Search
If we look at carefully in the code of approach - 1, from each indices - i in first for loop, we are using second for loop to find out a maximum possible value which is less than or equal to 'b - keyboards[i]'. Basically, from each indices in first array we are searching for lower-bound of some value in second array.
So, if second array is sorted in ascending order, can we search for such value in less than O ( m ) time complexity ?
Yes, if second array is sorted in ascending order For every ith index in first array, summation of values at ith index of first array and the values at the beginning of the second array are likely to be less than equal to b. This property will be valid up-to some index in second array then summation are likely to be greater than b after some index. So, we can observe monotonic property over second array. That means, from each index in second array, we can easily decide whether we have to choose left range or right range to find the solution.
So, we can apply binary search algorithm over second array to find appropriate summation. While performing binary search:
taking log base 2 on both sides,
log2 m = log2 (2^k)
log2 m = k log2(2)
log2 m = k * 1 [ since, log2(2) is 1 ]
Therefore, we need k = log2 (m) steps to find out potential solution using binary search.
Hence, we reduces our solution's time complexity to max( O(m log m), O(n * log m ) ).
Since, we sorted second array at the beginning, we require O(m log m) complexity to perform sorting and O ( n * log m ) complexity for finding out solution using binary search.
If we have provided the constraint that both arrays are already sorted, then complexity of our solution will be reduced to: O( n * log m ).
If we have provided constraint that input arrays are already sorted then we can take huge benefit to solve this problem. In addition, lets assume that both arrays are sorted as well as one of the array is very huge in size in comparison to another say, n << m. Say for example, n = 10 and m = 10 ^ 6.
Then, if we have to solve this problem, we ended up performing O( n * m) ᐀ 10^7 operations in our 1st approach whereas, we require only O( n * log m) ᐀ 10 * 20 = 200 operations in our 2nd approach which is far more efficient.
The sample code for above approach can be something like this:
Approach - 3: Two Pointers
We will only be benefited from approach - 3 as comparison to approach -2 assuming that the size of both arrays are almost same i.e n = m.
In approach -2, we are performing logm steps for searching from each of n indices in first array. Do we really need to perform binary search in each step or can we do search potential solution from each index in better time ?
Lets assume that, we are looking for a potential solution for index - i and after performing binary search we find potential index - k in second array. Then, we can observe that:
It is because as we move to the right side of index -i, value will also increases and hence, pair summation will also be definitely greater than that of before so there is no possibility of finding solution to the right side of index - k. But, we can find out potential solution on any index which is less than or equal to index - k.
So, we can keep on moving indices of first array from left to right and indices of second array from right to left based upon pair summation. For every indices in first array, we will move index from second array towards the left side until we find a potential summation which is smaller than or equal to b. If we find such index , we know that this is the maximum possible potential summation so moving index of further towards left will not be beneficial, instead we will move index from first array towards right side and search for potential solution towards the remaining range in second array. Eventually, we will find out our answer which is the maximum among all such potential solution.
Since, we access total n + m indices in overall operations , our time complexity is reduced to O ( n + m ) for searching part. Since, we have overhead of sorting given array. So, our complexity of solution for sample problem is max( O(m log m + n log n), O(n + m)).
If we have provided the constraint that both arrays are already sorted, then complexity of our solution will be reduced to: O ( n + m ).
The sample code for above approach can be something like this:
Conclusion:
We saw three different approaches to solve given problem. Also, we discussed about time complexity by changing given constraint of the problem and assuming input array are sorted.
If we want to solve similar problem in some real life application, its better to choose among approach 2 and approach 3. We can choose one of these approaches based upon size of input data and whether they are sorted or not. If input data are sorted and size of one input array is very less as compared to second then binary search approach will be a good choice and in other cases two pointer based approach will be a good choice. And, if we have very small length of arrays then we can still choose approach -1.
Lets think about some better approaches.
Approach - 2: Binary Search
If we look at carefully in the code of approach - 1, from each indices - i in first for loop, we are using second for loop to find out a maximum possible value which is less than or equal to 'b - keyboards[i]'. Basically, from each indices in first array we are searching for lower-bound of some value in second array.
So, if second array is sorted in ascending order, can we search for such value in less than O ( m ) time complexity ?
Yes, if second array is sorted in ascending order For every ith index in first array, summation of values at ith index of first array and the values at the beginning of the second array are likely to be less than equal to b. This property will be valid up-to some index in second array then summation are likely to be greater than b after some index. So, we can observe monotonic property over second array. That means, from each index in second array, we can easily decide whether we have to choose left range or right range to find the solution.
So, we can apply binary search algorithm over second array to find appropriate summation. While performing binary search:
- we first define our search range say lower endpoint of range as lo and higher endpoint of range as hi.
- If find out the middle point of current range [lo, hi] as md = (lo + hi) / 2.
- If summation of values at ith index in first array and md index in second array is greater than b, then we will never find solution in upper half i.e [md, hi]. So, we change our search range as [lo, md - 1] by changing hi as md - 1.
- In other case, we find out our one potential solution but as we have to find the maximum possible value which is less than or equal to b, we will change our search range as [md + 1, hi].
- In 1st step, we reduce search range by m / 2.
- In 2nd step, we reduce search range by (m / 2) / 2 = m / 4 = m / 2^2.
- Lets assume, in Kth step, we reduce search range by : m / 2^k = 1.
taking log base 2 on both sides,
log2 m = log2 (2^k)
log2 m = k log2(2)
log2 m = k * 1 [ since, log2(2) is 1 ]
Therefore, we need k = log2 (m) steps to find out potential solution using binary search.
Hence, we reduces our solution's time complexity to max( O(m log m), O(n * log m ) ).
Since, we sorted second array at the beginning, we require O(m log m) complexity to perform sorting and O ( n * log m ) complexity for finding out solution using binary search.
If we have provided the constraint that both arrays are already sorted, then complexity of our solution will be reduced to: O( n * log m ).
If we have provided constraint that input arrays are already sorted then we can take huge benefit to solve this problem. In addition, lets assume that both arrays are sorted as well as one of the array is very huge in size in comparison to another say, n << m. Say for example, n = 10 and m = 10 ^ 6.
Then, if we have to solve this problem, we ended up performing O( n * m) ᐀ 10^7 operations in our 1st approach whereas, we require only O( n * log m) ᐀ 10 * 20 = 200 operations in our 2nd approach which is far more efficient.
The sample code for above approach can be something like this:
int getMoneySpent(vector<int> keyboards, vector<int> drives, int b) { sort(drives.begin(), drives.end()); int n = keyboards.size(); int m = drives.size(); int ans = -1; for (int i = 0; i < n; ++i) { int lo = 0, hi = m - 1, ret = -1; while (lo <= hi) { int md = (lo + hi) / 2; if ((keyboards[i] + drives[md]) > b) { hi = md - 1; } else { ret = md; lo = md + 1; } } if (ret != -1) { ans = max(ans, keyboards[i] + drives[ret]); } } return ans; }
Approach - 3: Two Pointers
We will only be benefited from approach - 3 as comparison to approach -2 assuming that the size of both arrays are almost same i.e n = m.
In approach -2, we are performing logm steps for searching from each of n indices in first array. Do we really need to perform binary search in each step or can we do search potential solution from each index in better time ?
Lets assume that, we are looking for a potential solution for index - i and after performing binary search we find potential index - k in second array. Then, we can observe that:
- Summation of value at index - i in first array and summation of any value at index which is less than or equal to index - k in second array will always be less than or equal to b.
- And, summation of value at index - i in first array and summation of any value at index which is greater than index - k in second array will always be greater than b.
It is because as we move to the right side of index -i, value will also increases and hence, pair summation will also be definitely greater than that of before so there is no possibility of finding solution to the right side of index - k. But, we can find out potential solution on any index which is less than or equal to index - k.
So, we can keep on moving indices of first array from left to right and indices of second array from right to left based upon pair summation. For every indices in first array, we will move index from second array towards the left side until we find a potential summation which is smaller than or equal to b. If we find such index , we know that this is the maximum possible potential summation so moving index of further towards left will not be beneficial, instead we will move index from first array towards right side and search for potential solution towards the remaining range in second array. Eventually, we will find out our answer which is the maximum among all such potential solution.
Since, we access total n + m indices in overall operations , our time complexity is reduced to O ( n + m ) for searching part. Since, we have overhead of sorting given array. So, our complexity of solution for sample problem is max( O(m log m + n log n), O(n + m)).
If we have provided the constraint that both arrays are already sorted, then complexity of our solution will be reduced to: O ( n + m ).
The sample code for above approach can be something like this:
int getMoneySpent(vector<int> keyboards, vector<int> drives, int b) { sort(keyboards.begin(), keyboards.end()); sort(drives.begin(), drives.end()); int n = keyboards.size(); int m = drives.size(); if ((keyboards[0] + drives[0]) > b) return -1; int ans = -1; int driveIndex = m - 1; int keyboardIndex = 0; while (true) { if (keyboardIndex == n || driveIndex == -1)break; if ((keyboards[keyboardIndex] + drives[driveIndex]) > b) { driveIndex--; } else { ans = max(ans, keyboards[keyboardIndex] + drives[driveIndex]); keyboardIndex++; } } return ans; }
Conclusion:
We saw three different approaches to solve given problem. Also, we discussed about time complexity by changing given constraint of the problem and assuming input array are sorted.
If we want to solve similar problem in some real life application, its better to choose among approach 2 and approach 3. We can choose one of these approaches based upon size of input data and whether they are sorted or not. If input data are sorted and size of one input array is very less as compared to second then binary search approach will be a good choice and in other cases two pointer based approach will be a good choice. And, if we have very small length of arrays then we can still choose approach -1.
Similar Problems:
We can apply the idea of either binary search or two pointers to solve following problems:
1.) SPOJ: Hotels
It's very nice and well written.Thank you very much vhaiya.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThanks for the compliment @wetret
ReplyDelete