DSA: Searching

ยท

7 min read

What is Searching?

If we need groceries for our homes, we go to the supermarket to get all the things in one place easily. The supermarket is organized into several sections, for example, cereals, pulses, ready-to-eat food, etc. So, we go to that particular place and check if the thing we are searching for is present or not. If it is not present there, we go to some other store because if the item we are searching for is not in its section then it is unlikely to be present anywhere in the store. We either get the item or it is likely to be unavailable then.

Searching in programming

In programming, we generally search for an item in a collection of data. We will be given some set of data that may or may not have been organized or sorted.

  • It is a searching technique in which we are given a list of data and we need to find if the item(X) is present in the list or not. We check for individual items if they match X. We linearly for the item till we find the item or till we reach the end of the list and could not find X in the list.

  • X can be present anywhere on the list. It can be the first element, the middle element, the last element or anywhere in the between. Until and unless we see every element we will not be able to check if X is present in the list or not.

  • Since we have to linearly check each element, it is called Linear Search.

  • Time complexity: O(N)

  • Space complexity: O(1)

      int find(vector<int> arr,int x){
          for(int i=0;i<arr.size();i++){
              if(x==arr[i]) return i; //Returning the index of x in array
          }
          return -1; //The array does not contain x
      }
    
  • Imagine searching for a particular page number of a book containing 500 pages. Let us say you are searching for page 180. What we generally do is open a page randomly in the middle and see if the page number matches our search value or not, so let's say we open to page 250 on the book. Now we know that 180 comes before 250 so we move to the left side of the book. Now, we eliminated the search space 250-500 and the search space is 0-250. We now open randomly between 0-250 and land on page 140, as we know 180 comes after 140 and before 250, now our search space is between 140-250. We randomly open a page in between them and keep eliminating search spaces and finally after a few steps, we land on page 180.

  • This is the concept behind Binary Search. Trying and dividing the list into two halves and eliminating a search space and considering the other. It is called the Divide and Conquer technique.

  • In binary search, we initialize low as 0 and high as arraySize-1 calculate the middle which is (low+high)/2 and check if the middle element is our searched element or not. If yes then we return it else we check whether the element is on the left side or the right side. If it is on the left side we move our high to mid-1(Search space: 0,mid-1) and if it is on the right side we move our low to mid+1(Search space: mid+1,arraySize-1).

  • The above algorithm works perfectly for a sorted array.

  • Time complexity: O(log N), we can see every time we divide the array into exactly two halves. This pattern denoted log N base 2 ~ log N.

  • Space complexity: O(1).

//We are given an array(arr[]) and its size(n) and search element(k) and we need to return the index at which k is present in arr[]. If we do not find it, we return -1.
int binarysearch(int arr[], int n, int k) {
        // code here
        int low=0;
        int high=n-1;
        while(low<=high){
            int mid=low+(high-low)/2;
            if(arr[mid]==k) return mid;
            else if(arr[mid]>k) high=mid-1;
            else low=mid+1;
        }
        return -1;
    } //Here k=search element, n=size of the array.
  • Binary search is not restricted to sorted arrays only, there are a lot of applications of binary search in programming. For example, finding the square root of a number to 6/7 decimal points. We can do it using binary search.


    Let us see how we can find the square root of a number using binary search. The code is after the explanation.

    • We are given N and we need to find the square of it. Example,

      • For N=25, ans=5

      • For N=27, ans=5

      • For N=35, ans=5

      • For N=36, ans=6

    • Let us say the given input is 25. What is the possible range of the square root of it? It will be from (1 to N) so the square root will lie in the range of 1 to 25.

    • We set our low to 1 and high to N. Now, the mid turn out to be (1+25)/2=13, we check if the mid is the square of the number or if is it a potential answer. We see that 13*13 > 25, we know that any number in the range 13 to 25 including 13 cannot be the square root of 25 so we excluded that search space and now we search in the 1-12 range.

    • The mid is now 6(12+1/2), we see that 6*6>25 and the answer cannot be in the range 6 to 12 and it has to be in the 1-5 range.

    • Now our search space is 1-5, and our mid is now 3((1+5)/2) and 3*3<=25 which is in our range and can be a potential answer(Let's say we do not know if the square root of 25 is 5, we take into account every possible answer till we get the right answer). Since we have a potential answer we store the mid in our ans variable and move to the right to find any better answers.

    • Now we search in the 4-5 range. The mid is 4((4+5)/2) and 4*4<=25 which is a potential answer so we store it and move to the right by doing low=mid+1.

    • Now we search in the 5-5 range, our mid is 5 and 5*5<=25 so it is a potential answer and we move on the right hoping to get a better answer, so our low becomes 6 and high is 5. So, it violates the condition of the while loop(low<=high), so we come out of the loop having the answer to be 5 and it is the square root of 25.

  • Since every time we are dividing the search space in half and eliminating one of them so we are doing log2N work. So the time complexity is O(log N).

  • The space complexity is O(1).

      //We are given N(1<=N<=10^9) and we need to find the square root of N.
      int squareRoot(int N){
          int low=1;
          int high=N;
          int ans;
          while(low<=high){
              int mid=(high+low)/2;
              if(mid*mid <=N){
                  ans=mid;
                  low=mid+1;
              }
              else{
                  high=mid-1;
              }
          }
          return ans;
      }
    

Extra Points:

  • Binary search is widely used to solve a lot of questions where the input is not always sorted. The input might have some patterns around which we decide which side of the array to move to and which side of the array to eliminate.

  • In some cases, we apply binary search on the answer itself. We find the search space in which the answer might be present. This is a very useful implementation using which we can find the output in O(N log N) time. If we do not use binary search, we might exceed the time limit or get quadratic time complexity.

  • So, binary search is a very useful and powerful algorithm that helps to find the output to a certain category of questions in logarithmic time complexity.

A small note from my side:

This is my first few attempts at blogging, so if there are any errors please excuse me for that. I will be blogging about a lot more topics regarding DSA and computer science in general. So do consider following me on Hashnode, Twitter and LinkedIn.

I hope you all had a good read and if you gained any benefit reading this article or there is any doubt regarding the above topic or there is an error that I might have missed, leave a comment below or reach out on LinkedIn, I will be more than happy to help you out! :)

HAVE A GOOD DAY! ๐Ÿ˜Š

ย