Table of Contents

Week 11 - Searching Algorithms

10_search_algorithms.pdf

Overview

The Search Problem

Reference: Pollice G., Selkow S. and Heineman G. (2016). Algorithms in a Nutshell, 2nd Edition. O' Reilly.

Performance of Search Algorithms

Pre-sorted data

Types of Search Algorithms

As we saw with Sorting Algorithms, there is no universal best algorithm for every possible use case. The most appropriate search algorithm often depends on the data structure being searched, but also on any a priori knowledge about the data.

Note: Only linear and binary covered in this module.

Linear Search examples

Search the array a for the target value 7:

  1. a[0] = 2; this is not the target value, so continue
  2. a[1] = 1; this is not the target value, so continue
  3. a[2] = 5; this is not the target value, so continue
  4. a[3] = 4; this is not the target value, so continue
  5. a[4] = 8; this is not the target value, so continue
  6. a[5] = 7; this is the target value, so return index 5

Search the array a for the target value 3:

  1. a[0] = 2; this is not the target value, so continue
  2. a[1] = 1; this is not the target value, so continue
  3. a[2] = 5; this is not the target value, so continue
  4. a[3] = 4; this is not the target value, so continue
  5. a[4] = 8; this is not the target value, so continue
  6. a[5] = 7; this is not the target value, so continue
  7. a[6] = 9; this is not the target value, so continue
  8. Iteration is complete. Target value was not found so return -1

Linear Search in code

linear_search.py
  1. # Searching an element in a list/array in python
  2. # can be simply done using \'in\' operator
  3. # Example:
  4. # if x in arr:
  5. # print arr.index(x)
  6.  
  7. # If you want to implement Linear Search in python
  8.  
  9. # Linearly search x in arr[]
  10. # If x is present then return its location
  11. # else return -1
  12.  
  13. def search(arr, x):
  14.  
  15. for i in range(len(arr)):
  16.  
  17. if arr[i] == x:
  18. return i
  19.  
  20. return -1

Binary Search examples

Search the array a for the target value 7:

  1. Middle index =(0+6)/2 = 3. a[3] = 6 < 7, so search in the right subarray (indices left=4 to right=6)
  2. Middle index =(4+6)/2 = 5. a[5] = 8 > 7, so search in the left subarray (indices left=4 to right=4)
  3. Middle index =(4+4)/2 = 4. a[4] = 7 = 7, so return index 4

Search the array a for the target value 3:

  1. Middle index =(0+6)/2 = 3. a[3] = 6 > 3, so search next in the left subarray (indices left=0 to right=2)
  2. Middle index =(0+2)/2 = 1. a[1] = 4 > 3, so search next in the left subarray (indices left=0 to right=1)
  3. Middle index =(0+1)/2 = 0. a[0] = 2 < 3, so search next in the right subarray (indices left=0 to right=0)
  4. Middle index =(0+0)/2 = 0. a[0] = 2 < 3. Now there is just one element left in the partition (index 0) – this corresponds to one of two possible base cases. Return -1 as this element is not the target value.

Iterative Binary Search in Code

iterative_binary_search.py
  1. # Iterative Binary Search Function
  2. # It returns location of x in given array arr if present,
  3. # else returns -1
  4. def binarySearch(arr, l, r, x):
  5.  
  6. while l <= r:
  7.  
  8. mid = l + (r - l)/2;
  9.  
  10. # Check if x is present at mid
  11. if arr[mid] == x:
  12. return mid
  13.  
  14. # If x is greater, ignore left half
  15. elif arr[mid] < x:
  16. l = mid + 1
  17.  
  18. # If x is smaller, ignore right half
  19. else:
  20. r = mid - 1
  21.  
  22. # If we reach here, then the element was not present
  23. return -1
  24.  
  25.  
  26. # Test array
  27. arr = [ 2, 3, 4, 10, 40 ]
  28. x = 10
  29.  
  30. # Function call
  31. result = binarySearch(arr, 0, len(arr)-1, x)
  32.  
  33. if result != -1:
  34. print "Element is present at index %d" % result
  35. else:
  36. print "Element is not present in array"

Recursive Binary Search in Code

recursive_binary_search.py
  1. # Python Program for recursive binary search.
  2.  
  3. # Returns index of x in arr if present, else -1
  4. def binarySearch (arr, l, r, x):
  5.  
  6. # Check base case
  7. if r >= l:
  8.  
  9. mid = l + (r - l)/2
  10.  
  11. # If element is present at the middle itself
  12. if arr[mid] == x:
  13. return mid
  14.  
  15. # If element is smaller than mid, then it can only
  16. # be present in left subarray
  17. elif arr[mid] > x:
  18. return binarySearch(arr, l, mid-1, x)
  19.  
  20. # Else the element can only be present in right subarray
  21. else:
  22. return binarySearch(arr, mid+1, r, x)
  23.  
  24. else:
  25. # Element is not present in the array
  26. return -1
  27.  
  28. # Test array
  29. arr = [ 2, 3, 4, 10, 40 ]
  30. x = 10
  31.  
  32. # Function call
  33. result = binarySearch(arr, 0, len(arr)-1, x)
  34.  
  35. if result != -1:
  36. print "Element is present at index %d" % result
  37. else:
  38. print "Element is not present in array"

Conclusion

Week 12: Benchmarking

Benchmarking Algorithms in Python

Overview

Motivation for Benchmarking

Time in Python

Benchmarking a single run

import time
 
#log the start time in seconds
start_time = time.time()
 
#call the function you want to be benchmarked
 
#log the start time in seconds
end_time = time.time()
 
#calculate the elapsed time
time_elapsed = end_time - start_time

Benchmarking multiple statistical runs

import time
 
num_runs = 10 # number of times to test the function
results = [] # array to store the results for each test
 
# benchmark the function
for r in range(num_runs):
   # log the start time in seconds
   start_time = time.time()
   # call the function that you want to benchmark
 
   # log the end time in seconds
   end_time = time.time()
 
   # calculate the elapsed time
   time_elapsed = end_time - start_time
 
   results.append(time_elasped)
benchmark.py
  1. from numpy.random import randint
  2. from time import time
  3.  
  4. def Bubble_Sort(array):
  5. # your sort code
  6. sorted=array
  7. return sorted
  8.  
  9. def Merge_Sort(array):
  10. # your sort code
  11. sorted=array
  12. return sorted
  13.  
  14. def Counting_Sort(array):
  15. # your sort code
  16. sorted=array
  17. return sorted
  18.  
  19. def Quick_Sort(array):
  20. # your sort code
  21. sorted=array
  22. return sorted
  23.  
  24. def Timsort(array):
  25. # your sort code
  26. sorted=array
  27. return sorted
  28.  
  29. # The actual names of your sort functions in your code somewhere, notice no quotes
  30. sort_functions_list = [Bubble_Sort, Merge_Sort, Counting_Sort, Quick_Sort, Timsort]
  31.  
  32. # The prescribed array sizes to use in the benchmarking testst
  33. for ArraySize in (100, 250, 500, 750, 1000, 1250, 2500, 3750, 5000, 6250, 7500, 8750, 10000):
  34. # create a randowm array of values to use in the tests below
  35. testArray = randint(1,ArraySize*2,ArraySize)
  36. # iterate throug the list of sort functions to call in the test
  37. for sort_type in sort_functions_list:
  38. # and test every function ten times
  39. for i in range(10):
  40. start=time()
  41. sort_type(testArray)
  42. stop=time()
  43. print('{} {} {} {}'.format(sort_type, ArraySize, i, (stop-start)*1000))
  44.  

Useful References

Python 3 time documentation: https://docs.python.org/3/library/time.html

Python Date and Time Overview https://www.tutorialspoint.com/python/python_date_time.htm

Discussions of benchmarking issues in Python (advanced material) https://www.peterbe.com/plog/how-to-do-performance-micro-benchmarks-in-python