DATA ANALYTICS REFERENCE DOCUMENT 


Document Title:  Document Title 
Document No.:  1552735583 
Author(s):  Gerhard van der Linde, Rita Raher 
Contributor(s): 
REVISION HISTORY
Revision  Details of Modification(s)  Reason for modification  Date  By 

0  Draft release  Document description here  2019/03/16 11:26  Gerhard van der Linde, Rita Raher 
Reference: ^{1)}
Algorithm  Best case  Worst case  Average case  Space Complexity  Stable?  

Bubble Sort  <m>n</m>  <m>n  2</m>  <m>n  2</m>  1  Yes  
Selection Sort  <m>n  2</m>  <m>n  2</m>  <m>n  2</m>  1  No 
Insertion Sort  <m>n</m>  <m>n  2</m>  <m>n  2</m>  1  Yes  
Merge Sort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  <m>O(n)</m>  Yes  
Quicksort  <m>n log n</m>  <m>n  2</m>  <m>n log n</m>  <m>n</m> (worst case)  No*  
Heapsort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  1  No  
Counting Sort  <m>n + k</m>  <m>n + k</m>  <m>n + k</m>  <m>n + k</m>  Yes  
Bucket Sort  <m>n + k</m>  <m>n  2</m>  <m>n + k</m>  <m>n * k</m>  Yes  
Timsort  <m>n</m>  <m>n log n</m>  <m>n log n</m>  <m>n</m>  Yes  
Introsort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  <m>log n</m>  No 
*the standard Quicksort algorithm is unstable, although stable variations do exist
Criteria  Sorting algorithm 

Small number of items to be sorted  Insertion Sort 
Items are mostly sorted already  Insertion Sort 
Concerned about worstcase scenarios  Heap Sort 
Interested in a good averagecase behaviour  Quicksort 
Items are drawn from a uniform dense universe  Bucket Sort 
Desire to write as little code as possible  Insertion Sort 
Stable sorting required  Merge Sort 
Algorithm  Best case  Worst case  Average case  Space Complexity  Stable?  

Bubble Sort  <m>n</m>  <m>n  2</m>  <m>n  2</m>  1  Yes  
Selection Sort  <m>n  2</m>  <m>n  2</m>  <m>n  2</m>  1  No 
Insertion Sort  <m>n</m>  <m>n  2</m>  <m>n  2</m>  1  Yes  
Merge Sort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  <m>O(n)</m>  Yes  
Quicksort  <m>n log n</m>  <m>n  2</m>  <m>n log n</m>  <m>n</m> (worst case)  No*  
Heapsort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  1  No  
Counting Sort  <m>n + k</m>  <m>n + k</m>  <m>n + k</m>  <m>n + k</m>  Yes  
Bucket Sort  <m>n + k</m>  <m>n  2</m>  <m>n + k</m>  <m>n * k</m>  Yes  
Timsort  <m>n</m>  <m>n log n</m>  <m>n log n</m>  <m>n</m>  Yes  
Introsort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  <m>log n</m>  No 
*the standard Quicksort algorithm is unstable, although stable variations do exist
public static void bubblesort(int[]a){ int outer, inner; for(outer = a.length  1; outer > 0;outer){ //counting down for(inner=0;inner < outer;inner++){ //bubbling up if(a[inner] > a[inner+1]; { //if out of order.... int temp = a[inner]; //...then swap a[inner] =a[inner+1]; a[inner +1] = temp; } } } }
# Bubble Sort in python def printArray(arr): print (' '.join(str(i) for i in arr)) def bubblesort(arr): for i in range(len(arr)): for j in range(len(arr)  i  1): if arr[j] > arr[j + 1]: temp = arr[j] arr[j] = arr[j + 1] arr[j + 1] = temp # Print array after every pass print ("After pass " + str(i) + " :", printArray(arr)) if __name__ == '__main__': arr = [10, 7, 3, 1, 9, 7, 4, 3] print ("Initial Array :", printArray(arr)) bubblesort(arr)
for(outer =a.length1; outer >0; outer){ for(inner=0; inner < outer; inner++){ if(a[inner]>a[inner+1]){ //swap code omitted } } }
The element at index 4 is the smallest, so swap with index 0
The element at index 2 is the smallest, so swap with index 1
The element at index 3 is the smallest, so swap with index 2
The element at index 3 is the smallest, so swap with index 3
Selection sort might swap an array element with itself; this is harmless, and not worth checking for
public static void selectionsort(int[]a){ int outer=0, inner=0, min=0; for(outer = 0; outer <a.length1;outer++){ //outer counts up min = outer; for(inner = outer +1; inner <a.length; inner++){ if(a[inner]<a[min]){ //find index of smallest value min = inner; } } //swap a [min] with a [outer] int temp = a[outer]; a[outer] = a[min]; a[min] = temp; } }
# selection sort in python def printArray(arr): return (' '.join(str(i) for i in arr)) def selectionsort(arr): N = len(arr) for i in range(0, N): small = arr[i] pos = i for j in range(i + 1, N): if arr[j] < small: small = arr[j] pos = j temp = arr[pos] arr[pos] = arr[i] arr[i] = temp print ("After pass " + str(i) + " :", printArray(arr)) if __name__ == '__main__': arr = [10, 7, 3, 1, 9, 7, 4, 3] print ("Initial Array :", printArray(arr)) selectionsort(arr)
a[1]=5 is the key; 7>5 so move 7 right by one position, and insert 5 at index 0
a[2]=2 is the key; 7>2 so move both 7 and 5 right by one position, and insert 2 at index 0
a[3]=3 is the key; 7>3 and 5>3 so move both 7 and 5 right by one position, and insert 3 at index 1
a[4]=1 is the key; 7>1, 5>1, 3>1 and 2>1 so move both 7, 5, 3 and 2 right by one position, and insert 1 at index 1
(done)
public static void insertionsort(int a[]){ for(int i=1; i<a.length; i++){ int key =a[i]; //value to be inseted int j = i1; //move all elements > key right one position while(j>=0 && a[j]>key){ a[j+1]= a[j]; j=j1; } a[j+1]=key; //insert key in its new position } }
# insertion sort def printArray(arr): return(' '.join(str(i) for i in arr)) def insertionsort(arr): N = len(arr) for i in range(1, N): j = i  1 temp = arr[i] while j >= 0 and temp < arr[j]: arr[j + 1] = arr[j] j = 1 arr[j + 1] = temp print ("After pass " + str(i) + " :", printArray(arr)) if __name__ == '__main__': arr = [10, 7, 3, 1, 9, 7, 4, 3] print ("Initial Array :", printArray(arr)) insertionsort(arr)
Criteria  Sorting algorithm 

Small number of items to be sorted  Insertion Sort 
Items are mostly sorted already  Insertion Sort 
Concerned about worstcase scenarios  Heap Sort 
Interested in a good averagecase behaviour  Quicksort 
Items are drawn from a uniform dense universe  Bucket Sort 
Desire to write as little code as possible  Insertion Sort 
Stable sorting required  Merge Sort 
Algorithm  Best case  Worst case  Average case  Space Complexity  Stable?  

Bubble Sort  <m>n</m>  <m>n  2</m>  <m>n  2</m>  1  Yes  
Selection Sort  <m>n  2</m>  <m>n  2</m>  <m>n  2</m>  1  No 
Insertion Sort  <m>n</m>  <m>n  2</m>  <m>n  2</m>  1  Yes  
Merge Sort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  <m>O(n)</m>  Yes  
Quicksort  <m>n log n</m>  <m>n  2</m>  <m>n log n</m>  <m>n</m> (worst case)  No*  
Heapsort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  1  No  
Counting Sort  <m>n + k</m>  <m>n + k</m>  <m>n + k</m>  <m>n + k</m>  Yes  
Bucket Sort  <m>n + k</m>  <m>n  2</m>  <m>n + k</m>  <m>n * k</m>  Yes  
Timsort  <m>n</m>  <m>n log n</m>  <m>n log n</m>  <m>n</m>  Yes  
Introsort  <m>n log n</m>  <m>n log n</m>  <m>n log n</m>  <m>log n</m>  No 
*the standard Quicksort algorithm is unstable, although stable variations do exist
# Merge sort def mergesort(arr, i, j): mid = 0 if i < j: mid = int((i + j) / 2) mergesort(arr, i, mid) mergesort(arr, mid + 1, j) merge(arr, i, mid, j) def merge(arr, i, mid, j): print ("Left: " + str(arr[i:mid + 1])) print ("Right: " + str(arr[mid + 1:j + 1])) N = len(arr) temp = [0] * N l = i r = j m = mid + 1 k = l while l <= mid and m <= r: if arr[l] <= arr[m]: temp[k] = arr[l] l += 1 else: temp[k] = arr[m] m += 1 k += 1 while l <= mid: temp[k] = arr[l] k += 1 l += 1 while m <= r: temp[k] = arr[m] k += 1 m += 1 for i1 in range(i, j + 1): arr[i1] = temp[i1] print ("After Merge: " + str(arr[i:j + 1])) if __name__ == '__main__': arr = [9, 4, 8, 3, 1, 2, 5] print ("Initial Array: " + str(arr)) mergesort(arr, 0, len(arr)  1)
Terminal Output:
Initial Array: [9, 4, 8, 3, 1, 2, 5] Left: [9] Right: [4] After Merge: [4, 9] Left: [8] Right: [3] After Merge: [3, 8] Left: [4, 9] Right: [3, 8] After Merge: [3, 4, 8, 9] Left: [1] Right: [2] After Merge: [1, 2] Left: [1, 2] Right: [5] After Merge: [1, 2, 5] Left: [3, 4, 8, 9] Right: [1, 2, 5] After Merge: [1, 2, 3, 4, 5, 8, 9]
The main steps in Quick sort are:
The base case for the recursion is a subarray of length 1 or 0; by definition these cases do not need to be sorted.
# Quick Sort def printArray(arr): return (' '.join(str(i) for i in arr)) def quicksort(arr, i, j): if i < j: pos = partition(arr, i, j) quicksort(arr, i, pos  1) quicksort(arr, pos + 1, j) def partition(arr, i, j): #pivot = arr[j] # pivot on the last item pivot = arr[int(j/2)] # pivot on the median small = i  1 for k in range(i, j): if arr[k] <= pivot: small += 1 swap(arr, k, small) swap(arr, j, small + 1) print ("Pivot = " + str(arr[small + 1]), " Arr = " + printArray(arr)) return small + 1 def swap(arr, i, j): arr[i], arr[j] = arr[j], arr[i] if __name__ == '__main__': arr = [9, 4, 8, 3, 1, 2, 5] print (" Initial Array :",printArray(arr)) quicksort(arr, 0, len(arr)  1)
Initial Array : 9 4 8 3 1 2 5 Pivot = 5 Arr = 4 3 1 2 5 9 8 Pivot = 2 Arr = 1 2 4 3 5 9 8 Pivot = 3 Arr = 1 2 3 4 5 9 8 Pivot = 8 Arr = 1 2 3 4 5 8 9
# Counting sort def printArray(arr): return(' '.join(str(i) for i in arr)) def countingsort(arr): count = [0] * 11 # can store the count of positive numbers <= 10 N = len(arr) for i in range(0, N): count[arr[i]] += 1 for i in range(1, len(count)): count[i] += count[i  1] print ("Counting Array :", printArray(count)) output = [0] * N for i in range(len(arr)): output[count[arr[i]]  1] = arr[i] count[arr[i]] = 1 print ("After Sorting :", printArray(output)) if __name__ == '__main__': arr = [10, 7, 3, 1, 9, 7, 4, 3] print ("Initial Array :", printArray(arr)) countingsort(arr)
Initial Array : 10 7 3 1 9 7 4 3 Counting Array : 0 1 1 3 4 4 4 6 6 7 8 After Sorting : 1 3 3 4 7 7 9 10
Criteria  Sorting algorithm 

Small number of items to be sorted  Insertion Sort 
Items are mostly sorted already  Insertion Sort 
Concerned about worstcase scenarios  Heap Sort 
Interested in a good averagecase behaviour  Quicksort 
Items are drawn from a uniform dense universe  Bucket Sort 
Desire to write as little code as possible  Insertion Sort 
Stable sorting required  Merge Sort 