2.1 - Algorithms (Part 1)

I decided to follow savemyexams here with doing searching/sorting algorithms (which really should be in part 2), earlier

Principles of Computational Thinking

- Computational thinking is the process of breaking down a problem into smaller parts and solving them one at a time.
- It contains 3 main components:

Abstraction
Decomposition
Algorithmic thinking

Abstraction

- Abstraction is the process of simplifying a problem by removing unnecessary details.
- An example of abstraction could be a high-level programming language like Python or C, which hide the differences between different processor architectures so your program will run the same everywhere without having to worry about the underlying hardware.
- There are plenty of other examples as well, such as the concept of 'processes' in OSes, allowing multiple programs to run concurrently without having to manage the CPU and memory at a low level.

Decomposition

- Decomposition is the process of breaking down a problem into smaller parts, which can then be solved individually.
- An example of decomposition could be a mathematical equation, which can be broken down into smaller parts such as addition, subtraction, multiplication, and division.
- This process is also used in programming, where a program can be broken down into smaller parts such as functions, loops, conditionals, and variables.

Algorithmic thinking

- Algorithmic thinking is turning a problem into a series of basic step-by-step instructions.
- It requires the combined use of both abstraction and decomposition.
- Algorithms help problems become automated and let computers understand what to do.
- A good algorithm should be as bulletproof but simple as possible, to reduce the chance of problems.

Searching Algorithms

- Searching algorithms are algorithms used to efficiently find a specific element in a list.
- The two sorting algorithms you need are Linear Search and Binary Search.
- For searching (and sorting) algorithms, you need to understand time complexity. This is written in big O notation. Something like O(n) means that if it took 1 second for each object in a list, it would take n seconds to search the entire list.

Linear Search

Definition:

Linear search is the simplest search algorithm. It checks each element in a list, one by one, until it finds the target value or reaches the end of the list. This method does not require the list to be sorted.

How It Works:

Start at the beginning of the list.
Compare the current element with the target value you are searching for.
If the current element matches the target, the search is successful, and the index of the element is returned.
If the current element does not match the target, move to the next element in the list.
Repeat steps 2-4 until you either find the target value or reach the end of the list.
If the end of the list is reached without finding the target value, the search is unsuccessful, and a suitable message or value (like -1) is returned.

Example:

Imagine you have a list of numbers: [3, 8, 12, 5, 6], and you want to find the number 12 using linear search.

Start with the first number (3) – it's not 12.
Move to the next number (8) – it's not 12.
Move to the next number (12) – this time, it matches the target. The search is successful.

Pros:

Simple to implement and understand.
Works on both sorted and unsorted lists.

Cons:

Inefficient for large lists because each element must be checked, one at a time.
Worst-case scenario: it checks all elements, making it slow for large datasets (with a time complexity of O(n), where n is the number of elements in the list).

Implementation (Python):

def linear_search(arr, target):
    # Traverse through the list
    for i in range(len(arr)):
        # If the target is found, return the index
        if arr[i] == target:
            return i
        # Repeat
        
    # If the target is not found, return -1
    return -1

Binary Search

Definition:

Binary search is a more efficient search algorithm that works only on sorted (in order) lists. It repeatedly divides the search interval in half to find the target value.

How It Works:

Start with two pointers – one at the beginning of the list (low) and one at the end (high).
Calculate the middle point of the list using the formula: middle = (low + high) // 2.
(// is 'floor division'. Just division but you discard the remainder.)
Compare the middle element with the target value:
- If the middle element matches the target value, the search is successful, and the index is returned.
- If the target value is less than the middle element, narrow the search to the left half of the list by setting high = middle - 1.
- If the target value is greater than the middle element, narrow the search to the right half of the list by setting low = middle + 1.
Repeat steps 2-3 until the target value is found or the low pointer exceeds the high pointer, indicating the target is not in the list.

Example:

Imagine you have a sorted list of numbers: [2, 4, 7, 10, 15, 20], and you want to find the number 10 using binary search.

Start with low = 0 and high = 5 (index positions).
Calculate the middle index: middle = (0 + 5) // 2 = 2.
The middle element is 7, which is less than 10. So, search in the right half (low = 3).
Recalculate the middle index: middle = (3 + 5) // 2 = 4.
The middle element is 15, which is greater than 10. So, search in the left half (high = 3).
Now, middle = (3 + 3) // 2 = 3.
The middle element is 10, which matches the target value. The search is successful.

Pros:

Much faster than linear search for large, sorted lists.
Reduces the problem size by half with each step, leading to a time complexity of O(log n), where n is the number of elements in the list.

Cons:

Requires the list to be sorted before searching.
Slightly more complex to implement compared to linear search.

Implementation (Python):

def binary_search(arr, target):
    # Initialise the low and high pointers
    low = 0
    high = len(arr) - 1 
    # Continue searching as long as low <= high
    while low <= high:
        # Calculate the middle index
        mid = (low + high) // 2
                
        # If the target is at the middle, return its index
        if arr[mid] == target:
            return mid
        # If the target is smaller than the middle element, search the left half
        elif arr[mid] > target:
            high = mid - 1
        # If the target is larger than the middle element, search the right half
        else:
            low = mid + 1
        # Repeat
            
    # If the target is not found, return -1
    return -1

Comparison of Linear Search and Binary Search:

Feature	Linear Search	Binary Search
Data Requirement	Works on both sorted and unsorted lists	Requires a sorted list
Complexity	O(n) – checks each element	O(log n) – divides the list
Ease of Use	Simple to implement	More complex, needs sorting
Efficiency	Slower for large datasets	Faster, especially for large lists

Sorting Algorithms

- A sorting algorithm is an algorithm that sorts a list of elements.
- There are many different sorting algorithms, each with its own advantages and disadvantages.
- The ones you need to know are Bubble sort, Merge sort and Insertion sort.
- Sorting is usually done because it makes searching faster.

Bubble Sort

Definition:

- Bubble sort is a simple sorting algorithm that repeatedly steps through the list (starting from the first element), compares adjacent elements and swaps them if they are in the wrong order.
- Once it has passed through the whole list once, the tallest element will be at the end (because it will have been repeatedly bought forward).
- It continues to do this until no swaps are needed, which indicates the list is sorted.
- Its time complexity is O(n^2), which means it takes a lot more time with each element.

Example:

Suppose you have the list [5, 3, 8, 4, 2].

First Pass:

1. Compare 5 and 3. Swap because 5 > 3. List: [3, 5, 8, 4, 2]
2. Compare 5 and 8. No swap. List: [3, 5, 8, 4, 2]
3. Compare 8 and 4. Swap because 8 > 4. List: [3, 5, 4, 8, 2]
4. Compare 8 and 2. Swap because 8 > 2. List: [3, 5, 4, 2, 8]

Second Pass:

1. Compare 3 and 5. No swap. List: [3, 5, 4, 2, 8]
2. Compare 5 and 4. Swap because 5 > 4. List: [3, 4, 5, 2, 8]
3. Compare 5 and 2. Swap because 5 > 2. List: [3, 4, 2, 5, 8]

Third Pass:

1. Compare 3 and 4. No swap. List: [3, 4, 2, 5, 8]
2. Compare 4 and 2. Swap because 4 > 2. List: [3, 2, 4, 5, 8]

Fourth Pass:

1. Compare 3 and 2. Swap because 3 > 2. List: [2, 3, 4, 5, 8]

Final Sorted List:

[2, 3, 4, 5, 8]

Python:

def bubble_sort(arr):
    n = len(arr)
    for i in range(n):
        swapped = False
        for j in range(0, n - i - 1):
            if arr[j] > arr[j + 1]:
                # Swap if the element found is greater than the next element
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                swapped = True
        if not swapped:
            break  # If no two elements were swapped, the array is sorted
    return arr

Here's a visualisation with sound (using ArrayV):

Merge Sort

Definition:

- Merge sort is a divide-and-conquer algorithm that splits the list into two halves, recursively sorts each half, and then merges the sorted halves back together.
- It is known for its efficiency and stability. Even in the worst case, merge sort operates with a time complexity of O(n log n).
- The merging process ensures that the resultant list is sorted.

Example:

Suppose you have the list [5, 3, 8, 4, 2].

First Split:

Split into two halves: [5, 3] and [8, 4, 2]

Second Split:

Split [5, 3] into [5] and [3]. Split [8, 4, 2] into [8] and [4, 2].
Split [4, 2] into [4] and [2].

Merge Step 1:

Merge [5] and [3] into [3, 5].
Merge [4] and [2] into [2, 4].

Merge Step 2:

Merge [8] and [2, 4] into [2, 4, 8].

Final Merge:

Merge [3, 5] and [2, 4, 8] into [2, 3, 4, 5, 8].

Final Sorted List:

[2, 3, 4, 5, 8]

Python:

def merge_sort(arr):
    if len(arr) > 1:
        mid = len(arr) // 2
        left_half = arr[:mid]
        right_half = arr[mid:]
    
        # Recursively split and sort both halves
        merge_sort(left_half)
        merge_sort(right_half)
    
        i = j = k = 0
    
        # Merge the two halves
        while i < len(left_half) and j < len(right_half):
            if left_half[i] < right_half[j]:
                arr[k] = left_half[i]
                i += 1
            else:
                arr[k] = right_half[j]
                j += 1
            k += 1
    
        # Check for any leftover elements
        while i < len(left_half):
            arr[k] = left_half[i]
            i += 1
            k += 1
    
        while j < len(right_half):
            arr[k] = right_half[j]
            j += 1
            k += 1
    
    return arr

Insertion Sort

Definition:

- Insertion sort is a simple sorting algorithm that builds the final sorted list one item at a time.
- It is much like sorting playing cards in your hands. The list is split into a sorted part and an unsorted part.
- Elements from the unsorted part are picked one by one and placed at the correct position in the sorted part.

Example:

Suppose you have the list [5, 3, 8, 4, 2].

Initial List:

[5, 3, 8, 4, 2]

Step 1:

The first element 5 is already sorted. Move to the next element 3.
Insert 3 before 5. List: [3, 5, 8, 4, 2]

Step 2:

Next element is 8. It is already greater than 5, so no changes. List: [3, 5, 8, 4, 2]

Step 3:

Next element is 4. Insert it before 5 and after 3. List: [3, 4, 5, 8, 2]

Step 4:

Next element is 2. Insert it at the beginning. List: [2, 3, 4, 5, 8]

Final Sorted List:

[2, 3, 4, 5, 8]

Python:

def insertion_sort(arr):
    for i in range(1, len(arr)):
        key = arr[i]
        j = i - 1
    
        # Move elements of arr[0..i-1], that are greater than key, to one position ahead
        while j >= 0 and key < arr[j]:
            arr[j + 1] = arr[j]
            j -= 1
        arr[j + 1] = key
    
    return arr

Summary

Sorting Algorithm	Pros	Cons
Bubble Sort	Simple to understand and implement Doesn't require extra memory (in-place) Stable sort (maintains order of equal elements)	Very inefficient for large datasets (O(n^2) time complexity) Rarely used in practical applications due to poor performance Not suitable for large datasets
Insertion Sort	Simple and intuitive Efficient for small or nearly sorted datasets (O(n) time complexity for nearly sorted) In-place sorting (requires constant space) Stable sort	Still inefficient for large datasets (O(n^2) time complexity in worst-case) Less efficient than more advanced algorithms like Merge Sort for large unsorted lists
Merge Sort	Efficient for large datasets (O(n log n) time complexity) Consistent performance regardless of input (no worst-case O(n^2) scenario) Stable sort Can handle large datasets more effectively	Requires additional memory space (O(n) space complexity) More complex to implement than Bubble Sort or Insertion Sort May not be the most efficient for small datasets due to overhead