by Suvro Banerjee on Oct 1, 2018

Let’s explore the basic structure of designing an algorithm through some interesting problems and their subtleties

Motivation

It took me years to realize that Computer Science is not about “Computers” or “Programming”. We fall into this trap as we get so much hooked with the word “computer” and “programming” which are used so loosely with this discipline. Computer science is the study of problems. Given a problem, a computer scientist’s goal is to develop an algorithm, a step-by-step list of instructions for solving any instance of the problem that might arise.Algorithms are finite processes that if followed will solve the problem. Algorithms are solutions, of-course with few exceptions like everything else in this world. Computers and Programming are just the tools to achieve that.

After I was convinced that “Computer Science is the Study of Algorithms”, I could quickly relate it to Mathematics. In fact, they are more than twin siblings. Now that’s the beauty when you can gracefully navigate from one discipline to another. In short, I wouldn’t be wrong if I say, Mathematics is the foundation on which Computer Science is built and furthermore Algorithms are another way of expressing Mathematics. It is the music of reason.

It is problem solving, intellectually absorbing and beautiful — which is enough of a motivation for me to get started with the study of Algorithms. I am sure you have and you will. Let me end this section with two quotes which gives a good foundation to our study and certainly a food for thought.

An algorithm must be seen to be believed. — Donald Knuth (Mathematician & Computer Scientist)

If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is. — John von Neumann (Mathematics, Physics, Statistics, Economics and Computer Science)

Introduction

Let’s start with a problem. The algorithmic problem known as sorting is defined as follows-

The distinction between a problem and an instance of a problem is fundamental. The instance in this case might be an array of names like {Suvro, Bhavna, Jane, Mike, Ram, Richa}, or a list of numbers like {7, 12, 11, 101, 45}. But, “Sorting” algorithm is a general problem and hence needs a general solution which can take any possible input instances and gets you the desired output. There are various algorithms to solve the sorting problem, of which one is called as “Insertion Sort”.

Insertion sort is a method for sorting that starts with a single element (thus forming a trivially sorted list) and then incrementally inserts the remaining elements so that the list stays sorted. Take a look at the representation on the left.

The Python implementation of Insertion Sort is as follows-

alist = ['h', 'e', 'a', 'd']
def insertionSort(alist):
    for i in range(1, len(alist)): # it starts from position 1 , i.e. "e" and goes till end od the string "d"
        j = i
        current_value = alist[i]
        print("Iteration: ", i)
        while (j > 0 and current_value < alist[j-1]):
            alist[j] = alist[j-1]
            j = j - 1
            print(alist)
            
        alist[j] = current_value
        print(alist)
        print('\n')
        
insertionSort(alist)

We always seek algorithms that are correctefficient and easy to implement. In this introductory chapter we will only focus on the correctness issue of algorithms.

Correct algorithms usually come with a proof of correctness, i.e. it must take every instance of the problem to the desired result, similar to what we saw in the above example. But, before we go further let’s demonstrate why “it’s obvious” never suffices a proof of correctness and is usually flat-out wrong.

Is my Algorithm Obvious or Correct ?

Let’s consider a problem that arises often in manufacturing, transportation, and testing applications. Say, you have a robot which is capable of soldering. Now, you want that to solder a circuit board which comprises of chips and other components which need to be fastened to the board. So, the obvious question is how is the robot going to tour across the board and do the task. It has to do efficiently as there are lot of boards to be manufactured in a given time and each board has many such soldering points.

The formal problem definition will look something like this and we need to program the robot arm.

Problem : Robot Tour Optimization

Input : A set S of ’n’ points in the plane

Output: What is the shortest cycle tour that visits each point in the set S ?

Perhaps the most important idea would be the nearest-neighbor heuristic. Starting from some random point p0, we walk to its first nearest neighbor p1. From p1, we walk to its nearest unvisited neighbor (thus excluded p0 as a candidate). We keep repeating it until all the points are visited once and then we return to our initial point p0 to complete the tour.

So, the core idea over here is to visit nearby points before we visit faraway points to reduce the total travel time. It works perfectly on the depiction-1 given above and also not much but reasonably efficient at it looks at each pair of points (pi, pj) twice, once when adding pi and other when adding pj to the tour. But, sorry to say this algorithm is completely WRONG.

The algorithm of course finds a tour but it doesn’t necessarily find the shortest possible tour. Consider the case when all the points lie along a line, see depiction-2. Then if we start at 0 then we keep hopscotching left-right-left… across the board whereas an optimal tour might have been start from the leftmost point and visit each point as we walk right and finally return back to the first point.

May be what we need is a different approach. Always walking to the closest point is too restrictive. So, you can’t tell just by looking at them if the algorithms are right or wrong.

At this point, you might wonder what should be the correct algorithm to solve this problem. Well, one answer can be for sure correct which is to enumerate all possible orderings of all the points and then select that ordering which minimizes the total length. So, we are guaranteed to end up with the shortest possible tour.

Now, let’s evaluate this possibility further. Say, there are total 20 points. So, to enumerate all the possible paths with 20 points we will have 20! (20 factorial) which is of the magnitude 10 to the power 18 different orderings. Now that’s a big number for a computer to evaluate and say if n = 1000 which is quite possible, then may be we have to wait for another big bang to happen before the computer could evaluate all the possibilities and figure out which path is the shortest.

This problem is more commonly known as the Traveling Salesman Problem (TSP) and we will definitely make an effort to find an efficient algorithm to solve this problem but for now the main take-away from this discussion is there is a fundamental difference between algorithms and heuristics. Algorithms always produce a correct result, but heuristic might do a great job but never guarantees a complete solution.

More subtleties about Algorithm Correctness

Let’s look at another problem to expand our thought on the Algorithm Correctness point. This is a Scheduling Problem. Say, you are one of the sought after actor/actress and you got a list of movie projects to sign in. The offers come with the start date and the end date of the project. Now to take up the job you can’t simultaneously accept two projects whose intervals overlap.

For an artist such as yourself, the objective is very clear i.e. to make as much money as possible from these projects. Each project pay the same fee, which implies you seek the largest possible set of projects whose time intervals do not overlap. Let’s look at the pictorial depiction of this scheduling problem.

Now, like before let’s formally define the problem statement-

Problem : Movie scheduling problem

Input : A set of of intervals on the line.

Output : What is the largest subset of mutually non-overlapping intervals which can be selected from ?

I am sure there will be lots of ideas to design this algorithm. One could be start working whenever the first work is available i.e. you should start with the job with the earliest start date after all you don’t want to sit idle.

Now why this is not a great idea to go ahead with. Think of the first job which lasts for a very long period of time and in turn kills all the job prospects which have shorter durations. This scenario is depicted below-

So, the natural thought progression would be to start selecting the shortest job and keep seeking the shortest available job at every turn. Maximizing this number would maximize the profit. Let’s define this heuristic formally-

Now, this approach looks good, but taking shortest jobs might limit us to half the payoff. Let’s look at below depiction on this case.

So, let’s look for an algorithm which is both correct and efficient –

So, in our case, refer Depiction -3 to see how the projects are lined up. The below depiction shows how the artist accept the movies-

Take Aways

  1. There is a fundamental difference between algorithms, which always produce a correct result, and heuristics which may usually do a good job but without providing any guarantee.
  2. Algorithm correctness is a property that must be carefully demonstrated.

Author’s Note

This is the first article of the series, “Algorithms & Data Structures”. In the next article we will build a Mathematical framework which would demonstrate the correctness of the algorithms. Stay tuned …

Happy Learning 🙂

Sources

  1. The Algorithm Design Manual — Steven S. Skiena, Department of Computer Science, SUNY Stony Brook, USA.
  2. Introduction to Algorithms — Cormen, Leiserson, Rivest & Stein
  3. Algorithms & Data Structures using Python — Brad Miller and David Ranum, Luther College
Collected at:  https://medium.com/explore-artificial-intelligence/difference-between-algorithms-and-heuristic-ca8a27cc9a7d 

2 thoughts on “Difference between Algorithms and Heuristic”

  1. Nice post. I was checking continuously this blog and I am impressed! Extremely useful information particularly the last part 🙂 I care for such info much. I was looking for this certain info for a very long time. Thank you and good luck.

Leave a Reply

Your email address will not be published. Required fields are marked *