Monday 9 January 2017

Design and Analysis of Algorithms

Unit-I 
Asymptotic Notations and Introduction to Algorithms
An algorithm, named after the ninth century scholar Abu Jafar Muhammad Ibn Musu Al-Khowarizmi, is defined as follows: Roughly speaking:
  • An algorithm is a set of rules for carrying out calculation either by hand or on a machine.
  • An algorithm is a finite step-by-step procedure to achieve a required result.
  • An algorithm is a sequence of computational steps that transform the input into the output.
  • An algorithm is a sequence of operations performed on data that have to be organized in data structures.
  • An algorithm is an abstraction of a program to be executed on a physical machine (model of Computation)
[http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/intro.htm]

An algorithm is a recipe or a systematic method containing a sequence of instructions to solve a computational problem. It takes some inputs, performs a well de ned sequence of steps, and produces some output. Once we design an algorithm, we need to know how well it performs on any input. In particular we would like to know whether there are better algorithms for the problem. An answer to this rst demands a way to analyze an algorithm in a machine-independent way. Algorithm design and analysis form a central theme in computer science.
[http://www.imsc.res.in/~vraman/pub/intro_notes.pdf]

Algorithmic is a branch of computer science that consists of designing and analyzing computer algorithms
1. The “design” pertain to
i. The description of algorithm at an abstract level by means of a pseudo language, and
ii. Proof of correctness that is, the algorithm solves the given problem in all cases.
2. The “analysis” deals with performance evaluation (complexity analysis).

We start with defining the model of computation, which is usually the Random Access Machine (RAM) model, but other models of computations can be use such as PRAM. Once the model of computation has been defined, an algorithm can be describe using a simple language (or pseudo language) whose syntax is close to programming language such as C or java.

Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a certain order to get the desired output. Algorithms are generally created independent of underlying languages, i.e. an algorithm can be implemented in more than one programming language.
From the data structure point of view, following are some important categories of algorithms −
  • Search − Algorithm to search an item in a data structure.
  • Sort − Algorithm to sort items in a certain order.
  • Insert − Algorithm to insert item in a data structure.
  • Update − Algorithm to update an existing item in a data structure.
  • Delete − Algorithm to delete an existing item from a data structure.

Characteristics of an Algorithm

Not all procedures can be called an algorithm. An algorithm should have the following characteristics −
  • Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases), and their inputs/outputs should be clear and must lead to only one meaning.
  • Input − An algorithm should have 0 or more well-defined inputs.
  • Output − An algorithm should have 1 or more well-defined outputs, and should match the desired output.
  • Finiteness − Algorithms must terminate after a finite number of steps.
  • Feasibility − Should be feasible with the available resources.
  • Independent − An algorithm should have step-by-step directions, which should be independent of any programming code.

How to Write an Algorithm?

There are no well-defined standards for writing algorithms. Rather, it is problem and resource dependent. Algorithms are never written to support a particular programming code.
As we know that all programming languages share basic code constructs like loops (do, for, while), flow-control (if-else), etc. These common constructs can be used to write an algorithm.
We write algorithms in a step-by-step manner, but it is not always the case. Algorithm writing is a process and is executed after the problem domain is well-defined. That is, we should know the problem domain, for which we are designing a solution.

Example

Let's try to learn algorithm-writing by using an example.
Problem − Design an algorithm to add two numbers and display the result.
step 1 − START
step 2 − declare three integers a, b & c
step 3 − define values of a & b
step 4 − add values of a & b
step 5 − store output of step 4 to c
step 6 − print c
step 7 − STOP
Algorithms tell the programmers how to code the program. Alternatively, the algorithm can be written as −
step 1 − START ADD
step 2 − get values of a & b
step 3 − c ← a + b
step 4 − display c
step 5 − STOP
In design and analysis of algorithms, usually the second method is used to describe an algorithm. It makes it easy for the analyst to analyze the algorithm ignoring all unwanted definitions. He can observe what operations are being used and how the process is flowing.
Writing step numbers, is optional.
We design an algorithm to get a solution of a given problem. A problem can be solved in more than one ways.
one problem many solutions
Hence, many solution algorithms can be derived for a given problem. The next step is to analyze those proposed solution algorithms and implement the best suitable solution.

Algorithm Analysis

Efficiency of an algorithm can be analyzed at two different stages, before implementation and after implementation. They are the following −
  • A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of an algorithm is measured by assuming that all other factors, for example, processor speed, are constant and have no effect on the implementation.
  • A Posterior Analysis − This is an empirical analysis of an algorithm. The selected algorithm is implemented using programming language. This is then executed on target computer machine. In this analysis, actual statistics like running time and space required, are collected.
We shall learn about a priori algorithm analysis. Algorithm analysis deals with the execution or running time of various operations involved. The running time of an operation can be defined as the number of computer instructions executed per operation.

Algorithm Complexity

Suppose X is an algorithm and n is the size of input data, the time and space used by the algorithm X are the two main factors, which decide the efficiency of X.
  • Time Factor − Time is measured by counting the number of key operations such as comparisons in the sorting algorithm.
  • Space Factor − Space is measured by counting the maximum memory space required by the algorithm.
The complexity of an algorithm f(n) gives the running time and/or the storage space required by the algorithm in terms of n as the size of input data.

Space Complexity

Space complexity of an algorithm represents the amount of memory space required by the algorithm in its life cycle. The space required by an algorithm is equal to the sum of the following two components −
  • A fixed part that is a space required to store certain data and variables, that are independent of the size of the problem. For example, simple variables and constants used, program size, etc.
  • A variable part is a space required by variables, whose size depends on the size of the problem. For example, dynamic memory allocation, recursion stack space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I) is the variable part of the algorithm, which depends on instance characteristic I. Following is a simple example that tries to explain the concept −
Algorithm: SUM(A, B)
Step 1 -  START
Step 2 -  C ← A + B + 10
Step 3 -  Stop
Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now, space depends on data types of given variables and constant types and it will be multiplied accordingly.

Time Complexity

Time complexity of an algorithm represents the amount of time required by the algorithm to run to completion. Time requirements can be defined as a numerical function T(n), where T(n) can be measured as the number of steps, provided each step consumes constant time.
For example, addition of two n-bit integers takes n steps. Consequently, the total computational time is T(n) = c ∗ n, where c is the time taken for the addition of two bits. Here, we observe that T(n) grows linearly as the input size increases.

Asymptotic Notation

Asymptotic analysis of an algorithm refers to defining the mathematical boundation/framing of its run-time performance. Using asymptotic analysis, we can very well conclude the best case, average case, and worst case scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to work in a constant time. Other than the "input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in mathematical units of computation. For example, the running time of one operation is computed as f(n) and may be for another operation it is computed as g(n2). This means the first operation running time will increase linearly with the increase in n and the running time of the second operation will increase exponentially when n increases. Similarly, the running time of both operations will be nearly the same if n is significantly small.
Usually, the time required by an algorithm falls under three types −
  • Best Case − Minimum time required for program execution.
  • Average Case − Average time required for program execution.
  • Worst Case − Maximum time required for program execution.

Asymptotic Notations

Following are the commonly used asymptotic notations to calculate the running time complexity of an algorithm.
  • Ο Notation
  • Ω Notation
  • θ Notation

Big Oh Notation, Ο

The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It measures the worst case time complexity or the longest amount of time an algorithm can possibly take to complete.
Big O Notation
For example, for a function f(n)
Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }

Omega Notation, Ω

The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It measures the best case time complexity or the best amount of time an algorithm can possibly take to complete.
Omega Notation
For example, for a function f(n)
Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }

Theta Notation, θ

The notation θ(n) is the formal way to express both the lower bound and the upper bound of an algorithm's running time. It is represented as follows −
Theta Notation
θ(f(n)) = { g(n) if and only if g(n) =  Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }

Common Asymptotic Notations

Following is a list of some common asymptotic notations −
constantΟ(1)
logarithmicΟ(log n)
linearΟ(n)
n log nΟ(n log n)
quadraticΟ(n2)
cubicΟ(n3)
polynomialnΟ(1)
exponential2Ο(n)
[https://www.tutorialspoint.com/data_structures_algorithms/asymptotic_analysis.htm]


Analysis of algorithms 

The theoretical study of computer-program performance and resource usage.
What’s more important than performance?
• modularity
• correctness
• maintainability
• functionality
• robustness
• user-friendliness
• programmer time
• simplicity
• extensibility
• reliability

Why study algorithms and performance? 

• Algorithms help us to understand scalability.
• Performance often draws the line between what is feasible and what is impossible.
• Algorithmic mathematics provides a language for talking about program behavior.
• The lessons of program performance generalize to other computing resources.

Running time 

• The running time depends on the input: an already sorted sequence is easier to sort.
• Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones.
• Generally, we seek upper bounds on the running time, because everybody likes a guarantee.
Kinds of analyses
Worst-case: (usually)
• T(n) = maximum time of algorithm on any input of size n.
Average-case: (sometimes)
• T(n) = expected time of algorithm on any input of size n.
Best-case: (bogus)
• Cheat with a slow algorithm that works fast on some input.
[http://www.facweb.iitkgp.ernet.in/~sourav/Lecture-01.pdf]

Classification of Algorithms

Classification by purpose
Classification by implementation

  • Recursive or iterative
  • A recursive algorithm is one that calls itself repeatedly until a certain condition matches. It is a method common to functional programming. 
  • Iterative algorithms use repetitive constructs like loops.
  • Some problems are better suited for one implementation or the other. For example, the towers of hanoi problem is well understood in recursive implementation. Every recursive version has an iterative equivalent iterative, and vice versa.
  • Logical or procedural
  • An algorithm may be viewed as controlled logical deduction. 
  • A logic component expresses the axioms which may be used in the computation and a control component determines the way in which deduction is applied to the axioms. 
  • This is the basis of the logic programming. In pure logic programming languages the control component is fixed and algorithms are specified by supplying only the logic component.
  • Serial or parallel
  • Algorithms are usually discussed with the assumption that computers execute one instruction of an algorithm at a time. This is a serial algorithm, as opposed to parallel algorithms, which take advantage of computer architectures to process several instructions at once. They divide the problem into sub-problems and pass them to several processors. Iterative algorithms are generally parallelizable. Sorting algorithms can be parallelized efficiently.
  • Deterministic or non-deterministic
  • Deterministic algorithms solve the problem with a predefined process whereas non-deterministic algorithm must perform guesses of best solution at each step through the use of heuristics.

Classification by design paradigm

  • Divide and conquer
  • A divide and conquer algorithm repeatedly reduces an instance of a problem to one or more smaller instances of the same problem (usually recursively), until the instances are small enough to solve easily. One such example of divide and conquer is merge sorting. Sorting can be done on each segment of data after dividing data into segments and sorting of entire data can be obtained in conquer phase by merging them. 
  • The binary search algorithm is an example of a variant of divide and conquer called decrease and conquer algorithm, that solves an identical subproblem and uses the solution of this subproblem to solve the bigger problem. 

  • Dynamic programming
  • The shortest path in a weighted graph can be found by using the shortest path to the goal from all adjacent vertices. 
  • When the optimal solution to a problem can be constructed from optimal solutions to subproblems, using dynamic programming avoids recomputing solutions that have already been computed. 
  • - The main difference with the "divide and conquer" approach is, subproblems are independent in divide and conquer, where as the overlap of subproblems occur in dynamic programming. 
  • - Dynamic programming and memoization go together. The difference with straightforward recursion is in caching or memoization of recursive calls. Where subproblems are independent, this is useless. By using memoization or maintaining a table of subproblems already solved, dynamic programming reduces the exponential nature of many problems to polynomial complexity.
  •  
  • The greedy method
  • A greedy algorithm is similar to a dynamic programming algorithm, but the difference is that solutions to the subproblems do not have to be known at each stage. Instead a "greedy" choice can be made of what looks the best solution for the moment. 
  • The most popular greedy algorithm is finding the minimal spanning tree as given by Kruskal.
  •  
  • Linear programming
  • The problem is expressed as a set of linear inequalities and then an attempt is made to maximize or minimize the inputs. This can solve many problems such as the maximum flow for directed graphs, notably by using the simplex algorithm. 
  • A complex variant of linear programming is called integer programming, where the solution space is restricted to all integers.
  •  
  • Reduction also called transform and conquer
  • Solve a problem by transforming it into another problem. A simple example: finding the median in an unsorted list is first translating this problem into sorting problem and finding the middle element in sorted list. The main goal of reduction is finding the simplest transformation possible.
  •  
  • Using graphs
  • Many problems, such as playing chess, can be modeled as problems on graphs. A graph exploration algorithms are used. 
  • This category also includes the search algorithms and backtracking.
  •  
  • The probabilistic and heuristic paradigm 

  1. Probabilistic 
  1. Those that make some choices randomly.
  1. Genetic 
  1. Attempt to find solutions to problems by mimicking biological evolutionary processes, with a cycle of random mutations yielding successive generations of "solutions". Thus, they emulate reproduction and "survival of the fittest".
  1. Heuristic 
  1. Whose general purpose is not to find an optimal solution, but an approximate solution where the time or resources to find a perfect solution are not practical.

Classification by complexity



On Scriptol.org, algorithms are classified by purpose, but there are other ways to classify them.
Each algorithm has a goal, for example, the purpose of the Quick Sort algorithm is to sort data in ascending or descending order. But the number of goals is infinite, and we have to group them by kind of purposes.
An algorithm may be implemeted according to different basical principles.
A design paradigm is a domain in research or class of problems that requires a dedicated kind of algorithm:
Some algorithms complete in linear time, and some complete in exponential amount of time, and some never complete.
[http://www.scriptol.com/programming/algorithms-classification.php]

BASIC DIFFERENCES BETWEEN SPACE COMPLEXITY AND TIME COMPLEXITY


SPACE COMPLEXITY:
The space complexity of an algorithm is the amount of memory it requires to run to completion.
the space needed by a program contains the following components:
1)  Instruction space:
-stores the executable version of programs and is generally fixed.
2) Data space:
It contains:
a)  Space required by constants and simple variables.Its space is fixed.
b)  Space needed by fixed size stucture variables such as array and structures.
c)  dynamically allocated space.This space is usually variable.
3) enviorntal stack:
-Needed to stores information required to reinvoke suspended processes or functions.
the following data is saved on the stack
- return address.
-value of all local variables
-value of all formal parameters in the function..
TIME COMPLEXITY:
The time complexity of an algorithm is the amount of time it needs to run to completion. namely space
To measure the time complexity we can count all operations performed in an algorithm and if we know the time taken for each operation then we can easily compute the total time taken by the algorithm.This time varies from system to system.
Our intention is to estimate execution time of an algorithm irrespective of the computer on which it will be used. Hence identify the key operation and count such operation performed till the program completes its execution.
The time complexity can be expressd as a function of a key operation performed.
The space and time complexity is usually expressed in the form of function f(n),where n is the input size for a given instance of a problem being solved.
f(n) helps us to predict the rate of growthof complexity that will increase as size of input to the problem increases.
f(1) also helps us to predict complexity of two or more algorithms in order ro find which is more efficient.
[https://www.boddunan.com/articles/education/22-computer-science/18117-differences-between-space-complexity-and-time-complexity-space-complexity.html]

Recurrence Relations
Read PDF file whose link is :
[https://www.cs.duke.edu/~reif/courses/alglectures/skiena.lectures/lecture3.pdf]

Sorting Techniques

Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way to arrange data in a particular order. Most common orders are in numerical or lexicographical order.
The importance of sorting lies in the fact that data searching can be optimized to a very high level, if data is stored in a sorted manner. Sorting is also used to represent data in more readable formats. Following are some of the examples of sorting in real-life scenarios −
  • Telephone Directory − The telephone directory stores the telephone numbers of people sorted by their names, so that the names can be searched easily.
  • Dictionary − The dictionary stores words in an alphabetical order so that searching of any word becomes easy.

In-place Sorting and Not-in-place Sorting

Sorting algorithms may require some extra space for comparison and temporary storage of few data elements. These algorithms do not require any extra space and sorting is said to happen in-place, or for example, within the array itself. This is called in-place sorting. Bubble sort is an example of in-place sorting.
However, in some sorting algorithms, the program requires space which is more than or equal to the elements being sorted. Sorting which uses equal or more space is called not-in-place sorting. Merge-sort is an example of not-in-place sorting.

Stable and Not Stable Sorting

If a sorting algorithm, after sorting the contents, does not change the sequence of similar content in which they appear, it is called stable sorting.
Stable Sorting
If a sorting algorithm, after sorting the contents, changes the sequence of similar content in which they appear, it is called unstable sorting.
Unstable Sorting
Stability of an algorithm matters when we wish to maintain the sequence of original elements, like in a tuple for example.

Adaptive and Non-Adaptive Sorting Algorithm

A sorting algorithm is said to be adaptive, if it takes advantage of already 'sorted' elements in the list that is to be sorted. That is, while sorting if the source list has some element already sorted, adaptive algorithms will take this into account and will try not to re-order them.
A non-adaptive algorithm is one which does not take into account the elements which are already sorted. They try to force every single element to be re-ordered to confirm their sortedness.

Important Terms

Some terms are generally coined while discussing sorting techniques, here is a brief introduction to them −

Increasing Order

A sequence of values is said to be in increasing order, if the successive element is greater than the previous one. For example, 1, 3, 4, 6, 8, 9 are in increasing order, as every next element is greater than the previous element.

Decreasing Order

A sequence of values is said to be in decreasing order, if the successive element is less than the current one. For example, 9, 8, 6, 4, 3, 1 are in decreasing order, as every next element is less than the previous element.

Non-Increasing Order

A sequence of values is said to be in non-increasing order, if the successive element is less than or equal to its previous element in the sequence. This order occurs when the sequence contains duplicate values. For example, 9, 8, 6, 3, 3, 1 are in non-increasing order, as every next element is less than or equal to (in case of 3) but not greater than any previous element.

Non-Decreasing Order

A sequence of values is said to be in non-decreasing order, if the successive element is greater than or equal to its previous element in the sequence. This order occurs when the sequence contains duplicate values. For example, 1, 3, 3, 6, 8, 9 are in non-decreasing order, as every next element is greater than or equal to (in case of 3) but not less than the previous one.

Bubble Sort Algorithm

Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based algorithm in which each pair of adjacent elements is compared and the elements are swapped if they are not in order. This algorithm is not suitable for large data sets as its average and worst case complexity are of Ο(n2) where n is the number of items.

How Bubble Sort Works?

We take an unsorted array for our example. Bubble sort takes Ο(n2) time so we're keeping it short and precise.
Bubble Sort
Bubble sort starts with very first two elements, comparing them to check which one is greater.
Bubble Sort
In this case, value 33 is greater than 14, so it is already in sorted locations. Next, we compare 33 with 27.
Bubble Sort
We find that 27 is smaller than 33 and these two values must be swapped.
Bubble Sort
The new array should look like this −
Bubble Sort
Next we compare 33 and 35. We find that both are in already sorted positions.
Bubble Sort
Then we move to the next two values, 35 and 10.
Bubble Sort
We know then that 10 is smaller 35. Hence they are not sorted.
Bubble Sort
We swap these values. We find that we have reached the end of the array. After one iteration, the array should look like this −
Bubble Sort
To be precise, we are now showing how an array should look like after each iteration. After the second iteration, it should look like this −
Bubble Sort
Notice that after each iteration, at least one value moves at the end.
Bubble Sort
And when there's no swap required, bubble sorts learns that an array is completely sorted.
Bubble Sort
Now we should look into some practical aspects of bubble sort.

Algorithm

We assume list is an array of n elements. We further assume that swap function swaps the values of the given array elements.
begin BubbleSort(list)

   for all elements of list
      if list[i] > list[i+1]
         swap(list[i], list[i+1])
      end if
   end for
   
   return list
   
end BubbleSort

Pseudocode

We observe in algorithm that Bubble Sort compares each pair of array element unless the whole array is completely sorted in an ascending order. This may cause a few complexity issues like what if the array needs no more swapping as all the elements are already ascending.
To ease-out the issue, we use one flag variable swapped which will help us see if any swap has happened or not. If no swap has occurred, i.e. the array requires no more processing to be sorted, it will come out of the loop.
Pseudocode of BubbleSort algorithm can be written as follows −
procedure bubbleSort( list : array of items )

   loop = list.count;
   
   for i = 0 to loop-1 do:
      swapped = false
  
      for j = 0 to loop-1 do:
      
         /* compare the adjacent elements */   
         if list[j] > list[j+1] then
            /* swap them */
            swap( list[j], list[j+1] )   
            swapped = true
         end if
         
      end for
      
      /*if no number was swapped that means 
      array is sorted now, break the loop.*/
      
      if(not swapped) then
         break
      end if
      
   end for
   
end procedure return list

Insertion Sort


This is an in-place comparison-based sorting algorithm. Here, a sub-list is maintained which is always sorted. For example, the lower part of an array is maintained to be sorted. An element which is to be 'insert'ed in this sorted sub-list, has to find its appropriate place and then it has to be inserted there. Hence the name, insertion sort.
The array is searched sequentially and unsorted items are moved and inserted into the sorted sub-list (in the same array). This algorithm is not suitable for large data sets as its average and worst case complexity are of Ο(n2), where n is the number of items.

How Insertion Sort Works?

We take an unsorted array for our example.
Unsorted Array
Insertion sort compares the first two elements.
Insertion Sort
It finds that both 14 and 33 are already in ascending order. For now, 14 is in sorted sub-list.
Insertion Sort
Insertion sort moves ahead and compares 33 with 27.
Insertion Sort
And finds that 33 is not in the correct position.
Insertion Sort
It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here we see that the sorted sub-list has only one element 14, and 27 is greater than 14. Hence, the sorted sub-list remains sorted after swapping.
Insertion Sort
By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.
Insertion Sort
These values are not in a sorted order.
Insertion Sort
So we swap them.
Insertion Sort
However, swapping makes 27 and 10 unsorted.
Insertion Sort
Hence, we swap them too.
Insertion Sort
Again we find 14 and 10 in an unsorted order.
Insertion Sort
We swap them again. By the end of third iteration, we have a sorted sub-list of 4 items.
Insertion Sort
This process goes on until all the unsorted values are covered in a sorted sub-list. Now we shall see some programming aspects of insertion sort.

Algorithm

Now we have a bigger picture of how this sorting technique works, so we can derive simple steps by which we can achieve insertion sort.
Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater than the 
         value to be sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted

Pseudocode

procedure insertionSort( A : array of items )
   int holePosition
   int valueToInsert
 
   for i = 1 to length(A) inclusive do:
 
      /* select value to be inserted */
      valueToInsert = A[i]
      holePosition = i
      
      /*locate hole position for the element to be inserted */
  
      while holePosition > 0 and A[holePosition-1] > valueToInsert do:
         A[holePosition] = A[holePosition-1]
         holePosition = holePosition -1
      end while
  
      /* insert the number at hole position */
      A[holePosition] = valueToInsert
      
   end for
 
end procedure

Selection Sort

Selection sort is a simple sorting algorithm. This sorting algorithm is an in-place comparison-based algorithm in which the list is divided into two parts, the sorted part at the left end and the unsorted part at the right end. Initially, the sorted part is empty and the unsorted part is the entire list.
The smallest element is selected from the unsorted array and swapped with the leftmost element, and that element becomes a part of the sorted array. This process continues moving unsorted array boundary by one element to the right.
This algorithm is not suitable for large data sets as its average and worst case complexities are of Ο(n2), where n is the number of items.

How Selection Sort Works?

Consider the following depicted array as an example.
Unsorted Array
For the first position in the sorted list, the whole list is scanned sequentially. The first position where 14 is stored presently, we search the whole list and find that 10 is the lowest value.
Selection Sort
So we replace 14 with 10. After one iteration 10, which happens to be the minimum value in the list, appears in the first position of the sorted list.
Selection Sort
For the second position, where 33 is residing, we start scanning the rest of the list in a linear manner.
Selection Sort
We find that 14 is the second lowest value in the list and it should appear at the second place. We swap these values.
Selection Sort
After two iterations, two least values are positioned at the beginning in a sorted manner.
Selection Sort
The same process is applied to the rest of the items in the array.
Following is a pictorial depiction of the entire sorting process −
Selection Sort
Now, let us learn some programming aspects of selection sort.

Algorithm

Step 1 − Set MIN to location 0
Step 2 − Search the minimum element in the list
Step 3 − Swap with value at location MIN
Step 4 − Increment MIN to point to next element
Step 5 − Repeat until list is sorted

Pseudocode

procedure selection sort 
   list  : array of items
   n     : size of list

   for i = 1 to n - 1
   /* set current element as minimum*/
      min = i    
  
      /* check the element to be minimum */

      for j = i+1 to n 
         if list[j] < list[min] then
            min = j;
         end if
      end for

      /* swap the minimum element with the current element*/
      if indexMin != i  then
         swap list[min] and list[i]
      end if

   end for
 
end procedure

Merge Sort

Merge sort is a sorting technique based on divide and conquer technique. With worst-case time complexity being Ο(n log n), it is one of the most respected algorithms.
Merge sort first divides the array into equal halves and then combines them in a sorted manner.

How Merge Sort Works?

To understand merge sort, we take an unsorted array as the following −
Unsorted Array
We know that merge sort first divides the whole array iteratively into equal halves unless the atomic values are achieved. We see here that an array of 8 items is divided into two arrays of size 4.
Merge Sort Division
This does not change the sequence of appearance of items in the original. Now we divide these two arrays into halves.
Merge Sort Division
We further divide these arrays and we achieve atomic value which can no more be divided.
Merge Sort Division
Now, we combine them in exactly the same manner as they were broken down. Please note the color codes given to these lists.
We first compare the element for each list and then combine them into another list in a sorted manner. We see that 14 and 33 are in sorted positions. We compare 27 and 10 and in the target list of 2 values we put 10 first, followed by 27. We change the order of 19 and 35 whereas 42 and 44 are placed sequentially.
Merge Sort Combine
In the next iteration of the combining phase, we compare lists of two data values, and merge them into a list of found data values placing all in a sorted order.
Merge Sort Combine
After the final merging, the list should look like this −
Merge Sort
Now we should learn some programming aspects of merge sorting.

Algorithm

Merge sort keeps on dividing the list into equal halves until it can no more be divided. By definition, if it is only one element in the list, it is sorted. Then, merge sort combines the smaller sorted lists keeping the new list sorted too.
Step 1 − if it is only one element in the list it is already sorted, return.
Step 2 − divide the list recursively into two halves until it can no more be divided.
Step 3 − merge the smaller lists into new list in sorted order.

Pseudocode

We shall now see the pseudocodes for merge sort functions. As our algorithms point out two main functions − divide & merge.
Merge sort works with recursion and we shall see our implementation in the same way.
procedure mergesort( var a as array )
   if ( n == 1 ) return a

   var l1 as array = a[0] ... a[n/2]
   var l2 as array = a[n/2+1] ... a[n]

   l1 = mergesort( l1 )
   l2 = mergesort( l2 )

   return merge( l1, l2 )
end procedure

procedure merge( var a as array, var b as array )

   var c as array

   while ( a and b have elements )
      if ( a[0] > b[0] )
         add b[0] to the end of c
         remove b[0] from b
      else
         add a[0] to the end of c
         remove a[0] from a
      end if
   end while
   
   while ( a has elements )
      add a[0] to the end of c
      remove a[0] from a
   end while
   
   while ( b has elements )
      add b[0] to the end of c
      remove b[0] from b
   end while
   
   return c
 
end procedure

Shell Sort

Shell sort is a highly efficient sorting algorithm and is based on insertion sort algorithm. This algorithm avoids large shifts as in case of insertion sort, if the smaller value is to the far right and has to be moved to the far left.
This algorithm uses insertion sort on a widely spread elements, first to sort them and then sorts the less widely spaced elements. This spacing is termed as interval. This interval is calculated based on Knuth's formula as −

Knuth's Formula

h = h * 3 + 1

where −
   h is interval with initial value 1
This algorithm is quite efficient for medium-sized data sets as its average and worst case complexity are of Ο(n), where n is the number of items.

How Shell Sort Works?

Let us consider the following example to have an idea of how shell sort works. We take the same array we have used in our previous examples. For our example and ease of understanding, we take the interval of 4. Make a virtual sub-list of all values located at the interval of 4 positions. Here these values are {35, 14}, {33, 19}, {42, 27} and {10, 14}
Shell Sort
We compare values in each sub-list and swap them (if necessary) in the original array. After this step, the new array should look like this −
Shell Sort
Then, we take interval of 2 and this gap generates two sub-lists - {14, 27, 35, 42}, {19, 10, 33, 44}
Shell Sort
We compare and swap the values, if required, in the original array. After this step, the array should look like this −
Shell Sort
Finally, we sort the rest of the array using interval of value 1. Shell sort uses insertion sort to sort the array.
Following is the step-by-step depiction −
Shell Sort
We see that it required only four swaps to sort the rest of the array.

Algorithm

Following is the algorithm for shell sort.
Step 1 − Initialize the value of h
Step 2 − Divide the list into smaller sub-list of equal interval h
Step 3 − Sort these sub-lists using insertion sort
Step 3 − Repeat until complete list is sorted

Pseudocode

Following is the pseudocode for shell sort.
procedure shellSort()
   A : array of items 
 
   /* calculate interval*/
   while interval < A.length /3 do:
      interval = interval * 3 + 1     
   end while
   
   while interval > 0 do:

      for outer = interval; outer < A.length; outer ++ do:

      /* select value to be inserted */
      valueToInsert = A[outer]
      inner = outer;

         /*shift element towards right*/
         while inner > interval -1 && A[inner - interval] >= valueToInsert do:
            A[inner] = A[inner - interval]
            inner = inner - interval
         end while

      /* insert the number at hole position */
      A[inner] = valueToInsert

      end for

   /* calculate interval*/
   interval = (interval -1) /3;   

   end while
   
end procedure

Quick Sort

Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of data into smaller arrays. A large array is partitioned into two arrays one of which holds values smaller than the specified value, say pivot, based on which the partition is made and another array holds values greater than the pivot value.
Quick sort partitions an array and then calls itself recursively twice to sort the two resulting subarrays. This algorithm is quite efficient for large-sized data sets as its average and worst case complexity are of Ο(nlogn), where n is the number of items.

Partition in Quick Sort

Following animated representation explains how to find the pivot value in an array.
Quick Sort Partition Animation
The pivot value divides the list into two parts. And recursively, we find the pivot for each sub-lists until all lists contains only one element.

Quick Sort Pivot Algorithm

Based on our understanding of partitioning in quick sort, we will now try to write an algorithm for it, which is as follows.
Step 1 − Choose the highest index value has pivot
Step 2 − Take two variables to point left and right of the list excluding pivot
Step 3 − left points to the low index
Step 4 − right points to the high
Step 5 − while value at left is less than pivot move right
Step 6 − while value at right is greater than pivot move left
Step 7 − if both step 5 and step 6 does not match swap left and right
Step 8 − if left ≥ right, the point where they met is new pivot

Quick Sort Pivot Pseudocode

The pseudocode for the above algorithm can be derived as −
function partitionFunc(left, right, pivot)
   leftPointer = left -1
   rightPointer = right

   while True do
      while A[++leftPointer] < pivot do
         //do-nothing            
      end while
  
      while rightPointer > 0 && A[--rightPointer] > pivot do
         //do-nothing         
      end while
  
      if leftPointer >= rightPointer
         break
      else                
         swap leftPointer,rightPointer
      end if
  
   end while 
 
   swap leftPointer,right
   return leftPointer
 
end function

Quick Sort Algorithm

Using pivot algorithm recursively, we end up with smaller possible partitions. Each partition is then processed for quick sort. We define recursive algorithm for quicksort as follows −
Step 1 − Make the right-most index value pivot
Step 2 − partition the array using pivot value
Step 3 − quicksort left partition recursively
Step 4 − quicksort right partition recursively

Quick Sort Pseudocode

To get more into it, let see the pseudocode for quick sort algorithm −
procedure quickSort(left, right)

   if right-left <= 0
      return
   else     
      pivot = A[right]
      partition = partitionFunc(left, right, pivot)
      quickSort(left,partition-1)
      quickSort(partition+1,right)    
   end if  
   
end procedure

Insertion Sort Program in C

Implementation in C

#include <stdio.h>
#include <stdbool.h>
#define MAX 7

int intArray[MAX] = {4,6,3,2,1,9,7};

void printline(int count) {
   int i;
 
   for(i = 0;i <count-1;i++) {
      printf("=");
   }
 
   printf("=\n");
}

void display() {
   int i;
   printf("[");
 
   // navigate through all items 
   for(i = 0;i<MAX;i++) {
      printf("%d ",intArray[i]);
   }
 
   printf("]\n");
}

void insertionSort() {

   int valueToInsert;
   int holePosition;
   int i;
  
   // loop through all numbers 
   for(i = 1; i < MAX; i++) { 
 
      // select a value to be inserted. 
      valueToInsert = intArray[i];
  
      // select the hole position where number is to be inserted 
      holePosition = i;
  
      // check if previous no. is larger than value to be inserted 
      while (holePosition > 0 && intArray[holePosition-1] > valueToInsert) {
         intArray[holePosition] = intArray[holePosition-1];
         holePosition--;
         printf(" item moved : %d\n" , intArray[holePosition]);
      }

      if(holePosition != i) {
         printf(" item inserted : %d, at position : %d\n" , valueToInsert,holePosition);
         // insert the number at hole position 
         intArray[holePosition] = valueToInsert;
      }

      printf("Iteration %d#:",i);
      display();
  
   }  
}

main() {
   printf("Input Array: ");
   display();
   printline(50);
   insertionSort();
   printf("Output Array: ");
   display();
   printline(50);
}
If we compile and run the above program, it will produce the following result −

Output

Input Array: [4 6 3 2 1 9 7 ]
==================================================
Iteration 1#:[4 6 3 2 1 9 7 ]
 item moved : 6
 item moved : 4
 item inserted : 3, at position : 0
Iteration 2#:[3 4 6 2 1 9 7 ]
 item moved : 6
 item moved : 4
 item moved : 3
 item inserted : 2, at position : 0
Iteration 3#:[2 3 4 6 1 9 7 ]
 item moved : 6
 item moved : 4
 item moved : 3
 item moved : 2
 item inserted : 1, at position : 0
Iteration 4#:[1 2 3 4 6 9 7 ]
Iteration 5#:[1 2 3 4 6 9 7 ]
 item moved : 9
 item inserted : 7, at position : 5
Iteration 6#:[1 2 3 4 6 7 9 ]
Output Array: [1 2 3 4 6 7 9 ]
==================================================

Selection Sort Program in C

Implementation in C

#include <stdio.h>
#include <stdbool.h>
#define MAX 7

int intArray[MAX] = {4,6,3,2,1,9,7};

void printline(int count) {
   int i;
 
   for(i = 0;i <count-1;i++) {
      printf("=");
   }
 
   printf("=\n");
}

void display() {
   int i;
   printf("[");
 
   // navigate through all items 
   for(i = 0;i<MAX;i++) {
      printf("%d ", intArray[i]);
   }
 
   printf("]\n");
}

void selectionSort() {

   int indexMin,i,j;
 
   // loop through all numbers 
   for(i = 0; i < MAX-1; i++) { 
 
      // set current element as minimum 
      indexMin = i;
  
      // check the element to be minimum 
      for(j = i+1;j<MAX;j++) {
         if(intArray[j] < intArray[indexMin]) {
            indexMin = j;
         }
      }

      if(indexMin != i) {
         printf("Items swapped: [ %d, %d ]\n" , intArray[i], intArray[indexMin]); 
   
         // swap the numbers 
         int temp = intArray[indexMin];
         intArray[indexMin] = intArray[i];
         intArray[i] = temp;
      }          

      printf("Iteration %d#:",(i+1));
      display();
   }
}  

main() {
   printf("Input Array: ");
   display();
   printline(50);
   selectionSort();
   printf("Output Array: ");
   display();
   printline(50);
}
If we compile and run the above program, it will produce the following result −

Output

Input Array: [4 6 3 2 1 9 7 ]
==================================================
Items swapped: [ 4, 1 ]
Iteration 1#:[1 6 3 2 4 9 7 ]
Items swapped: [ 6, 2 ]
Iteration 2#:[1 2 3 6 4 9 7 ]
Iteration 3#:[1 2 3 6 4 9 7 ]
Items swapped: [ 6, 4 ]
Iteration 4#:[1 2 3 4 6 9 7 ]
Iteration 5#:[1 2 3 4 6 9 7 ]
Items swapped: [ 9, 7 ]
Iteration 6#:[1 2 3 4 6 7 9 ]
Output Array: [1 2 3 4 6 7 9 ]
==================================================

Merge Sort Program in C

Merge sort is a sorting technique based on divide and conquer technique. With the worst-case time complexity being Ο(n log n), it is one of the most respected algorithms.

Implementation in C

We shall see the implementation of merge sort in C programming language here −
#include <stdio.h>
#define max 10

int a[10] = { 10, 14, 19, 26, 27, 31, 33, 35, 42, 44 };
int b[10];

void merging(int low, int mid, int high) {
   int l1, l2, i;

   for(l1 = low, l2 = mid + 1, i = low; l1 <= mid && l2 <= high; i++) {
      if(a[l1] <= a[l2])
         b[i] = a[l1++];
      else
         b[i] = a[l2++];
   }
   
   while(l1 <= mid)    
      b[i++] = a[l1++];

   while(l2 <= high)   
      b[i++] = a[l2++];

   for(i = low; i <= high; i++)
      a[i] = b[i];
}

void sort(int low, int high) {
   int mid;
   
   if(low < high) {
      mid = (low + high) / 2;
      sort(low, mid);
      sort(mid+1, high);
      merging(low, mid, high);
   } else { 
      return;
   }   
}

int main() { 
   int i;

   printf("List before sorting\n");
   
   for(i = 0; i <= max; i++)
      printf("%d ", a[i]);

   sort(0, max);

   printf("\nList after sorting\n");
   
   for(i = 0; i <= max; i++)
      printf("%d ", a[i]);
}
If we compile and run the above program, it will produce the following result −

Output

List before sorting
10 14 19 26 27 31 33 35 42 44 0
List after sorting
0 10 14 19 26 27 31 33 35 42 44


Bubble sort implementation in C programming language, 

#include <stdio.h>
#include <stdbool.h>

#define MAX 10

int list[MAX] = {1,8,4,6,0,3,5,2,7,9};

void display() {
   int i;
   printf("[");
 
   // navigate through all items 
   for(i = 0; i < MAX; i++) {
      printf("%d ",list[i]);
   }
 
   printf("]\n");
}

void bubbleSort() {
   int temp;
   int i,j;
 
   bool swapped = false;
   
   // loop through all numbers 
   for(i = 0; i < MAX-1; i++) { 
      swapped = false;
  
      // loop through numbers falling ahead 
      for(j = 0; j < MAX-1-i; j++) {
         printf("     Items compared: [ %d, %d ] ", list[j],list[j+1]);

         // check if next number is lesser than current no
         //   swap the numbers. 
         //  (Bubble up the highest number)
   
         if(list[j] > list[j+1]) {
            temp = list[j];
            list[j] = list[j+1];
            list[j+1] = temp;

            swapped = true;
            printf(" => swapped [%d, %d]\n",list[j],list[j+1]);
         }else {
            printf(" => not swapped\n");
         }
   
      }

      // if no number was swapped that means 
      //   array is sorted now, break the loop. 
      if(!swapped) {
         break;
      }
      
      printf("Iteration %d#: ",(i+1)); 
      display();
   }
 
}

main() {
   printf("Input Array: ");
   display();
   printf("\n");
 
   bubbleSort();
   printf("\nOutput Array: ");
   display();
}
Input Array: [1 8 4 6 0 3 5 2 7 9 ]

     Items compared: [ 1, 8 ]  => not swapped
     Items compared: [ 8, 4 ]  => swapped [4, 8]
     Items compared: [ 8, 6 ]  => swapped [6, 8]
     Items compared: [ 8, 0 ]  => swapped [0, 8]
     Items compared: [ 8, 3 ]  => swapped [3, 8]
     Items compared: [ 8, 5 ]  => swapped [5, 8]
     Items compared: [ 8, 2 ]  => swapped [2, 8]
     Items compared: [ 8, 7 ]  => swapped [7, 8]
     Items compared: [ 8, 9 ]  => not swapped
Iteration 1#: [1 4 6 0 3 5 2 7 8 9 ]
     Items compared: [ 1, 4 ]  => not swapped
     Items compared: [ 4, 6 ]  => not swapped
     Items compared: [ 6, 0 ]  => swapped [0, 6]
     Items compared: [ 6, 3 ]  => swapped [3, 6]
     Items compared: [ 6, 5 ]  => swapped [5, 6]
     Items compared: [ 6, 2 ]  => swapped [2, 6]
     Items compared: [ 6, 7 ]  => not swapped
     Items compared: [ 7, 8 ]  => not swapped
Iteration 2#: [1 4 0 3 5 2 6 7 8 9 ]
     Items compared: [ 1, 4 ]  => not swapped
     Items compared: [ 4, 0 ]  => swapped [0, 4]
     Items compared: [ 4, 3 ]  => swapped [3, 4]
     Items compared: [ 4, 5 ]  => not swapped
     Items compared: [ 5, 2 ]  => swapped [2, 5]
     Items compared: [ 5, 6 ]  => not swapped
     Items compared: [ 6, 7 ]  => not swapped
Iteration 3#: [1 0 3 4 2 5 6 7 8 9 ]
     Items compared: [ 1, 0 ]  => swapped [0, 1]
     Items compared: [ 1, 3 ]  => not swapped
     Items compared: [ 3, 4 ]  => not swapped
     Items compared: [ 4, 2 ]  => swapped [2, 4]
     Items compared: [ 4, 5 ]  => not swapped
     Items compared: [ 5, 6 ]  => not swapped
Iteration 4#: [0 1 3 2 4 5 6 7 8 9 ]
     Items compared: [ 0, 1 ]  => not swapped
     Items compared: [ 1, 3 ]  => not swapped
     Items compared: [ 3, 2 ]  => swapped [2, 3]
     Items compared: [ 3, 4 ]  => not swapped
     Items compared: [ 4, 5 ]  => not swapped
Iteration 5#: [0 1 2 3 4 5 6 7 8 9 ]
     Items compared: [ 0, 1 ]  => not swapped
     Items compared: [ 1, 2 ]  => not swapped
     Items compared: [ 2, 3 ]  => not swapped
     Items compared: [ 3, 4 ]  => not swapped

Output Array: [0 1 2 3 4 5 6 7 8 9 ]

No comments:

Post a Comment