01: course overview

course format

  • Topic: Survey of practical data structures and algorithms used in industrial software design.

  • Textbook: Adam Drozdek, "Data Structures and Algorithms in C++", Fourth Edition.

  • Supplemental Textbook: Robert Sedgewick, "Algorithms in C++, Parts 1-4."

  • Prerequisites: Undergraduate degree in Computer Science, Computer Engineering or Electrical Engineering.

grading

  • Basis For Grades:

    Item Count Percentage of Grade
    Homeworks 5 40%
    Exams 2 60%

    The total exam grade is the sum of:
    • Midterm Exam
    • Final Exam
  • Grade Server:

    Each programming homework consists of testcases. The score on an assignment is based on the number of testcases that pass.

    Homeworks are submitted to a web-based grading server that automatically grades the testcases. The wall-clock speed of an assignment is measured. The top ten fastest submissions per assignment get bonus points.

  • Independent Work:

    Each final submission per homework is manually reviewed for cheating with other students. On top of this, data analytics is done to flag code. Flagged code is reviewed carefully by the instructor.

    Last semester two students were caught cheating on the first four assignments. They both recieved a failing grade in the course and cannot graduate from Syracuse University until they improve their grade.

    Assignments are different from last year. All submissions from last year have been saved and automatic algorithms will analyze for similarities between semesters.

    The only way to be safe from getting caught cheating is to work absolutely independently. You should never see a single line of another persons code. You should not discuss anything regarding homeworks with another student. Working in groups is not allowed.


complexity analysis

  • Definition:

    Complexity analysis categorizes an algorithm based on resource usage in relation to the input size (n).

  • Time Complexity:

    A categorization related to time.

  • Space Complexity:

    A categorization related to space.

  • Example:

    To sort all of the names of every person in the world, we can choose either an n^2 or nlgn algorithm

    There are 6.975 billion people in the world (n). The time to execute depends on the time complexity.

    Assume we have to do 1 nanosecond of computation per sort operation.

    • O(n^2) bubblesort:
      • n^2 = 6.975 billion ^ 2 = 4.8509e19 operations
      • seconds = 4.8509e19 operations * 1 (nanosecond / operation)
      • seconds = 4.86e10
      • years = 4.86e10 seconds / 31,536,000 (seconds / year)
      • years = 1,541
    • O(nlgn) heapsort:
      • nlgn = 6.975 billion * lg(6.975 billion) = 2.3001e11 operations
      • seconds = 2.3e11 operations * 1 (nanosecond / operation)
      • seconds = 230
      • minutes = 230 seconds / 60 (seconds / minute)
      • minutes = 3.8

    You can program a bubblesort in 5 minutes or you can program a heapsort in 1 hour. The purpose of complexity analysis is to do back of the envelope calculations to gauge how much programming time should be used optimizing performance. Or to see if an operation on a data input size is feasible within our lifetime with our most advanced algorithms.

    Usually we want to choose an algorithm with a better time or space complexity

    • But this is not always true!
      • O(n^2) is really an^2 + bn + c
      • O(nlgn) is really anlgn + bn + c
      • Sometimes the constant factors of an O(n^2) algorithm are much smaller than, say, an O(nlgn) algorithm for small n
    • Sometimes we have to choose either a good time complexity or a good space complexity

growth of functions

  • f(n) = n^2 + 50nlgn + 100n + lgn + 1000
n f(n) n^2 50nlgn 100n lgn 1000
    Value % Value % Value % Value % Value %
2 1305 4 0.31 100 7.66 200 15.33 1 0.08 1000 76.63
16 6060 256 4.22 3200 52.81 1600 26.40 4 0.07 1000 16.50
256 194544 65536 33.69 102400 52.64 25600 13.16 8 0.00 1000 0.51
4096 19645428 16777216 85.40 2457600 12.51 409600 2.08 12 0.00 1000 0.01
65536 4353950712 4294967296 98.65 52428800 1.20 6553600 0.15 16 0.00 1000 0.00


common cyclomatic complexities

  • O(n^3)
  • O(n^2)
  • O(nlgn)
  • O(n)
  • O(lgn)
  • O(1)
Draw graph of this function including the constant term on the board


lgn

  • We use log base 2 in computer science often
  • lgn is the number of times you divide n by 2 until you get to one
  • Often lgn comes up as part of a complexity because we divide a problem into two until we cannot anymore
An example log base 2 calculator:
1:  
// logBase2.cpp - download here

2:  

3:  
#include
 
<
iostream>

4:  

5:  
int
 
logBase2(
int
 
n)
{

6:  
 
 
int
 
ret 
=
 
0;

7:  
 
 
while
(
>
 
1)
{

8:  
 
 
 
 
/=
 
2;

9:  
 
 
 
 
ret++;

10: 
 
 
}

11: 
 
 
return
 
ret;

12: 
}

13: 

14: 
int
 
main(
int
 
argc, 
char
 
argv[
]
)
{

15: 

16: 
 
 
for
(
int
 
=
 
0;
 
<
=
 
32;
 
++i)
{

17: 
 
 
 
 
int
 
log2 
=
 
logBase2(
i)
;

18: 
 
 
 
 
std:
:
cout 
<
<
 
"the log base 2 of ("
 
<
<
 
<
<
 
") = "
 
<
<
 
log2 
<
<
 
std:
:
endl;

19: 
 
 
}

20: 

21: 
 
 
return
 
0;

22: 
}


The results:
  • lg(0) = 0
  • lg(1) = 0
  • lg(2) = 1
  • lg(4) = 2
  • lg(8) = 3
  • lg(16) = 4
  • lg(32) = 5
  • lg(64) = 6
  • lg(128) = 7
  • lg(256) = 8
  • lg(512) = 9
  • lg(1024) = 10
  • lg(4*1024*1024) = 32

binary for 256: 0b100000000.
a single high bit and 8 zero in the binary number.
lg(256) = 8.
each leading one creates a new lgn increment because binary is base2.

lgn grows extremely slowly compared to n. so nlgn is much better than n^2.


big-o notation

  • f(n) = n^2 + 50nlgn + 100n + lgn + 1000
  • O(f(n)) = O(n^2)
  • As n becomes large, the terms that grow smaller than n^2 do not matter
  • Big O defines the upper bounds of functions. In the worst possible performing case of an algorithm, it will never be greater than O(f(n)) by more than a constant factor.
  • Big O is a best possible worst case.
  • f(n) = 3n^2
    • O(f(n)) = O(n^2)
    • O(f(n)) != O(n^3)
    • O(f(n)) != O(n)

complexity class examples

  • O(1) - Simple Calculation: Conversion from Fahrenheit to Celsius
    1: 
    // temperature.cpp - download here

    2: 

    3: 
    float
     
    toCelsius(
    float
     
    fahrenheit_temp)
    {

    4: 
     
     
    float
     
    ret 
    =
     
    (
    fahrenheit_temp 
    32)
     
    5.0 
    9.0;

    5: 
     
     
    return
     
    ret;

    6: 
    }

  • O(lgn) - Binary Search: Given a sorted sequence, find if an element exists in the sequence
    1:  
    // binarySearch.cpp - download here

    2:  

    3:  
    int
     
    binarySearch(
    int
     
    array, 
    int
     
    n, 
    int
     
    key)
    {

    4:  
     
     
    int
     
    low 
    =
     
    0;

    5:  
     
     
    int
     
    mid;

    6:  
     
     
    int
     
    high 
    =
     
    n-1;

    7:  

    8:  
     
     
    while
    (
    low 
    <
    =
     
    high)
    {

    9:  
     
     
     
     
    mid 
    =
     
    (
    low 
    high)
     
    2;

    10: 
     
     
     
     
    if
    (
    key 
    <
     
    array[
    mid]
    )
    {

    11: 
     
     
     
     
     
     
    high 
    =
     
    mid 
    1;

    12: 
     
     
     
     
    }
     
    else
     
    if
    (
    array[
    mid]
     
    <
     
    key)
    {

    13: 
     
     
     
     
     
     
    low 
    =
     
    mid 
    1;

    14: 
     
     
     
     
    }
     
    else
     
    {

    15: 
     
     
     
     
     
     
    return
     
    mid;

    16: 
     
     
     
     
    }

    17: 
     
     
    }

    18: 
     
     
    return
     
    -1;

    19: 
    }

  • O(n) - Compute the Sum of an Array
    1:  
    // computeSum.cpp - download here

    2:  

    3:  
    int
     
    computeSum(
    int
     
    array, 
    int
     
    n)
    {

    4:  
     
     
    int
     
    ret 
    =
     
    0;

    5:  
     
     
    for
    (
    int
     
    =
     
    0;
     
    <
     
    n;
     
    ++i)
    {

    6:  
     
     
     
     
    ret 
    +=
     
    array[
    i]
    ;

    7:  
     
     
    }

    8:  
     
     
    return
     
    ret;

    9:  
    }

  • O(n) - Moving Average: Just because there are two for loops doesn't mean it is O(n^2)
    1:  
    // movingAverage.cpp - download here

    2:  

    3:  
    int
     
    movingAverage(
    int
     
    data, 
    int
     
    n, 
    int
     
    window_size)
    {

    4:  
     
     
    int
     
    ret 
    =
     
    new
     
    int
    [
    n]
    ;

    5:  
     
     
    int
     
    half_window 
    =
     
    window_size 
    2;

    6:  
     
     
    for
    (
    int
     
    =
     
    0;
     
    <
     
    n;
     
    ++i)
    {

    7:  
     
     
     
     
    int
     
    sum 
    =
     
    0;

    8:  
     
     
     
     
    int
     
    num_counted 
    =
     
    0;

    9:  
     
     
     
     
    for
    (
    int
     
    =
     
    half_window;
     
    <
     
    half_window;
     
    ++j)
    {

    10: 
     
     
     
     
     
     
    if
    (
    >
    =
     
    &
    &
     
    <
     
    n)
    {

    11: 
     
     
     
     
     
     
     
     
    num_counted++;

    12: 
     
     
     
     
     
     
     
     
    sum 
    +=
     
    data[
    j]
    ;

    13: 
     
     
     
     
     
     
    }

    14: 
     
     
     
     
     
     
    ret[
    i]
     
    =
     
    sum 
    num_counted;

    15: 
     
     
     
     
    }

    16: 
     
     
    }

    17: 
     
     
    return
     
    ret;

    18: 
    }

  • O(nlgn) - Efficient Sort (Merge Sort), We'll study this later, don't worry about it for now.
    1:  
    // mergeSort.cpp - download here

    2:  

    3:  
    #include
     
    <
    iostream>

    4:  

    5:  
    int
     
    merge(
    int
     
    left, 
    int
     
    right, 
    int
     
    len)
    {

    6:  
     
     
    int
     
    ret 
    =
     
    new
     
    int
    [
    len 
    len]
    ;

    7:  
     
     
    int
     
    left_position 
    =
     
    0;

    8:  
     
     
    int
     
    right_position 
    =
     
    0;

    9:  
     
     
    int
     
    ret_position 
    =
     
    0;

    10: 
     
     
    while
    (
    left_position 
    <
     
    len 
    &
    &
     
    right_position 
    <
     
    len)
    {

    11: 
     
     
     
     
    int
     
    left_value 
    =
     
    left[
    left_position]
    ;

    12: 
     
     
     
     
    int
     
    right_value 
    =
     
    right[
    right_position]
    ;

    13: 
     
     
     
     
    if
    (
    left_value 
    <
     
    right_value)
    {

    14: 
     
     
     
     
     
     
    ret[
    ret_position]
     
    =
     
    left_value;

    15: 
     
     
     
     
     
     
    ret_position++;

    16: 
     
     
     
     
     
     
    left_position++;

    17: 
     
     
     
     
    }
     
    else
     
    {

    18: 
     
     
     
     
     
     
    ret[
    ret_position]
     
    =
     
    right_value;

    19: 
     
     
     
     
     
     
    ret_position++;

    20: 
     
     
     
     
     
     
    right_position++;

    21: 
     
     
     
     
    }

    22: 
     
     
    }

    23: 
     
     
    while
    (
    left_position 
    <
     
    len)
    {

    24: 
     
     
     
     
    ret[
    ret_position]
     
    =
     
    left[
    left_position]
    ;

    25: 
     
     
     
     
    ret_position++;

    26: 
     
     
     
     
    left_position++;

    27: 
     
     
    }

    28: 
     
     
    while
    (
    right_position 
    <
     
    len)
    {

    29: 
     
     
     
     
    ret[
    ret_position]
     
    =
     
    right[
    right_position]
    ;

    30: 
     
     
     
     
    ret_position++;

    31: 
     
     
     
     
    right_position++;

    32: 
     
     
    }

    33: 
     
     
    return
     
    ret;

    34: 
    }

    35: 

    36: 
    int
     
    mergeSort(
    int
     
    input, 
    int
     
    len)
    {

    37: 
     
     
    if
    (
    len 
    =
    =
     
    1)
    {

    38: 
     
     
     
     
    return
     
    input;

    39: 
     
     
    }

    40: 
     
     
    int
     
    middle 
    =
     
    len 
    2;

    41: 
     
     
    int
     
    left 
    =
     
    new
     
    int
    [
    middle]
    ;

    42: 
     
     
    int
     
    right 
    =
     
    new
     
    int
    [
    middle]
    ;

    43: 
     
     
    for
    (
    int
     
    =
     
    0;
     
    <
     
    middle;
     
    ++i)
    {

    44: 
     
     
     
     
    left[
    i]
     
    =
     
    input[
    i]
    ;

    45: 
     
     
    }

    46: 
     
     
    for
    (
    int
     
    =
     
    0;
     
    <
     
    middle;
     
    ++i)
    {

    47: 
     
     
     
     
    right[
    i]
     
    =
     
    input[
    i+middle]
    ;

    48: 
     
     
    }

    49: 
     
     
    left 
    =
     
    mergeSort(
    left, 
    middle)
    ;

    50: 
     
     
    right 
    =
     
    mergeSort(
    right, 
    middle)
    ;

    51: 
     
     
    int
     
    ret 
    =
     
    merge(
    left, 
    right, 
    middle)
    ;

    52: 
     
     
    delete
     
    [
    ]
     
    left;

    53: 
     
     
    delete
     
    [
    ]
     
    right;

    54: 
     
     
    return
     
    ret;

    55: 
    }

    56: 

    57: 
    int
     
    main(
    int
     
    argc, 
    char
     
    argv[
    ]
    )
    {

    58: 

    59: 
     
     
    int
     
    array 
    =
     
    new
     
    int
    [
    8]
    ;

    60: 
     
     
    array[
    0]
     
    =
     
    8;

    61: 
     
     
    array[
    1]
     
    =
     
    2;

    62: 
     
     
    array[
    2]
     
    =
     
    4;

    63: 
     
     
    array[
    3]
     
    =
     
    9;

    64: 
     
     
    array[
    4]
     
    =
     
    3;

    65: 
     
     
    array[
    5]
     
    =
     
    6;

    66: 
     
     
    array[
    6]
     
    =
     
    10;

    67: 
     
     
    array[
    7]
     
    =
     
    5;

    68: 

    69: 
     
     
    int
     
    sorted 
    =
     
    mergeSort(
    array, 
    8)
    ;

    70: 
     
     
    for
    (
    int
     
    =
     
    0;
     
    <
     
    8;
     
    ++i)
    {

    71: 
     
     
     
     
    std:
    :
    cout 
    <
    <
     
    sorted[
    i]
     
    <
    <
     
    " "
    ;

    72: 
     
     
    }

    73: 
     
     
    std:
    :
    cout 
    <
    <
     
    std:
    :
    endl;

    74: 

    75: 
     
     
    return
     
    0;

    76: 
    }

  • O(n^2) - Bubble Sort: given any sequence, ensure all the elements are in ascending (or descending) order
    1:  
    // bubbleSort.cpp - download here

    2:  

    3:  
    void
     
    bubbleSort(
    int
     
    array, 
    int
     
    n)
    {

    4:  
     
     
    for
    (
    int
     
    =
     
    0;
     
    <
     
    n;
     
    ++i)
    {

    5:  
     
     
     
     
    for
    (
    int
     
    =
     
    1;
     
    <
     
    n;
     
    ++j)
    {

    6:  
     
     
     
     
     
     
    int
     
    lhs 
    =
     
    array[
    i]
    ;

    7:  
     
     
     
     
     
     
    int
     
    rhs 
    =
     
    array[
    j]
    ;

    8:  
     
     
     
     
     
     
    if
    (
    rhs 
    <
     
    lhs)
    {

    9:  
     
     
     
     
     
     
     
     
    array[
    i]
     
    =
     
    rhs;

    10: 
     
     
     
     
     
     
     
     
    array[
    j]
     
    =
     
    lhs;

    11: 
     
     
     
     
     
     
    }

    12: 
     
     
     
     
    }

    13: 
     
     
    }

    14: 
    }