Arrays and Trees (Data Structure Types)

It is important to understand these, because they’re the backbone of modern databases. It is also important to understand the notion of database index.

In this post, we will go through two types of data structures, arrays and trees. There is also another important data structure type known as Hash Table, but we will cover that in another post!

Array:

The two dimensional array is the simplest data structure. A table can be seen as an array.

Each row represents a subject, and columns the features that describe the subjects. Each column stores a certain type of data, such an integer, string, date…etc

Let’s say there’s a table array displaying worker information. if you wanted to find all the people who worked in the UK, you’ll have to look at each row to find if that row belongs to the UK.

*This will cost you N operations* (With N being the number of row).

It’s not a bad solutions – but could there be a faster way? This is where trees come into play. Though, most modern databases do provide advanced arrays to store tables efficiently like heap-organized tables or index-organized tables. But this does not change the problem of fast searching for a specific condition with a group of columns.

Trees:

A binary search tree is a binary tree with a special property, the key in each node must be:

  • Greater than all keys stored in the left sub-tree
  • Smaller than all keys stored in the right sub-tree

Let’s look at a tree with N =15 elements. Let’s say within this tree, I’m looking for a specific key of 208:

I start with the root, compare it to 208 to see which is greater or smaller, then continue down the appropriate path for the tree.

Let’s say 208 > root.

This means we move to the right side of the tree.

If the next value is greater than 208, then we look at the left side of the sub tree.

We continue this comparison until the value is either found, or does not exist.

In the end, the search costs the amount of levels inside the tree, and there are log(N) levels. Look again at the tree:

Top root has starts with one, the root itself. As we move along the tree, the amount of keys in each row increases by a certain pattern:

Level 1: root

Level 2: two sub trees

Level 3: four sub trees

Level 4: eight sub trees

This means that there are log(N) levels. This means the cost of search is log(N) – a definite upgrade over using N operations if we directly used the array. 

Do you see why this is efficient? It scales exponentially with bigger numbers.

It is similar to a Phonebook, though a phonebook is invariably faster since we also know the exact name of who we’re searching for.

Unfortunately, this tree does have problems. A big problem is when you have to get multiple elements between two values. It will cost O(N) because you’ll have to look at each node in the tree and check if it’s between these two values. This operation is also not disk I/O friendly.

I love computer science concepts, even though I never took a class on it. Though iOS developers don’t necessarily need it, it is almost good to have this knowledge in the back of our heads. I will continue exploring and learning more about various databases and sorting algorithms.

Advertisements

Swimming around in Relation Databases

Relational Databases

Let’s focus on the way a database handles a SQL query:

First of all, there’s an important concept called Time Complexity. Time complexity is used to see how long an algorithm will take for a given amount of data. What’s important is not the amount of data, but the way number operations increase when the amount of data increase.

Let’s look at a few different types of algorithms optimizing the number of operations it takes to process a specific amount of data:

To process about 2000 elements:

O(1)  = 1 operation

O(log(n)) = 7 operations

O(n) = 2 000 operations

O(n*log(n)) = 14 000 operations

O(n^2) = 4 000 000 operations

Even though the difference between O(1) and O(n^2) seems like a lot, you will lose at most around 2ms – just the time for you to blink your eyes. Current processors can handle hundreds of millions of processes per second, which is why performance and optimization are not a priority for many IT projects.

It’s important to understand this concept when dealing with a huge number of elements. To process about 1 000 000 elements:

Now the difference between O(1) and O(n^2) is 1 to 1 000 000 000 000. You’d probably have time to get a coffee at this rate. This is where performance optimizations matter.

Let’s break down what these different time complexities mean:

O(1) – A search in a good hash table  

O(log(n)) – A search in a well balanced tree

O(n) – A search in an array

O(n*log(n)) = The very best sorting algorithms

O(n^2) = A bad sorting algorithm

Now, on to the different types of time complexities:

-Average case scenario

-Best case scenario

-Worst case scenario <– usually the case

There are also other types of complexities such as memory consumption of an algorithm and disk I/O consumption of an algorithm.

There are, of course, worse complexities than n^2 such as:

n^4 = That sucks

3^n = Even worse

factorial n = You’ll never get the results, even with a low number of elements

n^n = Consider another field of profession

When you need to sort through a collection, you call the sort( ) function. There are many good sorting algorithms, one is called the merge sort.

Let’s say we need to merge two arrays of size N/2 into a N-element sorted array. This operation is called a Merge.

array mergeSort (array a)

if ( length ( a ) == 1)

return a [ 0 ] ;

end if

//recursive calls

[ left_array right_array ] : = split_into_2_equally_sized_arrays ( a );

//division phase

array new_left_array : = mergeSort( left_array);

array new_right_array : = mergeSort( right_array);

//merging the 2 small ordered arrays into a big one

array result := merge (new_left_array, new_right_array);

return result;

This kind of algorithm is called divide and conquer. The division phase is where the array is divided into smaller arrays. The sorting phase is where the small arrays are put together using the merge to form a bigger array. Let’s go more in depth into this merge sort:

In the division phase, the array is divided into unitary arrays using three steps: The former number of steps is log(N) (since N=8, log(N) = 3). Each step divides the initial array by two. The number of steps therefore, is equal to the number of times you can divide the array by two. This is the exact definition of logarithm. Notice how everything was broken down into smaller pieces.

Now, with the sorting phase, you start out with the unitary arrays. During each step, you apply multiple merges and the overall cost is N= 8 operations. Step one has 4 merges costing 2 operations each. In the second step, you have two merges costing 4 operations each. In the third and final step, you have one merge costing 8 operations.

Since there are log(N) steps, the overall cost N * log(N) operations.

So why is this algorithm so powerful?

  • You can modify it to reduce memory footprint, in a way that you don’t create new arrays but directly modify the input array.
  • You can also modify it to use disk space and a small amount of memory at the same time without a huge disk I/O penalty. The idea is to load memory only parts that are currently processed. This is important when you need to sort a multi-gigabyte table with only a memory buffer of 100 megabytes. This type of algo is called external sorting
  • You can modify it to run on multiple process and servers
  • This sorting algorithm is used in virtually all databases

More database related topics coming soon!

Adventures in [UITouch]

UITouch, one of the most basic objects a dev will come across when dealing with touches in iOS.

Phase property on UITouch:

ULTouchPhaseBegan

A finger has touched the screen

UITouchPhaseMoved

A finger moved on the screen. This can be helpful to know, because if a user is moving their finger across the screen longer than a simple touch, then we know that it is not just a tap but a gesture.

UITouchPhaseStationary

A finger is stationary. The user is completing a long press with the expectation that something is happening. This is typically done in a messaging app where we want to highlight some text or copy and paste something.

UITouchPhaseEnded

A finger was lifted off of the screen

UITouchPhaseCancelled

If the touch is interrupted by the system, such as an alert view appears, or the user receives a phone call.

Although these are pretty self explanatory, they are important because they tell you the state of the touch, and will consistently come up when using gesture recognizers. You need to know which state the app is in so we can return it to where it needs to be or where it needs to go on completion of the touch.

How touch events normally work:

Touch Event:

  1. Hit Testing finds Hit-Test view

Let’s say the user touches view B on the phone. The touch recognizer finds that view B is the hit test view, and

  1. Touches begin with event.

Touches begin when the event is called.

  1. Touches moved with event.

This happens as the user moves his or her finger. Continues to be called as long as the user is moving his or her finger.

  1. Touches ended with event.

Touch is ended when the user lifts his or her finger.

Let’s see what happens when the gesture recognizer is attached to that same view.

Gesture Recognizer

  1. Hit testing finds hit-test view. Gesture state “Possible”

When the user touches the same view B, hit testing still runs. And our gesture recognizer is sitting in the Possible state.

  1. Touches began with event. Gesture state “Possible”

Gesture Recognizer acknowledges the event, and is still sitting in the possible state.

  1. Gesture recognized – state “Began”

Now, the user moves their finger very slightly. This is where gesture recognizer differs from touch event. The gesture recognizer is actually going to enter the Began state. It is basically saying that it recognizes the user completing a particular gesture, which in this case is a pan gesture.

  1. Touches Cancelled when Event is called.

Touch event is now cancelled on the original touch event, and UIGestureRecognizer is going to take over the rest of the touch event cycle. So the user continues to move their finger..

  1. Gesture recognized – state “changed”.

As the user continues to move their finger, UIGestureRecognizer enters the “Changed” state. Instead of Touches Moved with event being called, which would the normal state to happen.

  1. Gesture recognized – state “ended”.

When the user completes their action and lift their finger, gesture recognizer enters the ended state.