The Big O

There was a time when developers needed to know the number of operations that executed when their code was being run. Computers were much slower back then, and it was important to optimize software so that no precious memory and processing power were wasted. Nowadays we possess extremely powerful computers capable of processing hundreds of millions of operations per second. As a result, many developers don’t care too much about the number of operations. Up to a certain point. When dealing with massive, massive amounts of data however, optimization becomes critical. Especially if you are trying to squeeze out a couple more milliseconds of performance.

Thus, the Big O notation was born! This mathematical symbol O( function ) was created to help computer scientists understand the amount of operations being executed when a particular function is being run, given a variable input data. Simply put, how long does a function take to process when fed some amount of information. What is important to understand here is that the Big O does not focus on the actual amount of data being operated upon. That’s not the important part. When analyzing time complexity, it is vital to understand the way the number of operations increase when the number of data increases. Utilizing Big O does not give us the exact amount of operations, but it can provide developers with a general idea of performance.

It may not sound so important for smaller amounts of data – and that’s completely true! If you only had a couple thousand pieces of data lying around, optimizations would bring very little difference to the table. Our modern processors are more than capable of handling such tasks, even when badly optimized. However, what if you were to run a badly optimized function on a dataset in the tens of millions, hundreds of millions? At a certain point, the different numbers of operations per data will begin to show their true colors. A constant time operation, O(n), will stay relatively linear in terms of operations while a time complexity of O(n^2) will quickly explode. Conversely, a function that produces a complexity of O(log(n)) will scale exponential better and better as the amount of data grows.

Let’s imagine that we had a dataset containing 2000 elements.

An O(1) time complexity will cost a total of 1 operations.

An O(n) time complexity will cost a total of 2000 operations.

An O(log(n)) time complexity will cost a total of 7 operations.

An O(n * log(n)) time complexity will cost a total of 14,000 operations.

An O(n^2) time complexity will cost a total of 4,000,000 operations.

See the differences in cost? 2000 elements is a negligible amount of data to be concerned about, but the time complexity difference is still apparent. What if we were to use a significantly larger dataset? Let’s say, 100 million? Now we’re starting to get somewhere. Now, optimization becomes extremely important.

So what does this mean when we deal with hundreds of millions of data? In certain industries, such as finance, where high frequency trades are being executed every millisecond, optimizations within time complexity could mean the difference between winning or losing millions of currency. In other industries, such as e-commerce, providing a good user experience means toning down load times between different pages as tons and tons of data is being processed. A long load time will result in a dramatic decrease of users.

This is why it’s still important to understand this concept. When we analyze the Big O for a given function, we are looking for its worst possible scenario. There are a multitude of important questions to be asked. What is the case where processes are being wasted by the millions and billions, and what can we do to lower that amount and make our code faster and more competitive in certain ecosystems.

 

Advertisements

Shell Scripts: Save previous development time!

What is a script?

You can think of it in the context of a play, television show, or movie. The actor takes the script, and executes each and every line. The script tells them what they can and cannot say. In the context of shell scripts, our computer is the one reading and following the directions.

A shell script, in essence, is a plain text file which contains a series of instructions that are then carried out by the operating system. It is also known as a program. These instructions are usually commands we type ourselves, like find, ls, or touch. There are times however, when we need to automate a series of large, tedious commands. Never fear, for there is a large device sitting in front of us that is incredibly good at such mundane tasks. This is where the power of shell scripting comes into play.

Shell scripts are generally created for useful tasks that may need to be used in repetition, but require the same sets of commands each time. Not only that, but we also create shell scripts for commands that would normally not be inputted manually on to the command line. These include if/else/switch, conditionals, variables, and functions.

When you create a plain text file and enter a list of commands, you have created a shell script. The computer will proceed to read the script and execute it sequentially. This can be done via the command line. Let’s create a simple shell script together.

During this tutorial, any parenthesis shown are meant to represent html tags.

Create a Basic Shell Script

In the command line, create a new file in the present working directory and name it anything you’d like.

touch is-pizza-good

Now that we have a new file, let’s populate it with instructions! The first line we need to put tells the current active shell to execute the script inside the Bash environment. Without it, bash would have no clue what to do with our script. Here is the line below.

#!/usr/bin/env bash

Now that we’ve established a working relationship with the Bash environment, we can begin doing whatever we want with the script.

#!/usr/bin/env bash
echo "YES" > the-answer.txt
echo "The answer awaits you inside this directory!"

Remember, anything that you would normally put on the command line can be put into a shell script.

Executing our Shell Script

Before we are able to run our script, we need to give it adequate permission, otherwise you will get an error saying that permission has been denied. To circumvent this, we simply use a single command that tells the operating system our script is safe to run. The chmod command allows you to change the permissions of files and folders, and the +x flag grants our script the permission required to execute successfully.

chmod +x is-pizza-good

Now that we’ve cleared up the permission conflicts, we can proceed to run our script! To run it, simply type:

./is-pizza-good

If the script has run successfully, you will see the commands being executed. This may be harder to detect visually depending on the nature of the script.

"The answer awaits you inside this directory!"

Parameters

As you execute a shell script, you can add additionals arguments to the end of the line. These arguments can be used within your script to deliver dynamic outputs. The example below shows a quick example of how paramters might be used in a basic script.

Here, we have a script called webify-name that takes in an argument and outputs it to a html document in h1 headers. I use parenthesis on the “h1” instead of tags because WordPress freaks out with the formatting.

#!/usr/bin/env bash
#!/usr/bin/env bash
echo "Preparing to webify the name $1"
echo "(h1)$1(/h1)" > name.html
echo "The name $1 has been webified!"

The program will look for the $1 parameter, and use it to execute the script. Let’s try running it!

chmod +x webify-name
./webify-name benjamin

The program sees that the string benjamin is passed in as the $1 parameter position. Now, every single $1 in our script is essentially replaced by the string benjamin. We have made a dynamic program!

As we’ve learned, $1 represents the first paramter position. Below is the format for adding in additional parameters.

./default-shell-script $1 $2 $3...

Using these parameters, we can create cool little commands that carry out useful actions and save lots of time!

#!/usr/bin/env bash
echo extracting $1 into $2/backup2…
mkdir $2/backup2
tar -xzf $1 -C $2/backup2
echo done!

Alright! Let’s take what we’ve learned so far to create a bash script that sets up a basic website folder template.

#!/usr/bin/env bash
#!/usr/bin/env bash
echo Creating a new file structure of a website at $1/src

mkdir $1/src
mkdir $1/src/js
mkdir $1/src/css
mkdir $1/src/images

touch $1/src/js/main.js
touch $1/src/css/style.css
touch $1/src/index.html

echo "(h2)$2(/h2)" >> $1/src/index.html
echo "h2 { color: gray; }" >> $1/src/css/style.css

echo Finished!

Though there is a maximum number of shell script parameters acceptable, it is a very large number, so you will never have to worry about reaching a limit.

The River of Data: A Brief look at Reactive Programming

Re-reactive What!?

You probably just read the words “reactive programming” and immediately checked out. Even though it is a rather complicated topic, you need not worry about it. By the end of this tutorial, you will gain some solid knowledge regarding this topic.

Remember, a healthy dose of struggling is essential for the learning process. So sit back, relax, and enjoy the ride. Let’s begin.

Back to the Beginning

Take a look at the expression below.

a = b + c

If the variables “b” and “c” were both five, then what would “a” equate to? Hopefully, you were able conclude that “a” would equal 10. Now, what would happen if we changed the value of “c” to 6? Mathematically speaking, the correct answer would be 11. We changed one of the variables, and therefore the outcome changes immediately. Makes sense, right?

Funnily enough, there are a few quarks that occur when we bring the example above into the world of programming. Let’s take a look at our little algebraic expression using imperative programming.

var b = 5, c = 5;

var a = b + c

var c = 6

What do you think “a” will be this time? If you answered “10”, you are correct! In imperative programming, the evaluation of “a” would never change after it is run, even if we attempted to change the variables to affect “a”. This is imperative programming at its very essence. In order to change “a”, we would need to run another statement explicitly telling the program to change the variable “a” to another value.

Conversely, in reactive programming, we are creating variable evaluations by changing their dependencies. In the case of “a = b + c”, “a” reacts to the changes happening with “b”, “c”, and the particular type of operation we are doing.

Give Me a Real Life Example!

Fair enough. Open an Excel spreadsheet on your computer. Now, enter a simple formula into one of the cells.

B1 * C1

If you changed the values of B1 and/or C1, all the cells affected by this formula would be changed instantly. They reacted to a shift coming from the numerous potential values that can be inserted into the original formula. Pretty cool right? As you can probably imagine, reactive programming is extremely powerful. For example, it can be very useful when creating responsive user interface and animations.

Are you still confused? Don’t worry. Here’s another example showing off the power of reactive programming.

What if the excel formula was replaced with a function which maps an air pollution graph that changes depending on a user’s selected location? Thanks to reactive programming, we are able to see the graphical interface change in response to user interaction even as the app is running! Something is changing in our graph function which immediately invokes a response.

You’ve made it this far! Give yourself a pat on the back! You should now understand the basic differences between imperative and reactive programming. Go make some tea, and do some pushups, because we’re about to dive into some of the core concepts surrounding reactive programming.

Degrees of Explicitness

Up to this point, we had been using very basic examples showing reactive programming in action. Our “a = b + c” expression is a very explicit form of reactive programming, because we are literally pushing the flow of changes in one direction.

b -> + -> c -> = a

Our excel example had a bit more complexity, since our cell formula dictates the output of every single cell specific to that formula. Here’s an easier way to imagine the evaluation of the excel cells. Imagine each of the cells as little delivery boxes, and our entire spreadsheet as a factory. For each of the delivery boxes in our factory, we are going to call on our formula to do something with that box. We can change the contents of the box, label the box, or even put the box into a bigger box. The possibilities are endless. Here is some pseudo code for what I just described.

factoryStream = [box, box, box, box, box, box]

factoryStream.forEach( formula( box ) ).subscribe() => do something

Our formula is acting upon all the boxes inside the factory, and doing something with each of them. We end up with a single formula pointing to a number of boxes, all with varied outcomes. Pretty cool, right?

But wait! There’s more! You can even become more implicit with the direction of transformations using reactive programming. What if we wanted to find a certain group of boxes, or even a specific box? In reality, we would need to go over each of the boxes, find the desired ones, and put them in a separate container. Then, we would need to take that container and find the specific box. We can see that this particular example is much more implicit than the previous examples.

Static and Dynamic

This concept is quite simple. Static reactive programming conveys data flows that are set up statically without any form of dynamic input during runtime. Our first example “a = b + c” is static, because we are literally assigning values to “b” and “c”.

A dynamic reactive program can change its data flows during run time. If you’ve ever worked in Photoshop, and created custom colors, you will notice that the color preview changes as you drag the cursor around the color wheel. This is a great example of dynamic reactive programming, because the data flows are being updated as the application is running.

Higher-order Reactive Programming

Higher-order reactive programming sounds like some kind of crazy, advanced terminology. Fear not! This concept is simply stating that data flows from one reactive evaluation can be used to determine the data flow for another evaluation. We actually went over a basic example of higher-order reactive programming in our example with the factory and the boxes.

Let’s get a little more specific. For this example, we will work with a big box full of movies. What if we wanted to find the most expensive movie that was made before the year 2000? Think about the actual steps you would take to physically accomplish that task.

We would need to filter out all the movies that were made before 2000 and put them aside. Now that we have a pile of movies made before 2000, we can then compare prices until we find the highest priced movie. Below is the scenario in pseudo code.

movieStream = [movie, movie, movie, movie, movie, movie]

movieStream.filterOut ( filterOptions ( movie ) ).comparePrices ( findHighestPrice() ).subscribe()

This is seriously cool stuff. You can see that the data stream we get from filtering the movies is then used to find the highest priced movie. We can chain various actions together, using a multitude of data streams to achieve the desired results. Does your head hurt yet?

Differentiated Reactive Programming

Computers are fast and furious. They process large amounts of data with ease, almost in an instant. However, there are times when we need to specify the order of evaluation for data streams. Remember higher-order reactive programming? What if a certain data stream was not processed on time, but needed for another reaction to happen? Bug alert.

Here is where data flow differentiation can come in use. We can allow for the evaluation of the certain data streams to occur first, before setting off the evaluation of other data streams.

If this is still confusing, think about that time when you had to follow a recipe. Did you take all the ingredients you got from the supermarket and toss them all into the pan at once? Obviously not. There is a very specific set of instructions that you must follow, one after the other. You might have to chop up the onions before tossing them in the pan, and you might have to do that only after the oil is sizzling in the pan. It’s a data stream full of onions! Perhaps at some point, you are allowed to toss in multiple items, but these actions are still carried out under guidance from a very explicit set of instructions.

Evaluation Models

When data is changed in reactive programming, that change will pulse outwards towards all data obtained partially or completely from that original data. Unfortunately, a naive implementation of reactive programming may be problematic for certain data structures. There are data structures where the time it takes to process each data flow compound exponentially. This is not something we want.

A solution to this problem is the proper utilization of differential reactive programming. By controlling the timing of data flows, we can simply tell the program to evaluate when a certain variable is needed.

Usage

We use this programming paradigm with a host of other paradigms, including imperative, object-orientated, and functional programming. We’ll save functional reactive programming for another time, but we can talk about imperative and object-orientated reactivity.

Imperative programming is used to act upon reactive data structures, creating one-way data flow constraints. Think of a data graphic that changes depending on other streams of data. Once the data is changed, we can use then utilize imperative programming to recreate a new data graphic.

Creating reactivity within object-orientated programming is quite different. Instead of methods, objects possess reactions that re-evaluate when other reactions have changed. It’s quite powerful. Imagine each and every status update from your Facebook used as a data stream to invoke a series of reactions. Perhaps the status that has the most “likes” from your friends will have higher priority on your newsfeed.

Confused? Let’s explore the Facebook example in depth using a popular Javascript reactive programming library called RxJs.

The Facebook Feed

Above is a non-reactive model of the Facebook feed. How do we get data from this feed? We need to loop through the array of comments until we reach the end, and then utilize the information that we get from the traversal. Is there a better way to keep track of new comments in the feed?

Now check out our model as a stream using an Observable. As it receives streams of information, it can perform individual transformations on the streams.

In a reactive model, the data source contains all the concepts and behaviors that it needs to determine when it has new data, when an error happens, or when it completes.

An Observable is simply RxJs’s way of representing a reactive data source. A reactive data source produces data over a period of time, and at some point will either error out, complete, or never complete until the process is terminated. Whenever an Observable gets data, it produces a result.

You can also attach many pipelines of transformation, called subscriptions, to a particular Observable. For example, you could pipe the new data into a particular format and then display it on a web page.

Once subscribed to an Observable, all subscribers will receive an update when the Observable receives new data.

Let’s tie in this new knowledge to better understand the two models of the Facebook feed above.

In the first example, we are pulling data from the model by iterating through it, getting the information, and then doing something with that information. In the second example, the data is being pushed to us without any further instructions. We then subscribe to that data, receive new streams of updates as they come, and then do something with that update.

The second reactive model represents a uni-directional data flow, because our data source become a kind of entry point for our application and our business logic depends on the data source as well. Notice how we don’t need to do anything until the data changes. Hence – reactive programming!

RxJs in Action

We’ve been talking very theoretical about this library, so how it actually works when writing the code.

observeTweets()
  .filter(s => s.message.contains("Greetings"))
  .flatMap(s => getUserAvatar(s.user))
  .map(a => $('').attr('src', a.url))
  .subscribe($avatar => { $avatars.append($avatar) }

In the example above, we are getting an Observable that produces new values every time a tweet comes in. We filter out the tweets that do not contain the word “Greetings” in them. The next step is particularly interesting. The flatMap method takes a function that returns a promise or an Observable, and merges it into the stream. After we get the information from our getUserAvatar function, we map our return value to a new image element. Finally, we subscribe to the entire pipeline, and the result of that pipeline is an $avatar object.

When using reactive programming, you need to think in three important steps. What data do we want, what transformations we want to apply, and what we want to do with the final result at the end of all the transformations.

Congratulations

You’ve made it to the end! Hopefully, you now have a grasp on some of the basic concepts of reactive programming! They say that the journey of a thousand miles begins with a single step. Congratulations on taking the plunge into this new programming paradigm. There are many more articles, tutorials, and videos for you to scour and devour. Maybe you’ll even go on to build a reactive app!