RECOMMENDATION SYSTEMS AND COLLABORATIVE ALGORITHM

Reading Time: 8 minutes

WHY DO WE NEED RECOMMENDATION SYSTEMS?

Walking through the steps of technology, which has rapid growth nowadays, represents a huge challenge for humanity. Software systems are currently creating a dynamic world, which undoubtedly facilitates human life and enables its improvement to the highest point of a digital being.

Many mobile and web systems offer easy usage and search through the internet. They are a necessary segment of education, health, employment, trade and of course fun. In such a fast and dynamic life, it is necessary to have more and more systems that will help us enable fast recommendation search when in need of finding relevant information, all in order to save us time. Usually, generated recommendation systems are according to the collaboration filtering algorithms or content-based methods.

RECOMMENDATION SYSTEMS IN REAL LIFE

In real life, people are overwhelmed with making a lot of decisions, no matter of its importance, either minor or major. Understanding human choices is a field studied by cognitive psychology.

One of the most important factors influencing the decisions an individual makes is ‘the past experience’, i.e. the decisions made by the man in the past, affect those he will make in the future.

Human actions are also dependent on the experiences they have gained in interactions with other people. The opinion of others affects our lives without us being aware of it. Relationship with friends affects what neighbourhood we will live in, which place we will visit during our vacation, in which bar we will have a drink, etc.

A real life recommendation system

If one person has positive experiences with another one, then he/she has gained trust and authority over the particular individual and is more likely to follow their advice, as well as choosing the decisions that the person chose when they were in a similar situation.

RECOMMENDATION SYSTEMS IN THE DIGITAL WORLD

All large companies and complex systems use collaborative filtering, as an example of this is the social network “Facebook” with the phrase “People you may know”. Facebook is a hugely complex system and has a massive database, which is why they have a need for an optimization of the user data set so that they can provide a precise recommendation. They also have collaborating systems for the news feed, as well as for the game, fun pages, groups, and event sections.

Another, well-known technology and media service provider which uses those collaboration systems is Netflix, with the “Because you watched” phrase. Netflix uses algorithms and machine learning, probably based on genres, history of the watched movies, ratings and the amount of all ratings of the users that have a similar content taste as ours.

Here is as well Amazon, the multinational technology company, which uses the algorithms for a product recommendation for their clients. They use the item-to-item approach for the recommendation.

Hint: Click on the picture if you want to know more about Item-to-Item Collaborative Filtering

Last example but not least, is the most successful business social network LinkedIn, which uses ex. “People in the Information Technology & Services industry you may know”, “People you may know from Faculty XXX”, “Trending pages in your network”, “Online events for you” and a number of other phrases.

I made a research on the collaborative filtering algorithm, so I will deeply explain how this algorithm works, please read the analysis in the sections below.

RECOMMENDATION SYSTEM AND COLLABORATIVE FILTERING

Based on the selected data processing algorithm, the systems use different recommendation techniques.

Content-based system

People who liked this also likes that as well

Collaborative filtering

Analyzing a huge amount of information

Hybrid recommendation systems

COLLABORATIVE FILTERING – DETAILED ANALYSIS

On a coordinate system, we can show the popularity of products, as well as the number of orders.

The X-axis is presenting the product curve, which shows the popularity of a variety of products. The most popular product is on the left part – at the head of the tail, and the less popular ones are in the right part. Under popularity, I mean how many times the product has been ordered, and viewed by others.

The Y-axis is representing the number of orders and product overviews over a certain time interval.

By analyzing the curve, it is noticeable that the often ordered products usually are considered most popular, and those that have not been ordered recently are omitted. That is what the collaborative filtering algorithm offers.

A measure of similarity is how similar two data objects are to each other. The measure of similarity in a dataset usually described as distance with dimensions, which represent characteristics of the objects that are in comparison. If the distance is small, then the degree of similarity is large, and vice versa. The similarities are very subjective and highly dependent on the domain of the systems.

The similarities are in the range of 0 to 1 [0, 1].

Two main similarities:

  • Similarity = 1 if X = Y
  • Similarity = 0 if X != Y

Collaborative filtering is processing the similarity of the data we have, with the help of several theorems, such as Cosine similarity, Euclidean Distance, Manhattan distance etc.

COLLABORATIVE FILTERING – COSINE SIMILARITY

In the beginning, we need to have a database and characteristics of the items.

For cosine similarity implementation, we need a matrix of similarity from the user database. In this matrix, the vector A are the products, and vector B are the users. Matrix is in format AXB. The fields of the matrix represent the grade/rating of the users’ Ai over the products Bj.

Therefore, we can imagine that we have users from 1 to n {1, …n} and grades/ratings on the products {1,…10}. Every row represents a different user, and every column represents one product. Every field of the matrix consists of the product grade/rating that the user has entered. Now, with this generated matrix, we can use the formula for finding the similarity between the users:

STEP 1:

Similarity (UserN, User1) =

 STEP 2:

In step 1, we can see that User N has the most similarities with User 2, but we can see that in the data we have a deficiency for some product ratings, so we should count the priority of the products that User N, has not set a rating. Now we need the values for the most similar users with User N, and those are User 2 and User 4. The following formula should be used:

Priority (product) = User2 (value*similarity) + User4 (value*similarity).

Example:

Priority(product3) = 8 * 0.66 = 5.28

Priority(product4) = 8 * 0.71 = 5.68

Priority(product5) = 7 * 0.71 + 8 * 0.66 = 10.25

STEP 3:

If we want to recommend two products to User N, these will be product5 and product4.

CONCLUSION:

Similarity theorems have their advantages and disadvantages, depending on what data set they apply. From the above analysis, we came to a conclusion that if the data contains zero values and are rarely distributed, we use the metric for computed a cosine similarity that handles nonzero values. Otherwise, if the data are densely distributed and diversity instead of similarity of users/products, and we have non-zero values, then we use the measures for calculating Euclidean distance. Such systems are under constant pressure from large volumes of data in databases and will undergo to even more challenges due to the daily increasing volume of the data. Therefore, there is a growing need for such new technologies that will dramatically improve the scalability of the recommendation systems.

QUESTION: WHAT WILL HAPPEN IN THE FUTURE?
ANSWER: ONLY TIME WILL TELL.

JavaScript loop and object iteration (optimization)

Reading Time: 3 minutes

Nowadays, it’s interesting that loops become part of our life as developers and we use them at least one time a day. Because of that, one day I decided to investigate and go deeper into JavaScript loops, where I found very interesting things and if I do not share them with you, I am going to feel guilty.

Before you continue reading, I would strongly recommend you to read my previous blog which I believe you will find very useful to create a full picture of the loops. So, go on and read it.

Object properties iteration

Let’s first analyze object iteration and suppose that we have an object, something like:

var obj = {
    property1: 1,
    property2: 2,
    …
}

First, what comes to our mind is to iterate them with the standard for each iteration:

for (var prop in obj) {
    console.log(prop);
}

In this case, we are going to iterate through the object properties but is it the correct way? The answer is yes and no, depends on your needs. Another way to iterate trough is to exclude all inherited properties, which in some case we do not need. So, we can exclude them by using the JavaScript method hasOwnProperty(). You can find an explanation about in operator and hasOwnProperty() in my previous blog.

Since we learned some object optimization/improvements/usage, now the question is, can we really do an optimization?
The answer is yes. Before I am going to show you how we can do that, let’s spend some time on the loops.

Loop iteration

In order to continue the previous example, I will continue explaining the loops with object iteration (of course you can test it with a list of integers like the speed test examples or anything you want).

To accomplish that, we will need the JavaScript method Object.keys().

  • Object.keys() method returns an array of a given object but only the own properties of the object

Let’s write the standard for loop:

var keys = Object.keys(obj)
for (var i = 0; i < keys.length; i++) {
    console.log(obj[keys[i]]);
}

Now we have a solution where we decreased the iteration time by eliminating the time for the evaluation `keys.length` from O(N) to O(1) which is a big time saving if we iterate big arrays.
So, during the development, if you are not limited (like applying some best practices,…), you can add another optimization, by using while loop.

var i = keys.length;
while (i--) {
    console.log(obj[keys[i]]);
}

In this case, we do not declare a new variable, we don’t execute new operations and the while loop will automatically stop when it reaches -1.

Speed testing:

Since the new browsers like Chrome are very fast and optimized, in order to see the best speed differences, I would suggest executing the loops on IE where you will be able to see a real speed difference between the loops.

var arr = new Array(10000);

Example speed test 1	
console.time();                        
for (var i = 0; i < arr.length; i++) {
    // operations...		
    var sum = i * i;   		
}
Execution 1: 4.4ms		
Execution 2: 5.5ms		
Execution 3: 5ms			
Execution 4: 4.6ms			
Execution 5: 5ms	
					
Example speed test 2                    
console.timeEnd(); 	          
while (i--) {
    // operations...
    var sum = i * i;
}
console.timeEnd();

Execution 1: 3.7ms
Execution 2: 4.8ms
Execution 3: 3.9ms
Execution 4: 3.8ms
Execution 5: 4.2ms

Thank you for reading and I would appreciate any feedback.

Making Swift networking code more readable

Reading Time: 3 minutes

With Swift 5 a new type got introduced:

@frozen public enum Result<Success, Failure> where Failure : Error {

    /// A success, storing a `Success` value.
    case success(Success)

    /// A failure, storing a `Failure` value.
    case failure(Failure)

The Result type is an enum consisting of 2 cases. The success and the failure case. Each of them can hold a generic value. The failure case, however, is limited to Types extending the Error type.

Not a big deal? Sure, but it’s the little things which add up and make a difference in the long run.

Lately, I was migrating from SwiftyJSON to native JSON parsing. Each network call was implemented in the following way:

func fetchSomething(completion: @escaping (SomeReturnValue?, SomeError?) -> Void) {
    NetworkingTool.request { (response) in
        guard response.isValid
            else { completion(nil, .somethingBad); return }
        do {
            let returnValue = try SomeReturnValue(response: response)
            completion(returnValue, nil)
        } catch {
            completion(nil, .scarry)
        }
    }
}

Looks okayish. Good. So let’s use it:

fetchSomething { (result, error) in
    guard error == nil
        else { handleError(error: error); return }
    doSomething(result: result)
}

Ok. But how to implement the doSomething? With an optional? This can’t be right, right? Force unwrap the result? And what about the error case? Force unwrap it? Oh and wait, what about the case where neither a result nor an error is returned? Is this even a thing? Ok, let me look up the implementation…

So a tiny bit of ambiguity paired with different people working on different parts of the network stack for different features can cause a real heterogeneous system. (Which does not imply that this is a bad system!)

If the company you’re working for is in favour of code ownership, you may not encounter this one. But so far no company I worked for was about code ownership. It’s usually your code is my code is our code, comrade. Period. There are simply too many trucks outside.

As long as code ownership isn’t a thing and you do not want to spend time on endless syntax and architectural discussions with little benefit or enforce a (new) best practice on all of your colleagues. Again. It comes really handy to have a built-in Result type which is reasonably unambiguous.

And since we all know that we’re spending more time reading code than writing, this saves us all valuable time.

New to programming? 5 things you should pay more attention to

Reading Time: 5 minutes

You decided to start learning programming. You have started to learn programming concepts, you have decided which language you want to learn, and everything looks great.

Except it isn’t.

It’s frustrating; it’s boring; it’s painful. I am not here to make your life easy, but I hope that I will make it a little easier. Here are the 5 things that I believe will help you to become a better programmer.

Find the right source to learn from

I had a professor who said:

“It’s better to spend more time researching where to learn from, than actually learning from one source.”

And this is gold.

Let’s say that you have found a great book or a great video course that everyone is loving. You think that you will love it too, every word that you read/hear in the book/course you will understand, and after you finish it, you will become a master of the things you will learn (at least, I thought like that).

And maybe you will, but probably you won’t. Most (or let’s say, some) of the things you won’t understand, and it’s natural. You will try to read/watch again and again, but it’s not getting any clearer.

My advice is, try to find a great book/course, and start learning from, but use it more as a reference than learning source.

I am not suggesting to only go through the content. Try to understand the concept, but also research it (on Google). Look for more resources, more explanation, more examples. When you will understand the concept, save the source that helped you the most (bookmark the page), and search for examples that you can solve.

This way, it is easier to learn, because you are combining the explanations of different sources, and you are sticking with the most simple explanation that is working for you. Also, research is more interesting than reading\listening the same thing all over again.

Understand the base (minimum) necessary logic rather than implementation

This is important for a few reasons:

  • First, if you understand the logic, it will be easier to learn the implementation
  • Second, the implementation may change, but the base necessary logic won’t

At the very beginning, it will be difficult to differentiate between logic and implementation, and maybe you should try to learn and remember everything, but later try to understand and study just the minimum necessary required things.

I still google some basic things. But because I know what I have to do, I exactly know what to search for (only the implementation/syntax).

With this approach, you will spend your time wisely, and you will be able to learn more important things.

To learn your first programming language is very hard, but that’s because you have to learn programming concepts (the logic). After you learn that, you can learn any language (the implementation) you want, in a matter of weeks.

Code, code, code…

Learning programming is like learning how to drive, except it’s safer (at least, physically). You can read, you can learn, but when you sit down and start to drive, you’ll realize that you haven’t learned anything.

That’s why you should focus on coding. When you study something, try to learn the minimum, so you can start to code, and then code as you learn. There is a great answer on Quora, that mentions 3 rules that you should follow when coding.

  • Write at least one line of code per day
  • First, write code, then refactor
  • No distractions when coding

Here, you can check the answer, that has reasoning for these rules. Maybe you can forget the second rule, but the other 2 are very important.

Attitude

I had to mention attitude. It is a hard path, especially at the beginning, so the right attitude is required. Hard work, believing in yourself, learning to say YES to everything is needed. More precisely, you say NO only when you are 100% sure that it isn’t possible to do. In any other case, you say YES, and you investigate, you try different approaches, you ask for help if it is necessary, you do everything you can. A time will come when you will need to learn to say NO, but first, you have to learn to say YES.

Rest

Of course, don’t forget to rest. You have to rest from the hard work you have done. Most of the stupid things I have done were when I was too tired. When you are tired, you don’t think rationally. You just want to finish your task, no matter. That’s when the biggest mistakes come. You won’t learn anything, you won’t do anything well, you are just wasting your time and nerves.