Why can’t traditional content recommendation engines just get it right?

Viktor Olson

The average OTT or SVOD viewer spends almost an equal amount of time on content discovery as they do actually watching their chosen content. This is still true, even when the service is using a recommendation engine. That’s significant time wasted for a user. So, what is a good ratio between searching for and watching movies and series that you should aim for? Netflix, who uses one of the most sophisticated recommendation algorithms in the world, has a ratio of 30/70.

So, you want to improve by making your recommendations more sophisticated? First, we need to understand an underlying problem with traditional engines.

To start, we are really good at understanding the users.

We don’t rely on a bunch of demographic metadata when we make a good user analysis. When Netflix reached 130 countries, they removed parameters like region and gender. It had very little impact on real choice. 

We are emotional beings and the choices we make are based on the mood we are currently in. Focusing on keywords without listening to how we feel, will give shallow response and, frankly, insult us for not taking our emotions seriously. 

Screen Shot 2018-07-13 at 12.29.03


At the same time, we are beings of routine. Our constant mood switching follows a strict recipe. It's basically coded in our DNA or, to make a Westworld reference, it's our inner drive. This is something recommendation engines are quite good at today. They focus on user behavior and listen to things that have an impact on our mood, for example the time of day.


The problem is how we understand the content.

Movies and series are not very different from users. Movies, TV shows, music, books, and art are carefully composed to make us feel something specific. Directors use a recipe of many different emotional triggers like audio, tempos, dialogs and colors - all to convey a specific mood. This is what truly defines content.

But for some reason we are happy to rely on keywords when we try to analyze it. When doing so called “Collaborative filtering” in a recommendation engine, you will see bundles like the movie Space Odyssey 2001 together with Aliens, because they share identical keywords like “Space”, “Darkness,” and “Sci-Fi”.  

Why are we dropping text-based metadata when analyzing users, but not when analyzing content? No wonder users spend so much time on browsing instead of streaming.


Finding the content's DNA - beyond keywords

Understanding the inner drive of the content is to understand the choices made by the artist who created it. A good example of this is Spotify’s feature Discover Weekly. Instead of just blindly accepting the metadata that comes at ingest, engineers at The Echo Nest let machine learning applications listen to the audio file to find characteristics like time signature, key, instruments, mode, tempo and loudness. All which are emotional triggers put there by an artist 

In the past 6 months, Viaplay have been using similar technologies like Spotify, but only for video. The application comes from Vionlabs and uses a method that can be described like reverse-engineering movie-making. The computer sits through a digital screening of your video catalog, typically at ingest. It’s instructed to pick up on the emotional triggers like tempo, colours, cuts, camera movement and so on. When it's done processing the visual and audio input, the movie holds a unique fingerprint.

 Screen Shot 2018-07-12 at 11.03.41


Bundles of similar fingerprints hold a recipe for specific moods. Build a database of fingerprints and you will get a machine learning, movie loving computer, which will be happy to share the perfect recommendation with your users. 

This is way beyond genre tags like “Drama” “Horror” or “Comedy”.  


James Bond is always James Bond… or not. When we look at the development of the James Bond franchise, it becomes painfully obvious that only using keywords is not good enough when doing recommendations.


From Russia With Love (1963) has a long build up with the big action anti-climax at the end. Slower scenes take part in the middle of the movie with plenty of dialog and character development. The emotional journey the viewer is onboarding is way different while watching Spectre, the latest in the franchise. It’s modern in it’s tempo with equal amount of action scenes in the start, middle and end. It will tickle other senses for the viewer which are hard to describe in text-based metadata.


Screen Shot 2018-07-12 at 12.08.25
Photo courtesy of Vionlabs - showing their video fingerprinting 

Technology scales creativity, not taking over.

Don't forget that recommendation algorithms or content discovery tools will only bring you data, intelligence, and efficiency. Your editors are still the key creative force - the choice still belongs to the viewer and it’s all about the content.

It’s like a beautiful circle. If your editors have more data and time, it will be easier to innovate. When they innovate, your customers will have richer discoveries and have more time to watch. The more they watch, the more we learn which content to bring in to the services.


Meet the people behind Vionlabs.

Last week we met with Arash Pendari - CEO and Patrick Danckwardt - Business Development during their visit in Bergen. For you who want to learn more about their technology or want to improve your own business around content discovery, be sure to follow Vimond CloseUp - the industry podcast.


Click here for more information