Insights into Collaborative Filtering

Collaborative filtering is one of the most widely used recommender system techniques that generate recommendations using similar neighbors. This article deliberates on different collaborative filtering methods and recent developments in this field.

Image credit: batjaket/Shutterstock
Image credit: batjaket/Shutterstock

Introduction to Collaborative Filtering

Collaborative filtering involves the comparison of user purchases, activities, preferences, and ratings and then using this data for analysis. Customers typically prefer products that have been given higher preference or liked by people with similar tastes.

The technique predicts the interest of the target user by evaluating the top-n similar users’ interests based on the assumption that when the choice of two persons matches for certain products, the probability of their choices matching for other products remains high.

Specifically, the collaborative filtering suggests that the ratings by similar users are likely to be similar, and similar items are also likely to receive similar ratings. This assumption is exploited by collaborative algorithms for recommendation and they utilize the similarity value for user preference prediction.

Similarity enables the recommender engines to identify the user purchase patterns and understand how the rating patterns are similar to other users. All rating information is stored in memory, specifically in memory-based collaborative filtering algorithms, while making the top-n list for recommendation. Similar items/users majorly contribute to the prediction phase of collaborative filtering-based recommender systems. Specifically, the top-n list of the recommended items is impacted when similarity delivers the wrong result.

Collaborative filtering utilizes two approaches for considering similarity: the item similarity-based approach (ISBA) and the user similarity-based approach (USBA). USBA predicts the rating based on rating information obtained from similar users. ISBA leverages the same concept as USBA except for using item similarity in place of user similarity.

Collaborative filtering is shaped by three key categories, including memory-based, model-based, and hybrid methods. The hybrid-based method synergistically combines memory-based and model-based methods. It exploits the advantages of the memory-based method and avoids the disadvantages of model-based methods.

Collaborative filtering is utilized in different fields, like social networking sites, e-Iearning, e-commerce, travel and tourism, customer relationship management, and in the marketing of television programs, music, movies, and books.

Baseline Predictors

Baseline prediction methods are used to normalize and pre-process data for use with more sophisticated algorithms and to establish non-personalized baselines against which the personalized algorithms are compared. Baseline algorithms that do not rely on the user’s ratings can also provide predictions for new users.

The simplest baseline involves predicting the average rating over all ratings/overall average ratings in the system. The user mean can be combined with the average deviation from the user mean rating for a specific item to improve the baselines. A more reasonable estimate of item and user preferences in the face of sparse sampling by incorporating damping terms leads to further regularization of the baseline. The baseline predicted ratings come close to the global mean due to this adjustment when the item or user has few ratings.

Additional baselines can be added and computed to make the baselines more sophisticated to deal with different effects. In addition to simple ANOVA-style functions, baseline predictors can also be learned as more general parameters with gradient descent or other parameter estimation techniques as a part of learning a bigger model.

Baseline predictors capture the impacts of item popularity and user bias effectively and apply to increasingly important factors like time. Collaborative filtering efficiently captures the interaction effects between items and users when the baseline is subtracted from the rating matrix to obtain a normalized rating matrix.

User–User Collaborative Filtering

This technique is also called k-NN collaborative filtering and has been the earliest automated collaborative filtering method. User-user collaborative filtering is a simple algorithmic interpretation of the core collaborative filtering principle, which involves finding other users with past rating behavior similar to that of the current user and utilizing their ratings on other items to predict the preference of the current user.

For instance, to predict the preference of a person for an item that has not been rated by him/her, user-user collaborative filtering finds other users who show high agreement with the person on the items rated by both of them.

These users’ ratings for the item are weighted by their level of agreement with the rating of the person to predict the person’s preference. Although weighted averaging is the most common mechanism as it is effective, simple, and consistent with the Social Choice theory, other mechanisms are also used to compute predictions.

For instance, the Ringo music recommender used the no-weighting mechanism by performing an unweighted average over ratings by neighboring users. Similarly, the BellCore video recommender utilized a multivariate regression over the users in the neighborhood to generate predictions.

Item–Item Collaborative Filtering

Although the user-user collaborative filtering is effective, the algorithm experiences scalability problems with the expanding user base. Thus, developing more scalable algorithms became necessary to extend collaborative filtering to large user bases and enable its deployment on e-commerce sites.

Item–item collaborative filtering/item-based collaborative filtering addresses this issue effectively, thus making it one of the most commonly deployed collaborative filtering techniques. Item–item collaborative filtering utilizes similarities between the rating patterns of items in place of using the similarities between users’ rating behavior for predicting preferences.

For instance, when two items have the same users dislike and like them, then they are similar in nature, and the users are expected to show similar preferences for similar items.

Thus, the overall structure of this method is similar to content-based approaches to personalization and recommendation, but the item similarity is inferred from the user preference patterns in place of being extracted from item data. Specifically, item-item collaborative filtering generates predictions using the user’s ratings for other items combined with the similarities of those items to the target item.

Probabilistic Methods

Many fully probabilistic collaborative filtering formulations have gained significant attention. The objective of these methods is to create user behavior probabilistic models and utilize those models for future behavior prediction.

Personality diagnosis is a probabilistic user model where it is assumed that a user’s ratings represent a combination of the user’s preference and Gaussian noise. Using the resulting distribution, recommendation and prediction are performed by computing the expected value of the user’s rating.

Recent Developments

Recently, a general framework called neural network-based collaborative filtering (NCF) was presented in a study that is generic and can generalize and express matrix factorization. A multi-layer perceptron (MLP) was leveraged to learn the user-item interaction function to boost NCF modeling with non-linearities. The proposed method was rigorously evaluated against established benchmarks on two real-world datasets

The proposed NCF framework demonstrated a better performance compared to the existing state-of-the-art (SOTA) methods in the experiments. Empirical evidence also showed that using deeper layers of neural networks leads to improved recommendation performance.

In another study, a novel framework leveraging dynamic graphs called dynamic graph collaborative filtering (DGCF) was presented to capture sequential and collaborative relations of both users and items at the same time. Three update mechanisms were also proposed, including second-order 'aggregation', first-order 'propagation', and zero-order 'inheritance', to represent the effect on an item or user when a new interaction occurs.

Based on these mechanisms, the related item and user embeddings were updated simultaneously when interactions occurred in turn, and then the latest embeddings were used to make recommendations. Extensive experiments were performed using three public datasets to rigorously assess the method.

The proposed DGCF framework significantly outperformed the SOTA dynamic recommendation methods. Specifically, the approach attained higher performance when the dataset contained less action repetition, which indicated the effectiveness of integrating the dynamic collaborative information.

A recent study introduced causal collaborative filtering (CCF), a general framework to model causality in collaborative filtering and recommendation. A unified causal view of collaborative filtering was provided, and it was mathematically demonstrated that several conventional collaborative filtering algorithms are special CCF cases under simplified causal graphs.

Subsequently, a conditional intervention approach for do-operations was proposed to estimate the user-item causal preference depending on the observational data. Additionally, a general counterfactual constrained learning framework was also introduced to estimate the user-item preferences.

The approach underwent thorough evaluation through experiments on two representative real-world datasets. The CCF framework notably improved the recommendation performance and reduced Simpson’s paradox problem of multiple CF algorithms.

To summarize, collaborative filtering recommends items based on similarities between users or items. It uses various methods like user-user, item-item, probabilistic models, and recent advancements like NCF, DGCF, and CCF to predict user preferences and generate tailored recommendations. While collaborative filtering shines in tailoring recommendations, its effectiveness rests on tackling critical challenges like data sparsity, scalability, explainability and fairness, and popularity bias.

References and Further Reading

Ekstrand, M. D., Riedl, J. T., Konstan, J. A. (2011). Collaborative filtering recommender systems. Foundations and Trends® in Human-Computer Interaction, 4(2), 81-173. http://dx.doi.org/10.1561/1100000009

He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T. S. (2017). Neural collaborative filtering. Proceedings of the 26th international conference on world wide web, 173-182. https://doi.org/10.1145/3038912.3052569

Li, X., Zhang, M., Wu, S., Liu, Z., Wang, L., Philip, S. Y. (2020). Dynamic graph collaborative filtering. 2020 IEEE international conference on data mining (ICDM), 322-331. https://doi.org/10.1109/ICDM50108.2020.00041

Xu, S., Ge, Y., Li, Y., Fu, Z., Chen, X., Zhang, Y. (2023). Causal collaborative filtering. Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval, 235-245. https://doi.org/10.1145/3578337.3605122

Mustafa, N., Ibrahim, A. O., Ahmed, A., Abdullah, A. (2017). Collaborative filtering: Techniques and applications. 2017 International Conference on Communication, Control, Computing and Electronics Engineering (ICCCCEE), 1-6. https://doi.org/10.1109/ICCCCEE.2017.7867668

Singh, P. K., Pramanik, P. K. D., Choudhury, P. (2020). Collaborative filtering in recommender systems: Technicalities, challenges, applications, and research trends. New Age Analytics, 183-215. https://www.researchgate.net/publication/340828169_Collaborative_Filtering_in_Recommender_Systems_Technicalities_Challenges_Applications_and_Research_Trends

Last Updated: Jan 22, 2024

Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2024, January 22). Insights into Collaborative Filtering. AZoAi. Retrieved on September 23, 2024 from https://www.azoai.com/article/Insights-into-Collaborative-Filtering.aspx.

  • MLA

    Dam, Samudrapom. "Insights into Collaborative Filtering". AZoAi. 23 September 2024. <https://www.azoai.com/article/Insights-into-Collaborative-Filtering.aspx>.

  • Chicago

    Dam, Samudrapom. "Insights into Collaborative Filtering". AZoAi. https://www.azoai.com/article/Insights-into-Collaborative-Filtering.aspx. (accessed September 23, 2024).

  • Harvard

    Dam, Samudrapom. 2024. Insights into Collaborative Filtering. AZoAi, viewed 23 September 2024, https://www.azoai.com/article/Insights-into-Collaborative-Filtering.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.