USST_Arts_112480748 负相关性信息和Sigmoid权重相似度对协同过滤算法的影响研究
![](/assets/7a34688/images/icon/s-pdf.png)
VIP免费
摘 要
互联网的快速发展推动了 Web2.0 时代的到来,网络用户由 Web1.0 时代信息
被动接受者变为主动发布者。互联网的普及,促使了网民数量的快速增长,从而
带动了互联网信息的爆炸式增长,信息越来越多,然而用户对信息的利用率却反
而降低,出现了信息过载和信息迷航,用户从海量内容中找到自己感兴趣的信息
所需要的时间成本越来越高。作为解决信息过载问题的工具之一,推荐系统能够
根据用户的访问行为记录,挖掘用户兴趣,为每个用户推荐其感兴趣的信息,实
现个性化的精准推荐。协同过滤算法是工业界使用最广泛运用最成功的一种个性
化推荐算法,但其在应用中遇到稀疏性问题和冷启动问题,极大地降低了推荐系
统的精度,阻碍了推荐系统的发展。本文在经典协同过滤算法的基础上,提出了
考虑负相关性信息的协同过滤算法和基于 Sigmoid 权重相似度的协同过滤算法,
实验证明了新算法的有效性。本文具体的研究工作如下:
1、提出了考虑负相关性信息的协同过滤算法。邻居选择在协同过滤算法中起
到承上启下的作用。经典协同过滤算法通常采用 Pearson 相关系数计算相似度,
然后选择相似度最高的若干用户作为当前用户的最近邻居,此种算法仅考虑了用
户评分的正相关性信息,却忽视了负相关性信息。MovieLens 数据集上的对比实
验表明,负相关性信息不仅可以提高推荐结果的准确性还可以增加推荐列表的多
样性。此外,我们还发现负相关性信息有助于提高度小用户的推荐准确性。综上
所述,负相关性信息有助于解决推荐系统中同时保证推荐列表的准确性和多样性
的问题以及冷启动问题。
2、提出了基于 Sigmoid 权重相似度的协同过滤算法。相似度计算是协同过滤
算法的基础,邻居选择和评分预测均需要准确的相似性度量。经典协同过滤算法
在共同评分的项目上计算相似度,但没有考虑共同评分项目集的大小,后来改进
的权重相似度虽然考虑到了这一点,但是仅降低了在较小共同评分集上的相似度,
没有增加较大共同评分项目集上的相似度,并且引入了需要手动调节的权重参数。
MovieLens 数据集上的实验表明,基于 Sigmoid 权重相似度的协同过滤算法不仅
能获得比传统协同过滤算法更好的预测准确性和推荐覆盖率,而且能弥补权重相
似度需要手动调节参数的不足。此外我们还发现,该算法能大幅度提高度小用户
的预测准确性。综上所述,Sigmoid 权重相似度能有效缓解协同过滤算法中的稀
疏性问题和冷启动问题。
上述的研究工作,从一定程度上解决了协同过滤算法所面临的稀疏性问题、
冷启动问题以及同时保证推荐列表多样性和准确性的问题,有助于推动协同过滤
算法的理论研究和现实应用。
关键词:个性化推荐系统 推荐算法 协同过滤 评分预测 负相关性信
息 权重相似度
ABSTRACT
The rapid development of the Internet has promoted the coming of Web2.0 era.
Web users have changed from passive recipients of information in the era of Web1.0 to
active publisher of information in Web2.0. Popularity of the Internet has prompted a
rapid growth of the number of Internet users, thus laded to the explosion of the Internet
information, resulting in more and more information. However the utilization of
information but instead reduces, information overload and information trek appears.
The time that users spend finding their interesting information from mass information
is becoming more and more. As one of early tools to overcome information overload
problem, recommended system can mine users’ interest according to their network
behaviors, then recommend the interesting information for each user and make
personalized precise recommendation. Collaborative filtering algorithm is one of the
personalized recommendation algorithms of recommendation system that most widely
and successfully used in industry up to now. But the sparseness problems and cold start
problems are always troubling the collaborative filtering. In this paper, a Collaborative
filtering algorithm by considering negative correlations information and that based on
Sigmoid weight similarity have proposed based on the classic collaborative filtering
algorithm for the data sparsity and cold start problems. Their effectiveness is verified
through several specific experimental. The following is the corresponding theoretical
research and application of the paper for collaborative filtering algorithm:
1、Firstly, a collaborative filtering algorithm by considering negative
correlation information is presented. Neighbor selection plays a connecting role to
collaborative filtering algorithms. Classic collaborative filtering algorithms usually
calculates similarity using the Pearson correlation coefficient and selects a few users of
the highest similarity as the current user's nearest neighbor, only considering positive
correlation information of the user ratings for item, ignoring the negative correlation
information. Experiments on MovieLens datasets show that negative correlation
information can not only improve the accuracy of the prediction results also increase
the diversity of recommendation list. Further analysis reveals that negative correlation
information can greatly improve the recommended accuracy of users with small degree.
To sum up, negative correlation information helps to solve the dilemma of the accuracy
and diversity of recommendation list and cold start problems in recommendation
system.
2、Secondly, a collaborative filtering algorithm based on sigmoid weight
similarity is proposed. Similarity calculation is the basis of collaborative filtering
algorithms. Neighbors selection and rating prediction both require accurate similarity.
Similarity is calculated on co-rating items in classic collaborative filtering algorithm,
but not considering the size of co-rating item sets. In spite of taking into account of it
later improved weight similarity only reduces similarity from a small co-rating set, not
increase similarity from larger co-rating set. Experiments on MovieLens datasets show
that the algorithm can get better performance than the traditional collaborative filtering
algorithm on the prediction accuracy and recommendation coverage and compensate
for the lack of weight similarity for manually adjusted parameters. Further analysis
showed that the algorithm can improve predictive accuracy for users with small degree.
In conclusion, sigmoid weights similarity can effectively alleviate the data sparsity and
cold start problems in recommendation system.
The above research work solves the data sparseness, cold start problems and
dilemma of the accuracy and diversity of collaborative filtering algorithm from a
certain extent, thus help to promote collaborative filtering theoretical research and
practical application of collaborative filtering algorithm.
Key words: personalized recommend system, recommendation
algorithm, collaborative filtering, rating prediction, negative
correlation information, weight similarity
目 录
中文摘要
ABSTRACT
第一章 绪论 .................................................................................................................... 1
1.1 研究背景 ........................................................................................................... 1
1.2 研究意义 ........................................................................................................... 2
1.2.1 理论意义 ................................................................................................ 2
1.2.2 现实意义 ................................................................................................ 3
1.3 个性化推荐系统的应用与研究 ....................................................................... 4
1.3.1 个性化推荐系统的应用 ........................................................................ 4
1.3.2 个性化推荐系统的研究 ........................................................................ 6
1.4 本文研究的主要内容与创新点 ....................................................................... 7
1.5 论文的组织结构 ............................................................................................... 8
第二章 个性化推荐理论与方法 .................................................................................. 10
2.1 基于关联规则的推荐 ..................................................................................... 10
2.2 基于内容的推荐 ............................................................................................. 12
2.3 协同过滤推荐技术 ......................................................................................... 14
2.4 混合推荐技术 ................................................................................................. 17
2.5 基于网络结构的推荐 ..................................................................................... 17
2.6 本章小结 ......................................................................................................... 20
第三章 协同过滤算法相关理论 .................................................................................. 21
3.1 协同过滤算法的概念和原理 ......................................................................... 21
3.2 经典的协同过滤技术 ..................................................................................... 21
3.2.1 基于用户的协同过滤算法 .................................................................. 22
3.2.2 基于项目的协同过滤算法 .................................................................. 26
3.2.3 基于模型的协同过滤算法 .................................................................. 28
3.3 协同过滤算法的改进研究综述 ..................................................................... 32
3.3.1 相似度改进 .......................................................................................... 32
3.3.2 邻居选择改进 ...................................................................................... 34
3.3.3 评分预测改进 ...................................................................................... 35
3.4 本章小节 ......................................................................................................... 36
第四章 考虑负相关性信息的协同过滤算法 .............................................................. 37
4.1 问题描述 ......................................................................................................... 37
4.2 相关研究综述 ................................................................................................. 37
4.3 传统协同过滤算法的不足 ............................................................................. 38
4.4 考虑负相关性信息的协同过滤算法 ............................................................. 39
4.4.1 邻居选取 .............................................................................................. 39
4.4.2 评分预测 .............................................................................................. 39
4.5 实验过程与结果分析 ..................................................................................... 39
4.5.1 数据集 .................................................................................................. 39
4.5.2 评价标准 .............................................................................................. 40
4.6 实验结果及分析 ............................................................................................. 41
4.6.1 Pearson 相似度值分布 ......................................................................... 41
4.6.2 参数 α估计 .......................................................................................... 41
4.6.3 准确性比较 .......................................................................................... 42
4.6.4 多样性比较 .......................................................................................... 43
4.6.5 负相关性对度大度小用户的影响 ...................................................... 43
4.7 本章小结 ......................................................................................................... 44
第五章 基于 Sigmoid 权重相似度的协同过滤算法 ................................................... 45
5.1 问题描述 ......................................................................................................... 45
5.2 相关研究综述 ................................................................................................. 45
5.3 基于 Sigmoid 权重相似度的协同过滤算法 .................................................. 46
5.3.1 传统相似度和权重相似度的不足 ...................................................... 46
5.3.2 Sigmoid 权重相似度 ............................................................................. 47
5.4 实验过程及结果分析 ..................................................................................... 49
5.4.1 SWCF 与CF 性能比较及分析 ............................................................ 49
5.4.2 Sigmoid 权重与 Min 权重比较及分析 ................................................ 50
5.4.3 用户冷启动问题研究 .......................................................................... 51
5.5 本章小结 ......................................................................................................... 51
第六章 总结与展望 ...................................................................................................... 52
6.1 总结 ................................................................................................................. 52
6.2 展望 ................................................................................................................. 52
参考文献 ........................................................................................................................ 54
致 谢 .............................................................................................................................. 62
摘要:
展开>>
收起<<
摘要互联网的快速发展推动了Web2.0时代的到来,网络用户由Web1.0时代信息被动接受者变为主动发布者。互联网的普及,促使了网民数量的快速增长,从而带动了互联网信息的爆炸式增长,信息越来越多,然而用户对信息的利用率却反而降低,出现了信息过载和信息迷航,用户从海量内容中找到自己感兴趣的信息所需要的时间成本越来越高。作为解决信息过载问题的工具之一,推荐系统能够根据用户的访问行为记录,挖掘用户兴趣,为每个用户推荐其感兴趣的信息,实现个性化的精准推荐。协同过滤算法是工业界使用最广泛运用最成功的一种个性化推荐算法,但其在应用中遇到稀疏性问题和冷启动问题,极大地降低了推荐系统的精度,阻碍了推荐系统的发展...
相关推荐
-
VIP免费2024-11-22 17
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 10
-
VIP免费2025-01-09 8
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 8
-
VIP免费2025-01-09 13
-
VIP免费2025-01-09 8
-
VIP免费2025-01-09 14
-
VIP免费2025-01-09 11
作者:赵德峰
分类:高等教育资料
价格:15积分
属性:66 页
大小:2.66MB
格式:PDF
时间:2024-11-11