基于用户-产品二分网统计特性的个性化推荐算法研究

VIP免费
3.0 刘畅 2024-11-07 4 4 2.82MB 80 页 15积分
侵权投诉
伴随着互联网与 Web 2.0 技术的迅猛发展,电子商务、微博和社交网络等新
一代网络应用快速进入人们的日常生活中,极大的便利了我们的生活,但同时
带来了海量的繁杂信息人们选择自己所需要的信息已经变成了一个巨大挑战。
个性化推荐系统是为解决信息过量问题而提出的一个有效的工具和方法,它通过
分析用户已经选择或者收集的产品历史信息,挖掘并预测用户潜在的兴趣和爱
好,进而向用户推荐可能感兴趣的信息或者产品。
本文首先分析了个性化推荐系统产生的背景和个性化推荐系统的发展带来的
理论和社会意义,梳理了国内外个性化推荐系统的实际应用情况和当前个性化推
荐系统的研究热点及其面临的问题。接着介绍了相关的复杂网络理论相关概念,
并着重总结了个性化推荐系统的发展成果和典型的基于网络结构的个性化推荐算
法。随后,将本文中采用的数据集和算法的评价指标等进行介绍。最后,本文立
足于用户-产品二分网络,并从研究统计属性的角度出发,以用户集聚系数、用户
关联网络及局部热传导为切入点提出三个各不相同却又相互联系的个性化推荐算
法。
基于用户集聚系数的协同过滤算法。在用-产品二分网中,用户集聚系数是
度量目标用户的所有邻居用户的特点或者兴趣爱好相同程度的一个统计量,本文
把二部分图中的用户集聚系数引入协同过滤算法的相似性计算中,考察了用户-
品二部分网络的集聚系数对于协同过滤算法的影响。通过在两个数据集上的数值
分析,我们发现引入用户集聚系数统计属性后可以大幅度提高推荐结的准确度
和推荐列表多样性。最后我们还从数据集稀疏度变化的角度探究新算法的性质。
基于用户关联网络的协同过滤算法。经典的协同过滤算法通过映射用-产品
二分网络来计算用户的关联网络。实证统计发现大多数用户关联网络都接近于全
联通,且具有非常大的集聚系数,进一步分析发现这是由于二部分图中的少数度
大产品造成的。本文提出基于用户关联网络二阶信息的改进协同过滤算法。研究
了用户的主流产品喜好、特殊产品喜好及噪声产品喜好。
基于二部图局部信息的热传导推荐算法。文章研究了在局部节点相似性下经
典的热传导推荐算法。首先介绍九大相似性评价方法,通过比较分析给出推荐效
果最好的相似性定义,在此基础上提出通过删除多余信息提高小度数节点推荐能
力的改进热传导算法。
目前,由于个性化推荐具有的理论意义和产生的巨大经济效益,已经成为数
学,计算机科学,管理科学,系统科学和物理学,尤其是复杂网络领域里的研究
热点。放眼未来,个性化推荐系统的研究和发展必将进入一个更加蓬勃的发展
期,我们也将在下一步的研究中继续关注用户兴趣的准确度量及二分网的属性研
究。
关键词: 用户-产品二部图 协同过滤算法 物质扩散 热传导 集聚系数
户关联网络 局部信息
ABSTRACT
Along with the rapid development of the Internet and Web 2.0 technologies, e-
commerce, twitter and social network quick access to a new generation of network
applications in people's daily lives, which brought the great convenience to our lives,
but also brought a dilemma of information overload, therefore, choosing their own
information has become a huge challenge for ordinary people. Personalized
recommendation system is an effective tool and method to solve the information
overload problem. By analyzing the historical information, personalized
recommendation systems mine and predict user interests and hobbies, and then
recommended the interested in information or product to the users.
This paper firstly analyzes the background and emergence of personalized
recommender systems, and theory and social significance as well, combing the actual
application of the domestic and foreign personalized recommendation. Then we
introduce the complex network theory concepts and focus summed up the results of the
development of the personalized recommendation system and typical network structure-
based personalized recommendation algorithm. Subsequently, the data sets and metrics
of the algorithms used were introduced. Finally, based on the user - object bipartite
networks, and from the point of view of the study the statistical properties of the user
clustering coefficient, user correlation network and the local information, we present
three different recommender systems.
Collaborative filtering algorithm based on user clustering coefficient. In user -
object bipartite network, user clustering coefficient measure all neighbors user's
characteristics or interest of target users. In this part, the user clustering coefficient is
introduced to the collaborative filtering systems, in order to investigate users clustering
coefficient for collaborative filtering algorithm. Through numerical analysis on the two
data sets, we found that the introduction of user clustering coefficient statistical
properties can significantly improve the accuracy of the recommended results and
recommend a list of diversity. Finally, we also change the angle from the sparsity of the
data set to explore the nature of the new algorithm.
The study of user correlation network in collaborative filtering algorithm. Classical
collaborative filtering algorithm calculate the user correlation network through
projecting user-object bipartite network. Empirical Statistics found that most users
connected network is close to full connection, and has a very large clustering coefficient,
further analysis showed that the result of this is due to the small number of degrees in
the second part of network. In this part, the second-order information based on user
correlation network is embedded into the collaborative filtering algorithm. Mainstream
products, special products and noise product preferences are identified.
Heat conduction bipartite graph of local information recommendation algorithm.
The research at the local node similarity classic heat conduction recommendation
algorithm. First introduced nine similarity evaluation method, recommended by the best
similarity definition is given through a comparative analysis based on the improved heat
conduction algorithm to improve the ability of small degree nodes Recommended by
removing redundant information.
Currently, for the personalized recommendation has theoretical significance and
produce huge economic benefits, it has become math, computer science, management
science, systems science and physics, especially in the field of complex networks
research focus. Looking to the future, the research and development of personalized
recommendation system will enter a more vigorous development period, we will also
continue to focus on in the next step of the study accurately measure user interest and
bipartite network properties.
Key words: User-object bipartite network; Collaborative Filtering
Algorithm; Mass diffusion; Heat conduction; Clustering
coefficients; User correlations network; Local information
中文摘要
ABSTRACT
第一章 ................................................................................................................ 1
1.1 研究背景 ............................................................................................................. 1
1.1.1 互联网的快速发展 .................................................................................... 1
1.1.2 信息繁杂的困境 ........................................................................................ 2
1.2 研究意义 ............................................................................................................. 3
1.2.1 理论意义 .................................................................................................... 4
1.2.2 实际意义 .................................................................................................... 5
1.3 个性化推荐系统的发展现状 ............................................................................. 6
1.3.1 个性化推荐广泛应用与研究 .................................................................... 6
1.3.2 个性化推荐发展中面临的问题 ................................................................ 8
1.4 本文的主要工作 ............................................................................................... 10
1.4.1 本文的贡献 .............................................................................................. 10
1.4.2 本文的结构 ............................................................................................... 11
第二章 个性化推荐系统相关背景理论简介 ........................................................... 14
2.1 复杂网络理论简介 ........................................................................................... 14
2.2 用户-产品二部分图 .......................................................................................... 14
2.2.1 二部图简介 .............................................................................................. 14
2.2.2 二部图的投影 .......................................................................................... 16
2.2 二部图相关统计属性简介 ............................................................................... 18
2.2.1 度及相关属性 .......................................................................................... 18
2.2.2 点强度分布 .............................................................................................. 19
2.2.3 异配性 ...................................................................................................... 19
2.2.4 集聚系数 .................................................................................................. 19
2.2.5 节点平均距离 .......................................................................................... 20
2.3 个性化推荐算法 ............................................................................................... 21
2.3.1 协同过滤系统 .......................................................................................... 21
2.3.2 基于内容的推荐系统 .............................................................................. 25
2.3.3 基于网络结构的推荐算法 ...................................................................... 28
2.3.4 其他推荐算法 .......................................................................................... 28
2.4 典型的基于网络结构的推荐算法 ................................................................... 30
2.4.1 基于二部分图资源分配的推荐算法 ...................................................... 30
2.4.2 基于传播的用户相似性度量 .................................................................. 31
2.4.3 基于二部图中的度关联的推荐算法 ...................................................... 32
2.4.4 基于高阶相似性的推荐算法 .................................................................. 32
2.4.5 基于有向相似性的推荐算法 .................................................................. 34
第三章 个性化推荐的数据与评价指标 ................................................................... 36
3.1 数据集介绍 ....................................................................................................... 36
3.1.1 MovieLens ................................................................................................ 36
3.1.2 Netflix....................................................................................................... 37
3.1.3 Delicious................................................................................................... 37
3.1.4 Amazon..................................................................................................... 38
3.2 算法评价指标 ................................................................................................... 39
3.2.1 准确度 ...................................................................................................... 42
3.2.2 流行性 ...................................................................................................... 42
3.2.3 多样性 ...................................................................................................... 43
3.2.4 准确率与召回率 ...................................................................................... 43
3.2.5 F 度量 ....................................................................................................... 44
第四章 集聚系数对协同过滤算法的影响研究 ....................................................... 45
4.1 集聚系数 ........................................................................................................... 45
4.2 产品集聚系数对协同过滤算法的影响研究 ................................................... 46
4.3 用户集聚系数对协同过滤算法的影响研究 ................................................... 47
4.4 数值结果分析 ................................................................................................... 48
4.4.1 稀疏度为 90 时的结果 ............................................................................ 48
4.4.2 稀疏度变化时的结果 .............................................................................. 50
4.5 总结 ................................................................................................................... 51
第五章 用户关联网络对协同过滤算法的影响研究 ............................................... 52
5.1 用户关联网络研究 ........................................................................................... 52
5.1.1 用户关联网络简介 .................................................................................. 52
5.1.2 用户关联网络统计属性研究 .................................................................. 52
5.2 基于用户关联网络的协同过滤算法 ............................................................... 55
5.3 模拟结果分析 ................................................................................................... 56
5.4 总结 ................................................................................................................... 58
摘要:

摘要伴随着互联网与Web2.0技术的迅猛发展,电子商务、微博和社交网络等新一代网络应用快速进入人们的日常生活中,极大的便利了我们的生活,但同时也带来了海量的繁杂信息,人们选择自己所需要的信息已经变成了一个巨大挑战。个性化推荐系统是为解决信息过量问题而提出的一个有效的工具和方法,它通过分析用户已经选择或者收集的产品历史信息,挖掘并预测用户潜在的兴趣和爱好,进而向用户推荐可能感兴趣的信息或者产品。本文首先分析了个性化推荐系统产生的背景和个性化推荐系统的发展带来的理论和社会意义,梳理了国内外个性化推荐系统的实际应用情况和当前个性化推荐系统的研究热点及其面临的问题。接着介绍了相关的复杂网络理论相关概念...

展开>> 收起<<
基于用户-产品二分网统计特性的个性化推荐算法研究.pdf

共80页,预览8页

还剩页未读, 继续阅读

作者:刘畅 分类:高等教育资料 价格:15积分 属性:80 页 大小:2.82MB 格式:PDF 时间:2024-11-07

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 80
客服
关注