互联网产品评论的情感倾向研究
VIP免费
摘 要
随着电子商务的迅速发展,网络上涌现了许多购物网站和产品论坛,这些购
物网站和产品论坛为消费者提供了发表评论的平台。越来越多的人在做出消费决
策前喜欢到互联网上参考用户和媒体对某产品的评论和报道信息。通过互联网上
的产品评论,消费者可以了解其他用户对产品的态度倾向,从而做出更好的购买
决策,对于销售商和生产商,可以及时获得用户对其产品和服务的反馈,了解用
户对自己和对竞争对手的评价,从而改进产品,改善服务,获得竞争优势。因此,
作为非结构化信息挖掘的一个新兴领域,产品评论的情感倾向研究引起了人们极
大的兴趣。实现情感倾向研究的基本研究思路是首先获得相关产品评论,对评论
中的主观性成分进行判断,然后从带有主观评价的产品评论中识别出用户关注的
产品属性,最后再对这些相关属性进行极性判断,识别出评论中所传递的对于产
品某项属性的态度。
产品评论的情感倾向研究属于文本情感倾向研究的一部分,本文首先对文本
情感倾向研究的研究背景进行了详细介绍。在该研究领域,国外起步较早,因此
相关文献和研究成果比较多,而中文的情感倾向研究目前仍处于摸索阶段,相关
参考文献较少,文章对此都做了概括性的介绍。
其次,本文的研究是针对餐饮领域的网络评论。尝试通过自然语言处理方法,
对产品评论的情感倾向进行研究。在产品属性识别阶段,利用支持向量机方法(SVM)
在不需要人工干预的条件下自动的从客户评论中识别出了用户经常关注的产品属
性,通过实验证明,该方法与其他的机器学习方法相比,具有较好的去除冗余的
效果,在实践中具有较强的实用性。在情感倾向判断阶段,本文将评论的情感倾
向分析问题看作是一个序列标注的问题,提出了一种基于条件随机场(CRFs)的文
本倾向性分析方法,利用分步从粗到细的策略,在CRFs框架下,充分考虑上下文
信息的影响,包括情感词的词性、否定词、程度副词等对情感词情感倾向度的影
响,把句子褒贬度判别和褒贬强度判别进行了区分,利用多个CRFs模型分开处理,
降低了客观标记对褒贬强度判别结果的影响,取得了较好的实验结果,证明了本
方法的有效性。
最后对本文的研究成果进行了总结,并对该领域的发展前景进行了展望。
本文属国家自然科学基金项目"Web2.0环境下基于本体学习的观点挖掘研究
(编号70903047)"成果之一。
关键词: 情感词 属性识别 情感倾向 支持向量机 条件随机场
ABSTRACT
With the rapidly expand of electronic commerce, it has emerged many shopping
websites and the product forum in the net, these shopping websites and the product
forums provide specially the platform for the consumers to publish their reviews. More
and more people like to browse the user and the media published some product reviews
and the report information before making a decision. Through internet's product reviews,
the consumer may understand that other users’ sentiment tendency distribution, makes
the better purchase decision-making. Regarding the seller and the producer, may obtain
the user feedback promptly about its product and the service feedback, understand the
user appraisal to themselves and the competitor, thus improve the product and service,
obtain the competitive advantage. Therefore,as an emerging domain in the
non-structurized information mine , the product reviews sentiment tendency analysis
has aroused people's enormous interest. The sentiment analysis approach is first obtains
the related product reviews,judges the subjective ingredient carried on the comments,
then identifies product attributes which users attent from the subjective appraisal
product reviews, finally judges the polar carried on these related attributes, distinguishes
regarding attribute to the product.
Sentiment analysis of internet product reviews belong to the text sentiment tendency
analysis, this article first make a detailed introduction to the text sentiment tendency
analysis. In this domain, overseas start early, therefore the related literature and the
research results are quite many, but Chinese sentiment tendency research was still at the
fumble stage at present, the related reference are few, the article has made regarding the
concise introduction.
Next, this article aims the network reviews at the dining domain. Attempt to
conducts the research to the product reviews sentiment tendency through the natural
language processing method. We use the support vector machines method to identify
product attributes, this method did not need the manual intervention to identify product
attributes automatically which the user paid attention frequently from the commentary,
Through the experiment, that this method compared with other machine learning
method, has a good elimination redundancy effect, had the strong usability in reality. In
the sentiment tendency analysis stage, we regard the sentiment tendency analysis
question as a sequence labeling question, propose a CRF-based text sentiment tendency
analysis approach. The approach uses step by step strategy, fully considers the context
information’s influence, including the emotion word's lexical category, the negative
term, the degree adverb and so on to the sentiment tendency, makes a discrimination
between the sentence appraisal and the appraisal intensity. Then uses many CRFs
models to process separately, reduced the the influence of objective mark, has obtained
a good experimental result, proved this method validity.
Finally, we make a summary to this article , and make a forecast to this domain's
prospects.
This study is supported by National Natural Science Foundation of China under
grant No. 70903047.
Key words: Sentiment words, Attribute identification, Sentiment
tendency, Support vector machine, Conditional
Random Field
目 录
中文摘要
ABSTRACT
第一章 绪论 .......................................................... 1
§1.1 课题研究的背景和意义 ....................................... 1
§1.2 国内外研究现状 ............................................. 2
§1.2.1 词语级情感倾向研究 .................................... 3
§1.2.2 句子级情感倾向研究 .................................... 3
§1.2.3 篇章级情感倾向研究 .................................... 4
§1.2.4 文本情感倾向分析的应用 ................................ 5
§1.3 本课题的研究难度 ........................................... 6
§1.4 本文的研究内容 ............................................. 7
§1.5 本文的组织结构 ............................................. 7
第二章 理论基础与方法 ................................................ 9
§2.1 属性识别相关方法 ........................................... 9
§2.2 支持向量机理论基础 ........................................ 10
§2.2.1 基本原理 ............................................. 10
§2.2.2 线性可分支持向量机 ................................... 11
§2.2.3 线性不可分支持向量机 ................................. 12
§2.2.4 非线性可分支持向量机 ................................. 13
§2.2.5 支持向量机模型的优势 ................................. 14
§2.3 情感倾向判断相关方法 ...................................... 15
§2.4 条件随机场理论基础 ........................................ 16
§2.4.1 序列标注问题 ......................................... 16
§2.4.2 条件随机场的无向图结构 ............................... 17
§2.4.3 条件随机场的势函数表示 ............................... 19
§2.4.4 参数估计 ............................................. 20
§2.4.5 概率计算 ............................................. 22
§2.4.6 条件随机场模型的优势 ................................. 22
§2.5 本章小结 ................................................... 24
第三章 评论获取及情感词典的建立 ..................................... 26
§3.1 概述 ...................................................... 26
§3.2 产品评论的抽取 ............................................ 26
§3.3 主客观分析 ................................................ 28
§3.4 情感词典的建立 ............................................ 29
§3.4.1 情感词汇的一般规律 ................................... 29
§3.4.2 基本情感词典 ......................................... 30
§3.4.3 领域情感词典 ......................................... 30
§3.5 情感词汇倾向分析 .......................................... 32
§3.5.1 情感词汇的极性标注 ................................... 32
§3.5.2 否定句处理 ........................................... 32
§3.5.3 程度副词的影响 ....................................... 33
§3.5.4 修饰极性的计算方法 ................................... 33
§3.6 本章小结 .................................................. 35
第四章 基于支持向量机的产品属性识别 ................................. 35
§4.1 多类分类问题 .............................................. 36
§4.1.1 一对多方法 ........................................... 36
§4.1.2 一对一方法 ........................................... 37
§4.2 特征选择和缩放 ............................................ 37
§4.3 实验 ...................................................... 38
§4.3.1 实验环境 ............................................. 38
§4.3.2 分词与词性标注 ....................................... 38
§4.3.3 实验步骤 ............................................. 39
§4.3.4 实验结果及分析 ....................................... 41
§4.4 本章小结 .................................................. 41
第五章 基于条件随机场的产品情感倾向分析 ............................. 42
§5.1 标注方法 .................................................. 42
§5.2 特征选取 .................................................. 44
§5.3 特征模板 .................................................. 44
§5.4 实验 ...................................................... 45
§5.4.1 CRFs 工具的选取 ...................................... 45
§5.4.2 实验评测标准 ......................................... 46
§5.4.3 实验结果及分析 ....................................... 47
§5.5 本章小结 .................................................. 49
第六章 总结与展望 ................................................... 50
§6.1 总结 ...................................................... 50
§6.2 下一步工作展望 ............................................ 50
参考文献 ............................................................ 51
在读期间公开发表的论文和承担科研项目及取得成果 ...................... 56
致谢 ................................................................ 56
摘要:
展开>>
收起<<
摘要随着电子商务的迅速发展,网络上涌现了许多购物网站和产品论坛,这些购物网站和产品论坛为消费者提供了发表评论的平台。越来越多的人在做出消费决策前喜欢到互联网上参考用户和媒体对某产品的评论和报道信息。通过互联网上的产品评论,消费者可以了解其他用户对产品的态度倾向,从而做出更好的购买决策,对于销售商和生产商,可以及时获得用户对其产品和服务的反馈,了解用户对自己和对竞争对手的评价,从而改进产品,改善服务,获得竞争优势。因此,作为非结构化信息挖掘的一个新兴领域,产品评论的情感倾向研究引起了人们极大的兴趣。实现情感倾向研究的基本研究思路是首先获得相关产品评论,对评论中的主观性成分进行判断,然后从带有主观...
相关推荐
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 7
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 6
-
VIP免费2025-01-09 7
-
VIP免费2025-01-09 6
作者:牛悦
分类:高等教育资料
价格:15积分
属性:61 页
大小:788.59KB
格式:PDF
时间:2024-11-19