文本情感分析方法的研究与应用
![](/assets/7a34688/images/icon/s-doc.png)
VIP免费
摘 要
近年来,随着信息技术的快速发展,BBS、博客、电子商务网站等网络应用得
到广泛普及,由用户创造带有情感色彩的主观性文本遍布网络各个角落,例如用
户对电子商务网站的产品评论信息等。用户购买产品并使用后,将自己的心得与
体会以评论的方式发布到网络上与网友共享。若对这类用户评论进行归纳分析,
有助于了解有多少用户对产品持正面态度,有多少用户持负面态度,这将帮助我
们获取用户对某类商品的情感倾向分布,以供潜在用户和商家做出决策。因此,
从数以亿计的产品评论中挖掘出对人们有价值的信息变得非常重要。采用人工阅
读的方式从海量信息中获得人们对某一话题(或产品)的情感倾向的工作耗时而
繁琐,且效率低下,因此,如何利用计算机基于文本情感分析自动判断用户情感
倾向是一个具有良好应用和推广价值的研究课题。
本文通过对文本情感分析的研究发现,在进行情感倾向性分析时,情感的聚
类是其中的重要步骤,聚类的结果将会影响到对情感倾向的判断,而当前所使用
的传统聚类算法在进行聚类时,由于受算法本身的限制,存在着准确率低等问题
因此,本文提出了一种基于谱聚类的情感分析方法,与 k-means 等经典聚类算法
相比较,谱聚类应用简单,且聚类性能优于传统聚类算法。
本文研究内容主要分为以下几个方面:
首先,论文对传统的文本情感分析技术进行了详细的介绍和分析。通过对传
统算法性能的分析,发现在应用传统方法对文本进行情感分析时,往往局限于利
用观点词短语和语法信息鉴别观点词和短语的极性,而忽视了词语所在的语境的
影响及用户意愿,导致文本聚类结果的不准确性,从而导致对文本情感倾向的判
断产生偏差。
其次,提出了基于谱聚类的改进的情感分析方法。该方法首先由拉谱拉斯矩
阵分解得到特征向量,这些特征向量对应于数据不同方面的特征信息,然后让用
户对部分特征信息进行检阅并确定自己所需要的聚类方向,最后由系统自动按用
户选择进行再聚类,从而得到用户所需要的情感方面分类,使得聚类结果的准确
率获得了一定程度的提高。
最后,将新的算法应用到基于电子商务产品评论意见挖掘中。主要以本文提
出的改进算法为基础进行系统的设计,详细说明模型框架关键部分的设计,描述
了文本预处理模块、情感分析模块和系统在真实评论意见中的应用。
关键词: 情感分析 文本分类 谱聚类 用户反馈 情感倾向
ABSTRACT
In recent years, BBS, blog, website of electronic business affairs, network
application have been widely popularized with the rapid development
of information technology, which has produced a large number subjective text by the
user, such as people’s review information on electronic commerce website product.
These reviews contain subjective feelings of customer behind buy and the use of
products. If we summarize and analysis the user reviews, it makes us know how many
users hold a positive attitude to the product, how many users hold a negative attitude.
This will help us to understand the user emotional attitude of the
commodity, for potential users and businesses make a decision. Therefore, digging
out valuable information from the mass of product reviews is very important. It’s time-
consuming, cumbersome and low efficiency to obtain people’s tendency on a topic (or
product) of the emotional from the mass information by artificial reading way. So, how
to use the computer to automatically determine the user sentiment tendency based
on text emotion analyzing is a research topic that with good value of application and
popularization.
In this paper, we can find that when analyzing emotional tendency, emotion
clustering is an important step, clustering results will affect the judgment of
emotional tendency based on the research on analysis of the emotion. But the currently
traditional clustering algorithms that are used have a low accuracy of clustering results.
Therefore, this paper proposes an emotion analyzing method based on spectral
clustering. Compared with the traditional clustering algorithms, it is simple, and its
performance is better than the traditional clustering algorithm.
The content of this paper is mainly divided into the following several aspects:
First of all, this paper introduces and analyses the traditional text sentiment
analysis technology amply. Through the analysis of the performance
of traditional algorithm, we find that when make sentiment analysis on text by
traditional methods, it’s often confined using of phrases and syntactic information to
identify the opinion of words and phrases, which ignore the influence
of the context words and intention of user. This makes a inaccuracy of the text
clustering results, affecting the text sentiment orientate on judgment.
Secondly, this paper put forward the improved emotion analysis method based
on the spectral clustering. The method obtains the feature vector by the decomposition
of Laplace matrix firstly. These feature vectors corresponding to different feature
information of data. Then user reviews a part of the feature information to determine the
direction of clustering. Finally, the system makes clustering automatically according to
user's selection, thus obtained emotional classification that user needs, which improves
a certain degree of accuracy on clustering result.
Finally, the improved algorithm is applied to the mining of online product
reviews opinion. The system is designed mainly based on the improved algorithm
proposed in this paper. Description of the design of the key part of the model is in
detailed.Also, describes the text preprocessing module, sentiment analysis module and
system in the real application in the comments.
Keywords: sentiment analysis, text classification, spectral clustering,
user feedback, emotional tendency
目 录
中文摘要
ABSTRACT
第一章 绪 论....................................................1
1.1 研究背景与意义.............................................1
1.2 研究发展现状...............................................1
1.3 主要研究内容和论文组织结构.................................3
第二章 情感分析及相关技术........................................5
2.1 情感分析...................................................5
2.1.1 情感分析概念............................................5
2.1.2 情感分析任务............................................5
2.2 情感分析相关技术...........................................7
2.2.1 文本预处理技术..........................................7
2.2.2 特征选择与提取..........................................8
2.2.3 情感分类...............................................11
2.2.4 情感摘要...............................................11
2.3 情感分析难点..............................................12
2.4 针对情感分析难点的解决方案................................13
2.5 本章小结..................................................13
第三章 情感分析中分类方法研究...................................14
3.1 文本分类算法..............................................14
3.1.1 基于决策树的分类算法...................................14
3.1.2 支持向量机算法.........................................16
3.1.3 贝叶斯分类算法.........................................18
3.2 文本聚类算法..............................................19
3.2.1 K-means 聚类方法.......................................19
3.2.2 层次聚类方法...........................................20
3.2.3 谱聚类方法.............................................21
3.2.4 其他聚类方法...........................................21
3.3 文本情感分析的性能评估....................................21
II
3.3.1 性能评估指标...........................................21
3.3.2 传统算法的性能分析.....................................22
3.4 本章小结..................................................24
第四章 基于谱聚类的文本分析方法.................................25
4.1 谱聚类的基本原理..........................................25
4.1.1 谱图划分原理...........................................25
4.1.2 相似矩阵、度矩阵及 Laplacian 矩阵.......................25
4.1.3 谱聚类算法.............................................25
4.2 谱聚类算法存在的问题......................................26
4.3 基于改进的谱聚类算法的情感分析............................27
4.3.1 构建 Laplacian 矩阵.....................................28
4.3.2 引入用户反馈的特征识别.................................28
4.3.3 KHM 算法对谱聚类算法的改进.............................29
4.4 实验结果与分析............................................31
4.4.1 实验环境与数据.........................................31
4.4.2 结果分析...............................................31
4.5 本章小结..................................................32
第五章 基于情感分析的网上商品评论意见挖掘系统...................33
5.1 系统整体架构..............................................33
5.2 数据采集模块..............................................34
5.3 文本情感分析模块...........................................34
5.3.1 评论文本预处理模块.....................................34
5.3.2 基于谱聚类的文本分类模块...............................35
5.4 本章小结..................................................36
第六章 总结与展望...............................................37
参考文献........................................................39
摘要:
展开>>
收起<<
摘要近年来,随着信息技术的快速发展,BBS、博客、电子商务网站等网络应用得到广泛普及,由用户创造带有情感色彩的主观性文本遍布网络各个角落,例如用户对电子商务网站的产品评论信息等。用户购买产品并使用后,将自己的心得与体会以评论的方式发布到网络上与网友共享。若对这类用户评论进行归纳分析,有助于了解有多少用户对产品持正面态度,有多少用户持负面态度,这将帮助我们获取用户对某类商品的情感倾向分布,以供潜在用户和商家做出决策。因此,从数以亿计的产品评论中挖掘出对人们有价值的信息变得非常重要。采用人工阅读的方式从海量信息中获得人们对某一话题(或产品)的情感倾向的工作耗时而繁琐,且效率低下,因此,如何利用计算...
相关推荐
-
VIP免费2024-10-15 6
-
VIP免费2024-10-15 8
-
VIP免费2024-10-15 6
-
VIP免费2024-10-15 6
-
VIP免费2024-10-15 6
-
VIP免费2024-10-15 6
-
VIP免费2024-10-15 7
-
VIP免费2024-10-15 6
-
VIP免费2024-10-15 10
-
VIP免费2024-10-15 11
作者:侯斌
分类:高等教育资料
价格:15积分
属性:43 页
大小:704.39KB
格式:DOC
时间:2024-11-19