文本情感分析方法的研究与应用

VIP免费
3.0 侯斌 2024-11-19 4 4 704.39KB 43 页 15积分
侵权投诉
近年来,随着信息技术的快速发展,BBS博客、电子商务网站等网络应用得
到广泛普及,由用户创造带有情感色彩的主观性文本遍布网络各个角落,例如
户对电子商务网站的产品评论信息用户购买产品并使用后,将自己的心得与
体会以评论的方式发布到网络上与网友共。若对这类用户评论进行归纳分析,
有助于了解有多少用对产品持正面态度,有多少用户持负面态度,这将帮助我
们获取用户对某类商品的情感分布,以供潜在用户和商家做出决策。因此,
从数以亿计的产品评论中挖掘出对人们有价值的信息变得非常重要。采用人工
读的方式从海量信息中获得人们对某一话题(或产品)的情感倾向作耗时而
繁琐,且效率低,因此,如何利用计算机基于文本情感分析自动判断用户情
倾向是一个具有良好应用和推广价值的研究课题。
本文通过对文本情感分析的研究发现,在进行情感倾向性分析时,情感的聚
类是其中的重要步骤,聚类的结果将会影响到对情感倾向的判断,而当前所使
的传统聚类算法进行聚类时,由于受算法本身的限制,存在着准确率低等问
因此出了聚类分析k-means 典聚
相比较,谱聚类应用简单,且聚类性能优于传统聚类算法。
本文研究内容主要分为以下几个方面:
首先,论文对传统的文本情感析技进行了详细的介绍和分析。通过对传
统算法性能的分析,发现在应用传法对文本进行情感分析时往往局限于利
用观点词短语和语法信息鉴别观点词和短语的极性,而忽视了词语所在的语境
影响及用户意愿,导致文聚类结果的不准确性,从而导致对文本情感倾向的
断产生偏差。
其次,提出了基于谱聚类的改进的情感分析方法。该方法首先由拉谱拉斯
阵分解得到特征向量,这些特征向量对应于数据不方面的特征信息
户对分特征信息进并确己所的聚类方向后由统自
选择进行聚类,从而得到用户所要的情感方面分类,使得聚类结果的准
率获得了一定程度的提
后,将的算法应用基于电子商务产品评论挖掘中。主要以文提
出的改进算法为基进行统的计,详细明模型框架关键分的计,描述
了文本预处理模块、情感分析模块统在真实评论意中的应用。
关键词: 情感分析 文本分类 谱聚类 用户反馈 情感倾向
ABSTRACT
In recent years, BBS, blog, website of electronic business affairs, network
application have been widely popularized with the rapid development
of information technology, which has produced a large number subjective text by the
user, such as people’s review information on electronic commerce website product.
These reviews contain subjective feelings of customer behind buy and the use of
products. If we summarize and analysis the user reviews, it makes us know how many
users hold a positive attitude to the product, how many users hold a negative attitude.
This will help us to understand the user emotional attitude of the
commodity, for potential users and businesses make a decision. Therefore, digging
out valuable information from the mass of product reviews is very important. It’s time-
consuming, cumbersome and low efficiency to obtain people’s tendency on a topic (or
product) of the emotional from the mass information by artificial reading way. So, how
to use the computer to automatically determine the user sentiment tendency based
on text emotion analyzing is a research topic that with good value of application and
popularization.
In this paper, we can find that when analyzing emotional tendency, emotion
clustering is an important step, clustering results will affect the judgment of
emotional tendency based on the research on analysis of the emotion. But the currently
traditional clustering algorithms that are used have a low accuracy of clustering results.
Therefore, this paper proposes an emotion analyzing method based on spectral
clustering. Compared with the traditional clustering algorithms, it is simple, and its
performance is better than the traditional clustering algorithm.
The content of this paper is mainly divided into the following several aspects:
First of all, this paper introduces and analyses the traditional text sentiment
analysis technology amply. Through the analysis of the performance
of traditional algorithm, we find that when make sentiment analysis on text by
traditional methods, it’s often confined using of phrases and syntactic information to
identify the opinion of words and phrases, which ignore the influence
of the context words and intention of user. This makes a inaccuracy of the text
clustering results, affecting the text sentiment orientate on judgment.
Secondly, this paper put forward the improved emotion analysis method based
on the spectral clustering. The method obtains the feature vector by the decomposition
of Laplace matrix firstly. These feature vectors corresponding to different feature
information of data. Then user reviews a part of the feature information to determine the
direction of clustering. Finally, the system makes clustering automatically according to
user's selection, thus obtained emotional classification that user needs, which improves
a certain degree of accuracy on clustering result.
Finally, the improved algorithm is applied to the mining of online product
reviews opinion. The system is designed mainly based on the improved algorithm
proposed in this paper. Description of the design of the key part of the model is in
detailed.Also, describes the text preprocessing module, sentiment analysis module and
system in the real application in the comments.
Keywords: sentiment analysis, text classification, spectral clustering,
user feedback, emotional tendency
中文
ABSTRACT
章 绪 ....................................................1
1.1 研究背景与意义.............................................1
1.2 研究发展现状...............................................1
1.3 主要研究内容和论文组织构.................................3
第二章 情感分析及相技术........................................5
2.1 情感分析...................................................5
2.1.1 情感分析概念............................................5
2.1.2 情感分析............................................5
2.2 情感分析相技术...........................................7
2.2.1 文本预处理技术..........................................7
2.2.2 特征选择与提取..........................................8
2.2.3 情感分类...............................................11
2.2.4 情感...............................................11
2.3 情感分析..............................................12
2.4 针对情感分析点的解决方................................13
2.5 章小..................................................13
第三章 情感分析中分类方法研究...................................14
3.1 文本分类算法..............................................14
3.1.1 基于决策的分类算法...................................14
3.1.2 支持向量机算法.........................................16
3.1.3 贝叶斯分类算法.........................................18
3.2 文本聚类算法..............................................19
3.2.1 K-means 聚类方法.......................................19
3.2.2 层次聚类方法...........................................20
3.2.3 谱聚类方法.............................................21
3.2.4 聚类方法...........................................21
3.3 文本情感分析的性能评....................................21
II
3.3.1 性能评估指标...........................................21
3.3.2 传统算法的性能分析.....................................22
3.4 章小..................................................24
第四章 基于谱聚类的文本分析方法.................................25
4.1 谱聚类的基本原理..........................................25
4.1.1 图划原理...........................................25
4.1.2 矩阵、度矩阵及 Laplacian 矩阵.......................25
4.1.3 谱聚类算法.............................................25
4.2 谱聚类算法存在的问题......................................26
4.3 基于改进的谱聚类算法的情感分析............................27
4.3.1 构建 Laplacian 矩阵.....................................28
4.3.2 引入用户反馈的特征.................................28
4.3.3 KHM 算法对谱聚类算法的改进.............................29
4.4 实验结果与分析............................................31
4.4.1 实验环境与数据.........................................31
4.4.2 结果分析...............................................31
4.5 章小..................................................32
第五章 基于情感分析的网上商品评论意挖掘...................33
5.1 系架构..............................................33
5.2 数据采集模块..............................................34
5.3 文本情感分析模块...........................................34
5.3.1 评论文本预处理模块.....................................34
5.3.2 基于谱聚类的文本分类模块...............................35
5.4 章小..................................................36
第六章 总结与展望...............................................37
参考献........................................................39
摘要:

摘要近年来,随着信息技术的快速发展,BBS、博客、电子商务网站等网络应用得到广泛普及,由用户创造带有情感色彩的主观性文本遍布网络各个角落,例如用户对电子商务网站的产品评论信息等。用户购买产品并使用后,将自己的心得与体会以评论的方式发布到网络上与网友共享。若对这类用户评论进行归纳分析,有助于了解有多少用户对产品持正面态度,有多少用户持负面态度,这将帮助我们获取用户对某类商品的情感倾向分布,以供潜在用户和商家做出决策。因此,从数以亿计的产品评论中挖掘出对人们有价值的信息变得非常重要。采用人工阅读的方式从海量信息中获得人们对某一话题(或产品)的情感倾向的工作耗时而繁琐,且效率低下,因此,如何利用计算...

展开>> 收起<<
文本情感分析方法的研究与应用.doc

共43页,预览5页

还剩页未读, 继续阅读

作者:侯斌 分类:高等教育资料 价格:15积分 属性:43 页 大小:704.39KB 格式:DOC 时间:2024-11-19

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 43
客服
关注