基于原料蔗糖份数据挖掘的应用研究

VIP免费
3.0 李琳琳 2024-10-14 23 4 2.71MB 73 页 15积分
侵权投诉
硕士学位论文
用于榨糖的蔗糖生产比较特殊:一是甘蔗有别于水稻、小麦等作物,收获对
象是植株体内的糖份,直观性不强,需要通过检测判断成熟度;不存在种子休眠
现象,收割后糖份呈现快速消耗,不能长时间存储;糖份达到最高值后,会出现
糖份被消耗的回糖现象,成熟后必需尽快收割。二是蔗糖生产是工业与农业的结
合,经济因素制约了榨糖企业的日榨能力,从而限制了甘蔗的收割必需持续一个
较长的时间段;加上品种等方面的差异,存在优化收割的需求。同时,两者的结
合使工业能够发挥更多的积极作用,不仅表现在企业对蔗农的资金、技术的支持
上,榨糖企业记录的进厂时的糖份数据蕴含着农业生产的大量信息,通过数据挖
掘可以将有用的知识反馈给农业生产,尤其是优化收割方面。
数据挖掘的工作就是从大量数据中提取人们所感兴趣的潜在的知识和信息。
榨糖企业的按质论价体系产生了大量进厂时的糖份数据,为数据挖掘提供了条
件。本文针对甘蔗糖份数据的实际情况,详细介绍了数据挖掘中的三个基本任务:
探索性数据分析、预测建模和聚类分析。
在甘蔗糖份的可视化数据探索中,发现了管理疲劳现象,给出了管理疲劳现
象的描述及相应的对策。通过可视化数据探索,还发现了各品种之间的优化方法:
与现行的先收割糖份最高的甘蔗的做法不同的是,收割相对增产潜力小、较成熟
的甘蔗品种可以达到全局最优,能够提高甘蔗总糖份。
甘蔗的糖份积累模型对于估产、确定榨期有着重要的参考作用。二次曲线、
分段 Logistic 曲线是现有文献中出现的两个甘蔗糖份积累的数学模型,哪一种模
型更适用于估产是本文预测建模探讨的内容。利用甘蔗糖份数据,采用回归预测
的方法分析得出 Logistic 曲线更适于表达大规模生产条件下的糖份波动。
榨糖企业与种蔗单位的关系可以看作是一种特殊的客户关系,虽然不存在客
户获得、客户保持,但是良好的客户管理对于甘蔗收割的优化是有帮助的。聚类
分析是客户关系管理中最常用的一种数据挖掘技术。本文采用 K平均聚类分析
的方法,根据不同周次糖份的高低对甘蔗品种、种植单位进行聚类。依据聚类的
特点,为不同时段确定收割重点提供了依据,为管理人员研究糖份高低的栽培原
因提供了参考,同时也为榨糖企业对种蔗单位的资金、技术的支持提供了决策参
I
摘 要
考。
论文最后介绍了精准农业的概念,分析了精准农业与数据挖掘的关系,在此
基础上提出了甘蔗的精准施肥、精准收割将会给甘蔗糖份的数据挖掘带来更理想
的应用前景。
关键词:甘蔗 数据挖掘 糖份积累模型 聚类分析
II
硕士学位论文
ABSTRACT
Sugar cane which used to mill sugar is quite specially: One, the sugar cane is
different with rice, wheat, and other crops, the harvest object is the sugar in plant, less
intuitionistic, and it must be in virtue of the examination and judgment maturity.
Without the dormancy phenomenon, the sugar will be consumed quickly after harvest,
sugar cane cannot be stored long time. The sugar cane appears the sugar to reduce by
the plant consumption phenomenon, after the maximum.The mature sugar cane must
be harvested as soon as possible. Two, the sugar cane production is the union of
industry and agricultural, the economic cause restricts the productivity of sugar
enterprise, this require the sugar cane to be harvested with a long time. For the variety
difference, it is necessary to optimize harvest. At the same time, two unions enable the
industry to display more positive roles, not only the enterprise support planters with
fund and technic, but also the sugar enterprise’ sugar data of shopping which contain
the massive information of the agricultural production. Through the data mining, a lot
of useful knowledge can be feedback to the agricultural production, particularly in
optimizing harvests.
The data mining work is withdraws the latent interesting knowledge and
information from the mass datas for the peoples. Sugar enterprise’s system of set
prices according to quality has produced massively shopping sugar data; this has
provided the condition for the data mining. This article according to the actual
situation of sogar data, to introduce in the data mining three basic tasks: the data
analysis of exploring, the forecast modeling, and cluster analysis.
In the sogar data visible exploration, it had discovered the management weary
phenomenon, produces the description of the management weary phenomenon and the
corresponding countermeasure. With the visible data exploration, it has discovered
optimized method between various varieties: comparing with the present method to
harvest the sugar cane of sugar highest, to harvest the varieties that have small
potential increase, this will be possible to achieve the most superior in overall
III
ABSTRACT
situation, may achieve enhances the sugar goal in total.
Sugar accumulation model of sugar cane has the important reference function, to
evaluate output, to determine the mill time. The conic curve and partitions the
Logistic curve are two sugars accumulating mathematical models which in the
appeared literature, it is the content of the forecasts modelling discussion, which
model is more suitable in evaluates output. With the sugar data, by the regression
analysis, it obtain that the Logistic curve is more suitable for the sugar undulation
curve in the large scale production condition.
The relation between the sugar enterprise and the planting units is possible to
regard as one kind of special customer relations, without the customer obtaining, or
the customer maintaining, but the good customer management is helpful in the
optimization harvest. Cluster analysis is most commonly the data mining technology
in the customer relations management. This article uses the K-Average cluster
analysis method, according to sugar content by week and the variety; the planting
units cluster. Based on cluster characteristic, it provides the basis for what time to
harvest. It provides the reference for the administrator to study the cultivation reason
of the sugar difference. At the same time, it provides the policy-making reference for
the sugar enterprise to supply the fund and technical to planting units.
The paper finally introduces the concept of the precision agricultural, and
analyzes the relation between precision agricultural and the data mining. in this
foundation, it proposed that the precision fertilization and precision harvest of sugar
cane will bring the more ideal application prospect to the data mining.
KEYWORDSSugar caneData miningSugar accumulation modelCluster
analysis
IV
硕士学位论文
摘 要................................................................................................................... I
ABSTRACT.........................................................................................................III
1 章 绪论 .......................................................................................................1
1.1 前言........................................................................................................1
1.2 甘蔗........................................................................................................1
1.3 国内外研究现状....................................................................................2
1.4 本文研究的目的....................................................................................2
1.5 本文主要工作........................................................................................3
2 章 数据挖掘..............................................................................................5
2.1 引言........................................................................................................5
2.2 数据挖掘的概念....................................................................................5
2.2.1 数据挖掘 .....................................................................................5
2.2.2 可视化数据挖掘 .........................................................................6
2.2.3 数据挖掘的步骤 .........................................................................6
2.2.4 数据挖掘工具 .............................................................................7
2.3 其它相关概念........................................................................................7
2.3.1 时序数据的挖掘 .........................................................................7
2.3.2 回归预测 .....................................................................................7
2.3.3 聚类分析 .....................................................................................8
2.3.4 CRISP-DM过程模型 ..................................................................8
2.4 小结........................................................................................................9
3 章 事务分析............................................................................................11
3.1 引言......................................................................................................11
3.2 甘蔗生产的理解..................................................................................11
3.2.1 榨糖企业与种蔗单位 ...............................................................11
3.2.2 收割的主要过程 .......................................................................12
3.2.3 几个相关概念 ...........................................................................13
V
3.3 数据理解..............................................................................................14
3.4 数据挖掘的目标..................................................................................16
3.4.1 预测分析 ...................................................................................16
3.4.2 聚类分析 ...................................................................................16
3.5 本章小结..............................................................................................17
4 章 可视化数据探索..............................................................................19
4.1 引言......................................................................................................19
4.2 数据集的探索......................................................................................19
4.3 可视化数据探索..................................................................................23
4.3.1 甘蔗糖份曲线趋势变化 ...........................................................23
4.3.2 不同品种间的糖份差异 ...........................................................25
4.3.3 不同种蔗单位间的差异 ...........................................................28
4.4 对管理疲劳现象的可视化分析..........................................................31
4.5 小结......................................................................................................34
5 章 基于甘蔗糖份的数据挖掘 ...........................................................35
5.1 引言......................................................................................................35
5.2 甘蔗糖份的预测模型..........................................................................35
5.2.1 收割管理的假设 .......................................................................35
5.2.2 分析数据的选择 .......................................................................36
5.2.3 数据预处理 ...............................................................................37
5.2.4 两种曲线模型 ...........................................................................38
5.2.5 模型分析 ...................................................................................39
5.3 对种蔗单位的聚类分析......................................................................43
5.3.1 数据预处理 ...............................................................................43
5.3.2 对种蔗单位的聚类分析 ...........................................................50
5.3.3 结果分析 ...................................................................................51
5.3.4 进一步的探讨 ...........................................................................52
5.4 本章小结..............................................................................................57
6 章 结语 .....................................................................................................59
VI
摘要:

硕士学位论文摘要用于榨糖的蔗糖生产比较特殊:一是甘蔗有别于水稻、小麦等作物,收获对象是植株体内的糖份,直观性不强,需要通过检测判断成熟度;不存在种子休眠现象,收割后糖份呈现快速消耗,不能长时间存储;糖份达到最高值后,会出现糖份被消耗的回糖现象,成熟后必需尽快收割。二是蔗糖生产是工业与农业的结合,经济因素制约了榨糖企业的日榨能力,从而限制了甘蔗的收割必需持续一个较长的时间段;加上品种等方面的差异,存在优化收割的需求。同时,两者的结合使工业能够发挥更多的积极作用,不仅表现在企业对蔗农的资金、技术的支持上,榨糖企业记录的进厂时的糖份数据蕴含着农业生产的大量信息,通过数据挖掘可以将有用的知识反馈给农业...

展开>> 收起<<
基于原料蔗糖份数据挖掘的应用研究.pdf

共73页,预览8页

还剩页未读, 继续阅读

作者:李琳琳 分类:高等教育资料 价格:15积分 属性:73 页 大小:2.71MB 格式:PDF 时间:2024-10-14

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 73
客服
关注