引用本文
  •    [点击复制]
  •    [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 341次   下载 648 本文二维码信息
码上扫一扫!
一种新的关联规则抽样算法
0
()
摘要:
针对目前经典的关联规则挖掘Apriori算法需对数据库多次扫描费时多计算量大,而抽样扫描会造成挖掘精确度下降等问题,采用控制样本频繁项目集的方法,利用频繁1项集进行抽样处理,对关联规则挖掘的抽样操作和精度控制进行研究,提出了基于抽样操作的关联规则挖掘算法——HAC算法。理论分析及性能试验结果表明:HAC算法能够有效缩减数据库规模,至少少扫描数据库1次,提高了关联规则挖掘的效率,同时其计算精度不受影响。
关键词:  关联规则  抽样  准则系数  Apriori算法  HAC算法
DOI:10.11841/j.issn.1007-4333.2007.03.069
修订日期:2006-10-25
基金项目:国家自然科学基金
A new sampling algorithm for association rule
Abstract:
In order to reduce the long time spent for scanning the database by using Apriori algorithm,which may descend the mining accuracy,the research on the sample operation and precision control with the help of frequent item-set,especially,the frequent 1-item-set is presented in this paper.The HAC algorithm based on sampling was designed.The results in theory and capability experiment indicated that HAC algorithm could decrease the scanning times by at least once,promote the efficiency of mining and improve the computation precision.
Key words:  association rule,sampling,guide coefficient,Apriori algorithm,HAC algorithm