An improved random forest algorithm based on unbalanced data

doi:10.11835/j.issn.1000-582X.2018.04.007

Home > Archive>Volume 41, Issue 4, 2018 >54-62. DOI:10.11835/j.issn.1000-582X.2018.04.007

An improved random forest algorithm based on unbalanced data
DOI:
                        10.11835/j.issn.1000-582X.2018.04.007
                    
CSTR:
                        [cstr]
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:TP391.4
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Random forest algorithm has better classification performance as a combination of classification and is suitable for a variety of classification environments, but it also has some flaws. For example, it can not distinguish positive and negative class when dealing with unbalanced data. By setting conditions on sampling results, we improve the Bootstrap sampling method, reduce the influence of sampling on non-equilibrium and ensure the randomness of this algorithm. Then, we weight every decision tree according to the non-equilibrium coefficient of the generated data to enhance the discourse right of the decision tree which is sensitive to the non-equilibrium data and improve the classification performance of the whole algorithm dealing with unbalanced data. With these two above improvements, the new algorithm can significantly improve classification performance when the number of decision tree is insufficient.

Reference

Cited by

Get Citation

魏正韬,杨有龙,白婧.基于非平衡数据的随机森林分类算法改进[J].重庆大学学报,2018,41(4):54~62

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:October 20,2017
Revised:
Adopted:
Online: May 06,2018
Published:

Home

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code