SCHOOL OF COMPUTER SCIENCE AND SOFTWARE ENGINEERING
MONASH UNIVERSITY
TECHNICAL REPORT 2004/165
Data Mining Cardiovascular Bayesian Networks
C R Twardy, A E Nicholson, K B Korb and J McNeil
ABSTRACT
Bayesian networks (BNs) are rapidly becoming a tool of choice for applied Artificial Intelligence. Although BNs have been successfully used for many medical diagnosis problems, there have been few applications to epidemiological data where data mining methods play a significant role. In this paper, we look at the application of BNs to epidemiological data, specifically assessment of risk for coronary heart disease (CHD). We build the BNs: (1) by knowledge engineering BNs from two epidemiological models of CHD in the literature; (2) by applying a causal BN learner. We evaluate these BNs using cross-validation. We compared performance in predicting CHD events over 10 years, measuring area under the ROC curve and Bayesian information reward. The knowledge engineered BNs performed as well as logistic regression, while being easier to interpret. These BNs will serve as the baseline in future efforts to extend BN technology to better handle epidemiological data, specifically to predict and prevent CHD.