SCHOOL OF COMPUTER SCIENCE AND SOFTWARE ENGINEERING
MONASH UNIVERSITY


TECHNICAL REPORT 2004/165


Data Mining Cardiovascular Bayesian Networks

C R Twardy, A E Nicholson, K B Korb and J McNeil

ABSTRACT

Bayesian networks (BNs) are rapidly becoming a tool of choice for applied Artificial Intelligence. Although BNs have been successfully used for many medical diagnosis problems, there have been few applications to epidemiological data where data mining methods play a significant role. In this paper, we look at the application of BNs to epidemiological data, specifically assessment of risk for coronary heart disease (CHD). We build the BNs: (1) by knowledge engineering BNs from two epidemiological models of CHD in the literature; (2) by applying a causal BN learner. We evaluate these BNs using cross-validation. We compared performance in predicting CHD events over 10 years, measuring area under the ROC curve and Bayesian information reward. The knowledge engineered BNs performed as well as logistic regression, while being easier to interpret. These BNs will serve as the baseline in future efforts to extend BN technology to better handle epidemiological data, specifically to predict and prevent CHD.