CSE459 Paper Topics

CSE459 Causal Discovery Paper Topics


Some possible paper topics. Possible programs are described where appropriate. Each identifies a lecture in CSE459 which is relevant. Past lecture notes are available for follow-up; see my Bayesian AI web page.

  1. (Lecture 1): Is machine learning just computational statistics? Program: Implement some simple symbolic learning algorithms. Wipe them out with Excel.

  2. (Lecture 2): Is the causal interpretation of Bayesian nets primary? Argue that the true variable order necessarily produces the sparsest Bnet when using the Pearl net construction algorithm. Program: Write a program which, given a true net (true causal structure) and a total ordering, generate the Bnet using Pearl's algorithm. Run it N! times on various cases, illustrating the thesis.

  3. (Lecture 3): Apply the concepts of Bayesian confirmation theory to the value of randomizing selection into experimental and control groups. Program: A simulation program to generate the relevant likelihoods (a la Monte Hall, 2 envelopes).

  4. (Lecture 4): Define causal power for linear and/or discrete causal models. For linear models this might be in terms of variance accounted for; in discrete models this might be in terms of relative entropy between variables. (On last topic: see Cover and Thomas Elements of Info Theory.) Program: Write a program to report the causal power of one variable on another, using Wright's rules to factor out the different lines of influence.

  5. (Lecture 6): Generalize the concept of Markov (statistical) equivalence to experimental equivalence, using the concept of augmented causal models.

  6. (Lectures 11 & 13, 2001): Compare conditional independence learning with metric learning, in the form of the PC algorithm and K2. Program: Implement PC with an oracle (see Assignment 1, 2001, for details). Implement K2 (see Lect 13). Run them both against simple problems, both using noiseless data exactly reflecting the causal model. Compare the results. K2 requires a total ordering as input, to get started. PC provides a partial order as output. To what extent could the two be fused, with PC providing ordering data to K2? Would this make any sense?

  7. (Lecture 12, 2001): Use classification tree learning (DTREE) and CaMML to model missing data both when the missing values are and are not correlated to the observed values. Compre the results empirically with techniques that simply assume the missing values are unrelated to known data. Describe a procedure/criterion for identifying when the modeling of missing data is useful and when not. Program: Using CaMML, DTREE and possibly other programs.

  8. (Lecture 12, 2001): Implement a simple Expectation Maximization (EM) program to estimate parameters given missing data. Run it on cases where missing data are uncorrelated with measured data and also where such data are related to available data. Describe the relative effectiveness of EM under the different conditions. Can you come up with any criteria for identifying when EM filling-in is useful and when not? Program: EM algorithm.

  9. (Lecture 13, 2001): Prove that causal models in a statistical equivalence class are likelihood equivalent. Argue against Heckerman that hypothesis equivalence is in most cases an inappropriate assumption: argue that the causal interpretation is fundamental to making sense of Bayesian nets in general and the automated learning of Bayesian nets in particular. Program: Proofs are better than programs!

  10. (Lecture 13, 2001): Improve upon the MDL metric described there (some defects of which are pointed out in the lecture). Implement the new scoring metric with a simple search (e.g., greedy search starting with known constraints, or a simple stochastic search). Compare it empirically with the PC algorithm in Tetrad II (I have a copy). Program: MDL causal search program.