Bayesian Inference, Step 2: Running Bayesian Inference with Your Data

If you've had the opportunity to install R Studio and JAGS, and to download the associated programs needed to run Bayesian inference, it is a small step to actually getting your parameter estimates. Two of the programs from the BEST folder - BESTExample.R, and BEST1G.R - allow you to run independent-samples and one-sample t-tests, respectively. All that's required is changing the input in the "y1" and "y2" strings with your data, separate each observation with a comma, and then run the program. Other options can be changed, such as which value you are comparing against in the one-sample t-test, but everything else can essentially remain the same.

I realize it's been a while between posts, but right now I'm currently in the process of applying for jobs; this should start to pick up again in mid-October once the deadlines pass, but in the meantime, wish me luck!

Bayesian Inference, Step 1: Installing JAGS On Your Machine

Common complaint: "Bayesian analysis is too hard! Also, I have kidney stones."
Solution: Make Bayesian analysis accessible and efficient through freeware that anyone can use!

These days, advances in technology, computers, and lithotripsy have made Bayesian analysis easy to implement on any personal computer. All it requires is a couple of programs and a library of scripts to run the actual process of Bayesian inference; all that needs to be supplied by you, the user, is the data you have collected. Conceptually, this is no more difficult then entering in data into SAS or SPSS, and, I would argue, is easier in practice.

This can be done in R, statistical software that can interface with a variety of user-created packages. You can download one such package, JAGS, to do the MCMC sampling for building up distributions of parameter estimates, and then use those parameter estimates to brag to your friends about how you've "Gone Bayes."

All of the software and steps you need to install R, JAGS, and rjags (a program allowing JAGS to talk to R) can be found on John Kruschke's website here. Once you have that, it's simply a matter of entering in your own data, and letting the program do the nitty-gritty for you.

Bayesian Inference Web App That Explains (Nearly) Everything

For those still struggling to understand the concepts of Bayesian inference - in other words, all of you - there is a web app developed by Rasmus Baath which allows one to see the process unfolding in real time. Similar to an independent-samples t-test, we are trying to estimate a parameter of the mean difference between the populations that the samples were drawn from; however, the Bayesian approach offers much richer information about a distribution of parameter values, and, more importantly, which ones are more credible than others, given the data that has been collected.

The web app is simple to use: You input data values for your two groups, specify the number of samples and burn-in samples (although the defaults for this are fine), and then hit the Start button. The MCMC chain begins sampling the posterior distribution, which builds up a distribution of credible parameter values, and 95% of the distribution containing the parameter values with the most credibility is labeled as the highest density interval, or HDI. This same procedure is applied to all of the other parameters informed by the data, including normality, effect size, and individual group means and standard deviations, which provides a much richer set of information than null hypothesis significance testing (NHST).

Because of this I plan to start incorporating more Bayesian statistics into my posts, and also because I believe it will overtake, replace, and destroy NHST as the dominant statistical method in the next ten years, burning its crops, sowing salt in its fields, looting its stores, stampeding its women, and ravishing its cattle. All of this, coming from someone slated to teach the traditional NHST approach next semester; which, understandably, has me conflicted. On the one hand, I feel pressured to do to whatever is most popular at the moment; but on the other, I am an opportunistic coward.

In any case, Bayesian inference is becoming a more tractable technique, thanks to programs that interface with the statistics package R, such such JAGS. Using this to estimate parameters for region of interest data, I think, will be a good first step for introducing Bayesian methods to neuroimagers.

Bayesian Approaches to fMRI: Thoughts

Pictured: Reverend Thomas Bayes, Creator of Bayes' Theorem and Nerd Baller

This summer I have been diligently writing up my qualification exam questions, which will effect my entry into dissertation writing. As part of the qualification exam process I opted to perform a literature review on Bayesian approaches to fMRI, with a focus on spatial priors and parameter estimation at the voxel level. This necessarily included a thorough review of the background of Bayesian inference, over the course of which I gradually became converted to the view that Bayesian inference was, indeed, more useful and more sophisticated than traditional null hypothesis significance testing (NHST) techniques, and that therefore every serious scientist should adopt it as his statistical standard.

At first, I tended to regard practitioners of Bayesian inference as seeming oddities, harmless lunatics so convinced of the superiority of their technique as to come across as almost condescending. Like all good proselytizers, Bayesian practitioners appear to be appalled by the putrid sea of misguided statistical inference in which their entire field had foundered, regarding their benighted colleagues as doomed unless they were injected with the appropriate Bayesian vaccine. And in no place was this zeal more evident than their continual attempts to sap the underlying foundations of NHST and the assumptions on which it rested. At the time I considered these differences between the two approaches to be trivial, mostly because I convinced myself that any overwhelmingly large effect size acquired in NHST would be essentially equivalent to a parameter estimate calculated by the Bayesian approach.

Bayesian Superciliousness Expressed through Ironic T-shirt

However, the more I wrote, the more I thought to myself that proponents of Bayesian methods may be on to something. It finally began to dawn on me that rejecting the null hypothesis in favor of an alternative hypothesis, and actually being able to say something substantive about the alternative hypothesis itself and compare it with a range of other models, are two very different things. Consider the researcher attempting to make the case that shining light in someone's eyes produces activation in the visual cortex. (Also consider the fact that doing such a study in the good old days would get you a paper into Science, and despair.) The null hypothesis is that shining light into someone's eyes should produce no activation. The experiment is carried out, and a significant 1.0% signal change is observed in the visual cortex, with a confidence interval from [0.95, 1.05]. The null hypothesis is rejected and you accept the alternative hypothesis that shining light in someone's eyes elicits greater neural activity in this area than do periods of utter and complete darkness. So far, so good.

Then, suddenly, one of these harmless Bayesian lunatics pops out of the bushes and points out that, although a parameter value has been estimated and a confidence interval calculated stating what range of values would not be rejected by a two-tailed significance test, little has been said about the credibility of your parameter estimate. Furthermore, nothing has been said at all about the credibility of the alternative hypothesis, and how much more believable it should be as compared to the null hypothesis. These words shock you so deeply that you accidentally knock over a nearby jar of Nutella, creating a delicious mess all over your desk and reminding you that you really should screw the cap back on when you are done eating.

Bayesian inference allows the researcher to do all of the above mentioned in the previous paragraph, and more. First, it has the advantage of being uninfluenced by the intentions of the experimenter, the knowledge of which is inherently murky and unclear, but on which NHST "critical" values are based. (More on this aspect of Bayesian inference, as compared to NHST, can be found in a much more detailed post here.) Moreover, Bayesian analysis sheds light on concepts common to both Bayesian and NHST approaches while pointing out the disadvantages of the latter and outlining how these deficiencies are addressed and mitigated in the former, whereas the converse approach is not true; this stems from the fact that Bayesian inference is more mathematically and conceptually coherent, providing a single posterior distribution for each parameter and model estimate without falling back on faulty, overly conservative multiple correction mechanisms which punish scientific curiosity. Lastly, Bayesian inference is more intuitive. We should intuitively expect our prior beliefs to influence our interpretation of posterior estimates, as more extraordinary claims should require correspondingly extraordinary evidence.

Having listened to this rhapsody on the virtues and advantages of going Bayesian, the reader may wonder how many Bayesian tests I have ever performed on my own neuroimaging data. The answer is: None.

Why is this? First of all, considering the fact that a typical fMRI dataset is comprised of hundreds of thousands of voxels, and given current computational capacity, Bayesian inference for an single neuroimaging session can take prohibitively long amounts of time. Furthermore, the only fMRI analysis package I know of that allows for Markov-Chain Monte Carlo (MCMC) sampling at each voxel is FSL's FLAME 1+2, although this procedure can take on the order of days for a single subject, and the results usually tend to be more or less equal to what would be produced through traditional methods. Add on top of this models which combine several levels of priors and hyperparameters which mutually constrain each other, and the computational cost increases even more exponentially. One neuroimaging technique which uses Bayesian inference in the form of spatial priors in order to anatomically constrain the strength and direction of connectivity - an approach known as dynamic causal modeling (DCM; Friston et al, 2003) - is relatively unused among the neuroimaging community, given the complexity of the approach (at least, outside of Friston's group). Because of these reasons, Bayesian inference has not gained much traction in the neuroimaging literature.

However, some statistical packages do allow for the implementation of Bayesian-esque concepts, such as mutually constraining parameter estimates through a process known as shrinkage. While some Bayesian adherents may balk at such weak-willed, namby-pamby compromises, in my experience these compromises can satisfy both the some of the intuitive concepts of Bayesian methods while allowing for more efficient computation time. One example is AFNI's 3dMEMA, which estimates the precision of the estimate for each subject (i.e., the inverse of the variance of that individual's parameter estimate), and weights it in proportion to its precision. For example, a subject with less variance would be weighted more when taken to a group-level analysis, while a subject with a noisy parameter estimate would be weighted less.

Overall, while comprehensive Bayesian inference at the voxel level would be ideal, for right now it appears impractical. Some may take issue with this, but until further technological advances in computer speed or clever methods which allow for more efficient Bayesian inference, current approaches will likely continue.