Running Bayes in AWS12 Jul 2017
Okay, so you’ve got R up and running on AWS and you’ve figured out how to get your code to run and walk away. Great. Now for an example where you might actually need to do this: running a Bayesian analysis.
This usage differs a little from what I’ve previously suggested is the use for AWS. Here, instead of needing a huge increase in power, what you might be looking for is a machine you can start running and ignore, but where 128 cores in a parallel environment isn’t going to save you any time.^1
Here are the steps to follow to use JAGS on your EC2 instance. These steps come from this very helpful post, but it’s a little out of date and there are a few additional steps you’ll need to take.
Step 0: Get your EC2 instance running and install R
I’m going to assume that you’ve already read my previous post on this and that you’re set here.
Step 1: Install JAGS
To run JAGS in R, you’ll need to have JAGS installed. I’m going to assume here that you have super user powers (i.e. you ran
sudo su and didn’t exit from that) to run these commands
- Install the tools you’ll need to compile everything:
yum install gcc
yum install gcc-c++
yum install gcc-gfortran
yum install readline-devel
yum install makeMake should already be installed, but if for some reason it’s not, you definitely need to have that.
- Download JAGS
- Check https://sourceforge.net/projects/mcmc-jags/files/JAGS/4.x/Source/ to make sure that you’re getting the most recent version and if not, change the
wgetcommand as appropriate
- Install JAGS
tar xf JAGS-4.2.0.tar.gz
- Link the installation so R can find it
- You can double check that this is, in fact where the jags.pc file is located by running
ls /usr/local/lib64/pkgconfig. If it’s not in there, you’ll have to figure out where it is and adjust accordingly.
Step 2: Install rjags
Just launch R and run
Assuming that everything has gone well, you start your code and walk away. But how will you know when it’s finished? Well there are two ways. The first is to take a look at the EC2 console. On the page that displays your instances, you can select “monitoring” toward the bottom of the page and take a look at how much computing power you’re using. It’s delayed by a few minutes, but it will probably show a spike when you start the code, then a flat line. When that line falls again, you’re all done.
The better option is to use
screen when you start your process instead of
nohup will work just fine, but you won’t be able to see any progress. If you opt for
screen and then start R and run your code that way (instead of starting
screen and then using
R CMD BATCH because obviously that won’t solve the monitoring issue), you can always come back later, reattach to the screen and take a look at the nice progress bar that jags provides.