Running Bayes in AWS
12 Jul 2017This is the third post in a series on using AWS for your social stats needs. Here’s the original post on getting started and a follow-up post with some additional lessons learned.
Okay, so you’ve got R up and running on AWS and you’ve figured out how to get your code to run and walk away. Great. Now for an example where you might actually need to do this: running a Bayesian analysis.
This usage differs a little from what I’ve previously suggested is the use for AWS. Here, instead of needing a huge increase in power, what you might be looking for is a machine you can start running and ignore, but where 128 cores in a parallel environment isn’t going to save you any time.^1
Here are the steps to follow to use JAGS on your EC2 instance. These steps come from this very helpful post, but it’s a little out of date and there are a few additional steps you’ll need to take.
Step 0: Get your EC2 instance running and install R
I’m going to assume that you’ve already read my previous post on this and that you’re set here.
Step 1: Install JAGS
To run JAGS in R, you’ll need to have JAGS installed. I’m going to assume here that you have super user powers (i.e. you ran sudo su
and didn’t exit from that) to run these commands
- Install the tools you’ll need to compile everything:
yum install gcc
yum install gcc-c++
yum install gcc-gfortran
yum install readline-devel
yum install make
Make should already be installed, but if for some reason it’s not, you definitely need to have that.
- Download JAGS
wget https://sourceforge.net/projects/mcmc-jags/files/JAGS/4.x/Source/JAGS-4.2.0.tar.gz
- Check https://sourceforge.net/projects/mcmc-jags/files/JAGS/4.x/Source/ to make sure that you’re getting the most recent version and if not, change the
wget
command as appropriate
- Install JAGS
tar xf JAGS-4.2.0.tar.gz
cd JAGS-4.2.0
./configure --libdir=/usr/local/lib64
make
make install
cd ..
- Link the installation so R can find it
export PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig
- You can double check that this is, in fact where the jags.pc file is located by running
ls /usr/local/lib64/pkgconfig
. If it’s not in there, you’ll have to figure out where it is and adjust accordingly.
Step 2: Install rjags
Just launch R and run install.packages("rjags")
.
Monitoring Progress
Assuming that everything has gone well, you start your code and walk away. But how will you know when it’s finished? Well there are two ways. The first is to take a look at the EC2 console. On the page that displays your instances, you can select “monitoring” toward the bottom of the page and take a look at how much computing power you’re using. It’s delayed by a few minutes, but it will probably show a spike when you start the code, then a flat line. When that line falls again, you’re all done.
The better option is to use screen
when you start your process instead of nohup
. Sure nohup
will work just fine, but you won’t be able to see any progress. If you opt for screen
and then start R and run your code that way (instead of starting screen
and then using R CMD BATCH
because obviously that won’t solve the monitoring issue), you can always come back later, reattach to the screen and take a look at the nice progress bar that jags provides.