From the Canyon Edge -- :-Dustin

Tuesday, November 16, 2010

Yet another Ubuntu Archive Proxy Solution (approx)

Many developers of Ubuntu find it useful to cache all (or at least some) of the Ubuntu Archive locally.

I certainly do.

I have maintained a full copy of the Ubuntu archive for the last ~3 years. Originally, I just used rsync and slapped logic around it to make sure it did the right thing. It did most of the time.

Eventually, Jonathan Davies' ubumirror project/package simplified my mirror situation, and really made it easy to filter out some of the architectures I didn't need.

Still, this required about 400GB of disc space, and quite a bit of overnight bandwidth to keep it perfectly in sync.

Earlier this year, I learned about the approx package, and it has become my new favorite proxy solution. I did look at apt-cacher-ng, but the configuration was complicated that I could figure out in 5 minutes, so if you can show me how to do exactly what I've done with approx, I'm all ears ;-) I also looked at squid-deb-proxy, but I didn't want to have to install additional packages on my clients, and I really wanted this to work well for network installations of Ubuntu servers.

Here's my solution...

To install, simply:
sudo apt-get install approx
Then set the URLs you want to proxy, in /etc/approx/approx.conf:
ubuntu http://archive.ubuntu.com/ubuntu
ubuntu-security http://security.ubuntu.com/ubuntu
I configured my proxy machine to listen on port 80:
sudo dpkg-reconfigure approx
Next, I took a little shortcut on my dd-wrt router's DNSMasq options, so that I don't have to configure to each and every one of my guests to point to my local mirror. I want that to happen automatically and transparently to my guests. So I set my router to authoritatively serve my local proxy's IP address as the resolution for archive.ubuntu.com and security.ubuntu.com. The additional DNSMasq options for me are:
address=/archive.ubuntu.com/security.ubuntu.com/10.1.1.11
where "10.1.1.11" is my proxy's static IP address.

This ensures that all of my guests transparently use my local proxy, without having to perform custom configuration on each.

Now on the proxy itself, I don't want archive.ubuntu.com to point to the localhost, as that won't work very well at all! So for that one machine, I changed its DNS to point to Google's Public DNS at 8.8.8.8.
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
Alternatively, I could manually set the IP address of archive.ubuntu.com and security.ubuntu.com in that machine's /etc/hosts.

Moreover, if I ever need to disable the use of the caching proxy on a single guest, I can simply and temporarily change that machine's DNS to 8.8.8.8 as above.

I'm really finding this to be a handy way of speeding up my network installs and package upgrades on my set of Ubuntu machines at home. I'm not wasting nearly as much disk space or network bandwidth, and I don't have to configure anything on each and every client or installation.

And now that I no longer need a 500GB local disk, I will probably move my proxy into a virtual machine very soon.

I also added a custom byobu status script to track the size of the approx cache, as well as the number of files in the cache, ~/.byobu/bin/61_approx:
#!/bin/sh
dir=/var/cache/approx
du=$(du -sh $dir | awk '{print $1}')
count=$(find $dir -type f -name "*.deb" | wc -l)
printf "Prox:%s,%s" "$du" "$count"

Cheers,
:-Dustin