From the Canyon Edge: Introducing: run-one and run-this-one

Sunday, February 6, 2011

Introducing: run-one and run-this-one

I love cronjobs! They wake me up in the morning, fetch my mail, backup my data, sync my mirrors, update my systems, check the health of my hardware and RAIDs, transcode my MythTV recordings, and so many other things...

The robotic precision of cron ensures that each subsequent job runs, on time, every time.

But cron doesn't check that the previous execution of that same job completed first -- and that can cause big trouble.

This often happens to me when I'm traveling and my backup cronjob fires while I'm on a slow up-link. It's bad news when an hourly rsync takes longer than an hour to run, and my system heads down a nasty spiral, soon seeing 2 or 3 or 10 rsync's all running simultaneously. Dang.

For this reason, I found myself putting almost all of my cronjobs in a wrapper script, managing and respecting a pid file lock according to the typical UNIX sysvinit daemon method. Unfortunately, this led to extensively duplicated lock handling code spread across my multiple workstations and servers.

I'm proud to say, however, that I have now solved this problem on all of my servers, at least for myself, and perhaps for you too!

In Ubuntu 11.04 (Natty), you can now find a pair of utilities in the run-one package: run-one and run-this-one.

run-one

You can simply prepend the run-one utility on the beginning of any command (just like time or sudo). The tool will calculate the md5sum $HASH of the rest of $0 and $@ (the command and its arguments), and then try to obtain a lock on a file in $HOME/.cache/$HASH using flock. If it can obtain the lock, then your command is simply executed, releasing the lock when done. And if not, then another copy of your command is already running, and it quietly exits non-zero.

I can now be safely assured that there will only ever be one copy of this cronjob running on my local system as $USER at a time:

  */60 * * * *   run-one rsync -azP $HOME example.com:/srv/backup

If a copy is already running, subsequent calls of the same invocation will quietly exit non-zero.

run-this-one

run-this-one is a slightly more forceful take on the same idea. Using pgrep, it finds any matching invocations owned by the user in the process table and kills those first, then continues, behaving just as run-one (establishing the lock and executing your command).

I rely on a handful of ssh tunnels and proxies, but I often suspend and resume my laptop many times a day, which can cause those ssh connections to go stale and hang around for a while before the connection times out. For these, I want to kill any old instances of the invocation, and then start a fresh one.

I now use this code snippet in a wrapper script to establish my ssh socks proxy, and a pair of local port forwarding tunnels for (squid and bip proxies):

  run-this-one ssh -N -C -D 1080 -L 3128:localhost:3128 \

    -L 7778:localhost:7778 example.com

Have you struggled with this before? Do you have a more elegant solution? Would you use run-one and/or run-this-one to solve a similar problem?

You can find the code in Launchpad/bzr here, and packages for Lucid, Maverick, and Natty in a PPA here.

 bzr branch lp:run-one

 sudo apt-add-repository ppa:run-one/ppa

 sudo apt-get update

sudo apt-get install run-one

Cheers,
:-Dustin

10 comments:

Bill SullivanFebruary 6, 2011 at 11:16 PM
I've had some similar problems with backups and rsync. run-one, in particular, looks like it should make my backups go much more smoothly. I look forward to taking a look at the code. Thanks!
ReplyDelete
Replies
AnonymousFebruary 7, 2011 at 2:48 AM
Hi Dustin,

Just what I was looking for. I did notice it handles the locking on files in $HOME. Would it be difficult to make it user-independant by ie optionally moving the lock files to /tmp or /var/lock or something?

Fred
ReplyDelete
Replies
AnonymousFebruary 7, 2011 at 2:59 AM
I have been saying I should write something similar myself for quite some time now :) Will give it a try!
ReplyDelete
Replies
AmitFebruary 7, 2011 at 7:18 AM
Very cool! My (sometimes broken) offlineimap cron job thanks you.
ReplyDelete
Replies
Dustin KirklandFebruary 7, 2011 at 8:38 AM
Fred,

This is to prevent one user from DoS'ing another. Ie, one user could prevent another from running their backup job.

I am thinking about adding some special handling for the root user in the run-this-one tool, though.
ReplyDelete
Replies
AnonymousFebruary 7, 2011 at 2:22 PM
I've always used "lckdo" for this kind of behaviour before. Look forward to seeing how run-once compares
ReplyDelete
Replies
Dustin KirklandFebruary 7, 2011 at 3:49 PM
Thanks for the pointer, Adam, I was not aware of that tool.

From the manpage, I see:
"Now that util-linux contains a similar command named flock, lckdo is deprecated, and will be removed from some future version of moreutils."
ReplyDelete
Replies
AnonymousFebruary 7, 2011 at 4:37 PM
Eeek, good to know it's on the way out I guess. Will have to add that to the mental "stuff that needs changing when we dist-upgrade" list!
ReplyDelete
Replies
kvzFebruary 8, 2011 at 8:38 AM
Hey Dustin,

I used to solve this with Tim Kay's solo: http://timkay.com/solo/

which locks by opening a local port instead of writing a lockfile. I'd never thought of that :)
ReplyDelete
Replies
AnonymousFebruary 9, 2011 at 8:26 AM
Perhapse something line RUN_ONE_LOCK_DIR enviroment variable could be used to tell run-one where the lockfiles are kept.
ReplyDelete
Replies

Add comment

Please do not use blog comments for support requests! Blog comments do not scale well to this effect.

Instead, please use Launchpad for Bugs and StackExchange for Questions.
* bugs.launchpad.net
* stackexchange.com

Thanks,
:-Dustin

About the Author

Dustin Kirkland (Twitter, LinkedIn) is an engineer at heart, with a penchant for reducing complexity and solving problems at the cross-sections of technology, business, and people.

With a degree in computer engineering from Texas A&M University (2001), his full-time career began as a software engineer at IBM in the Linux Technology Center working on the Linux kernel and security certifications, including a one-year stint as an dedicated engineer-in-residence at Red Hat in Boston (2005). Dustin was awarded the title Master Inventor at IBM, in recognition of his prolific patent work as an inventor and reviewer with IBM's intellectual property attorneys.

Dustin then first joined Canonical (2008) as an engineer (eventually, engineering manager), helping create the Ubuntu Server distribution and establishing Ubuntu as the overwhelming favorite Linux distribution in Amazon, Google, and Microsoft's cloud platforms, as well as authoring and maintaining dozens of new open source packages.

Dustin joined Gazzang (2011), a venture-backed start-up built around an open source project that he co-authored (eCryptFS), as Chief Technology Officer, and helped dozens of enterprise customers encrypt their data at rest and securely manage their keys. Gazzang was acquired by Cloudera (2014).

Having effectively monetized eCryptFS as an open source project at Gazzang, Dustin returned to Canonical (2013) as the VP of Product for Ubuntu and spent the next several years launching a portfolio of products and services (Ubuntu Advantage, Extended Security Maintenance, Canonical Livepatch, MAAS, OpenStack, Kubernetes) that continues to deliver considerable annual recurring revenue. With Canonical based in London, an 800+ work-from-home employee roster and customers spread across 40+ countries, Dustin traveled the world over, connecting with clients and colleagues steeped in rich cultural experiences.

Google Cloud (2018) recruited Dustin from Canonical to product manage Google's entrance into on-premises data centers with its GKE On-Prem (now, Anthos) offering, with a specific focus on the underlying operating system, hypervisor, and container security. This work afforded Dustin a view deep into the back end data center of many financial services companies, where he still sees tremendous opportunities for improvements in security, efficiencies, cost-reduction, and disruptive new technology adoption.

Seeking a growth-mode opportunity in the fintech sector, Dustin joined Apex Clearing (now, Apex Fintech Solutions) as the Chief Product Officer (2019), where he led several organizations including product management, field engineering, data science, and business partnerships. He drastically revamped Apex's product portfolio and product management processes, retooling away from a legacy "clearing house and custodian", and into a "software-as-a-service fintech" offering instant brokerage account opening, real-time fractional stock trading, a secure closed-network crypto solution, and led the acquisition and integration of Silver's tax and cost basis solution.

Drawn back into a large cap, Dustin joined Goldman Sachs (2021) as a Managing Director and Head of Platform Product Management, within the Consumer banking division, which included Marcus, and the Apple and GM credit cards. He built a cross-functional product management community and established numerous documented product management best practices, processes, and anti-patterns.

Dustin lives in Austin, Texas, with his wife Kim and their wonderful two daughters.

Sunday, February 6, 2011

Introducing: run-one and run-this-one

10 comments:

Printfriendly

About the Author

Blog Archive

Github Activity

Google Plus

Twitter

StackExchange

Solar Output

Labels