Saturday, January 22. 2011

Amanda's Transfer Mechanisms

There's been a bit of confusion on the mailing list and IRC about how Amanda assembles transfers out of transfer elements, and how transfer mechanisms influence that.

In the final form of a transfer, any two adjacent elements must have the same mechanism. For example is, an upstream element speaking XFER_MECH_PUSH_BUFFER cannot talk to a downstream element using XFER_MECH_READ_FD (nor, more confusingly, XFER_MECH_PULL_BUFFER). So each mechanism is an isolated definition of "here's how upstream and downstream should talk". They come in pairs because generally anything upstream can do to downstream (e.g., upstream can write to downstream's fd) can occur in reverse (e.g., downstream can read from upstream's fd).

What makes this confusing is that if you specify a set of elements which can't talk directly to one another, then xfer.c will add "glue" elements between the specified elements. To make that concrete, imagine you specify a transfer as

source-holding --> filter-xor --> dest-fd
(if you like practical examples, then imagine filter-xor is a buffer-based decompression filter, and you're pulling data from holding disk, decompressing, and sending to a pipe -- something amfetchdump would do). Here are the mechanisms supported by each element:
source-holding:
 XFER_MECH_PULL_BUFFER
filter-xor
 XFER_MECH_PULL_BUFFER (input) and
 XFER_MECH_PULL_BUFFER (output)
or
 XFER_MECH_PUSH_BUFFER (input) and
 XFER_MECH_PUSH_BUFFER (output)
dest-fd
 XFER_MECH_WRITEFD (input)

In putting these together, source-holding and filter-xor can use the same mechanism (PULL_BUFFER). This leaves filter-xor using PULL_BUFFER for output, but dest-fd does not support this. So xfer.c adds a glue element that can speak PULL_BUFFER on input and WRITEFD on output. This element basically loops in a thread, calling upstream->pull_buffer and write(downstream->input_fd, buffer). So the final xfer looks like

 source-holding --(PULL_BUFFER)--> filter-xor --(PULL_BUFFER)--> glue --(WRITEFD)--> dest-fd

Hopefully that helps to explain how the glue works.

Note that one of the cool things about this arrangement is that in most cases the complexity is in the glue, not the elements. In fact, in this case the glue provides the only thread that's required to run this transfer, so the other three element implementations don't need to manage threads at all.

Friday, July 16. 2010

IPv6 and Amanda

Amanda joined the IPv6 revolution in November 2006 - all of the BSD-style authentication mechanisms can support IPv6 endpoints. However, it's generally agreed that this was a mistake, and in this post I will talk about why that's the case.


Continue reading "IPv6 and Amanda"

Thursday, July 8. 2010

What's New in Amanda: The End of Fragmentation

Most of my posts in this series have been about features that are available in a released version of Amanda. This time, I want to share a project I'm working on right now - one that will be available in Amanda-3.2. I'm reworking the way Amanda writes its data to tape (or any other kind of storage) to make it more efficient, more reliable, and simpler to configure.

Historically, Amanda's conservative approach to finicky tape hardware has meant that it wasted some space at the end of each tape. With the changes I'm working on, Amanda will no longer waste this space, and can also avoid some needless copying of data in most cases, with a minimum of additional risk.


Continue reading "What's New in Amanda: The End of Fragmentation"

Thursday, July 1. 2010

What's New in Amanda: Hackability

It's been a while since I've posted about recent development in Amanda, but it's not for lack of interesting topics!

Today I want to talk a little bit about Amanda's development. Historically, Amanda has always had a small, core group of developers who do the lion's share of the development work. There are probably lots of reasons for this, not least of which is that a backup application isn't the sexiest project on which to spend your spare time. But I think there's a deeper reason, and it has to do with hackability.


Continue reading "What's New in Amanda: Hackability"

Monday, April 12. 2010

Want to work on Amanda?

I've not made any secret of the fact that I want more people hacking on Amanda. This is both for selfish reasons -- many hands make light work -- and for altruistic reasons -- a broader community of developers can provide better governance for the project and long-term continuity. With a few noticable exceptions, I haven't had a lot of satisfaction.

I think part of the reason is that Amanda has a steep learning curve, even within the new Perl code. The time to climb that curve is a big investment, and folks with only a small itch to scratch can't afford it.

In an effort to sweeten the pot, we (Zmanda) are offering to pay for flexible work on Amanda. Part-time or full-time, on your own schedule. Your choice of projects. Support and gratitude from the other hackers. And the option to become a full Zmanda employee if that's your bent.

Here are some possible projects, to pique your interest:

  • MySQL application (to round out the set with ampgsql)
  • Cyrus Imapd application (gnutar doesn't deal well with the application's tiny files and hard links)
  • OpenSSL for network transport, using certificates and keys for authentication
  • Database-backed backup catalog
  • Amvault upgrade
  • Handle Logical EOM (LEOM) on all devices that support it, drastically reducing the number of parts Amanda writes
  • Support for more cloud backends than just S3
  • Parallel writes to multiple devices

If you're interested, contact me (dustin@zmanda.com) and we'll work something out!

Thursday, March 25. 2010

What's New in Amanda: Postgres Backups

In the second installment a series of posts about recent work on Amanda.

The Application API allows Amanda to back up "structured" data -- data that cannot be handled well by 'dump' or 'tar'. Most databases fall into this category, and with the 3.1 release, Amanda ships with ampgsql, which supports backing up Postgres databases using the software's point-in-time recovery mechanism.

The how-to for this application is on the Amanda wiki.


Continue reading "What's New in Amanda: Postgres Backups"

Friday, March 12. 2010

What's New in Amanda: Transfer Architecture

Amanda's primary mission in life is to move large quantities of data around. Historically, this has been done through a patchwork of methods, each written separately and with its own quirks. POSIX pipes, TCP sockets, shared memory, on-disk cache files -- Amanda's done it all. But these multiple implementations were error-prone, difficult to maintain, and often not the most efficient approach.

In an effort to remedy this, we introduced the transfer architecture, abbreviated XFA. This was technically included in Amanda-2.6.1, but was only used by amvault. In the upcoming Amanda-3.1 release, however, the XFA is central to all recovery operations, and is used internally by the taper (the portion of the backup system that writes to devices).

This post highlights some of the features of the transfer architecture, and some of the improvements we'd like to make.


Continue reading "What's New in Amanda: Transfer Architecture"

What's New in Amanda: Automated Tests

This is the first in what will be a series of posts about recent work on Amanda. Amanda has a reputation as old and crusty -- not so! Hopefully this series will help to illustrate some of the new features we've completed, and what's coming up. I'll be cross-posting these on the Zmanda Team Blog too.

Among open-source applications, Amanda is known for being stable and highly reliable. To ensure that Amanda lives up to this reputation, we've constructed an automated testing framework (using Buildbot) that runs on every commit. I'll give some of the technical details after the jump, but I think the numbers speak for themselves. The latest release of Amanda (which will soon be 3.1.0) has 2936 tests!

These tests range from highly-focused unit tests, for example to ensure that all of Amanda's spellings of "true" are parsed correctly, all the way up to full integration: runs of amdump and the recovery applications.


Continue reading "What's New in Amanda: Automated Tests"

Wednesday, February 17. 2010

Testing Legacy Code

I just read Roy Osherove's The Art of Unit Testing with Examples in .NET, on the advice of a slashdot review. I was not terribly impressed with the book, but reading it did help me to solidify my thinking about testing and test-driven development, and put words to concepts I had come to on my own.

Rather late in the book, Osherove describes three properties of good tests.

  • Trustworthiness - Do developers believe that passing tests mean things are working? Do developers believe that failing tests indicate a real bug?
  • Maintainability - Do developers think that tests are easy to add and maintain, or are they likely to avoid writing tests when rushed?
  • Readability - Do developers often consult the unit tests to see how the system under test is supposed to work?

What most struck me was that these properties were related to developers' perceptions of the tests, not the tests themselves. Tests are as much a social artifact of a project as a technical tool.

Buildbot's Tests

Around the time I was reading this, one of the more prolific Buildbot contributors commented, "I try not to change the tests - they scare me." Buildbot's tests were badly isolated, slow, and failed intermittently. As maintainer, I had grown accustomed to saying "oh, that test fails sometimes, don't worry about it" - a trustworithiness failure. Because of the terrible isolation, changing just about anything in Buildbot would cause dozens of tests to fail, requiring repetitive editing to fix - not maintainable. And the tests consisted of long sequences of operations and assertions, written in the Twisted style, which is already not readable. As a result, even I don't know what most of the tests are actually testing. This was a bad situation for any application, but particularly embarassing for a popular testing tool!

So I blew the tests away. Well, not really - I moved them to buildbot/broken_test/ in hopes they can be useful in writing new tests, and so that the braver souls among us can still run them. Now our metabuildbot is green, and I can legitimately ask for unit tests for new code.

There are costs associated with this move, too. A lot of people have worked very hard to write tests that have now been categorically labeled "broken," to whom all I can say is "I'm sorry". With far fewer tests and thus far worse coverage, it's also difficult to have confidence that Buildbot really works. The short-term workaround is to make a few beta releases and rely on real-world testing to suss out any problem.

So this is only the first step. We - I - still need to write real tests for the vast majority of the Buildbot code. That's particularly complicated because Buildbot's units are badly isolated, and interfaces are ill-defined. I will need to do a good bit of refactoring to bring it into compliance.

Friday, January 22. 2010

Object identity in Perl

I ran across a surprising weakness in Perl, regarding object identity. I was writing a function to handle XMsgs, which are messages used by the Amanda transfer architecture to indicate the progress of a transfer. Messages sent from any transfer element are delivered to the same handler, and in this case I needed to know which element had sent the message. Fortunately, XMsg objects have a elt attribute pointing to the sending element.


Continue reading "Object identity in Perl"

Notice

The postings on this site are my own and don't necessarily represent the opinions of Zmanda, Inc.