<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    
    <title>Code V.igoro.us - Code</title>
    <link>http://code.v.igoro.us/</link>
    <description>Dustin J. Mitchell</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.6 - http://www.s9y.org/</generator>
    <pubDate>Fri, 20 May 2011 23:07:54 GMT</pubDate>

    <image>
        <url>http://code.v.igoro.us/templates/default/img/s9y_banner_small.png</url>
        <title>RSS: Code V.igoro.us - Code - Dustin J. Mitchell</title>
        <link>http://code.v.igoro.us/</link>
        <width>100</width>
        <height>21</height>
    </image>

<item>
    <title>Nagios NSCA from Python</title>
    <link>http://code.v.igoro.us/archives/69-Nagios-NSCA-from-Python.html</link>
            <category>Code</category>
            <category>mozilla</category>
            <category>Sysadmin</category>
    
    <comments>http://code.v.igoro.us/archives/69-Nagios-NSCA-from-Python.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=69</wfw:comment>

    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=69</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;I&#039;ve been working on improving the monitoring of the build slaves at Mozilla.  As part of this project, I needed to be able to submit passive check results to the Nagios servers via &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/community.nagios.org/2009/06/11/nagios-setting-up-the-nsca-addon-for-passive-checks/&#039;]);&quot;  href=&quot;http://community.nagios.org/2009/06/11/nagios-setting-up-the-nsca-addon-for-passive-checks/&quot;&gt;NSCA&lt;/a&gt; during system startup.  I&#039;m doing this from a Python script that needs to run on a wide array of systems using whatever random Python is available.  We run some oddball stuff, so the common denominator is Python 2.4.&lt;/p&gt;

&lt;p&gt;It turns out that there&#039;s no Python NSCA library, although there is &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/search.cpan.org/dist/Net-Nsca/lib/Net/Nsca.pm&#039;]);&quot;  href=&quot;http://search.cpan.org/dist/Net-Nsca/lib/Net/Nsca.pm&quot;&gt;Net::Nsca&lt;/a&gt; in Perl.  So, I wrote one, and put it on github: &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/github.com/djmitche/pynsca&#039;]);&quot;  href=&quot;https://github.com/djmitche/pynsca&quot;&gt;https://github.com/djmitche/pynsca&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;At the moment, this only knows XOR, and only does service checks.  That&#039;s all I need, but hopefully it can be easily expanded to cover other purposes.  The one thing I want to avoid is adding mandatory requirements -- this should work, at least in plain-text and XOR modes, on a plain-vanilla Python installation.&lt;/p&gt;

&lt;p&gt;By the way, the startup script I&#039;m working on is &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/hg.mozilla.org/build/puppet-manifests/file/tip/modules/buildslave/files/runslave.py&#039;]);&quot;  href=&quot;http://hg.mozilla.org/build/puppet-manifests/file/tip/modules/buildslave/files/runslave.py&quot;&gt;runslave.py&lt;/a&gt;, which includes a modified copy of &lt;i&gt;pynsca&lt;/i&gt; and does a number of other housekeeping jobs as well.  More on that in a subsequent post. &lt;/p&gt;
 
    </content:encoded>

    <pubDate>Fri, 20 May 2011 16:55:11 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/69-guid.html</guid>
    
</item>
<item>
    <title>Amanda's Transfer Mechanisms</title>
    <link>http://code.v.igoro.us/archives/68-Amandas-Transfer-Mechanisms.html</link>
            <category>amanda</category>
    
    <comments>http://code.v.igoro.us/archives/68-Amandas-Transfer-Mechanisms.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=68</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=68</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;There&#039;s been a bit of confusion on the mailing list and IRC about how Amanda assembles transfers out of transfer elements, and how transfer mechanisms influence that.  &lt;/p&gt;

&lt;p&gt;In the final form of a transfer, any two adjacent elements must have
the &lt;em&gt;same&lt;/em&gt; mechanism.  For example is, an upstream element speaking
XFER_MECH_PUSH_BUFFER cannot talk to a downstream element using
XFER_MECH_READ_FD (nor, more confusingly, XFER_MECH_PULL_BUFFER).  So each mechanism is an
isolated definition of &quot;here&#039;s how upstream and downstream should
talk&quot;.  They come in pairs because generally anything upstream can do
to downstream (e.g., upstream can write to downstream&#039;s fd) can occur
in reverse (e.g., downstream can read from upstream&#039;s fd).&lt;/p&gt;

&lt;p&gt;What makes this confusing is that if you specify a set of elements which can&#039;t talk directly to one another, then xfer.c will add &quot;glue&quot; elements &lt;em&gt;between&lt;/em&gt; the specified elements.  To make that concrete, imagine you specify a transfer as&lt;/p&gt;

&lt;pre&gt;source-holding --&gt; filter-xor --&gt; dest-fd&lt;/pre&gt;

(if you like practical examples, then imagine filter-xor is a buffer-based decompression filter, and you&#039;re pulling data from holding disk, decompressing, and sending to a pipe -- something amfetchdump would do).  Here are the mechanisms supported by each element:

&lt;pre&gt;
source-holding:
 XFER_MECH_PULL_BUFFER
&lt;/pre&gt;

&lt;pre&gt;
filter-xor
 XFER_MECH_PULL_BUFFER (input) and
 XFER_MECH_PULL_BUFFER (output)
or
 XFER_MECH_PUSH_BUFFER (input) and
 XFER_MECH_PUSH_BUFFER (output)
&lt;/pre&gt;

&lt;pre&gt;
dest-fd
 XFER_MECH_WRITEFD (input)
&lt;/pre&gt;

&lt;p&gt;In putting these together, source-holding and filter-xor can use the same mechanism (PULL_BUFFER).  This leaves filter-xor using PULL_BUFFER for output, but dest-fd does not support this.  So xfer.c adds a glue element that can speak PULL_BUFFER on input and WRITEFD on output.  This element basically loops in a thread, calling upstream-&gt;pull_buffer and write(downstream-&gt;input_fd, buffer).  So the final xfer looks like&lt;/p&gt;

&lt;pre&gt;
 source-holding --(PULL_BUFFER)--&gt; filter-xor --(PULL_BUFFER)--&gt; glue --(WRITEFD)--&gt; dest-fd
&lt;/pre&gt;

&lt;p&gt;Hopefully that helps to explain how the glue works.&lt;/p&gt;

&lt;p&gt;Note that one of the cool things about this arrangement is that in most cases the complexity is in the glue, not the elements.  In fact, in this case the glue provides the only thread that&#039;s required to run this transfer, so the other three element implementations don&#039;t need to manage threads at all. &lt;/p&gt;
 
    </content:encoded>

    <pubDate>Sat, 22 Jan 2011 14:47:13 -0600</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/68-guid.html</guid>
    
</item>
<item>
    <title>virtualenv for Perl</title>
    <link>http://code.v.igoro.us/archives/66-virtualenv-for-Perl.html</link>
            <category>Code</category>
    
    <comments>http://code.v.igoro.us/archives/66-virtualenv-for-Perl.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=66</wfw:comment>

    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=66</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;I absolutely love &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/pypi.python.org/pypi/virtualenv&#039;]);&quot;  href=&quot;http://pypi.python.org/pypi/virtualenv&quot;&gt;virtualenv&lt;/a&gt; for Python development.  It allows me to develop &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/buildbot.net&#039;]);&quot;  href=&quot;http://buildbot.net&quot;&gt;Buildbot&lt;/a&gt; against several versions of Python and several versions of its dependencies, without modifying my system&#039;s Python installation at all!&lt;/p&gt;

&lt;p&gt;Now, I need to do the same thing in Perl.  So I thought I&#039;d compare the two side-by-side.&lt;/p&gt;

&lt;p&gt;&lt;h2&gt;virtualenv&lt;/h2&gt;&lt;/p&gt;

&lt;pre&gt;
# install virtualenv locally
wget http://bitbucket.org/ianb/virtualenv/raw/tip/virtualenv.py
# set up a sandbox
python virtualenv.py sandbox
# activate it
source sandbox/bin/activate
# start installing stuff
easy_install buildbot
&lt;/pre&gt;

&lt;h2&gt;local::lib&lt;/h2&gt;

&lt;p&gt;There is no local::lib gentoo ebuild!  I&#039;m sure there&#039;s a good reason, but that&#039;s odd all the same!&lt;/p&gt;

&lt;pre&gt;
# install local::lib locally
wget http://search.cpan.org/CPAN/authors/id/G/GE/GETTY/local-lib-1.006007.tar.gz
tar -zxf local-lib-1.006007.tar.gz
cd local-lib-1.006007
perl Makefile.PL --bootstrap # (accept lots of defaults)
make test &amp;&amp;amp; make install
# activate it (permanently)
echo &#039;eval $(perl -I$HOME/perl5/lib/perl5 -Mlocal::lib)&#039; &gt;&gt;~/.bashrc
eval $(perl -I$HOME/perl5/lib/perl5 -Mlocal::lib)
# start installing stuff
perl -MCPAN -e install Config::General # (I have to accept the same defaults again??)
&lt;/pre&gt;

&lt;p&gt;I think virtualenv is the clear winner here!&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Thu, 11 Nov 2010 10:48:43 -0600</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/66-guid.html</guid>
    
</item>
<item>
    <title>IPv6 and Amanda</title>
    <link>http://code.v.igoro.us/archives/61-IPv6-and-Amanda.html</link>
            <category>amanda</category>
            <category>Sysadmin</category>
    
    <comments>http://code.v.igoro.us/archives/61-IPv6-and-Amanda.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=61</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=61</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;Amanda joined the IPv6 revolution in November 2006 - all of the BSD-style authentication mechanisms can support IPv6 endpoints.  However, it&#039;s generally agreed that this was a mistake, and in this post I will talk about why that&#039;s the case. First, a bit of background on how Amanda&#039;s networking code works, and what had to change to support IPv6.  Amanda supports security mechanisms called BSD (the oldest), BSDUDP, and BSDTCP.  These all authenticate (if you can call it that) using the same sorts of checks that rsh uses.  The incoming connection is accepted if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it is from a &quot;reserved&quot; port (less than 1024);&lt;/ii&gt;
&lt;li&gt;the address of the initiator has complementary forward and reverse DNS records in place; and&lt;/li&gt;
&lt;li&gt;the initiator&#039;s hostname is in &lt;tt&gt;.amandahosts&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;During a backup operation, the Amanda server contacts each client host.  When using the BSD authentications, this triggers &lt;tt&gt;amandad&lt;/tt&gt;, which checks the above restrictions before beginning communication with the server.  This initial connection is packet-based, and can be carried out over UDP (for BSD and BSDUDP) or TCP (BSDTCP).  When a dump begins, several &quot;streams&quot; are opened to transmit the data, index, and metadata.  For BSD and BSDUDP, each stream is implemented as a distinct TCP connection, where the client sends a port number to the server and the server connects to that port.  BSDTCP multiplexes all streams over a single TCP connection using a basic type/length packet encapsulation.&lt;/p&gt;

&lt;p&gt;The first challenge in adding IPv6 support was to deal properly with IPv6 addresses when querying the DNS.  That meant switching to getaddrinfo and getnameinfo, as suggested by &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.kame.net/newsletter/19980604/&#039;]);&quot;  href=&quot;http://www.kame.net/newsletter/19980604/&quot;&gt;Jun-ichiro itojun Itoh&lt;/a&gt;. These functions bring their own compatibility problems, but Amanda uses &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.gnu.org/software/gnulib/&#039;]);&quot;  href=&quot;http://www.gnu.org/software/gnulib/&quot;&gt;gnulib&lt;/a&gt;, which provides compatibile implementations on systems where they are not available, minimizing the difficulty.&lt;/p&gt;

&lt;p&gt;We had a lot of trouble from systems such as RHEL3 possessing IPv6 support in the compiler environment but not in the kernel.  On such systems, code using constants like AF_INET6 or AI_V4MAPPED would compile without problems, but fail at runtime.  We added a WORKING_IPV6 preprocessor conditional, without which all references to IPv6-related symbols were removed.  At configure time, Amanda tries to create an IPv6 socket, and sets this conditional to true if it succeeds.  The &lt;tt&gt;--without-ipv6&lt;/tt&gt; configure option forcibly disables IPv6 support.&lt;/p&gt;

&lt;p&gt;The sockaddr structures and API for IPv6 are fairly difficult to use, particularly if it&#039;s not known in advance what sort of address they will contain.  We added a set of macros and utility functions in &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/github.com/zmanda/amanda/blob/master/common-src/sockaddr-util.c&#039;]);&quot;  href=&quot;http://github.com/zmanda/amanda/blob/master/common-src/sockaddr-util.c&quot;&gt;sockaddr-util.c&lt;/a&gt; and &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/github.com/zmanda/amanda/blob/master/common-src/sockaddr-util.h&#039;]);&quot;  href=&quot;http://github.com/zmanda/amanda/blob/master/common-src/sockaddr-util.h&quot;&gt;sockaddr-util.h&lt;/a&gt;.  Using these macros throughout Amanda removed a significant amount of code that was conditionalized on both compile-time support and runtime address family, and centralized that logic in one easily-maintained place.&lt;/p&gt;

&lt;p&gt;On our build systems, we had to deal with different levels of support in the compile environment and the kernel.  This is fine: most Amanda users install binary packages that are produced on roughly the same OS distribution and version as was used for the build, so the kernel support is generally the same.  However, a third variable has tripped up lots of Amanda users: system configuration.  In particular, several newer Linux distributions have shipped with &lt;tt&gt;localhost&lt;/tt&gt; resolving to ::1 vi &lt;tt&gt;/etc/hosts&lt;/tt&gt;, but without enough interface configuration to actually utilize a socket bound to that address.  Amanda uses localhost sockets for inter-process communication, so this misconfiguration causes backup operations to fail.  The solution is to either finish configuring IPv6 on the host, remove the reference to ::1 in &lt;tt&gt;/etc/hosts&lt;/tt&gt;, or build Amanda with &lt;tt&gt;--without-ipv6&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;I have not yet heard of an Amanda installation where IPv6 communication is in use.  But I have heard from countless IPv4 users whose Amanda installations have failed due to bad IPv6 support.  At the moment, then, I feel that adding IPv6 support to Amanda has been a net negative for the project.  Although there is doubtless room for improvement, I will not entertain patches for better IPv6 support, for fear they will introduce new bugs for our exclusively IPv4 userbase.&lt;/p&gt;

&lt;p&gt;Of course, all of this may change as dual-stack networks grow more prevalent and are replaced by pure IPv6 networks!&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Fri, 16 Jul 2010 22:04:00 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/61-guid.html</guid>
    
</item>
<item>
    <title>What's New in Amanda: The End of Fragmentation</title>
    <link>http://code.v.igoro.us/archives/59-Whats-New-in-Amanda-The-End-of-Fragmentation.html</link>
            <category>amanda</category>
    
    <comments>http://code.v.igoro.us/archives/59-Whats-New-in-Amanda-The-End-of-Fragmentation.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=59</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=59</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;Most of my posts in this series have been about features that are available in a released version of Amanda.  This time, I want to share a project I&#039;m working on right now - one that will be available in Amanda-3.2.  I&#039;m reworking the way Amanda writes its data to tape (or any other kind of storage) to make it more efficient, more reliable, and simpler to configure.&lt;/p&gt;

&lt;p&gt;Historically, Amanda&#039;s conservative approach to finicky tape hardware has meant that it wasted some space at the end of each tape.  With the changes I&#039;m working on, Amanda will no longer waste this space, and can also avoid some needless copying of data in most cases, with a minimum of additional risk. &lt;h1&gt;Amanda&#039;s Storage Format&lt;/h1&gt;&lt;/p&gt;

&lt;p&gt;Before examining the new functionality, let&#039;s look at Amanda&#039;s storage format.  Amanda treats all storage devices like tapes&lt;sup&gt;1&lt;/sup&gt; - a set of sequentially numbered data files of arbitrary size, each composed of a sequence of fixed-size blocks.  Each file begins with a one-block header that identifies the dump and gives information about its contents.  The header is followed by blocks of raw data.&lt;/p&gt;

&lt;p&gt;Amanda supports writing a dump across multiple tapes - spanning.&lt;sup&gt;2&lt;/sup&gt;  The technique is this: a dump is split into a sequence of parts, and each part is written a a single file on a volume.  During recovery, Amanda reads the parts in sequence, and concatenates their data to reproduce the original dumpfile.  Usually all parts are the same size, and this size is generally 1-5% of the tape capacity.&lt;/p&gt;

&lt;h1&gt;Better Safe than Sorry - At a Cost&lt;/h1&gt;

&lt;p&gt;Amanda was originally designed around tape drives - in fact, if you look at the history of Linux kernel support for tape drives, it is closely intertwined with Amanda development.  Tape drives are finicky beasts, and in many cases cannot distinguish the end of the tape (called EOM) from any other fatal error.  Worse, tape drives employ large caches to ensure they can write continuously, and when an error occurs all of the data in that cache is lost, and there is no way for Amanda to determine how much actually made it onto the tape.  Beginning a new on-tape file (writing a filemark) flushes the cache and signals any errors immediately.&lt;/p&gt;

&lt;p&gt;Since time immemorial, then, Amanda has treated any error from the tape drive as EOM, and assumed that all data written since the last filemark is potentially corrupt.  That means that the part in progress when the error occurred is logged as PARTIAL, and Amanda will start at the beginning of that part on the next tape.&lt;/p&gt;

&lt;p&gt;The PARTIAL part is recorded in the catalog, but will not be used for recovery, so it is effectively wasted space.  A little arithmetic will tell you that, on average, each tape will waste half of the part size.  This is at least excusable with real tape drives; with vtapes (disk), this wasted space is completely unnecessary.  Worse, vtapes are most flexible when they are kept small and dumps are spanned over many vtapes per night; but the wasted space increases linearly with the number of vtapes used.&lt;/p&gt;

&lt;p&gt;In order to rewind a part and write it again on a new tape, Amanda also needs to keep its data somewhere, called the part cache.  When the dump is on the holding disk, the holding disk acts as a part cache.  Otherwise, Amanda can cache parts in memory or on disk.  Caching in memory is faster, but requires a lot of RAM for a reasonable part size.  Caching on disk allows larger parts, but is considerably slower.&lt;/p&gt;

&lt;h1&gt;Logical EOM&lt;/h1&gt;

&lt;p&gt;More recent tape drives (those made in the last decade or so) have a feature called &quot;early warning&quot;.  With this feature enabled, the drive alerts the host system when it is &quot;near&quot; EOM, and flushes the cache to tape.  Exactly what &quot;near&quot; means is not specified in the SCSI standard, but in general there&#039;s room to flush the cache and write a filemark, at least.  This is sometimes called a logical EOM - LEOM.&lt;/p&gt;

&lt;p&gt;Amanda can take advantage of this functionality to cleanly finish a part before running headlong into a physical EOM.  This eliminates the wasted space for a PARTIAL part, and also eliminates the need to cache parts, since rewinding is not required.  In one small change (well, OK, it&#039;s &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/github.com/zmanda/amanda/compare/1bc521c5ce1be2d144fefc8ce37917c55ab690e8...6a087798ea0c9945093226150da37d5af49d1810&#039;]);&quot;  href=&quot;http://github.com/zmanda/amanda/compare/1bc521c5ce1be2d144fefc8ce37917c55ab690e8...6a087798ea0c9945093226150da37d5af49d1810&quot;&gt;about 4,300 lines&lt;/a&gt;), Amanda gets faster and uses storage space more efficiently.  What&#039;s not to love?&lt;/p&gt;

&lt;p&gt;Better yet, all of the non-tape devices (vtapes, S3 devices, DVD-ROMs, etc.) can easily emulate LEOM, so backups to these devices will automatically benefit from this improvement.&lt;/p&gt;

&lt;h1&gt;In the Code&lt;/h1&gt;

&lt;p&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/github.com/zmanda/amanda/compare/1bc521c5ce1be2d144fefc8ce37917c55ab690e8...6a087798ea0c9945093226150da37d5af49d1810&#039;]);&quot;  href=&quot;http://github.com/zmanda/amanda/compare/1bc521c5ce1be2d144fefc8ce37917c55ab690e8...6a087798ea0c9945093226150da37d5af49d1810&quot;&gt;Three important patches&lt;/a&gt; toward this functionality were just committed.  What remains is to set up real LEOM support for the VFS device (vtapes) and for the tape device.&lt;/p&gt;

&lt;p&gt;The VFS device can, of course, trivially emulate LEOM when it is enforcing the MAX_VOLUME_USAGE property - the vtape length.  However, predicting when a filesystem will run out of space is much more difficult.  We are still discussing options, and I would love to hear suggestions here or on the mailing list.&lt;/p&gt;

&lt;p&gt;As for the tape device, it will assume that LEOM is not supported unless the user configures it explicitly (with the &quot;LEOM&quot; device property) or we can determine support for LEOM from the operating system at runtime.  Unfortunately, this is one of those areas so technical that only a half-dozen people know how it works, so it may take me some time to track down this information for non-Linux operating systems.  Again, advice and assistance is welcome!&lt;/p&gt;

&lt;hr /&gt;

&lt;ul&gt;
&lt;li&gt;&lt;sup&gt;1&lt;/sup&gt;This is an ages-old design decision, but one that artificially constrains Amanda&#039;s flexibility, especially with vtapes.
&lt;li&gt;&lt;sup&gt;2&lt;/sup&gt;In fact, Amanda has supported spanning for something like 7 years now.  Yet I occasionally see users in #amanda complaining about this serious limitation and wondering when we&#039;re going to do anything about it.  Will 2003 be soon enough?
&lt;/ul&gt;
 
    </content:encoded>

    <pubDate>Thu, 08 Jul 2010 18:07:00 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/59-guid.html</guid>
    
</item>
<item>
    <title>What's New in Amanda: Hackability</title>
    <link>http://code.v.igoro.us/archives/45-Whats-New-in-Amanda-Hackability.html</link>
            <category>amanda</category>
    
    <comments>http://code.v.igoro.us/archives/45-Whats-New-in-Amanda-Hackability.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=45</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=45</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;It&#039;s been a while since I&#039;ve posted about recent development in &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/amanda.org&#039;]);&quot;  href=&quot;http://amanda.org&quot;&gt;Amanda&lt;/a&gt;, but it&#039;s not for lack of interesting topics!&lt;/p&gt;

&lt;p&gt;Today I want to talk a little bit about Amanda&#039;s development.  Historically, Amanda has always had a small, core group of developers who do the lion&#039;s share of the development work.  There are probably lots of reasons for this, not least of which is that a backup application isn&#039;t the sexiest project on which to spend your spare time.  But I think there&#039;s a deeper reason, and it has to do with hackability.
 Amanda was originally written in C, which means any changes require the full set of developer tools.  For example, just to fix a typo in an error message, you would need to find the source, configure and compile it, find and fix the error message, and recompile to test.  Fixing something more substantial in the highly interdependent Amanda codebase also requires a deep understanding of many parts of Amanda - from the obscure configuration interface to the oddly interlinked disklist structure.  This level of programming skill is not common among Amanda&#039;s user base (systems administrators), and I can count the people who understand the disklist structure on one hand.&lt;/p&gt;

&lt;p&gt;The result has been a paltry flow of patches from anyone but the core hackers.  Furthermore, no entry path has been available by which newcomers could work their way up to being core developers.  While I don&#039;t want to disparage the work of any of the great programmers who have written Amanda over the years, it&#039;s a shame that there have been so few at any time, and I worry about what would happen if the number were to reach zero.  So what to do?&lt;/p&gt;

&lt;h1&gt;$hackability++&lt;/h1&gt;

&lt;p&gt;We&#039;ve done a few things to try to make Amanda more hackable.  Probably the biggest change is to rewrite parts of Amanda in Perl.  I&#039;m asked &quot;why&quot; quite often, and while we had a lot of reasons, two relate directly to hackability.  First, more sysadmins know Perl than C, because Perl is quite often used to build the &quot;glue&quot; that links systems together.  Interestingly, based on many conversations, it seems that Python may also have been a good choice, as I &lt;a href=&quot;http://code.v.igoro.us/archives/12-Perl-vs.-Python.html&quot;&gt;suspected when I first proposed the rewrite&lt;/a&gt;.  But it&#039;s too late now!&lt;/p&gt;

&lt;p&gt;Second, and more importantly, Perl code can be hacked in place.  If &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/man/amvault.8.html&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/man/amvault.8.html&quot;&gt;amvault&lt;/a&gt; isn&#039;t acting the way you want it to, just open up &lt;tt&gt;/usr/sbin/amvault&lt;/tt&gt; and tweak away.  No need to download the source, no need to compile, no segmentation faults, just hacking.  When you&#039;re done, run a quick &lt;tt&gt;diff&lt;/tt&gt; and send the results to amanda-hackers.&lt;/p&gt;

&lt;p&gt;Even users who do not know Perl can take advantage of this &lt;i&gt;in-situ&lt;/i&gt; hackability.  Within Amanda&#039;s C code, if I want a user to try a patch, that user must figure out how to download Amanda&#039;s source, apply the patch, configure, compile, and install.  None of those steps are trivial.  With Perl code, I can often provide a patch that is simple enough to be applied directly to the installed executables by hand, or with a simple application of &lt;tt&gt;patch&lt;/tt&gt;.  Everyone stays focused on the bug under investigation, and the user&#039;s backups are up and running that much more quickly.&lt;/p&gt;

&lt;h1&gt;New APIs&lt;/h1&gt;

&lt;p&gt;As I mentioned before, historically Amanda&#039;s code has been highly interdependent.  Details of the implementation of the holding disk were spread over most of the files in the server implementation.  The dumplevel -987 has a special meaning that is documented nowhere, but referenced in several source files.  All of this makes new development difficult, because it&#039;s impossible to &quot;slice off&quot; and study a portion of Amanda in isolation.&lt;/p&gt;

&lt;p&gt;The solution here is to create abstract interfaces, where new functionality can be &quot;plugged in&quot; and Amanda can use it without changes.  The Amanda developers have abused the term &quot;API&quot; for these interfaces, and we now have quite a few:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/index.php/Application_API&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/index.php/Application_API&quot;&gt;Application API&lt;/a&gt; - an abstraction of backup clients, e.g., &lt;a href=&quot;http://code.v.igoro.us/archives/50-Whats-New-in-Amanda-Postgres-Backups.html&quot;&gt;ampgsql&lt;/a&gt; for Postgres;
&lt;li&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/index.php/Device_API&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/index.php/Device_API&quot;&gt;Device API&lt;/a&gt; - an abstraction of backend storage devices, such as tape, disk, cloud, or DVD-RW;
&lt;li&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/index.php/Changer_API&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/index.php/Changer_API&quot;&gt;Changer API&lt;/a&gt; - an abstraction of tape changers and other mechanisms for selecting from a set of volumes; and
&lt;li&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/index.php/Script_API&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/index.php/Script_API&quot;&gt;Script API&lt;/a&gt; - a means of invoking scripts before or after certain events during a backup.
&lt;/ul&gt;

&lt;p&gt;This strategy has already paid off: we have seen several new scripts and applications contributed, and the DVD-RW device arrived out of the blue as a contribution from someone who found it useful.&lt;/p&gt;

&lt;h1&gt;Other Changes&lt;/h1&gt;

&lt;p&gt;In the interest of greater accessibility to new hackers, we have also put Amanda on &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/github.com/zmanda/amanda&#039;]);&quot;  href=&quot;http://github.com/zmanda/amanda&quot;&gt;github&lt;/a&gt; and created a set of good &quot;beginner&quot; projects.  Zmanda has even offered to &lt;a href=&quot;http://code.v.igoro.us/archives/53-Want-to-work-on-Amanda.html&quot;&gt;pay people to hack on Amanda&lt;/a&gt;, as a way of easing the cost of entry.  I also try to point out interesting projects on the Amanda mailing list, particularly projects that Jean-Louis and I probably will not find time to work on.&lt;/p&gt;

&lt;p&gt;The idea here is to encourage new hackers to pick up a well-scoped project to become familiar with Amanda.  The hackers can then move on to more sophisticated projects that meet their particular backup needs or address a particular interest.&lt;/p&gt;

&lt;h1&gt;Will You Join Me?&lt;/h1&gt;

&lt;p&gt;So Amanda is ready for you.  When can you start?&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Thu, 01 Jul 2010 22:29:00 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/45-guid.html</guid>
    
</item>
<item>
    <title>Modern Multiprocessing</title>
    <link>http://code.v.igoro.us/archives/29-Modern-Multiprocessing.html</link>
            <category>multiprocessing</category>
    
    <comments>http://code.v.igoro.us/archives/29-Modern-Multiprocessing.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=29</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=29</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;I&#039;ve been thinking a lot lately about the way we accomplish multiprocessing.  We&#039;ve seen a significant change in the operation of Moore&#039;s law for CPU speeds: today&#039;s CPUs are about the same speed as those of a few years ago, but they have more cores, and more virtual processors on those cores.  This is great for heavily-loaded servers, which have plenty of distinct tasks to place on those cores and VCPUs, but not so useful for users working with single-threaded applications.&lt;/p&gt;

&lt;p&gt;Why are most applications still single-threaded?  There are lots of good reasons. Threaded code is harder to write, and not just because it requires careful analysis and use of synchronization primitives: many common tasks are difficult to meaningfully parallelize without careful control over inter-thread communication, and in a portable application you don&#039;t have that kind of control.  Threaded code generally performs badly on single-CPU systems, which are still common.  Some popular languages still make threading difficult, at least in a portable fashion.  And threads are still relatively heavyweight entities in most operating systems: you don&#039;t spawn ten threads to mergesort a 100-item array.&lt;/p&gt;

&lt;p&gt;Some of these problems will go away with a little more time, but some will get worse.  &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/en.wikipedia.org/wiki/Non-Uniform_Memory_Access&#039;]);&quot;  href=&quot;http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access&quot;&gt;NUMA&lt;/a&gt; architectures can make sharing data between threads slow.  Hyperthreading and its interaction with processor caches adds yet another level of unpredictability.&lt;/p&gt;

&lt;p&gt;We know how to build massively parallel systems that run massively parallel algorithms.  What is still unknown is how to build portable, simple software that can run efficiently across a vareity of architectures.  This is a problem of practice, not theory, and there&#039;s lots of interesting work going on in this area.&lt;/p&gt;

&lt;p&gt;Of course, there are languages designed explicitly to support communication, such as &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.vitanuova.com/inferno/papers/limbo.html&#039;]);&quot;  href=&quot;http://www.vitanuova.com/inferno/papers/limbo.html&quot;&gt;Limbo&lt;/a&gt; or &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.erlang.org/index.html&#039;]);&quot;  href=&quot;http://www.erlang.org/index.html&quot;&gt;Erlang&lt;/a&gt;, &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.haskell.org/&#039;]);&quot;  href=&quot;http://www.haskell.org/&quot;&gt;Haskell&lt;/a&gt;, and &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/clojure.org/&#039;]);&quot;  href=&quot;http://clojure.org/&quot;&gt;Clojure&lt;/a&gt;.  For the most part, these languages are structured as communicating sequential processes, which is to say that they represent multiprocessing as a set of sequential threads that pass information to one another.  Problems of thread safety are subsumed by the languages, but mapping the parallelism to available resources is generally left to the programmer or administrator.&lt;/p&gt;

&lt;p&gt;One interesting project is Apple&#039;s &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/developer.apple.com/mac/articles/cocoa/introblocksgcd.html&#039;]);&quot;  href=&quot;http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html&quot;&gt;Grand Central Dispatch&lt;/a&gt;.  It defines a simple but highly expressive closure syntax (a block) and a mechanism to dynamically schedule execution of such closures (queues).  Critically, the GCD library takes care of scaling the parallelism of the queue processing appropriately to the underlying hardware.  On a single-threaded CPU, this amounts to cooperative multitasking, but on parallel hardware the operating system can dynamically allocate virtual CPUs to applications needing more parallelism.&lt;/p&gt;

&lt;p&gt;This topic seems to come up often in my various pursuits, so I will return to it again. &lt;/p&gt;
 
    </content:encoded>

    <pubDate>Mon, 12 Apr 2010 15:24:00 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/29-guid.html</guid>
    
</item>
<item>
    <title>Want to work on Amanda?</title>
    <link>http://code.v.igoro.us/archives/53-Want-to-work-on-Amanda.html</link>
            <category>amanda</category>
    
    <comments>http://code.v.igoro.us/archives/53-Want-to-work-on-Amanda.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=53</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=53</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;I&#039;ve not made any secret of the fact that I want more people hacking on Amanda.  This is both for selfish reasons -- many hands make light work -- and for altruistic reasons -- a broader community of developers can provide better governance for the project and long-term continuity.  With a few noticable exceptions, I haven&#039;t had a lot of satisfaction.&lt;/p&gt;

&lt;p&gt;I think part of the reason is that Amanda has a steep learning curve, even within the new Perl code.  The time to climb that curve is a big investment, and folks with only a small itch to scratch can&#039;t afford it.&lt;/p&gt;

&lt;p&gt;In an effort to sweeten the pot, we (Zmanda) are offering to pay for flexible work on Amanda.  Part-time or full-time, on your own schedule.  Your choice of projects.  Support and gratitude from the other hackers.  And the option to become a full Zmanda employee if that&#039;s your bent.&lt;/p&gt;

&lt;p&gt;Here are some possible projects, to pique your interest:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MySQL application (to round out the set with &lt;tt&gt;&lt;a href=&quot;http://code.v.igoro.us/archives/50-Whats-New-in-Amanda-Postgres-Backups.html&quot;&gt;ampgsql&lt;/a&gt;&lt;/tt&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/cyrusimap.web.cmu.edu/&#039;]);&quot;  href=&quot;http://cyrusimap.web.cmu.edu/&quot;&gt;Cyrus Imapd&lt;/a&gt; application (gnutar doesn&#039;t deal well with the application&#039;s tiny files and hard links)&lt;/li&gt;
&lt;li&gt;OpenSSL for network transport, using certificates and keys for authentication&lt;/li&gt;
&lt;li&gt;Database-backed backup catalog&lt;/li&gt;
&lt;li&gt;Amvault upgrade&lt;/li&gt;
&lt;li&gt;Handle Logical EOM (LEOM) on all devices that support it, drastically reducing the number of parts Amanda writes&lt;/li&gt;
&lt;li&gt;Support for more cloud backends than just S3&lt;/li&gt;
&lt;li&gt;Parallel writes to multiple devices&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you&#039;re interested, contact me (&lt;tt&gt;dustin@zmanda.com&lt;/tt&gt;) and we&#039;ll work something out! &lt;/p&gt;
 
    </content:encoded>

    <pubDate>Mon, 12 Apr 2010 14:04:34 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/53-guid.html</guid>
    
</item>
<item>
    <title>What's New in Amanda: Postgres Backups</title>
    <link>http://code.v.igoro.us/archives/50-Whats-New-in-Amanda-Postgres-Backups.html</link>
            <category>amanda</category>
    
    <comments>http://code.v.igoro.us/archives/50-Whats-New-in-Amanda-Postgres-Backups.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=50</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=50</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;In the second installment a series of posts about recent work on &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/amanda.org&#039;]);&quot;  href=&quot;http://amanda.org&quot;&gt;Amanda&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The Application API allows Amanda to back up &quot;structured&quot; data -- data that cannot be handled well by &#039;dump&#039; or &#039;tar&#039;.  Most databases fall into this category, and with the 3.1 release, Amanda ships with &lt;tt&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/man/ampgsql.8.html&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/man/ampgsql.8.html&quot;&gt;ampgsql&lt;/a&gt;&lt;/tt&gt;, which supports backing up &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.postgresql.org&#039;]);&quot;  href=&quot;http://www.postgresql.org&quot;&gt;Postgres&lt;/a&gt; databases using the software&#039;s &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.postgresql.org/docs/current/static/continuous-archiving.html&#039;]);&quot;  href=&quot;http://www.postgresql.org/docs/current/static/continuous-archiving.html&quot;&gt;point-in-time recovery&lt;/a&gt; mechanism.&lt;/p&gt;

&lt;p&gt;The how-to for this application is &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/index.php/How_To:Use_Amanda_to_Back_Up_PostgreSQL&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/index.php/How_To:Use_Amanda_to_Back_Up_PostgreSQL&quot;&gt;on the Amanda wiki&lt;/a&gt;. &lt;h2&gt;Operation&lt;/h2&gt;
Postgres, like most &quot;advanced&quot; databases, uses a logging system to ensure consistency even in the face of (some) hardware failures.  In essence, it writes every change that it makes to the database to the logfile &lt;i&gt;before&lt;/i&gt; changing the database itself.  This is similar to the operation of logging filesystems.  The idea is that, in the face of a failure, you just replay the log to re-apply any potentially corrupted changes.&lt;/p&gt;

&lt;p&gt;Postgres calls its log files WAL (write-ahead log) files.  By default, they are 16MB.  Postgres runs a shell command to &quot;archive&quot; each logfile when it is full.&lt;/p&gt;

&lt;p&gt;So there are two things to back up: the data itself, which can be quite large, and the logfiles.  A full backup works like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Execute &lt;tt&gt;PG_START_BACKUP(ident)&lt;/tt&gt; with some unique identifier.&lt;/li&gt;
&lt;li&gt;Dump the data directory, excluding the active WAL logs.  Note that the database is still in operation at this point, so the dumped data, taken alone, will be inconsistent.&lt;/li&gt;
&lt;li&gt;Execute &lt;tt&gt;PG_STOP_BACKUP()&lt;/tt&gt;.  This archives a text file with the suffix &lt;tt&gt;.backup&lt;/tt&gt; that indicates which WAL files are needed to make the dumped data consistent again.&lt;/li&gt;
&lt;li&gt;Dump the required WAL files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An incremental backup, on the other hand, only requires backing up the already-archived WAL files.&lt;/p&gt;

&lt;p&gt;A restore is still a manual operation -- a DBA would usually want to perform a restore very carefully.  The process is described on the wiki page linked above, but boils down to restoring the data directory and the necessary WAL files, then providing postgres with a shell command to &quot;pull&quot; the WAL files it wants.  When postgres next starts up, it will automatically enter recovery mode and replay the WAL files as necessary.&lt;/p&gt;

&lt;h2&gt;Quiet Databases&lt;/h2&gt;

&lt;p&gt;On older Postgres versions, making a full backup of a quiet database is actually impossible.  After &lt;tt&gt;PG&lt;em&gt;STOP&lt;/em&gt;BACKUP()&lt;/tt&gt; is invoked, the final WAL file required to reconstruct a consistent database is still &quot;in progress&quot; and thus not archived yet.  Since the database is quiet, postgres does not get any closer to archiving that WAL file, and the database hangs (or, in the case of ampgsql, times out).&lt;/p&gt;

&lt;p&gt;Newer versions of Postgres do the obvious thing: &lt;tt&gt;PG&lt;em&gt;STOP&lt;/em&gt;BACKUP()&lt;/tt&gt; &quot;forces&quot; an early arciving of the current WAL file.&lt;/p&gt;

&lt;p&gt;The best solution for older versions is to make sure transactions are being committed to the database all the time.  If the database is truly silent during the dump (perhaps it is only accessed during working hours), then this may mean writing garbage rows to a throwaway table:&lt;/p&gt;

&lt;pre&gt;
CREATE TABLE push_wal AS SELECT * FROM GENERATE_SERIES(1, 500000);
DROP TABLE push_wal;
&lt;/pre&gt;

&lt;p&gt;Note that using &lt;tt&gt;CREATE TEMPORARY TABLE&lt;/tt&gt; will not work, as temporary tables are not written to the WAL file.&lt;/p&gt;

&lt;p&gt;As a brief encounter in &lt;tt&gt;#postgres&lt;/tt&gt; taught me, another option is to upgrade to a more modern version of Postgres!&lt;/p&gt;

&lt;h2&gt;Log Incremental Backups&lt;/h2&gt;

&lt;p&gt;DBAs and backup admins generally want to avoid making frequent full backups, since they&#039;re so large.  The usual pattern is to make a full backup and then dump the archived log files on a nightly basis for a week or two.  As the log files are dumped, they can be deleted from the database server, saving considerable space.&lt;/p&gt;

&lt;p&gt;In Amanda terms, each of these dumps is an &quot;incremental&quot;, and is based on the previous night&#039;s backup.  That means that the dump after the full is level 1, the next is level 2, and so on.  Amanda currently supports 99 levels, but this limit is fairly arbitrary and can be increased as necessary.&lt;/p&gt;

&lt;p&gt;The problem in ampgsql, as implemented, is that it allows Amanda to schedule incremental levels however it likes.  Amanda considers a level-&lt;i&gt;n&lt;/i&gt; backup to be everything that has changed since the last level-&lt;i&gt;n-1&lt;/i&gt; backup.  This works great for GNU tar, but not so well for Postgres.  Consider the following schedule:&lt;/p&gt;

&lt;table&gt;
&lt;tr&gt;&lt;th&gt;Monday&lt;/th&gt;&lt;td&gt;level 0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;Tuesday&lt;/th&gt;&lt;td&gt;level 1&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;Wednesday&lt;/th&gt;&lt;td&gt;level 2&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;Thursday&lt;/th&gt;&lt;td&gt;level 1&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;The problem is that the dump on Thursday, as a level 1, needs to capture all changes since the previous level 0, on Monday.  That means that it must contain all WAL files archived since Monday, so those WAL files must remain on the database server until Thursday.&lt;/p&gt;

&lt;p&gt;The fix to this is to only perform level 0 or level-&lt;i&gt;n+1&lt;/i&gt; dumps, where &lt;i&gt;n&lt;/i&gt; is the level of the last dump performed.  In the example above, this means either a level 0 or level 3 dump on Thursday.  A level 0 is a full backup and requires no history.  A level 3 would only contain WAL files archived since the level 2 dump on Wednesday, so any WAL files before that could be deleted from the database server.&lt;/p&gt;

&lt;p&gt;[EDIT: replaced &quot;corrupt&quot; with the more accurate &quot;inconsistent&quot;; clarified final example]&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Thu, 25 Mar 2010 18:53:00 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/50-guid.html</guid>
    
</item>
<item>
    <title>Happy Ada Lovelace Day</title>
    <link>http://code.v.igoro.us/archives/52-Happy-Ada-Lovelace-Day.html</link>
            <category>Code</category>
    
    <comments>http://code.v.igoro.us/archives/52-Happy-Ada-Lovelace-Day.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=52</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=52</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;Yesterday, March 24, was Ada Lovelace Day.  I was at &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/pumpingstationone.org/2010/03/lady-ada-lovelace-day-at-psone-tomorrow/&#039;]);&quot;  href=&quot;http://pumpingstationone.org/2010/03/lady-ada-lovelace-day-at-psone-tomorrow/&quot;&gt;Pumping Station: One&lt;/a&gt;, and decided to spend an hour or so writing something to honor the first computer programmer.  I was feeling singularly uninspired, and googling for &quot;Ada Lovelace&quot; didn&#039;t turn up anything interesting.  But it did give me an idea: write a program that googles for you!&lt;/p&gt;

&lt;p&gt;I haven&#039;t written much JavaScript lately, but I&#039;ve heard a lot about the work Google&#039;s done to provide easy JavaScript libraries and APIs.  I thought it&#039;d be interesting to try out some of these APIs.  It was!  I hacked up a quick HTML page, using a &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/code.google.com/apis/ajaxsearch/documentation/reference.html#_intro_GSearch&#039;]);&quot;  href=&quot;http://code.google.com/apis/ajaxsearch/documentation/reference.html#_intro_GSearch&quot;&gt;Searcher&lt;/a&gt; object, to search for Lovelace images.  Here&#039;s the code:&lt;/p&gt;

&lt;script src=&quot;http://gist.github.com/343932.js&quot;&gt;&lt;/script&gt;

&lt;p&gt;The only downside I&#039;ve found is that even if you request a &quot;large&quot; result set, you only get 8 results.  I don&#039;t know of a way to get subsequent results.  So this really only cycles through a few images.  Still, it only took me an hour, and most of that was remembering how to use the DOM.&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Thu, 25 Mar 2010 13:35:38 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/52-guid.html</guid>
    
</item>
<item>
    <title>Solving an Encoding Mystery</title>
    <link>http://code.v.igoro.us/archives/39-Solving-an-Encoding-Mystery.html</link>
            <category>Code</category>
    
    <comments>http://code.v.igoro.us/archives/39-Solving-an-Encoding-Mystery.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=39</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=39</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;I don&#039;t write about it here, but I&#039;ve been getting into brewing beer.  I downloaded an app for my iPhone, &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.ibrewmaster.com/iBrewMaster/Welcome.html&#039;]);&quot;  href=&quot;http://www.ibrewmaster.com/iBrewMaster/Welcome.html&quot;&gt;iBrewMaster&lt;/a&gt;, which helps me store recipes and track batches of homebrew through the brewing, fermeting, and serving stages.&lt;/p&gt;

&lt;p&gt;I recently decided to make a clone of Dogfish Head&#039;s &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.dogfish.com/brews-spirits/the-brews/year-round-brews/raison-detre.htm&#039;]);&quot;  href=&quot;http://www.dogfish.com/brews-spirits/the-brews/year-round-brews/raison-detre.htm&quot;&gt;Raison D&#039;être&lt;/a&gt;.  This beer is fantastic, but that&#039;s beside the point.  I added the recipe to the app, and clicked save.  In the menu, however, I saw &quot;Raison D&#039;√™tre&quot;.  Not pretty.  The app has a feature where you create a &quot;batch&quot; from a particular recipe.  I did so, and the name of the batch appeared as &quot;Raison D&#039;‚àö‚Ñ¢tre&quot;.  Even worse!
 I emailed the app developer, who replied almost immediately regarding how he was handling encodings.  I won&#039;t go into detail, except to say that he is careful to always encode strings before inserting them into the internal SQLite database.&lt;/p&gt;

&lt;p&gt;I wanted to give him more information than a simple &quot;well, it doesn&#039;t work and you should fix it!&quot;  So I set about trying to replicate this particular sequence of characters.  I know that Macs (and the iPhone is a Mac) have good support for encodings, so I assume the UI is not at fault.  I know that the strings go into SQLite in UTF-8, and that SQLite just treats them as bytestrings, so a later SELECT will return the same UTF-8 bytestring as was specified in the INSERT.  So the error must occurr somewhere between the SELECT and displaying the string on-screen.&lt;/p&gt;

&lt;h2&gt;Character Encodings&lt;/h2&gt;

&lt;p&gt;A word about encodings, with a bit of revisionist history.  In the beginning, there was the Unicode Character Set.  Every funny squiggle that the monks knew how to make on paper had a number - its Unicode codepoint.  There are a &lt;i&gt;lot&lt;/i&gt; of Unicode characters - &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.i18nguy.com/unicode/char-count.html&#039;]);&quot;  href=&quot;http://www.i18nguy.com/unicode/char-count.html&quot;&gt;95156&lt;/a&gt; at last count.&lt;/p&gt;

&lt;p&gt;Then computers were invented, and we needed to represent, or encode, these squiggles in only 7 bits each.  Someone (who probably only spoke English) decided &quot;well, you can only use the first 127 characters,&quot; and thus ASCII was born.  If you want to write an &quot;ñ&quot; in ASCII, you&#039;ll have to make do with &quot;n&quot;.  Don&#039;t even ask about &quot;葉&quot;.  We soon got a bit less stingy (get it?), and with 8 bits available, everyone rushed to put their favorite characters at code points 128-255 - regardless of what Unicode put there.  For example, in latin-1, &quot;ñ&quot; is at code point 240.  On a Mac, it&#039;s 150.  Trying to decode a byte in the range 128-255 was a challenge, because the encoding was usually unknown.  Those were dark days, as chaos reigned.&lt;/p&gt;

&lt;p&gt;Finally, some enlightened souls (Rob Pike and Ken Thompson) scratched out a new encoding &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt&#039;]);&quot;  href=&quot;http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt&quot;&gt;on a placemat&lt;/a&gt;.  In this encoding, called UTF-8, all of the 7-bit ASCII characters still fit in one byte, and look exactly the same.  But all of the other characters take more than one byte, with some cleverness applied to make the encoding both compact and easy to decode reliably.  As a side note, the Unicode characters 128-255 match the latin-1 character set.&lt;/p&gt;

&lt;p&gt;Now, if you have a sequence of unicode characters, say &quot;青蛙吃我的餃子&quot; that you want to store digitally, then you need to &lt;i&gt;encode&lt;/i&gt; them, preferably in UTF-8, with the result being a bytestring.  When you want the unicode characters back (perhaps to do some hyphenation), you perform the reverse operation, and &lt;i&gt;decode&lt;/i&gt; from UTF-8 to Unicode Characters.&lt;/p&gt;

&lt;h2&gt;Back to the Chase&lt;/h2&gt;

&lt;p&gt;A common mistake with multi-byte encodings is to assume that each byte is a distinct character, perhaps with a &lt;tt&gt;for&lt;/tt&gt; loop indexing an array of bytes.  Since one character (&quot;ê&quot;) turned into two (&quot;√™&quot;), this was a reasonable guess.  A little Python (in my Unicode-enabled terminal) shows me the encoded form of &quot;ê&quot;:&lt;/p&gt;

&lt;pre&gt;
&amp;gt;&amp;gt;&amp;gt; print `u&quot;ê&quot;` 
u&#039;\xc3\xaa&#039;
&lt;/pre&gt;

&lt;p&gt;Treating those as Unicode characters instead of bytes is simple:&lt;/p&gt;

&lt;pre&gt;
&amp;gt;&amp;gt;&amp;gt; print u&quot;Raison D&#039;\u00c3\u00aatre&quot;
Raison D&#039;Ãªtre
&lt;/pre&gt;

&lt;p&gt;No dice.  Another common mistake is to encode with one encoding, and decode with another.  This is especially common when the programming environment &quot;automatically&quot; performs encodings or decodings.  For example, Python has an annoying habit of decoding to ASCII, which produces the infamous &lt;tt&gt;UnicodeEncodeError&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;So let&#039;s try this out, guessing at the encoding that&#039;s used on the way out.&lt;/p&gt;

&lt;pre&gt;
&amp;gt;&amp;gt;&amp;gt; orig = u&quot;Raison D&#039;être&quot;
&amp;gt;&amp;gt;&amp;gt; print orig.encode(&#039;utf-8&#039;).decode(&#039;latin-1&#039;)
Raison D&#039;Ãªtre
&lt;/pre&gt;

&lt;p&gt;The result is the same as the single-byte treatment above.  Why?  Recall that the latin-1 encoding is identical to Unicode in the range 128-255, so treating a byte as a Unicode character is the same as treating it as a latin-1 character.&lt;/p&gt;

&lt;p&gt;At this point, I perused the list of encodings Python supports, and &quot;mac-roman&quot; jumped out as a potential culprit.&lt;/p&gt;

&lt;pre&gt;
&amp;gt;&amp;gt;&amp;gt; print orig.encode(&#039;utf-8&#039;).decode(&#039;mac-roman&#039;)
Raison D&#039;√™tre
&lt;/pre&gt;

&lt;p&gt;A match!  What about the longer string of nonsense in the batch name?&lt;/p&gt;

&lt;pre&gt;
&amp;gt;&amp;gt;&amp;gt; once = orig.encode(&#039;utf-8&#039;).decode(&#039;mac-roman&#039;)
&amp;gt;&amp;gt;&amp;gt; print once.encode(&#039;utf-8&#039;).decode(&#039;mac-roman&#039;)
Raison D&#039;‚àö‚Ñ¢tre
&lt;/pre&gt;

&lt;p&gt;Another match.&lt;/p&gt;

&lt;p&gt;I don&#039;t know much about iPhone internals, but I assume that the string library treats a bytestring without any attached encoding as being in the Mac-Roman character set.  When the value was selected out of the recipes table, this decoding was done implicitly, followed by an explicit UTF-8 encoding when inserting into the batches table, and another implicit Mac-Roman decoding when selecting the batch for display.&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Mon, 15 Mar 2010 10:52:00 -0500</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/39-guid.html</guid>
    
</item>
<item>
    <title>What's New in Amanda: Transfer Architecture</title>
    <link>http://code.v.igoro.us/archives/49-Whats-New-in-Amanda-Transfer-Architecture.html</link>
            <category>amanda</category>
    
    <comments>http://code.v.igoro.us/archives/49-Whats-New-in-Amanda-Transfer-Architecture.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=49</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=49</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;Amanda&#039;s primary mission in life is to move large quantities of data around.  Historically, this has been done through a patchwork of methods, each written separately and with its own quirks.  POSIX pipes, TCP sockets, shared memory, on-disk cache files -- Amanda&#039;s done it all.  But these multiple implementations were error-prone, difficult to maintain, and often not the most efficient approach.&lt;/p&gt;

&lt;p&gt;In an effort to remedy this, we introduced the &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/index.php/Transfer_Architecture&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/index.php/Transfer_Architecture&quot;&gt;transfer architecture&lt;/a&gt;, abbreviated XFA.  This was technically included in Amanda-2.6.1, but was only used by &lt;i&gt;amvault&lt;/i&gt;.  In the upcoming Amanda-3.1 release, however, the XFA is central to all recovery operations, and is used internally by the taper (the portion of the backup system that writes to devices).&lt;/p&gt;

&lt;p&gt;This post highlights some of the features of the transfer architecture, and some of the improvements we&#039;d like to make. &lt;h2&gt;Transfers and Elements&lt;/h2&gt;
A transfer is pretty simple: it moves data from one place to another.  It is built from a list of transfer elements, the first being the data source and the last the destination.  Any elements in the middle are filters, and could apply compression or encryption, for example.  Elements are automatically connected to one another using the most efficient means available.&lt;/p&gt;

&lt;p&gt;Transfers operate &lt;i&gt;asynchronously&lt;/i&gt;, meaning that ordinary execution of Amanda continues in parallel to the movement of data.  When the transfer needs Amanda&#039;s attention, it sends a message, and Amanda reacts accordingly.&lt;/p&gt;

&lt;p&gt;The beauty of this architecture is in the variety of elements that can be connected.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sources
&lt;ul&gt;
&lt;li&gt;File or socket&lt;/li&gt;
&lt;li&gt;DirectTCP Connection&lt;/li&gt;
&lt;li&gt;Holding Disk&lt;/li&gt;
&lt;li&gt;Spanned dumpfile&lt;/li&gt;
&lt;li&gt;Random or repeated patterns (for testing)&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Filters
&lt;ul&gt;
&lt;li&gt;In-process compression or encryption (e.g., libgz)&lt;/li&gt;
&lt;li&gt;External utilities (e.g., gzip or amgpgcrypt)&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Destinations
&lt;ul&gt;
&lt;li&gt;File or socket&lt;/li&gt;
&lt;li&gt;DirectTCP Connection&lt;/li&gt;
&lt;li&gt;Spanned Dumpfile&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;/ul&gt;&lt;/p&gt;

&lt;p&gt;The full list is given &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/pod/Amanda/Xfer.html&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/pod/Amanda/Xfer.html&quot;&gt;in the POD&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Benefits&lt;/h2&gt;

&lt;p&gt;The advantage of the transfer architecture is that it massively simplifies the process of transferring data.  Nowhere is this more obvious than in the splitting and re-joining of dumpfiles.&lt;/p&gt;

&lt;p&gt;The &lt;i&gt;Amanda::Xfer::Recovery::Source&lt;/i&gt; element, which reads from spanned dumpfiles, cooperates with the &lt;i&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/pod/Amanda/Recovery/Clerk.html&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/pod/Amanda/Recovery/Clerk.html&quot;&gt;Clerk&lt;/a&gt;&lt;/i&gt;, via transfer messages, to load the proper volumes and seek to the proper files to recover and entire dumpfile, even if it is distributed over more than one source volume.  Similarly, &lt;i&gt;Amanda::Xfer::Taper::Dest&lt;/i&gt; works with a &lt;i&gt;&lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/pod/Amanda/Taper/Scribe.html&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/pod/Amanda/Taper/Scribe.html&quot;&gt;Scribe&lt;/a&gt;&lt;/i&gt; to load volumes and update the catalog while spanning dumpfiles.&lt;/p&gt;

&lt;p&gt;So the Amanda taper is a simple wrapper around a transfer from holding disk (FILE-WRITE) or socket (PORT-WRITE) to a spanned dumpfile, using a scribe.  Similarly, all of the recovery tools use a clerk and a transfer to read dumpfiles off the appropriate volumes.  And &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/wiki.zmanda.com/man/amvault.8.html&#039;]);&quot;  href=&quot;http://wiki.zmanda.com/man/amvault.8.html&quot;&gt;amvault&lt;/a&gt; combines the two to simultaneously read from one volume and write to another.&lt;/p&gt;

&lt;h2&gt;Future&lt;/h2&gt;

&lt;p&gt;At this point, the transfer architecture is a reliable abstraction, but it is not yet terribly efficient.  The advantage of the abstraction, though, is that as it is made more efficient, all of the components of Amanda that make use of it will immediately become faster, with no changes.&lt;/p&gt;

&lt;p&gt;There are plenty of places in Amanda where the transfer architecture will be useful, and certainly plenty of work to do to make it faster.  If you&#039;re interested in helping out, please let me know!&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Fri, 12 Mar 2010 14:06:47 -0600</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/49-guid.html</guid>
    
</item>
<item>
    <title>What's New in Amanda: Automated Tests</title>
    <link>http://code.v.igoro.us/archives/40-Whats-New-in-Amanda-Automated-Tests.html</link>
            <category>amanda</category>
    
    <comments>http://code.v.igoro.us/archives/40-Whats-New-in-Amanda-Automated-Tests.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=40</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=40</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;This is the first in what will be a series of posts about recent work on &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/amanda.org&#039;]);&quot;  href=&quot;http://amanda.org&quot;&gt;Amanda&lt;/a&gt;.  Amanda has a reputation as old and crusty -- not so!  Hopefully this series will help to illustrate some of the new features we&#039;ve completed, and what&#039;s coming up.  I&#039;ll be cross-posting these on &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.zmanda.com/blogs/&#039;]);&quot;  href=&quot;http://www.zmanda.com/blogs/&quot;&gt;the Zmanda Team Blog&lt;/a&gt; too.&lt;/p&gt;

&lt;div style=&quot;float: left&quot;&gt;&lt;img src=&quot;http://chart.apis.google.com/chart?cht=bvg&amp;chs=180x150&amp;chtt=Test+Count&amp;chbh=40,8,8&amp;chd=t:91,915,2936&amp;chds=0,4000&amp;chxt=x,r&amp;chxl=0:|v2.6.0|v2.6.1|v3.1.0&amp;chxr=1,0,3500,1000&amp;chco=008000&amp;chxs=0,000000,12,0,lt|1,000080,10,1,lt&quot;&gt;&lt;/div&gt;

&lt;p&gt;Among open-source applications, Amanda is known for being stable and highly reliable.  To ensure that Amanda lives up to this reputation, we&#039;ve constructed an automated testing framework (using &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/buildbot.net&#039;]);&quot;  href=&quot;http://buildbot.net&quot;&gt;Buildbot&lt;/a&gt;) that runs on every commit.  I&#039;ll give some of the technical details after the jump, but I think the numbers speak for themselves.  The latest release of Amanda (which will soon be 3.1.0) has 2936 tests!&lt;/p&gt;

&lt;p&gt;These tests range from highly-focused unit tests, for example to ensure that all of Amanda&#039;s spellings of &quot;true&quot; are parsed correctly, all the way up to full integration: runs of amdump and the recovery applications. The tests are implemented with Perl&#039;s &lt;tt&gt;Test::More&lt;/tt&gt; and &lt;tt&gt;Test::Harness&lt;/tt&gt;.  The result for the &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/github.com/zmanda/amanda/commit/deb1d40c203906bd949789c4fab08172d54c49cc&#039;]);&quot;  href=&quot;http://github.com/zmanda/amanda/commit/deb1d40c203906bd949789c4fab08172d54c49cc&quot;&gt;current trunk&lt;/a&gt; looks like this:&lt;/p&gt;

&lt;pre style=&quot;font-size: 50%&quot;&gt;
=setupcache.....................ok
Amanda_Archive..................ok
Amanda_Changer..................ok
Amanda_Changer_compat...........ok
Amanda_Changer_disk.............ok
Amanda_Changer_multi............ok
Amanda_Changer_ndmp.............ok
Amanda_Changer_null.............ok
Amanda_Changer_rait.............ok
Amanda_Changer_robot............ok
Amanda_Changer_single...........ok
Amanda_ClientService............ok
Amanda_Cmdline..................ok
Amanda_Config...................ok
Amanda_Curinfo..................ok
Amanda_DB_Catalog...............ok
Amanda_Debug....................ok
Amanda_Device...................ok
        211/428 skipped: various reasons
Amanda_Disklist.................ok
Amanda_Feature..................ok
Amanda_Header...................ok
Amanda_Holding..................ok
Amanda_IPC_Binary...............ok
Amanda_IPC_LineProtocol.........ok
Amanda_Logfile..................ok
Amanda_MainLoop.................ok
Amanda_NDMP.....................ok
Amanda_Process..................ok
Amanda_Recovery_Clerk...........ok
Amanda_Recovery_Planner.........ok
Amanda_Recovery_Scan............ok
Amanda_Report...................ok
Amanda_Tapelist.................ok
Amanda_Taper_Scan...............ok
Amanda_Taper_Scan_traditional...ok
Amanda_Taper_Scribe.............ok
Amanda_Util.....................ok
Amanda_Xfer.....................ok
amadmin.........................ok
amarchiver......................ok
amcheck.........................ok
amcheck-device..................ok
amcheckdump.....................ok
amdevcheck......................ok
amdump..........................ok
amfetchdump.....................ok
amgetconf.......................ok
amgtar..........................ok
amidxtaped......................ok
amlabel.........................ok
ampgsql.........................ok
        40/40 skipped: various reasons
amraw...........................ok
amreport........................ok
amrestore.......................ok
amrmtape........................ok
amservice.......................ok
amstatus........................ok
amtape..........................ok
amtapetype......................ok
bigint..........................ok
mock_mtx........................ok
noop............................ok
pp-scripts......................ok
taper...........................ok
All tests successful, 251 subtests skipped.
Files=64, Tests=2936, 429 wallclock secs (155.44 cusr + 31.48 csys = 186.92 CPU)
&lt;/pre&gt;

&lt;p&gt;The skips are due to tests that require external resources - tape drives, database servers, etc.  The first part of the list contains tests for almost all perl packages in the &lt;tt&gt;Amanda&lt;/tt&gt; namespace.  These are generally unit tests of the new Perl code, although some tests integrate several units due to limitations of the interfaces.  The second half of the list is tests of Amanda command-line tools.  These are integration tests, and ensure that all of the documented command-line options are present and working, and that the tool&#039;s behavior is correct.  The integration tests are necessarily incomplete, as it&#039;s simply not possible to test every permutation of this highly flexible package.&lt;/p&gt;

&lt;p&gt;The &lt;tt&gt;=setupcache&lt;/tt&gt; test at the top is interesting: because most of the Amanda applications need some dumps to work against, we &quot;cache&quot; a few completed amdump runs using tar, and re-load them as needed during the subsequent tests.  This speeds things up quite a bit, and also removes some variability from the tests (there are a &lt;i&gt;lot&lt;/i&gt; of ways an amdump can go wrong!).&lt;/p&gt;

&lt;p&gt;The entire test suite is run at least 54 times for every commit by &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/buildbot.net&#039;]);&quot;  href=&quot;http://buildbot.net&quot;&gt;Buildbot&lt;/a&gt;.  We test on 42 different architectures - about a dozen linux distros, in both 32- and 64-bit varieties, plus Solaris 8 and 10, and Darwin-8.10.1 on both x86 and PowerPC.  The remaining tests are for special configurations -- server-only, client-only, special runs on a system with several tape drives, and so on.&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Fri, 12 Mar 2010 12:03:52 -0600</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/40-guid.html</guid>
    
</item>
<item>
    <title>Testing Legacy Code</title>
    <link>http://code.v.igoro.us/archives/38-Testing-Legacy-Code.html</link>
            <category>amanda</category>
            <category>buildbot</category>
    
    <comments>http://code.v.igoro.us/archives/38-Testing-Legacy-Code.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=38</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=38</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;I just read Roy Osherove&#039;s &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.amazon.com/Art-Unit-Testing-Examples-Net/dp/1933988274&#039;]);&quot;  href=&quot;http://www.amazon.com/Art-Unit-Testing-Examples-Net/dp/1933988274&quot;&gt;The Art of Unit Testing with Examples in .NET&lt;/a&gt;, on the advice of a slashdot review.  I was not terribly impressed with the book, but reading it did help me to solidify my thinking about testing and test-driven development, and put words to concepts I had come to on my own.&lt;/p&gt;

&lt;p&gt;Rather late in the book, Osherove describes three properties of good tests. 
&lt;ul&gt;
&lt;li&gt;&lt;i&gt;Trustworthiness&lt;/i&gt; - Do developers believe that passing tests mean things are working? Do developers believe that failing tests indicate a real bug?&lt;/li&gt;
&lt;li&gt;&lt;i&gt;Maintainability&lt;/i&gt; - Do developers think that tests are easy to add and maintain, or are they likely to avoid writing tests when rushed?&lt;/li&gt;
&lt;li&gt;&lt;i&gt;Readability&lt;/i&gt; - Do developers often consult the unit tests to see how the system under test is supposed to work?&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;

&lt;p&gt;What most struck me was that these properties were related to developers&#039; perceptions of the tests, not the tests themselves.  Tests are as much a social artifact of a project as a technical tool.&lt;p&gt;

&lt;h2&gt;Buildbot&#039;s Tests&lt;/h2&gt;

&lt;p&gt;Around the time I was reading this, one of the more prolific Buildbot contributors commented, &quot;I try not to change the tests - they scare me.&quot;  Buildbot&#039;s tests were badly isolated, slow, and failed intermittently.  As maintainer, I had grown accustomed to saying &quot;oh, that test fails sometimes, don&#039;t worry about it&quot; - a trustworithiness failure.  Because of the terrible isolation, changing just about anything in Buildbot would cause dozens of tests to fail, requiring repetitive editing to fix - not maintainable.  And the tests consisted of long sequences of operations and assertions, written in the Twisted style, which is already not readable.  As a result, even I don&#039;t know what most of the tests are actually testing.  This was a bad situation for any application, but particularly embarassing for a popular testing tool!&lt;/p&gt;

&lt;p&gt;So I &lt;b&gt;blew the tests away&lt;/b&gt;.  Well, not really - I moved them to &lt;tt&gt;buildbot/broken_test/&lt;/tt&gt; in hopes they can be useful in writing new tests, and so that the braver souls among us can still run them.  Now our &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/buildbot.net/metabuildbot/tgrid&#039;]);&quot;  href=&quot;http://buildbot.net/metabuildbot/tgrid&quot;&gt;metabuildbot&lt;/a&gt; is green, and I can legitimately ask for unit tests for new code.&lt;/p&gt;

&lt;p&gt;There are costs associated with this move, too. A lot of people have worked very hard to write tests that have now been categorically labeled &quot;broken,&quot; to whom all I can say is &quot;I&#039;m sorry&quot;.  With far fewer tests and thus far worse coverage, it&#039;s also difficult to have confidence that Buildbot really works.  The short-term workaround is to make a few &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/comments.gmane.org/gmane.comp.python.buildbot.devel/5703&#039;]);&quot;  href=&quot;http://comments.gmane.org/gmane.comp.python.buildbot.devel/5703&quot;&gt;beta releases&lt;/a&gt; and rely on real-world testing to suss out any problem.&lt;/p&gt;

&lt;p&gt;So this is only the first step.  We - I - still need to write real tests for the vast majority of the Buildbot code.  That&#039;s particularly complicated because Buildbot&#039;s units are badly isolated, and interfaces are ill-defined.  I will need to do a good bit of refactoring to bring it into compliance.&lt;/p&gt; 
 
    </content:encoded>

    <pubDate>Wed, 17 Feb 2010 12:26:15 -0600</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/38-guid.html</guid>
    
</item>
<item>
    <title>Revising the allowForce option</title>
    <link>http://code.v.igoro.us/archives/37-Revising-the-allowForce-option.html</link>
            <category>buildbot</category>
    
    <comments>http://code.v.igoro.us/archives/37-Revising-the-allowForce-option.html#comments</comments>
    <wfw:comment>http://code.v.igoro.us/wfwcomment.php?cid=37</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://code.v.igoro.us/rss.php?version=2.0&amp;type=comments&amp;cid=37</wfw:commentRss>
    

    <author>nospam@example.com (Dustin J. Mitchell)</author>
    <content:encoded>
    &lt;p&gt;Buildbot&#039;s WebStatus display has, for a long time, had an &lt;tt&gt;allowForce&lt;/tt&gt; option which controls what kind of mayhem can be wrought via the web interface.  Historically, this has been a boolean option: either web users can do everything (force builds and shut down slaves) or nothing.  &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/buildbot.net/trac/ticket/701&#039;]);&quot;  href=&quot;http://buildbot.net/trac/ticket/701&quot;&gt;Bug 701&lt;/a&gt; asks that we change that to give more granular access control.&lt;/p&gt;

&lt;p&gt;Buildbot has an interesting way of separating the status display from the control functionality.  It has two parallel interface hierarchies, IStatus and IControl, implementing the necessary methods.  The IStatus hierarchy is illustrated with the orange bubbles here:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://djmitche.github.com/buildbot/docs/0.7.12/images/status.png&quot; alt=&quot;&quot; width=&quot;80%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The IControl hierarchy is similar, although it only goes down to the Build level right now.&lt;/p&gt;

&lt;p&gt;When &lt;tt&gt;allowForce&lt;/tt&gt; is true, the WebStatus object adapts the buildmaster to the IControl interface and adds a link to the result in its &lt;tt&gt;control&lt;/tt&gt; attribute.  Forcing a build or shutting down a slave then uses this object to navigate to the appropriate control object and calls a method from the corresponding interface.  If the &lt;tt&gt;control&lt;/tt&gt; attribute is None, no access is allowed.&lt;/p&gt;

&lt;p&gt;This scheme has the advantage that it is difficult to accidentally expose functionality, since when &lt;tt&gt;allowForce&lt;/tt&gt; is false, the control methods are inaccessible.  However, it has the disadvantage of not allowing any more granular level of access control.&lt;/p&gt;

&lt;p&gt;I &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/github.com/djmitche/buildbot/commit/7572c5bdad4a09393b665fff2939e605df58deb1&#039;]);&quot;  href=&quot;http://github.com/djmitche/buildbot/commit/7572c5bdad4a09393b665fff2939e605df58deb1&quot;&gt;just reworked&lt;/a&gt; the web status to have a more flexible authorization mechanism, and while I wasn&#039;t able to remove the IControl hierarchy entirely, I was able to marginalize it to only those code blocks that need to perform controlled actions, instead of passing control objects all over the place. &lt;/p&gt;
 
    </content:encoded>

    <pubDate>Sat, 13 Feb 2010 15:08:29 -0600</pubDate>
    <guid isPermaLink="false">http://code.v.igoro.us/archives/37-guid.html</guid>
    
</item>

</channel>
</rss>
