Wednesday, November 16. 2011

IT and Community

Mozilla's IT team is pivoting to a more community-focused approach. Our director of IT, mrz, has been writing extensively about it over the last few weeks.

As you can imagine, the difficult part of this is to balance security with accessibility. We'd like to be open, but we can't give the keys to the kingdom out to anyone who promises to help. The approach we're taking is to treat volunteers as we would part-time employees - post positions, interview, and then supervise to gain trust. This is a fairly common model, actually, for any organization with volunteers and a need for security. Youth programs, for example, generally do an interview and background check with new volunteers, and those volunteers will be paired with senior volunteers or staff for a while.

However, it's a bit cumbersome, both for Mozilla and for potential volunteers. We must design entire positions - ongoing tasks or roles that a volunteer can work on for an extended period of time - and then select a limited number of volunteers to fill those roles. For potential volunteers, an application and interview can mean a long time (weeks?) before they get to do anything hands-on. It also carries the risk that we'd have to turn a qualified volunteer away due to lack of suitable positions.

So what to do?

We need a more fluid way of interacting with potential contributors. Since our bug database is public, we can begin by simply tagging a few bugs that are appropriate for newcomers -- things that don't require sensitive access and are well-encapsulated so they can be completed without extensive knowledge of Mozilla's infrastructure.

Here's the list.

It's a bit short right now. There are a few things that may help:

  • We can get better about identifying appropriate tasks and projects and making bugs out of them.
  • We can identify a means of giving limited or sandboxed access to a new volunteer.
  • Consumers of Mozilla's IT resources can begin tagging bugs, where Mozilla can provide the resources and volunteers can do the heavy lifting - got any ideas?

Friday, September 2. 2011

Subscribe to a google group with a different address?

Google Groups is one place where, IMHO, Google pushes its hegemony too far, making it difficult to use. I wanted to subscribe to puppet-users with my Mozilla address, but since I have a Google account, Groups assumes I want to subscribe with that address. No!

I found the fix with a bit of Googling (some irony there). It involves editing a URL:

http://groups.google.com/group/puppet-users/boxsubscribe?email=email@domain.com

where you'd substitute the name of the group you want for puppet-users and add your email at the end.

Friday, May 20. 2011

Nagios NSCA from Python

I've been working on improving the monitoring of the build slaves at Mozilla. As part of this project, I needed to be able to submit passive check results to the Nagios servers via NSCA during system startup. I'm doing this from a Python script that needs to run on a wide array of systems using whatever random Python is available. We run some oddball stuff, so the common denominator is Python 2.4.

It turns out that there's no Python NSCA library, although there is Net::Nsca in Perl. So, I wrote one, and put it on github: https://github.com/djmitche/pynsca.

At the moment, this only knows XOR, and only does service checks. That's all I need, but hopefully it can be easily expanded to cover other purposes. The one thing I want to avoid is adding mandatory requirements -- this should work, at least in plain-text and XOR modes, on a plain-vanilla Python installation.

By the way, the startup script I'm working on is runslave.py, which includes a modified copy of pynsca and does a number of other housekeeping jobs as well. More on that in a subsequent post.

Saturday, January 22. 2011

Amanda's Transfer Mechanisms

There's been a bit of confusion on the mailing list and IRC about how Amanda assembles transfers out of transfer elements, and how transfer mechanisms influence that.

In the final form of a transfer, any two adjacent elements must have the same mechanism. For example is, an upstream element speaking XFER_MECH_PUSH_BUFFER cannot talk to a downstream element using XFER_MECH_READ_FD (nor, more confusingly, XFER_MECH_PULL_BUFFER). So each mechanism is an isolated definition of "here's how upstream and downstream should talk". They come in pairs because generally anything upstream can do to downstream (e.g., upstream can write to downstream's fd) can occur in reverse (e.g., downstream can read from upstream's fd).

What makes this confusing is that if you specify a set of elements which can't talk directly to one another, then xfer.c will add "glue" elements between the specified elements. To make that concrete, imagine you specify a transfer as

source-holding --> filter-xor --> dest-fd
(if you like practical examples, then imagine filter-xor is a buffer-based decompression filter, and you're pulling data from holding disk, decompressing, and sending to a pipe -- something amfetchdump would do). Here are the mechanisms supported by each element:
source-holding:
 XFER_MECH_PULL_BUFFER
filter-xor
 XFER_MECH_PULL_BUFFER (input) and
 XFER_MECH_PULL_BUFFER (output)
or
 XFER_MECH_PUSH_BUFFER (input) and
 XFER_MECH_PUSH_BUFFER (output)
dest-fd
 XFER_MECH_WRITEFD (input)

In putting these together, source-holding and filter-xor can use the same mechanism (PULL_BUFFER). This leaves filter-xor using PULL_BUFFER for output, but dest-fd does not support this. So xfer.c adds a glue element that can speak PULL_BUFFER on input and WRITEFD on output. This element basically loops in a thread, calling upstream->pull_buffer and write(downstream->input_fd, buffer). So the final xfer looks like

 source-holding --(PULL_BUFFER)--> filter-xor --(PULL_BUFFER)--> glue --(WRITEFD)--> dest-fd

Hopefully that helps to explain how the glue works.

Note that one of the cool things about this arrangement is that in most cases the complexity is in the glue, not the elements. In fact, in this case the glue provides the only thread that's required to run this transfer, so the other three element implementations don't need to manage threads at all.

Thursday, November 11. 2010

virtualenv for Perl

I absolutely love virtualenv for Python development. It allows me to develop Buildbot against several versions of Python and several versions of its dependencies, without modifying my system's Python installation at all!

Now, I need to do the same thing in Perl. So I thought I'd compare the two side-by-side.


Continue reading "virtualenv for Perl"

Wednesday, November 10. 2010

Firefox 4.0b7 - a few tweaks

The new beta of Firefox 4.0 was released today. I'm not quite willing to run Minefield (nightlies), so I've been eagerly awaiting this beta to fix some nagging but not show-stopper bugs in 4.0b6. One of those involved bad interactions of App Tabs with Panorama. Now the app tabs nicely decorate the side of each tab set in the panorama view.

Another nice thing is that the Option-Space key combination, which opened panorama in 4.0b6, no longer does so. That's OK - I found that to be too easy to press anyway. It's now bound to Command-E (right there at the top of the "View" menu).

Panorama has also been re-bound to swipe-up and swipe-down, which makes me less happy. In most apps on the Mac, those swipes are equivalent to the "Home" and "End" keystrokes -- they scroll to the top or bottom of the current page. So with a little help from my new co-workers, I discovered the settings to fix that.

The full list of gesture bindings is written up here, but the two I needed to change are browser.gesture.swipe.up and .down. The scrolling commands to bind to them are cmd_scrollTop and cmd_scrollBottom.

4.0 has a number of other great UI enhancements, too. I'm excited to see 4.0 finally released!

[edit: fixed formatting]

Friday, October 22. 2010

irssi settings for status-in-nick

I am getting started in releng at Mozilla, and IRC provides a central meeting-place for the group. As such, indicating your status to this group is an important, so others can know whether you're nearby to answer a question or take care of a problem. This is generally done by adding suffixes to the IRC nickname, e.g., "dustin|afk" or "dustin|lunch".

Before I go further, I know that this is frown on, and even results in autoignores, in some corners of IRC. If that's the case for you, read no further.


Continue reading "irssi settings for status-in-nick"

Friday, July 16. 2010

IPv6 and Amanda

Amanda joined the IPv6 revolution in November 2006 - all of the BSD-style authentication mechanisms can support IPv6 endpoints. However, it's generally agreed that this was a mistake, and in this post I will talk about why that's the case.


Continue reading "IPv6 and Amanda"

Saturday, July 10. 2010

SSH With Snow Leopard

I just upgraded my Macbook to Snow Leopard, and the upgrade has changed the way SSH authentication works. I have set up a system I like quite a bit, now, and thought I would share.


Continue reading "SSH With Snow Leopard"

Thursday, July 8. 2010

What's New in Amanda: The End of Fragmentation

Most of my posts in this series have been about features that are available in a released version of Amanda. This time, I want to share a project I'm working on right now - one that will be available in Amanda-3.2. I'm reworking the way Amanda writes its data to tape (or any other kind of storage) to make it more efficient, more reliable, and simpler to configure.

Historically, Amanda's conservative approach to finicky tape hardware has meant that it wasted some space at the end of each tape. With the changes I'm working on, Amanda will no longer waste this space, and can also avoid some needless copying of data in most cases, with a minimum of additional risk.


Continue reading "What's New in Amanda: The End of Fragmentation"

Thursday, July 1. 2010

What's New in Amanda: Hackability

It's been a while since I've posted about recent development in Amanda, but it's not for lack of interesting topics!

Today I want to talk a little bit about Amanda's development. Historically, Amanda has always had a small, core group of developers who do the lion's share of the development work. There are probably lots of reasons for this, not least of which is that a backup application isn't the sexiest project on which to spend your spare time. But I think there's a deeper reason, and it has to do with hackability.


Continue reading "What's New in Amanda: Hackability"

Sunday, June 27. 2010

IPv6 Configuration

IPv6 Certification Badge for djmitche
I've been meaning to get IPv6 set up on my local network for some time. My only practical reason is that Amanda supports IPv6 and I should test that support. It was also a good chance to re-immerse myself in network configuration, and Hurricane Electric has a neat certification process to add some motivation.


Continue reading "IPv6 Configuration"

Saturday, April 17. 2010

LCD Display and TMP102 sensor

It is incredibly easy to throw things together with an Arduino. It's common to see criticism of the device when it's used in in projects that don't require even a fraction of its power, and that might be justified. As a flexible platform for test-driving complex modules, though, the Arduino hits the mark perfectly on flexibility and usability.

I don't have a particular project in mind for my Arduino, but since I don't have a multimeter or an oscilloscope, I want to use it as a test harness for various basic components as I experiment with them. To this end, I bought a cute 16x2 character amber LCD display. With thoughts of building a fermentation-temperature monitor and learning about the I2C bus, I also bought a relatively cheap TMP102 and breakout board. With a little bit of reading, I was able to stitch them together quickly and easily.


Continue reading "LCD Display and TMP102 sensor"

Monday, April 12. 2010

Modern Multiprocessing

I've been thinking a lot lately about the way we accomplish multiprocessing. We've seen a significant change in the operation of Moore's law for CPU speeds: today's CPUs are about the same speed as those of a few years ago, but they have more cores, and more virtual processors on those cores. This is great for heavily-loaded servers, which have plenty of distinct tasks to place on those cores and VCPUs, but not so useful for users working with single-threaded applications.

Why are most applications still single-threaded? There are lots of good reasons. Threaded code is harder to write, and not just because it requires careful analysis and use of synchronization primitives: many common tasks are difficult to meaningfully parallelize without careful control over inter-thread communication, and in a portable application you don't have that kind of control. Threaded code generally performs badly on single-CPU systems, which are still common. Some popular languages still make threading difficult, at least in a portable fashion. And threads are still relatively heavyweight entities in most operating systems: you don't spawn ten threads to mergesort a 100-item array.

Some of these problems will go away with a little more time, but some will get worse. NUMA architectures can make sharing data between threads slow. Hyperthreading and its interaction with processor caches adds yet another level of unpredictability.

We know how to build massively parallel systems that run massively parallel algorithms. What is still unknown is how to build portable, simple software that can run efficiently across a vareity of architectures. This is a problem of practice, not theory, and there's lots of interesting work going on in this area.

Of course, there are languages designed explicitly to support communication, such as Limbo or Erlang, Haskell, and Clojure. For the most part, these languages are structured as communicating sequential processes, which is to say that they represent multiprocessing as a set of sequential threads that pass information to one another. Problems of thread safety are subsumed by the languages, but mapping the parallelism to available resources is generally left to the programmer or administrator.

One interesting project is Apple's Grand Central Dispatch. It defines a simple but highly expressive closure syntax (a block) and a mechanism to dynamically schedule execution of such closures (queues). Critically, the GCD library takes care of scaling the parallelism of the queue processing appropriately to the underlying hardware. On a single-threaded CPU, this amounts to cooperative multitasking, but on parallel hardware the operating system can dynamically allocate virtual CPUs to applications needing more parallelism.

This topic seems to come up often in my various pursuits, so I will return to it again.

Want to work on Amanda?

I've not made any secret of the fact that I want more people hacking on Amanda. This is both for selfish reasons -- many hands make light work -- and for altruistic reasons -- a broader community of developers can provide better governance for the project and long-term continuity. With a few noticable exceptions, I haven't had a lot of satisfaction.

I think part of the reason is that Amanda has a steep learning curve, even within the new Perl code. The time to climb that curve is a big investment, and folks with only a small itch to scratch can't afford it.

In an effort to sweeten the pot, we (Zmanda) are offering to pay for flexible work on Amanda. Part-time or full-time, on your own schedule. Your choice of projects. Support and gratitude from the other hackers. And the option to become a full Zmanda employee if that's your bent.

Here are some possible projects, to pique your interest:

  • MySQL application (to round out the set with ampgsql)
  • Cyrus Imapd application (gnutar doesn't deal well with the application's tiny files and hard links)
  • OpenSSL for network transport, using certificates and keys for authentication
  • Database-backed backup catalog
  • Amvault upgrade
  • Handle Logical EOM (LEOM) on all devices that support it, drastically reducing the number of parts Amanda writes
  • Support for more cloud backends than just S3
  • Parallel writes to multiple devices

If you're interested, contact me (dustin@zmanda.com) and we'll work something out!

Notice

The postings on this site are my own and don't necessarily represent the opinions of Zmanda, Inc.