2014/11/06

Printing function calls in Perl

Lately my favourite way to interpolate function calls into printed strings is to use the ${\ ... } notation for embedding arbitrary expressions into strings:

print "f($number) = ${\ fibonacci($number) }\n";

Other techniques include separate arguments:

print "f($number) = ", fibonacci($number), "\n";

or a helper variable:

my $result = fibonacci($number);
print "f($number) = $result\n";

or even printf:

printf "f(%d) = %d\n", $number, fibonacci($number);

Of all these techniques I tend to prefer either of the first two, because they lead to putting the expressions "in-line" with the rest of the text string, whereas in the latter two they sit elsewhere, making it harder to see at a glance what gets printed where. Especially with printf's positional arguments, it can be easy to be "off-by-one" with a large number of arguments, and put everything in the wrong place.

In addition, the first technique is not limited to print calls, as it works just on simple string interpolation, so it works just as well to construct strings in general.

(This post was originally written to answer a question on StackOverflow.)

2014/09/24

Perl module idea for making bitfields

Background

I've been doing a lot of hardware IO work lately, which involves lots of talking to hardware devices, where there are byte-wide registers that store little bitfields, sometimes of individual and unrelated bits packed together. To make this easier I tend to write myself lots of pairs of functions to pack/unpack the fields in one of these bytes; for instance:

use constant {
      MASK_RX_RD      => 1<<6,
      MASK_TX_DS      => 1<<5,
      MASK_MAX_RT     => 1<<4,
      EN_CRC          => 1<<3,
      CRCO            => 1<<2,
      PWR_UP          => 1<<1,
      PRIM_RX         => 1<<0,
};

sub unpack_CONFIG
{
   my ( $config ) = @_;
   return
      MASK_RX_RD  => !!( $config & MASK_RX_RD ),
      MASK_TX_DS  => !!( $config & MASK_TX_DS ),
      MASK_MAX_RT => !!( $config & MASK_MAX_RT ),
      EN_CRC      => !!( $config & EN_CRC ),
      CRCO        => $CRCOs[!!( $config & CRCO )],
      PWR_UP      => !!( $config & PWR_UP ),
      PRIM_RX     => !!( $config & PRIM_RX );
}

sub pack_CONFIG
{
   my %config = @_;
   return
      ( $config{MASK_RX_RD}  ? MASK_RX_RD  : 0 ) |
      ( $config{MASK_TX_DS}  ? MASK_TX_DS  : 0 ) |
      ( $config{MASK_MAX_RT} ? MASK_MAX_RT : 0 ) |
      ( $config{EN_CRC}      ? EN_CRC      : 0 ) |
      ( ( _idx_of $config{CRCO}, @CRCOs )
         // croak "Unsupported 'CRCO'" ) * CRCO |
      ( $config{PWR_UP}      ? PWR_UP      : 0 ) |
      ( $config{PRIM_RX}     ? PRIM_RX     : 0 ) );
}

This convenient pair of functions bidirectionally converts bytes into key/value lists. As well as containing 6 simple boolean values, there's also an enumerated field, represented by the values in the array @CRCOs. These functions convert those too.

I'm getting tired of writing these pairs of functions. Hey, this is Perl right? :) I should get Perl to write them for me.

TL;DR - I propose the following

I'd like instead to write something like:

use bitfield;

bitfield CONFIG =>
   MASK_RX_DS  => boolfield(6),
   MASK_TX_DS  => boolfield(5),
   MASK_MAX_RT => boolfield(4),
   EN_CRC      => boolfield(3),
   CRCO        => enumfield(2, qw( values here )),
   PWR_UP      => boolfield(1),
   PRIM_RX     => boolfield(0);

The operation here is that 'bitfield' is automatically exporting a function called bitfield, along with the various *field() functions that create field definitions. boolfield() declares a single true/false boolean bit position, enumfield() declares a range of bits that give an enumeration. I guess also would be required intfield() to use a range of bits as an integer, and finally most likely customfield() taking a pair of conversion CODE refs or somesuch.

What does anyone think to that?

2014/09/07

Released: Tickit 0.47

New up in Tickit 0.47, some fairly small and incremental updates:

  • Support the 'blink' terminal attribute

    Both in libtickit C library and Tickit perl module now support the blink attribute, much to my hesitation. ;)

    I'm not sure I want to encourage this sort of thing, but the Neovim project said they wanted this, so I've reluctantly added support for it all the same.

  • Bugfix for renderbuffer 'get*' methods

    When offset and clipping are applied, previously the get* methods didn't pay attention to this, fetching content relative to the toplevel, or segfaulting if requested out of bounds. This has now been fixed.

  • Tickit::Widget::HBox and ::VBox have now been moved to the Tickit-Widgets distribution.

    This dist is now linked to explicitly from the documentation. This supports the longterm goal of turning 'Tickit' into purely the window-layer downwards, and having all the widget support live in its own distribution, backed eventually by its own C library.

  • Nicer handling of fallback terminfo attributes for definitions missing them.

    Certain distributions of terminfo databases seem to be lacking certain essential attributes for some termtypes. Such examples as the upstream 'screen' terminfo is lacking erase_chars. libtickit now includes some fallback "likely to work" strings to handle these cases. This turns it from an instant failure on startup, to an at-worst wrong output, but hopefully most terminals should understand these standard strings. At least, if they don't I suspect libtickit is far from the only place that's broken, if they don't supply a more correct string in their terminfo.

2014/07/27

Event Reflexivity - A Design Pattern Pattern?

The previous series of posts on the topic of Event Reflexivity each posed a question about what the general shape of the design pattern actually is. Over the following months after the posts I thought about the pattern a lot more and eventually came to the conclusion that a lot of the questions aren't a simple matter of choosing what's correct - the reason these questions are hard to answer is that both options could be correct. What I have here is in fact not a single unclear design pattern, but instead a whole family of possible choices on a design pattern, with different specifics being more useful to specific cases.

The specific design choices that a particular implementation takes should answer such questions as:

  • Ordering control: Are subscribers of some named action invoked in a controllable fashion?
  • Invocation functions: What kinds of actions or ways of invoking them are actually provided?
  • Heirarchial actions: Do the types of actions occupy a simple flat namespace, or is there some heirarchial structure to them? Can a subscriber catch an entire subspace of actions at once?
  • Explicit or implicit subscription: Do subscribers have to explicitly list every action (or action subspace) they wish to receive?
  • Filterable arguments: Does the implementation offer a built-in way to filter specific values of arguments by some pre-declared pattern?

Perhaps the only real design question for the Event-Reflexive Design Pattern Pattern is to decide on some neat concise language that implementations can use to explain their specific choices on these issues.

2014/06/06

List::Util additions in Perl 5.20

For a while now, I have taken over maintaining List::Util and Scalar::Util, the utility modules that ship with core Perl. After a while of getting used to what's where, I've actually now started adding things to it again; mostly by surveying what's commonly used from some other utility modules, and bringing them in so they can be nicely implemented in XS for efficiency. Now that Perl 5.20 is out, all of these latest updates now ship with core perl.

From List::MoreUtils, I have taken the four shortcutting reduce-like boolean test functions of all, any, none and notall. These are all similar to grep, in that they take a block that evaluates some predicate test on each element of the list. Where they differ from grep, is that grep will count the total number of items in the list that returned true, whereas these four functions will simply indicate what the overall result was; allowing them to short-circuit as soon as the result is determined.

use List::Util 1.33 qw( any );

if( any { $_ == 0 } @numbers ) {
  say "The list of numbers includes zero";
}

As this module ships as part of the Perl core, it can reliably make use of the C compiler to build it, so most of the functions it contains are implemented in efficient XS code. Specifically these four also use an optimisation technique called MULTICALL, which improves the efficiency of functions of this form, where a given small block of code is repeatedly executed many times, with $_ set differently every time.

Another set of functions copied from elsewhere are the pair* functions taken and extended from List::Pairwise. These are all functions that interpret their list as an even-sized list of pairs, executing the code block with $a and $b set to the first and second value of each pair. This could be used to operate on regular perl hashes (by assigning keys to $a and their associated values to $b), though there is no requirement that it really be a hash. The functions will preserve the order of the pairs, and won't get upset if the "keys" are not plain strings, or not unique. As the result is also returned in a list of pairs, it could be assigned into a hash, or used elsewhere.

use List::Util 1.33 qw( pairgrep pairmap );

# Take a subset of a hash whose keys are ALLCAPS
my %capitals = pairgrep { $a =~ m/^[A-Z]+$/ } %hash

# Rename keys in a hash
my %renamed = pairmap { ($a =~ s/^foo_/bar_/r), $b } %hash

(This latter example also makes use of the perl 5.14 s///r flag, to return the result of the substitution instead of editing in place.)

use List::Util 1.33 qw( pairs );

foreach ( pairs %hash ) {
   # $_ will be a 2-element ARRAY ref
   say "$_->[0] has value $_->[1]";
}

As of version 1.39 (so a little too late to make it into perl 5.20.0, but still available on CPAN), pairs returns blessed array references that respond to methods called key and value (inspired by DCONWAY's Var::Pairs), as well as being accessible by array indexing.

use List::Util 1.39 qw( pairs );

foreach ( pairs %hash ) {
   say $_->key, " has value ", $_->value;
}

I have many more ideas for functions that could be added, though some care will need to be taken not to invent experimental ideas; but instead to take inspiration of tried-and-tested from CPAN, as all these have done, to bring into core and standardise existing ideas.

One other thing I have my sights set on is to implement further-optimised versions of at least some of the functions in Scalar::Util and List::Util as custom ops on perl 5.16 onwards. This will give them an even further performance boost, as they won't even be regular XS functions any more, so will completely remove the expensive call-time overhead of the ENTERSUB/LEAVESUB pair.


I should also add, that for a while now I've been a self-employed IT contractor, which has given me a lot more free time to be able to write such things as named above. If anyone is interested in supporting or sponsoring similar work on Perl, contact me by email. I'd be happy to give most reasonable Perl jobs a consideration. For that matter, I also work in C, or other languages, and I've even been known to build small-scale electronics projects.

2014/03/20

Event-Reflexivity in Static Languages

<< First | < Prev

In the previous article, we looked at the idea of explicit registration of handlers for events in an event-reflexive system, and touched on the idea that it may be more useful (or in fact, required) when dealing with a static language like C, rather than a dynamic langauge like Perl. Today's story will look at the different requirements for static languages in more detail.

In the examples in previous posts we have been able to use dynamic language features to easily implemented named actions by simply creating functions of the right name, and dispatch to them simply by passing that string name to the central dispatch functions.

# In a handler module
sub add_user
{ ... }

# In dispatching code
run_events "add_user", @args;

This became especially useful when creating dynamic action names in the IRC cases

# In a handler module
sub on_message_JOIN
{ ... }

sub on_message
{ ... }

# In dispatching code
run_events [ "on_message", "JOIN" ], @args;

To be clear here, we have used the following abilities of Perl (though similar should apply to most dynamic languages) in order to easily implement this system:

  • The ability to invoke a function in a module based on a dynamic string at runtime
  • The ability to pass a variable list of arguments to a function as a simple list, without the intermediate dispatcher function having to understand them
  • The ability to return any kind of value or list of values from a function

In a static language, this simply isn't going to work. We'll need something much stronger to bind all these pieces together. We'll need a way to more strongly identify the named actions as hook points, some way to pass the arguments around between them, and some way to interpret the return values for possible methods of combination or short-circuit return

The first idea to handle this is simply to number the events in some sort of globally-defined enumeration. But this of course creates a single global numbering, and half of the point of event-reflexivity in the first place was to avoid this kind of centralisation - a central numbering would mean that plugin modules couldn't themselves create and emit new events. They can't just invent new numbers because they might collide with existing ones.

Perhaps what is required here is that the event-reflexivity core can allocate contiguous blocks of numbered events, and allow some kind of association between event numbers and friendly string names for the convenience of programmers and users. When a new module wishes to allocate some events, it can request a block from the core, and be given its starting number. While that number would be dynamically allocated between different instances of the system, or even different runs on the same machine, it would at least be constant throughout one program run, allowing other modules to bind or invoke them. The friendly naming system would then exist to allow programs to look up the current number for a known event name, to bind or invoke it.

typedef ERcore_event_id int;

ERcore_event_id ercore_allocate_events(int n_events,
                                       const char *evname[]);
ERcore_event_id ercore_lookup_event(const char *name);
const char *ercore_event_name(ERcore_event_id id);

Having a way to create the identity of named events, we next need a way to register a function to actually handle them. This is where our second problem arises - the problem of how to pass event arguments. Perhaps the simplest is simply to allow a single void * pointer argument, on the basis that the named event would document somewhere what this was supposed to point at - likely a structure of some kind. Because C lacks the ability to create true closures, it may be necessary to pass a second pointer argument at binding time, and passing both that and the event argument to the invoked functions at dispatch time.

void ercore_bind_event(ERcore_event_id id,
                            void (*fn)(ERcore_event_id id,
                                       void *args,
                                       void *data),
                            void *data);
void ercore_run_event(ERcore_event_id,
                      void *args);

Here we've only defined the simplest of the invocation functions, run_events, because any of the others would require some consideration of the return value as well. Here we start to run into problems of needing to know how to interpret the meanings of these values. This means we can't implement any of the more interesting invocation methods as seen in the second post (Kinds of Invocation in Event-Reflexive Programming).

Our handling of arguments isn't very satisfactory, forcing invoking code to always pack their arguments into a structure, and all the handling code to unpack them from it again. We've also made our arguments totally opaque to the actual dispatch system, meaning we can't do any of the interesting tricks we could do in dynamic languages where these are visible (such as those seen in the third post, Instrumentation and Logging in Event-Reflexive Programming).

Finally, by making the event identity a simple ID number and having opaque argument structures that the event dispatch core cannot inspect, we have lost our ability to perform dynamic dispatch based on some of the arguments, as seen in the fourth post (Hierarchies of Actions in Event-Reflexive Programming).

We have seen in previous posts that all of these abilities are useful things to have, in combination they define the essential nature of what Event-Reflexivity is really all about. It would be nice if we could have these things in static languages as well as dynamic ones.

This then leads to the final, and most expansive question of this series of posts:

How do you create a powerful system of event-reflexivity in a static language such as C? How do you cope with combining return values and short-circuit evaluation in those dispatch modes that require them? How do you pass different kinds of arguments to invoked functions in a useful and simple way? And how do you perform dynamic multiple dispatch on of pieces of the event identity?

2014/03/13

Event Registration in Event-Reflexive Programming

<< First | < Prev

Continuing our recent theme of an IRC bot, the next step in the story concerns the suggestion someone once made to me that given the way most of the internals of this bot worked, it shouldn't be too hard to have these events broadcast over some kind of IPC socket or similar, between multiple processes, to allow parts of the bot to be written out-of-process. Indeed, given the right kind of serialisation, there's no reason these extra parts had to even be perl, they could be implemented in a different language.

This idea has come up twice now in two different concepts, so I decided to think about it in some more detail. In principle the idea is sound enough, but as ever the devil comes down to the details. If every event was serialised and broadcast to every listening process, the IPC overheads could get very large, because most of the time most of the processes would ignore it. The simple form of event-reflexivity we have been using up to now has relied heavily on the very cheap (virtually free) cost of introspection within the code of one process, but now we need to find a better way to implement it.

The obvious way to start this is some kind of registration system. When each process connects to the central core, it starts off telling the core which events it is interested in, perhaps in a set of strings, or regexp matches, or something. This is a good first step in cutting down plenty of unwanted noise over the serialisation links, and generally improves things. This filtering doesn't have to be perfect as each connected process can still state it isn't interested in specific events it still manages to receive, but anything we can do on the core side to cut that down will obviously help.

However, further consideration of the specific domain of interest in being an IRC bot starts to suggest we can do something more powerful. Within IRC, it's quite likely that most events of interest to plugged-in processes will concern some specific IRC channel or user. It's also quite likely that at least some plugins may be interested only in events on specific channels or users, or matching only specific text, or some other criteria. If we could get the core event distribution mechanism to filter on these as well, we can further cut down on pointless IPC overheads.

The full implications and decisions of how this might work aren't really related to event-reflexivity, but what is of interest here is that this kind of event registration system doesn't have to be only for out-of-process management. In fact, as soon as we start to consider how event-reflexive programming might be implemented in a static language like C, as compared a dynamic language like Perl, we fairly soon conclude that there must be some kind of registration call, to hook up pieces of code and help in the event dispatch process on some level or another.

This leads us on to in fact two questions this time:

How useful is it to implement event reflexivity using explicit registrations of interest in events?
Does the answer depend on whether the language is static or dynamic? Can explicit registrations provide useful abilities even in dynamic languages, or they just add unnecessary complication

Next >

2014/03/06

Hierarchies of Actions in Event-Reflexive Programming

<< First | < Prev

So far this series we have seen the introduction to event-reflexive programming, and a couple of use-cases it would appear in. This time our story continues in chronological fashion, following the development of various IRC-related systems.

The first attempt at an IRC bot was a large soup of various event-reflexive concepts, and was the experimentation bench for a lot of my first ideas about it. One pattern I found very useful was to include partly-dynamic data in with action names. That is, to use information at runtime to direct the flow of event handling. In particular in IRC, the most obvious one comes from considering the command name in incoming IRC messages.

The simplest implementation of this could be expressed something like the following, presuming our underlying IRC implementation gives us simple objects to represent each message:

my $message = ...;
my $command = $message->command;

run_plugins "on_message_$command", $message;

With this simple mechanism we now have a way for each plugin to react to specifically-named IRC events, without them having to capture all the events and filter for only the ones they care about.

However, it turns out that in a number of places we actually want to capture all the messages (for example, debugging and logging). No great problem here; we can simply make a second call to a generic on_message instead and pass in the command string itself as the first argument:

my $message = ...;
my $command = $message->command;

run_plugins "on_message_$command", $message;
run_plugins "on_message", $command, $message;

A pattern seems to be emerging here. We can extend this further, for example to handle the specific CTCP message verbs in IRC CTCP messages (for now, don't worry if you don't know what CTCP means; just consider that it's a second sub-hierarchy of messages):

sub on_message_PRIVMSG {
  my ($message) = @_;

  if(message is CTCP) {
    my $verb = ...;

    run_plugins "on_message_ctcp_$verb", $message;
    run_plugins "on_message_ctcp", $verb, $message;
    run_plugins "on_message", "ctcp", $verb, $message;
  }
}

However, the mechanism we've built here still seems a little unsatisfactory. Any given plugin could handle more than one of these cases, meaning it would be called multiple times. Maybe it would be better to build it such that we only call the most-specific event handler that each plugin defines. To do that we would have to build that logic right in to the basic definition of run_plugins.

One possible idea would be to pass an array reference containing pieces of event name, which should be joined by underscores (_) until a suitable handling method is found, and the remaining pieces would be passed as the first positional arguments. Thus a call such as:

run_plugins [ "on_message", "ctcp", $verb ], $message;

would invoke handlers similarly to the previous example, except that it will call at-most one action handler per plugin, meaning that specific handlers "override" more generic ones that plugin provides.

The main question of this post is therefore

To what extent should arguments be interpreted as part of the dispatch of action handlers themselves? Should some arguments be allowed to take part in forming the action name itself, to allow a degree of override-like dispatch logic on a per-plugin basis?

Next >

2014/02/27

Instrumentation and Logging in Event-Reflexive Programming

<< First | < Prev

The previous two posts in this series have introduced the idea of event-reflexive programming, and started to investigate a little into its properties, and design decisions about creating a system using it in terms of ordering considerations and ways to invoke individual actions.

I said this story would continue chronologically, following the history of this idea through various systems I've encountered or built using it. So far we have an ISP provisioning system, and an IRC bot. Quite different in size, scope and semantics, they did however have some common features. The one I want to talk about today is the debugging logging and instrumentation part.

At the ISP, every top-level provisioning action was identified by its name, a plain string, and a named set of arguments, the value of each argument also a string. This made it trivially simple to encode the action over a simple TCP socket we had at the time (this being years before the explosion of YAML and JSON-driven systems, and before, even, the peak of the XML craze). While it wasn't strictly required, it turned out that keeping this property for all of the inner reflexive events as well made logging very simple. The logging was also aware of the nesting level of the event-reflexive core, allowing it to print simple logs showing the full tree-shaped nature of the events as they took place. I forget all the inner details of the logging format, but a hypothetical example could look something like:

Action 'add_user': reseller=foo username=a-new-user product=shiney
+- Action(1) 'make_user_config': reseller=foo username=a-new-user product=shiney
+- Action(1) 'make_user_homedir': reseller=foo username=a-new-user product=shiney
|  +- Action(2) 'copy_skeleton': reseller=foo username=a-new-user product=shiney
...

It wasn't too long before I started finding this logging system simply not powerful enough. At least in the "leaf" events, these often actually did useful things - performed LDAP reads or writes, interacted with the filesystem, talked to various 3rd-party systems, and so on. The logging system then gained the ability to write these as well.

While most logging/debugging systems we have currently use a simple linear scale of "verboseness", the logged items in our logging system were tagged with any of a wide set of possible categories. The set of categories in effect at any time was given by an environment variable. For example, to log just the reads and writes to the LDAP directory, and the attributes of a write, one could set:

DEBUG_INSTRUMENT_FLAGS=LDAPr,LDAPw,LDAPwa

This becomes a much more powerful logging system because it allows the programmer/operator to choose not simply the level of verbosity of the logging, but to more finely-tune where in the code the logging is more detailed. A few current logging systems also possess this ability now.

The full power of our instrumentation system here, however, was that it was integrated with the rest of the event-reflexive core. This meant that it could peek into the action names and parameters, and test the values of them. These values could then be used to set or change the logging flags. A particularly powerful example could be:

A=add_user,reseller=foo,product=gold{LDAPw,LDAPwa};reseller=bar,user=frank{IMAP}

In this example, we are interested in logging the LDAP writes and attributes contained by them that happens during any attempt to add a new user to the foo reseller, for their gold product (perhaps because we are investigating some issue with this one), or separately, any IMAP-related activity that the bar user called frank performs. Because these are interpreted by the core of the event-reflexive system, they can apply fully recursively; applying to any nested inner action performed as part of the ultimate add_user action, for example.

Being strings, these tests could also be performed using regexps, though I never got around to implementing that test as part of this code, so they remained only simple string equality tests. However, even this far gets us an enormous amount of expressive power very simply, and virtually for free in the main body of action-handling code in the plugins, because almost all of it exists only once, at the very core. However, much of its power has come exactly because of the limited expressiveness of the individual arguments, being simple strings.

Over in my IRC bot, meanwhile, many of the arguments and return values being passed around between the named events were application-level objects expressing such concepts as IRC users, channels, and so on. These objects made it convenient to write the code powerful in the action handlers, but limited the ability of the event-reflexive core to introspect into them, to provide such abilities as detailed debug logging similar to that seen above for the ISP provisioning system.

This leads to my third question about event-reflexivity:

To what extent should the arguments of event-reflexive calls be understood and interpreted by the event-reflexive system itself, above simply passing them on to invoked action handlers? Do such abilities as powerfully flexible instrumentation logging justify limiting the expressiveness of the parameters that may be passed?

Next >

2014/02/22

Kinds of Invocation in Event-Reflexive Programming

<< First | < Prev

In the previous post I introduced the idea of Event-Reflexive programming, and discussed the first use-case I had for it; driving the user provisioning system at an ISP. I said this story would continue in chronological order.

A couple of years into this job, I felt I had learned Perl enough to do what surely pretty-much any Perl developer does at this time. Yes, I decided to write an IRC bot. It's one of those rites of passage that every developer goes through at some point. Of course, even this early in my programming career I had already seen several dozen terrible attempts at this, so I was quite determined to ensure mine wouldn't suffer quite as many of those mistakes. In my head, of course, I knew I wouldn't suffer many mistakes because I, of course, was armed with Event-Reflexive Programming.

I ended up with, I thought, the most amazing (it wasn't), the most powerful (it wasn't) and the most flexible (it wasn't) IRC bot the world had ever seen (it isn't). However, in the process of building it I had expanded on the original concept of event-reflexivity considerably.

In the previous post I introduced the most basic two forms of invoking the plugins in an event-reflexive system, run_plugins and run_plugins_reverse. In the course of developing this IRC bot, I found it necessary to create a number of other variations on this basic theme.

Recalling the original two functions, both of these simply execute action hooks defined by the plugins. Neither of them returns an interesting result. What I found while implementing the IRC bot was that as well as merely requesting that work be performed, I was also using the event-reflexive core to abstract out a number of query-like operations - such as abstracting away the specific mechanism of database used to store information about registered users. At this point, the event-reflexive core needs a way to pose a question to the list of plugins, and return an answer as soon as one has been provided:

sub ask_plugins {
  my ( $query, @args ) = @_

  foreach my $plugin ( @plugins ) {
    next unless $plugin->can( $query );
    my $ret = $plugin->$query( @args );
    return $ret if defined $answer;
  }

  return undef;
}

Perl being Perl, it's only a short matter of time before we want a list-valued return from some of these queries. And once we're returning a list, we're not restricted to returning the result from a single plugin - we can run them all:

sub ask_plugins_list {
  my ( $query, @args ) = @_;

  my @ret;
  foreach my $plugin ( @plugins ) {
    next unless $plugin->can( $query );
    push @ret, $plugin->$query( @args );
  }

  return @ret;
}

The final and most interesting invocation function was called scatter_plugins. This being written years before I had encountered the concept of Futures, it was initially written with a complex combination of additional code reference arguments, before I managed to neaten it up somewhat with the creation of Async::MergePoint. I won't give the implementation here, but the point of this particular call was to account for the fact that some plugin actions are going to be asynchronous, and only return a result later. What we'd like to do is start all the operations concurrently, then await their eventual completion before continuing.

These days, I would instead implement this operation using Future. In fact, at this point a case could be made for implementing all of them using Future. If the entire event-reflexive core was based on futures, then trivially it will cope with any synchronous or asynchronous kind of work environment (due to the universal suitability of futures). If we currently set aside our previous question of plugin ordering, and assert that ordering doesn't matter, then all the remaining operations besides reverse can be expressed on top of a single idea:

sub _call_all_plugins {
  my ( $method, @args ) = @_;

  return map {
    my $plugin = $_;
    $plugin->$method( @args );
  } grep { $_->can( $method ) } @plugins;
}

sub run_plugins_concurrently {
  Future->needs_all( _call_all_plugins( @_ ) )
        ->then_done( "" ); # return an empty result
}

sub ask_plugins_concurrently {
  Future->needs_any( _call_all_plugins( @_ ) )
}

sub ask_all_plugins {
  Future->needs_all( _call_all_plugins( @_ ) )
}

In fact at this point our previous idea of scatter_plugins becomes totally redundant - the universal expressiveness of Futures has allowed this to be expressed even simpler. But this incredibly simple implementation has come at a cost - we've lost the sequential lazy-evaluation nature of ask_plugins. Additionally, whatever stop-on-error semantics we might have wanted out of run_plugins have been lost.

Perhaps instead we decide we need ordering, at least in some cases. This brings to mind some additional invocation functions, that themselves are also wrappers around a single common idea:

use Future::Utils qw( repeat );

sub _call_each_plugin {
  my ( $reverse, $while, $method, @args ) = @_;

  repeat {
    my ( $plugin ) = @_;
    $plugin->$method( @args );
  } foreach => [ grep { $_->can( $method ) }
                 $reverse ? reverse(@plugins) : @plugins ],
    while => $while;
}

sub run_plugins_sequentially {
  _call_each_plugin( 0, sub { 1 }, @_ );
}

sub run_plugins_sequentially_reverse {
  _call_each_plugin( 1, sub { 1 }, @_ );
}

sub ask_plugins_sequentially {
  _call_each_plugin( 0, sub { not shift->get }, @_ );
}

Keeping in mind my first question from the previous post, on the subject of ordering between plugins, this motivates a choice of second questions:

If ordering guarantees are not required, are the concurrent invocation functions given above sufficient to express any order-less possibly-asynchronous operation in an event-reflexive system?
If ordering is required, are the additional sequential invocation functions sufficient to express any ordered possibly-asynchronous operation?

Next >

2014/02/20

Event-Reflexive Programming

I am relatively certain that nobody reading this article for the first time will know what "event-reflexive" means, because it's a term I just made up. Or rather, I made up about a year ago while thinking over a bunch of related ideas that I felt needed a name. I'm hoping I didn't invent the basic concept because the reason for this post is to ask people a bunch of questions about it (as well as get the idea out of my head and onto at least some virtual paper).

Event-reflexive programming is a technique to manage the collection of a number of loosely-connected and perhaps interchangeable pieces into a larger coherent form, to work together to solve some common task. It is perhaps most illustrative to explain this story in roughly chronological order.

My first ever post-University job (and as it happens, the one where I learned Perl, and picked up such controversial habits as three-space indent) was centred around a large Internet Service Provider back-end provisioning system. In this system we had a large number of plugins to provide hooks into various components that made up a user's account. This was all hung together, at its core, by basically a big foreach loop looking something like

sub run_plugins {
  my ( $action, @args ) = @_;

  foreach my $plugin ( @plugins ) {
    $plugin->$action( @args ) if $plugin->can( $action );
  }
}

At its very basic level, this was all there was to it. Plugins are just classes, each can hook into the processing of some named action merely by having a method of its name. The list of plugins was determined by a config file, one per virtual reseller the ISP hosted users for. This made for a very powerful yet very simple way to customise and vary the capabilities per reseller - simply provide different plugin modules for them.

I should perhaps at this point explain the meaning of the word "event". An event here could mean a reaction to some external stimulus - the toplevel events used in this ISP provisioning system were such requests as "create a user" or "change this user's password". However, most large actions were really broken down into a sequence of smaller steps that themselves were run through the plugin system. A simple action like "change the user's password" is likely to be handled directly by each plugin module that cares, but an action as large as creating a user is probably broken down into several stages - create the initial user config in a database, mkdir their home directory, fill it with skeleton files, and so on.

It is at this point that the first of the awkward architectural questions of this approach comes to light. We started off with a list of the plugins stored in an array, and as we know, arrays are ordered containers. This approach necessarily imparts an ordering onto the plugins. The order each plugin reacts in this ISP system is determined by the order the names of the plugins appear in a configuration file. This allows us a certain degree of control, but also brings up the first large problem - it is the plugins that have order applied to them, but sometimes it is the individual action hooks that you wish to order.

For example; the action to create a new user has the stages of creating user's config in a database, then mkdiring their home directory, filling it with skeleton files, and so on. The ordering of these steps is fixed - we can't fill skeleton files in their home until it exists, and we can't create the directory until we have the basic user config in the database to look up where their home is. So naturally this leads to a good ordering relationship between the plugins that are responsible for these steps.

But now lets consider what happens when we want to delete a user again. By this plugin ordering we'd first remove their config from the database, before we rmdir their home, but if we did it that way we'd forget where their home was. Clearly, the solution here is that some events (such as deleting a user) have to be executed in reverse. This leads fairly quickly to the invention of a second control function:

sub run_plugins_reverse {
  my ( $action, @args ) = @_;

  foreach my $plugin ( reverse @plugins ) {
    $plugin->$action( @args ) if $plugin->can( $action );
  }
}

Some actions have more interesting ordering constraints even than this, however. The most complex action we had was the action used to regrade a user out of one product and into a different one. This basically involved a partial tear-down of the user, followed by some shuffling around of data, then a recreation again to add back the pieces that were removed, now in a different product. There is no simple way to express this simply in terms of plugin ordering, as a lot of plugins are needed both near the beginning and again near the end of the process.

The solution to this was to re-invoke the plugin action system again as part of handling a particular action. By decomposing this large action into three smaller steps, the plugin system's action ordering abilities help us ensure things all run correctly.

sub regrade_user {
  my %args = @_;

  run_plugins_reverse teardown_user_for_regrade => (
    user => $args{user},
    product => $args{old_product},
    ...
  );

  run_plugins regrade_user_inner => (
    user => $args{user},
    ...
  );

  run_plugins setup_user_for_regrade => (
    user => $args{user},
    product => $args{new_product},
    ...
  );
}

This toplevel action has virtually nothing to do - the above code is a slight over-simplification of the real implementation of course, but in principle all it contained was three recursive calls back down to the plugin management system to execute the three main stages of regrading a user. This is where the "reflexive" part of the name of the technique comes from - a lot of its power derives from the way that events are broken down into smaller pieces reflected back through the core of the system a second time, or further, to react to the whole thing as a large nested tree of individual steps.

I didn't remain at this job for long enough to properly see this idea through to its logical conclusion, but had I continued further on this route, I may well have decided to break these actions themselves into smaller chunks with small wrapper functions like the one given above, to split stages of other actions - for example, to split the stages of creating a user in the first place into the configuration stage, the home directory stage, and the provisioning of 3rd-party services stage. Continuing to break code down into smaller chunks like this leads to the idea that actually, the order in which the individual plugins run starts not to matter, because the order the steps are performed in is controlled by the way it is reflexively decomposed.

This now leads me on to ask the first of several questions I still have about this as a general programming technique:

Is it ever required to actually have strict ordering control of plugins in an event-reflexive system, or can any ordering problem always be resolved by breaking down actions into ordered requests to perform sequences of sub-actions? Even if so, is it ever desirable to have such an ordering control anyway; does it make any kinds of system neater or easier to create, at the cost of having to manage that order?

Next >


Edit 2014/06/09: I am leaning to the conclusion now, that if any kind of ordering were to be guaranteed, it starts to defeat the entire purpose of using event-reflexivity in the first place. The basic premise is to build a large system by loosely glueing together smaller components which are largely independent. If you have to think about which other components exist and what order they come in relative to the current one, it makes it harder to write that, and you might as well have written it by some other method. I think therefore, the answer to this question ought to be:

No; the base of an event-reflexive system should not attempt to give any kind of ordering guarantees between plugins. Any strict ordering that is required should be constructed out of explicit inter-component reflexive calls, and not by relying on the implicit side-effect of plugin execution order.