Latest libvterm / pangoterm work

Since it's been a while since I wrote anything, and I've lately been doing some work on both pangoterm and the underlying libvterm I thought I'd write a little post summarising the recent updates.


Two new features now supported in pangoterm are bracketed paste mode, and configurable basic and high-brightness palettes.

Bracketed paste mode is a feature that applications can enable, which causes text pasted into the terminal via the clipboard to be surrounded by special markers indicating this fact. Applications like text editors can make use of these markers to interpret the keystrokes that arrive as literal text, overriding whatever other meaning they might wish to apply during that time.

There's also a few bugfixes; interpretation of numeric keypad events and IME handling (written by FireFly on Freenode), and a fix for cells with the maximally-supported set of combining chars applied (such as so-called Zalgotext).


On the libvterm front, there is now a second set of callbacks at the State and Screen layers, to allow embedding applications to at least inspect CSI, DCS or OSC strings that libvterm itself doesn't understand, thus allowing some method of extension by applications. I'm not terribly happy with the interface shape of these yet - it will likely change in future - but for now it serves its purpose to satisfy Neovim.

There's a fix for an awkward bug I've been trying to isolate for a while, which involved the Screen layer in its most aggressive optimisation setting, where there is a pending scrollrect and pending damage, and another scrollrect arrives. This illusive bug was the cause of occasional rendering failures during scroll after a quick line edit in command shells, and similar situations.


libtickit terminal size handling

I've recently been working on the Window layer in the C library libtickit (a migration to C from my Tickit perl distribution). It's very-almost ready now - I think we have all the code and tests ported from Perl (with the single exception of the extended drag-and-drop events whose names start "drag_..."). I haven't yet done the docs though.

I've also been adding/fixing a few other bits in the core code. Two new term-related functions make actually writing programs a little neater:

void tickit_term_observe_sigwinch(TickitTerm *tt, bool observe);
TickitTerm *tickit_term_open_stdio(void);

This latter convenience function makes all the examples a few lines shorter, and while it's at it, enables the SIGWINCH behaviour. That ensures that the TickitTerm instance actually sends EV_RESIZE events when the terminal is resized. Previously C programs couldn't actually detect that one; it was done in the Perl layer using a handler on $SIG{WINCH}.


Meta-Meta Documentation

When any project of mine gets over a certain size and people start actually using it for things, inevitably people ask me for more documentation. Often there is a problem here in that I'm never really quite sure what to write about. I find it very hard to provide tutorial-style documentation "up-front" explaining the hows and the whys (as compared to reference-style docs, which just lists all the pieces that exist).

Let me give a random example: Consider Parser::MGC, a base for writing grammar parsers. Someone might want to use this for parsing some sort of expression language, and they'd find a lack of explaining how to do that. It's not that I don't know how to explain that, it's simply that it hasn't occurred to me to write that documentation as a discrete subject. If they'd ask me on IRC I could explain at great length, because they'd given me that initial impulse to give them an answer. But at some point someone will say "Well why isn't that written in documentation?" or "You should write a tutorial", and the same cycle of conversation will usually end with something like

But I don't know what documentation to write for you.

I have sometimes asked people to explain to me what documentation they want, and usually people don't like doing that because they feel since they don't already know the answers, they can't provide the questions. This article is my attempt to break out of that cycle.

Here now then is my rather direct plea to anyone who would ask me for more documentation. Here is your documentation on how to document what documentation I should write - a meta-meta-document:

If you explain to me what problems you want to solve, or what situations you want documentation for, that's a good start. It would be better if you can provide me some introduction text around your problem, your background, and then I can write something better for you specifically, that still stands nicely on its own.

But, here's the killer revelation:

If you write some documentation wherein you write the English waffly-wordy bits that explain the background of the problem and introduce the reader to the background context of the code, and then write a block of code that just says
banana banana banana
then I will find that about a hundred times easier to write. All I have to do is fill in that blob of perl code, and hey presto; there's your documentation.

I know this is a lot to ask of people, and I don't imagine it will happen very often, but that would be by far the fastest quickest way to get more documentation out of me.


Turning a Hard Problem Into an Easy One

Recently I continued my stewardship (for want of a better word) of List::Util by finally adding the last function in a useful set of the pairwise key/value list management functions, called unpairs.

I had been intending to add to the set of pairfirst, pairgrep and pairmap, a sorting function called pairsort. To manage that I would first have to get around a slightly awkward interface requirement. Perl's regular grep and map pass the single items in via $_, whereas pairgrep and pairmap operate not on single values of their input list, but odd/even sets of two elements each, consisting of a notional "key" and "value". They have to pass the key and value items in via $a and $b. Since regular sort already uses $a and $b, there'd be no easily convenient place to pass the second key and value in to pairsort - perl doesn't provide a $c and $d for this purpose.

It would be easier, in hindsight, if the pair* versions of these functions had continued using the $_ variable to pass both key and value by using the Pair objects in the list provided by the pairs function. If this were the case, then pairsort could be implemented using just $a and $b, as the comparison function would have access to all four items using $a->key, $a->value, $b->key and $b->value.

I started by thinking about how a pairsort function would do this, taking the values in the list and splitting them into key/value pairs, packing them into Pair objects, and... Wait a moment - this is exactly the task that the pairs function already does. Rather than writing a pairsort function as a native XS function from scratch, it would be far easier to reuse pairs and regular sort. All I'd need is a function to take the sorted output, that list of Pair objects, and turn it back into the even-sized list of keys and values. An inverse of the pairs function - call it unpairs.

Rather than a complex XS function to implement sorting a list of keys and values, all I'd need is the simpler pairs and unpairs functions, wrapped around a regular sort. This could easily be done in Perl code:

sub pairsort(&@)
   my $code = shift;
   unpairs sort &$code pairs @_;

In fact, by writing them out in a little table we can see that the two functions pairs and unpairs are in fact quite sufficient to let us implement any of the higher-order pairwise functions, using only the regular single-element functions perl already provides:

pairfirst { COND } LIST  ===   unpairs first { COND } pairs LIST
pairgrep { COND } LIST   ===   unpairs grep { COND }  pairs LIST
pairmap { FUNC } LIST    ===   unpairs map { FUNC }   pairs LIST
pairsort { CMP } LIST    ===   unpairs sort { CMP }   pairs LIST

You might at this point start to notice some similarities with the Schwartzian Transform; a method of sorting a list by the values returned by some transformation function on each item. To remind ourselves what this looks like, here's an example that sorts a list of strings by the number of vowels it has:

my @sorted_strings = map { $_->[0] }
                     sort { $a->[1] <=> $b->[1] }
                     map { [ $_, length( $_ =~ s/[^aeiou]+//gr ) ]

This starts to look rather similar to sorting an even-sized kvlist by, say, a simple string order on its keys:

my @sorted_kvlist = unpairs
                    sort { $a->key cmp $b->key }

What both of these have in common is their method of attacking a problem that is in some way inconvenient to process by changing the form of the data into a shape that is easier to solve. This is done in a reversible way so that at the end we can undo the initial transformation and return back to a list in the form we started with, but now in the right order.

In the Schwartzian's case, we take the original list of values and turn them into two-element array references containing the transformed value we'd like to sort by and the original value. In the key/value pair case, we take the original list in sets of two elements and bundle these keys and values into Pair objects. Each transformation can be easily undone, but before we undo it, we first pass the list through our required sort function, to obtain the order that we'd like.

What we have achieved then, is a way to create new sorting behaviours by just providing a pair of functions that let us use a regular sort. In the Schwartzian's case it simply gives us a more efficient way to achieve this sort by reducing the number of times we need to transform items into their sorting values, but in the pairsort case it lets us perform what previously we could not; as perl's native sort does not understand how to take items in key/value pairs from its input list. In both cases, we can write in a more generalised form, as:

my @output = UNTRANSFORM
             sort { COMPARE($a, $b) }

This works because the two functions of TRANSFORM and UNTRANSFORM are inverses of each other. We transform the input list from one type into a list of a different type, perform the sorting operation on that type instead, then undo that transformation and return back to our original type now in a new order.

There is of course nothing unique to strings or pairs, or indeed even sorting, in this pattern. It can apply to many other kinds of list-processing function, or even problems on other kinds of data. Whenever you have a problem that at first glance seems hard, it pays to consider whether the problem can be turned into one that is inherently easier by changing the values from one type into another, and then back again.


Printing function calls in Perl

Lately my favourite way to interpolate function calls into printed strings is to use the ${\ ... } notation for embedding arbitrary expressions into strings:

print "f($number) = ${\ fibonacci($number) }\n";

Other techniques include separate arguments:

print "f($number) = ", fibonacci($number), "\n";

or a helper variable:

my $result = fibonacci($number);
print "f($number) = $result\n";

or even printf:

printf "f(%d) = %d\n", $number, fibonacci($number);

Of all these techniques I tend to prefer either of the first two, because they lead to putting the expressions "in-line" with the rest of the text string, whereas in the latter two they sit elsewhere, making it harder to see at a glance what gets printed where. Especially with printf's positional arguments, it can be easy to be "off-by-one" with a large number of arguments, and put everything in the wrong place.

In addition, the first technique is not limited to print calls, as it works just on simple string interpolation, so it works just as well to construct strings in general.

(This post was originally written to answer a question on StackOverflow.)


Perl module idea for making bitfields


I've been doing a lot of hardware IO work lately, which involves lots of talking to hardware devices, where there are byte-wide registers that store little bitfields, sometimes of individual and unrelated bits packed together. To make this easier I tend to write myself lots of pairs of functions to pack/unpack the fields in one of these bytes; for instance:

use constant {
      MASK_RX_RD      => 1<<6,
      MASK_TX_DS      => 1<<5,
      MASK_MAX_RT     => 1<<4,
      EN_CRC          => 1<<3,
      CRCO            => 1<<2,
      PWR_UP          => 1<<1,
      PRIM_RX         => 1<<0,

sub unpack_CONFIG
   my ( $config ) = @_;
      MASK_RX_RD  => !!( $config & MASK_RX_RD ),
      MASK_TX_DS  => !!( $config & MASK_TX_DS ),
      MASK_MAX_RT => !!( $config & MASK_MAX_RT ),
      EN_CRC      => !!( $config & EN_CRC ),
      CRCO        => $CRCOs[!!( $config & CRCO )],
      PWR_UP      => !!( $config & PWR_UP ),
      PRIM_RX     => !!( $config & PRIM_RX );

sub pack_CONFIG
   my %config = @_;
      ( $config{MASK_RX_RD}  ? MASK_RX_RD  : 0 ) |
      ( $config{MASK_TX_DS}  ? MASK_TX_DS  : 0 ) |
      ( $config{MASK_MAX_RT} ? MASK_MAX_RT : 0 ) |
      ( $config{EN_CRC}      ? EN_CRC      : 0 ) |
      ( ( _idx_of $config{CRCO}, @CRCOs )
         // croak "Unsupported 'CRCO'" ) * CRCO |
      ( $config{PWR_UP}      ? PWR_UP      : 0 ) |
      ( $config{PRIM_RX}     ? PRIM_RX     : 0 ) );

This convenient pair of functions bidirectionally converts bytes into key/value lists. As well as containing 6 simple boolean values, there's also an enumerated field, represented by the values in the array @CRCOs. These functions convert those too.

I'm getting tired of writing these pairs of functions. Hey, this is Perl right? :) I should get Perl to write them for me.

TL;DR - I propose the following

I'd like instead to write something like:

use bitfield;

bitfield CONFIG =>
   MASK_RX_DS  => boolfield(6),
   MASK_TX_DS  => boolfield(5),
   MASK_MAX_RT => boolfield(4),
   EN_CRC      => boolfield(3),
   CRCO        => enumfield(2, qw( values here )),
   PWR_UP      => boolfield(1),
   PRIM_RX     => boolfield(0);

The operation here is that 'bitfield' is automatically exporting a function called bitfield, along with the various *field() functions that create field definitions. boolfield() declares a single true/false boolean bit position, enumfield() declares a range of bits that give an enumeration. I guess also would be required intfield() to use a range of bits as an integer, and finally most likely customfield() taking a pair of conversion CODE refs or somesuch.

What does anyone think to that?


Released: Tickit 0.47

New up in Tickit 0.47, some fairly small and incremental updates:

  • Support the 'blink' terminal attribute

    Both in libtickit C library and Tickit perl module now support the blink attribute, much to my hesitation. ;)

    I'm not sure I want to encourage this sort of thing, but the Neovim project said they wanted this, so I've reluctantly added support for it all the same.

  • Bugfix for renderbuffer 'get*' methods

    When offset and clipping are applied, previously the get* methods didn't pay attention to this, fetching content relative to the toplevel, or segfaulting if requested out of bounds. This has now been fixed.

  • Tickit::Widget::HBox and ::VBox have now been moved to the Tickit-Widgets distribution.

    This dist is now linked to explicitly from the documentation. This supports the longterm goal of turning 'Tickit' into purely the window-layer downwards, and having all the widget support live in its own distribution, backed eventually by its own C library.

  • Nicer handling of fallback terminfo attributes for definitions missing them.

    Certain distributions of terminfo databases seem to be lacking certain essential attributes for some termtypes. Such examples as the upstream 'screen' terminfo is lacking erase_chars. libtickit now includes some fallback "likely to work" strings to handle these cases. This turns it from an instant failure on startup, to an at-worst wrong output, but hopefully most terminals should understand these standard strings. At least, if they don't I suspect libtickit is far from the only place that's broken, if they don't supply a more correct string in their terminfo.