NAME

quine - The Pentomino Solving Quine in Perl


SYNOPSIS

CAUTION! CAUTION! This program can use a lot of computing resources for very long periods of time. It may take an hour or more just to find the first of thousands of possible solutions. CAUTION! CAUTION!

    $  perl  quine

For more output, try adding the --verbose command line option. This will cause the program to print incomplete results every few seconds, to help you see what it is doing. These partial results are usually full quines, but this isn't guaranteed. Full solutions are always quines.


WHAT IS A QUINE?

In programming terms, a quine is a program which, when run, reproduces its own source code. There are very many ways to do this in Perl, since it is a language which excels at interpreting live code. A very short quine in Perl, attributed to a V. Vinay, is as follows:

    $_=q(print"\$_=q($_);eval;");eval;

The term is named after logician and author Willard V. Quine.


WHAT ARE PENTOMINOS?

A domino is made up of pieces adjoining two squares. The set of pentominos, or pentominoes, are those pieces which can be constructed similarly of five squares. Taking five squares and arranging them edge to edge, one may discover that there are twelve uniquely shaped varieties.

         @      @                                #
     #   @  ##  @   #    @   ##  @@@   ##   @    #   @@
    ##   @  #   @@  ##   @   #   @    ##   @@@  ##   @
     ##  @  #    @  ##  @@@  ##  @    #     @    #  @@
         @  #

The most common puzzle which can be played with pentominos is to fill a given space with some or all of the pentomino pieces. Since there are twelve different pieces, shapes with 60 square units are popular.

For example, a 6x10 rectangle can fit all twelve pentomino pieces easily. And in fact, all twelve can fit in the 6x10 in an astounding 2336 different ways! This makes the pentominos puzzle seem like trivial child's play.

Well... not so fast. Try it by hand sometime, using some graph paper. Most people can find a way to fit 11 pieces in a few minutes, but many never succeed in arranging all 12 at once. A lucky few can find a handful of unique solutions.


SOURCE AND OUTPUT

This pentomino game solver is also a quine. The program plays with the pentomino pieces, and reproduces itself on every solution found. Each reproduction differs in the whitespace, but is a complete copy which can be saved and executed as-is. Here is an example of the output, which can also be run to solve pentominos on another terminal.

 #!/usr/bin/perl
 $_=$w=q#
 $w=~s=\s|\n==sg ;++$L{'                 FfFffFF fFf'};($b,$B,$bb,$BB)=(10,6,8,6
 );$q=int($b/2); ($*,$",                 $/,$:,$ ;)=split//,"\040:\n\043_";$D=$"
 x($b+2);$D=$D.( $".($*x                 $b).$") x$B.$D;$Z=sub{reverse@_};++$L{'
 IIIIi'};$T=sub{ my($x,$                 P,$T,$O ,$F)=(0,$b*$q+$q,@_);$_=$"x($b+
 $b*$b+$b);my@T= (-2-$b,                 1,2+$b, -1);@T=&$Z(@{T})if$F;while($T){
 $X=substr($T,0, 1,'');$                 x=lc${X                         }if!$x;
 substr($_,$P,1, $x);(${X}ne$x)?         ($P+=$T [$O]):($O=($O+1)%(@T))} s=\A$"+
 ==s;s=$"+\z==s; $_};++$L{'LlLLL         l'};$W= sub{$_=shift;eval(qq%y/ $"f-z/.
 $*/%);s-(\.{4,} )-'.{'.length($         1).'}'- ge;qr/$_/};++$L{'NNnnnN nNn'};+
 +$L{'PPpPpPp'}; $r=sub{local($_         ,$m)=@_ ;s/$"//g;s/(.)/$1x$bb/g e;$x=$b
 b*$b;s|(.{$x})| $1x$BB|eg;s~(?<         =(.))(. )~($1eq$2)?$2:$*~eg;$x- -;s~(?<
 =(.).{$         x})(.)~                 ($2ne$1                 )?$*:$2
 ~ge;s|. (.{$x})|\1$/|g;                 s!.*?$/ !!s;$x=$w.$;.$w ;s:\S:substr($x
 ,0,1,'' ):ge;print"$:!$                 ^X$/",q @$_=$w=q@,$:,$/ ,$_,"$:",q@=>s=
 \s|\n|@ ,$;,$;,q@.*==sg                 =>eval@ ,$/,"$;$;DATA$; $;$*$m$/";};++$
 L{'TTtT ttTTt'};$L=sub{                 my($Z,$ O,$F);for$x(key s%L){$L{$x}={};
 for$O(0 ..3){for$F(0,1)                 {$Z=&$T ($x,$O,$F);$L{$ x}{$Z}=&$W($Z);
                                         }}}};++         $L{'UuU
 UuUu'};($L5,$L4 ,$L3,$L         L)=($b- 4,$b-3,         $b-2,$b-1);($S, $s)=("S
 -S$LL:","S--S$L L:");$h         =["S$b: S-S$b:S         ","S$b:${S}S-S$ b:S","S
 S$LL:${s}SS","S $b:${S}         ${S}S-S $b:S","         SSS$L4:S---S$L4 :SS","S
 $b:${S}${s}SS", "S$b:S-         S$L3:${ s}SS","         SS$LL:S--S$L3:S -S$b:S"
 ,"SS$LL:${s}S-S $b:S","         S$b:$S$ S${S}S-         S$b:S","SSS$L5: S----S$
 L5:SS",         "SS$LL:         S--S$L3 :${s}SS                 ","S$b: $S$S$S$
 S${S}S- S$b:S"];for(@$h){s=S=\1 34S=g;s =\-=$*=                 g;s=(\d +):=.{$
 1}=g;$_ =qr/$_/}++$L{'VVvVVv'}; $H=sub{ my$X=sh                 ift;for $x(@$h)
 {return ''if$X=~/$x/}1;};++$L{' WwWwwwW wWw'};$                 V=sub{m y($I,$D
 ,$q,$Q) =(0,@_);for$I(0..-1+len gth($Q) ){$_=su                 bstr($Q ,$I,1);
 if($"eq $_){next}substr($D,$I+$ q,1,$_) ;}&$H($                 D)&&&$A ($D);};
 ++$L{'X         XxxXxXx         xXXx'};                                 $S=$c=0
 ;++$L{'YyYyyYYY y'};$A= sub{++$c;my($D)=@_;!@A&         &(&$r($D,"SOLUTION".++$
 S),return);@ARG V&&(@A< 3)&&&$r($D,"$c$*moves")         ;my($aa,$a)=('',pop@A);
 for$aa(sort(key s%{$L{$ a}})){while($D=~m/(?=$L         {$a}{$aa})/g){&$V($D,po
 s($D),$aa);++po s($D);} }push@A,$a;};++$L{'ZzZZ         zzzZz'};&$L;@A=&$Z(sort
 (keys%L));&$A($ D);__@@ *%&(*%(*@&%(*&@^_$w=~s=         \s|\n==sg;++$L{'FfFffFF
 #=>s=\s|\n|__.*==sg=>eval
 __DATA__                                            [ e d @ h a l l e y . c c ]

(This particular output isn't a full solution-- it only got ten of the twelve pieces in place. But it's still a full quine; run it to get some complete solutions.)


DATA PREPARATION

In this program, the pentomino problem is solved Perl-style, thinking ``out of the box'' to organize a 2D problem into a series of 1D string elements that benefit from a regular expression engine. However, this takes some creative but straightforward data preparation.

An empty pentominoes board $D is arranged as a one-dimensional string of whitespace, broken into ``rows'' with non-whitespace so pieces can't be placed across the wrapped ``edges.'' Though there are no newlines in this string, we can wrap here for clarity, rendering the board thus.

    $D = "::::::::::::
          :          :
          :          :
          :          :   # a 6 x 10 rectangle inside box, in 1D
          :          :
          :          :
          :          :
          ::::::::::::";

Next, a library of pieces is created in a hash %L. This is a two-stage process. There are twelve pentomino pieces, often named after the letters they most resemble.

         i  l
     ff  i  l    n  pp  ttt       v    w     x    y  zz
    ff   i  l   nn  pp   t   u u  v    ww   xxx  yy   z
     f   i  ll  n   p    t   uuu  vvv   ww   x    y   zz
         i      n                                 y

(I could have gone a lot to further to erode these naming clues in the obfuscation, but decided not to bother with swapping names for the wrong shapes or other hacks.)

For each piece, the name of the piece is a mixed-case string like $t = 'TTtTttTTt'. This odd-looking name is actually an instruction tape for a special tape-driven drawing routine &$T(). This routine acts like a LOGO turtle, with two possible instructions. The simulated pen ``draws'' forward in a blank scratch buffer for capital letters, and rotates 90 degrees for lowercase letters. Leading and trailing empty paper is stripped, leaving the essential characters and the 1D structural spacing.

                   "::::::::::::
                    ::::::::::::
                    :::::uu:::::
    'UuUUuUu'  -->  :::::u::::::  -->  'uu::::::::::u:::::::::::uu'
                    :::::uu:::::
                    ::::::::::::
                    ::::::::::::"

Up to eight orientations (four rotations and two flips) are collected for each piece, but duplicates due to rotational symmetry are discarded. We'll call each of these orientations a ``glyph string.'' Next, each such glyph string is rewritten with &$W() into a precompiled regular expression which can later be used to find suitable empty space in the playing board. The glyph string and regex are paired in a hash in the library. The 'W' piece is thus ultimately stored in the library %L as four unique pairs.

    $L{'WwWwwwWwWw'} = {
        'w:::::::::::ww:::::::::::ww' => qr/(?-xism: .{11}  .{11}  )/,
        'ww:::::::::::ww:::::::::::w' => qr/(?-xism:  .{11}  .{11} )/,
        'ww:::::::::ww::::::::::w' => qr/(?-xism:  .{9}  .{10} )/,
        'w::::::::::ww:::::::::ww' => qr/(?-xism: .{10}  .{9}  )/
    };

The preparation is done, and the solver can now play pentominoes.


SOLVER ALGORITHM

The solver consists of three pieces: an Assayer in &$A(), a Validator in &$V() and an optional Heuristic in &$H() to avoid wasting a ton of runtime on unprofitable attempts. The Assayer and Validator are called recursively up to twelve levels deep, once per piece until the board is full.

A runtime stack @A initially holds the names of all of the twelve pieces. The ``main'' execution just calls &$A() with a blank board.

The Assayer just iterates through all possible positions on the board for every possible hole to fit the next piece. Since Perl makes this so easy, it can skip any place that has other pieces or board edges in the way, by using each of the available regular expressions for that piece.

For each position on the board, the Assayer calls the Validator. The &$V() draws a piece on a copy of the board, replacing only spaces with letters as copied from the left-hand glyph string.

    $D = "::::::::::::
          :  fuui   p:
          :  ffui vpp:
          : ffuuilvpp:   # f, i, l, n, p, t, u, v...
          :    tilvvv:
          :  nntil   :
          :nnntttll  :
          ::::::::::::";

Once it's drawn, the Validator calls the &$H() Heuristic for any hints that the board is not going to succeed. If there's any little islands of surrounded space on the board which are too small for any valid piece to fit, then we know the board won't solve. The program can save huge amounts of time by not trying to prove impossible solutions the hard way. This routine uses a few hard-coded regular expressions to look for singleton, double- or triple-holes. (An overly verbose improvement could extend this to look for all nineteen quadruple-holes, and even some common six-hole spaces.)

    $h = [ ...
           qr/(?-xism:\S\S.{9}\S  \S.{9}\S\S)/,
           qr/(?-xism:\S.{10}\S \S.{9}\S \S.{9}\S \S.{10}\S)/,
           qr/(?-xism:\S\S\S.{7}\S   \S.{7}\S\S)/,
           ...
         ];

Lastly, for worthwhile board positions, the Validator recurses to call the Assayer again. Whenever the Assayer finds there's no @A pieces remaining, a solution has been found, and will be rendered.

If any arguments are found in @ARGV, the Assayer also renders the incomplete board in progress, with some speed statistics, every few seconds. These status boards are not 100% guaranteed to be full quines; only full solutions are code-complete.

Left alone for many hours, the program would theoretically arrive at all possible solutions to the given board. This version does not weed out symmetrically identical solutions, though I may add a small feature to do this. There should be 2336 unique solutions for the 6x10 board.


INTERRUPTIONS

This is a massive search space to brute force. I was impressed with Perl's muscle without my really trying hard to optimize the problem further. On a 1.4GHz dual Athlon box running Perl 5.8.0 on Linux [April 2003], some statistics outputs showed that it was placing around 1200 pieces per second with full heuristics. (It could place as many as 4500 without heuristics, but timing the solution-to-solution progress showed vast numbers of wasted moves.) However, it's still not going to find all of the possible solutions to a given board in one evening.

Since this program could take days or weeks to completely find all solutions for a given board size, the chances of running that long on a fickle whim are nearly astronomically small. Most people (including myself) will let it run for a few minutes or hours, but then kill it.

It would be fairly straightforward but cumbersome to make this program safely interruptable and resumable. Since this is just a curiosity effort, I have not implemented that feature in this version.

An interruptable version would encode the state of the piece stack @A into the quine, along with pos() information for each piece that had already been placed. The quine would have to do some extra recovery work when started; it would have to shortcut the recursion through V and A until it ``caught up'' with the current position, and then resume the effort.


RENDERING

The final step is simply to reproduce. This program outputs its own complete source code whenever it solves the puzzle. The source code is formatted to show the solution to the puzzle graphically; all solutions represent the same program source code with different whitespace, and any such copy can run the whole series to generate all other possible solutions of the same board size.

The quine nature is not done by rewinding the DATA handle, but by having most of the program in a scalar variable that's evaluated. The remaining infrastructure from the shebang to the data token are replicated in the rendering routine. The main string is stripped of all of its whitespace before being evaluated, and for the board itself.

The &$R() routine renders the small 1D board string as a full-fledged quine-like output. The board is scaled up to ensure that it requires at least as many non-whitespace characters as are in the program. Any smaller, and the quine would not be complete. Since the board may be larger than the program, the program is duplicated following a __ marker to fill in any remaining puzzle image. A given message argument is printed after the __DATA__ token, for solutions or status. The original has a signature here which is not replicated since it's not considered source code.


OBFUSCATION

This program is not really heavily obfuscated, beyond a few basic features. Running perltidy or B::Deparse on one string gives away the whole show, syntactically speaking. In fact, emacs isn't even very confused about how to do the syntax coloring in the fully compressed form.

Due to the quine's use of whitespace, all of the program functions within are thus limited to forms of Perl syntax which do not depend on any whitespace between tokens. The whole program has to fit within the puzzle space, as well, but there's still a bit of room for ``playing golf'' or reducing the overall bulk. These constraints lead to some interesting syntax.

Some built-in Perl variables such as $", $* and $/ are used instead of character or numeric constants. A few creative terminators are used in the s/// and m// operators. Some semicolons and commas have been replaced with other separators where possible.

For visual interest in the quine, the %L initialization is spread out across the program a little bit.


AUTHOR

Ed Halley <ed@halley.cc> http://www.halley.cc/ed/


COPYING

Copyright (C) 2003 Ed Halley

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. If you publish this work, please include this analysis documentation along with the obfuscated quine source code.