Skip to content

Instantly share code, notes, and snippets.

@Tyil
Last active November 16, 2019 16:38
Show Gist options
  • Save Tyil/10b1b321968420d88b65af7108b23cf0 to your computer and use it in GitHub Desktop.
Save Tyil/10b1b321968420d88b65af7108b23cf0 to your computer and use it in GitHub Desktop.

Making a simple bot in Raku

Making IRC bots is incredibly simple in Raku, thanks to IRC::Client. It allows you to create a very simple bot in about 20 lines of code. There's a plugin system that allows easy re-use of code between multiple bots, and adding customized features can be as easy as dropping in an anonymous class.

So, let's get to it!

Get your dependencies installed

Raku uses zef as the standard module installer, and if you're reading this, I'm assuming you have it available to you. Install IRC::Client with zef, and you should be good to get started.

zef install IRC::Client

Setting up the bot

To set up the bot, we'll need to have a nickname to use, a server to connect to and a list of channels to join. To make it easier to run this is a program from your shell, I'll be using a MAIN sub as well.

use IRC::Client;

sub MAIN ()
{
    IRC::Client.new(
        nick => 'raku-advent',
        host => 'irc.darenet.org',
        channels => < #advent >,
    ).run;
}

Let's save this in a file called bot.pl6, and run it.

perl6 bot.pl6

This will run, and if you're in the channel you specified in channels, you should see the bot joining in a short moment. However, the program itself doesn't seem to provide any output. It would be highly convenient, especially during development, to show what it's doing. This is possible by enabling the debug mode. Adding this to the new method call, making it look as follows.

IRC::Client.new(
    nick => 'raku-advent',
    host => 'irc.darenet.org',
    channels => < #advent >,
    debug => True,
).run;

If you restart the application now, you will see there's a lot of output all of a sudden, showcasing the IRC commands the bot is receiving and sending in response. Now all we need to do is add some functionality.

Making the bot work

As described earlier, functionality of the bot is added in using plugins. These can be any class that implements the right method names. For now, we'll stick to irc-to-me, which is a convenience method which is triggered whenever the bot is spoken to in a private message, or directly addressed in a channel.

The simplest example to get started with here is to simply have it respond with the message you sent to the bot. Let's do this by adding an anonymous class as a plugin to the new method call.

IRC::Client.new(
    nick => 'raku-advent',
    host => 'irc.darenet.org',
    channels => < #advent >,
    debug => True,
    plugins => [
        class {
            multi method irc-to-me ($e) {
                $e.text
            }
        }
    ],
).run;

When you restart the bot and talk to it on IRC, you will see it responding to you with the same messaage you sent it.

<@tyil> raku-advent: hi
<raku-advent> tyil, hi
<@tyil> raku-advent: how are you doing
<raku-advent> tyil, how are you doing

Adding some real features

So, you've seen how easy it is to get started with a simple IRC bot in just over a dozen lines. Let's add two features that you may want your bot to support.

For convenience sake, I will only cover the class implementing the features, not the entire IRC::Client.new block.

Uptime

First off, let's make the bot able to show the time its been running for. For this, I'll make it respond to people asking it for uptime. We can use the irc-to-me convenience method for this again. After all, we probably don't want it to respond every time someone discusses uptime, only when the bot is asked directly about it.

In Raku, there's a special variable called $*INIT-INSTANT, which contains an Instant of the moment the program started. We can use this to easily get the Duration that the program has been running for.

class {
    multi method irc-to-me ($ where *.text eq 'uptime') {
        my $response = "I've been alive for";
        my ($seconds, $minutes, $hours, $days, $weeks) =
            (now - $*INIT-INSTANT).polymod(60, 60, 24, 7);

        $response ~= " $weeks weeks" if $weeks;
        $response ~= " $days days" if $days;
        $response ~= " $hours hours" if $hours;
        $response ~= " $minutes minutes" if $minutes;
        $response ~= " $seconds seconds" if $seconds;

        $response ~ '.';
    }
}

Now, whenever you ask the bot for uptime, it will respond with a human friendly uptime notification.

<@tyil> uptime
<@tyil> raku-advent: uptime
<raku-advent> tyil, I've been alive for 5 minutes 8 seconds.

User points

Most channels have a bot that keeps track of user points, or karma as it's sometimes referred to. There's a module already that does this for us, called IRC::Client::Plugin::UserPoints. We don't have to do much apart from installing it and adding it to the list of plugins.

zef install IRC::Client::Plugin::UserPoints

Once this finishes, the module can be used in your code. You will need to import it with a use statement, which you can put directly under the use IRC::Client line.

use IRC::Client;
use IRC::Client::Plugin::UserPoints;

Now, in the list of plugins, add it as a new entry.

plugins => [
    IRC::Client::Plugin::UserPoints.new,
    class {
        ...
    },
],

This plugin makes the bot respond to !scores, !sum and whenever a nick is given points using a ++ suffix, for instance, tyil++.

<@tyil> raku++
<@tyil> raku++
<@tyil> !scores
<raku-advent> tyil, « raku » points: main: 2

Finding plugins

All plugins for IRC::Client that are shared on the community have the prefix IRC::Client::Plugin::, so you can search for that on modules.perl6.org to find plugins to use. Of course, you can easily add your own plugins to the ecosystem as well!

Winding down

As you can see, with some very simple code you can add some fun or important tools to your IRC community using the Raku programming language. Try it out and have some fun, and share your ideas with others!


Parsing Firefox' user.js with Raku

One of the simplest way to properly configure Firefox, and make the configurations syncable between devices without the need of 3rd party services, is through the user.js file in your Firefox profile. This is a simple JavaScript file that generally contains a list of user_pref function calls. Today, I'll be showing you how to use the Raku programming language's Grammars to parse the content of a user.js file. Tomorrow, I'll be expanding on the basis created here, to allow people to programmatically interact with the user.js file.

The format

Let's take a look at the format of the file first. As an example, let's use the startup page configuration setting from my own user.js.

user_pref("browser.startup.homepage", "https://searx.tyil.nl");

Looking at it, we can deconstruct one line into the following elements:

  • Function name: in our case this will almost always be the string user_pref;
  • Opening bracket;
  • List of arguments, seperated by ,;
  • Closing bracket;
  • A ; ending the statement.

We can also see that string arguments are enclosed in ". Integers, booleans and null values aren't quoted in JavaScript, so that's something we need to take into account as well. But let's set those aside for now, and first get the example line parsed.

Setting up the testing grounds

I find one of the easiest ways to get started with writing a Grammar is to just write a small Raku script that I can execute to see if things are working, and then extend the Grammar step by step. The starting situation would look like this.

grammar UserJS {
    rule TOP { .* }
}

sub MAIN () {
    my @inputs = ('user_pref("browser.startup.homepage", "https://searx.tyil.nl");');

    for @inputs {
        say UserJS.parse($_);
    }
}

Running this script should yield a single Match object containing the full test string.

「user_pref("browser.startup.homepage", "https://searx.tyil.nl");」

The and markers indicate that we have a Match object, which in this case signifies that the Grammar parsed the input correctly. This is because the placeholder .* that we're starting out with. Our next steps will be to add rules in front of the .* until that particular bit doesn't match anything anymore, and we have defined explicit rules for all parts of the user.js file.

Adding the first rule

Since the example starts with the static string user_pref, let's start on matching that with the Grammar. Since this is the name of the function, we'll add a rule named function-name to the grammar, which just has to match a static string.

rule function-name {
    'user_pref'
}

Next, this rule needs to be incorporated with the TOP rule, so it will actually be used. Rules are whitespace insensitive, so you can re-order the TOP rule to put all elements we're looking for one after another. This will make it more readable in the long run, as more things will be tacked on as we continue.

rule TOP {
    <function-name>
    .*
}

Running the script now will yield a little more output than before.

「user_pref("browser.startup.homepage", "https://searx.tyil.nl");」
 function-name => 「user_pref」

The first line is still the same, which is the full match. It's still matching everything, which is good. If it didn't, the match would fail and it would return a Nil. This is why we keep the .* at the end.

There's an extra line this time, though. This line shows the function-name rule having a match, and the match being user_pref. This is in line with our expectations, as we told it to match that literal, exact string.

Parsing the argument list

The next part to match is the argument list, which exists of an opening bracket, a closing bracket to match and a number of arguments in between them. Let's make another rule to parse this part. It may be a bit naive for now, we will improve on this later.

rule argument-list {
    '('
    .+
    ')'
}

Of course, the TOP rule will need to be expanded to include this as well.

rule TOP {
    <function-name>
    <argument-list>
    .*
}

Running the script will yield another line, indicating that the argument-list rule matches the entire argument list.

「user_pref("browser.startup.homepage", "https://searx.tyil.nl");」
 function-name => 「user_pref」
 argument-list => 「("browser.startup.homepage", "https://searx.tyil.nl")」

Now that we know this basic rule works, we can try to improve it to be more accurate. It would be more convenient if we could get a list of arguments out of it, and not include the brackets. Removing the brackets is the easier part, so let's do that first. You can use the <( and )> markers to indicate where the result of the match should start and end respectively.

rule argument-list {
    '('
    <( .+ )>
    ')'
}

You can see that the output of the script now doesn't show the brackets on the argument-list match. Now, to make a list of the arguments, it would be easiest to create an additional rule to match a single argument, and match the , as a seperator for the arguments. We can use the % operator for this.

rule argument-list {
    '('
    <( <argument>+ % ',' )>
    ')'
}

rule argument {
    .+
}

However, when you try to run this, all you'll see is a Nil as output.

Debugging a grammar

Grammars are quite a hassle to debug without any tools, so I would not recommend trying that. Instead, let's use a module that makes this much easier: Grammar::Tracer. This will show information on how the Grammar is matching all the stuff. If you use Rakudo Star, you already have this module installed. Otherwise, you may need to install it.

zef install Grammar::Tracer

Now you can use it in the script by adding use Grammar::Tracer at the top of the script, before the grammar declaration. Running the script now will yield some content before you see the Nil.

TOP
|  function-name
|  * MATCH "user_pref"
|  argument-list
|  |  argument
|  |  * MATCH "\"browser.startup.homepage\", \"https://searx.tyil.nl\");"
|  * FAIL
* FAIL

Looking at this, you can see that an argument is being matched, but it's being too greedy. It matches all characters up until the end of the line, so the argument-list can't match the closing bracket anymore. To fix this, we must update the argument rule to be less greedy. For now, we're just matching strings that appear within double quotes, so let's change the rule to more accurately match that.

rule argument {
    '"'
    <( <-["]>+? )>
    '"'
}

This rule matches a starting ", then any character that is not a ", then another ". There's also <( and )> in use again to make the surrounding " not end up in the result. If you run the script again, you will see that the argument-list contains two argument matches.

「user_pref("browser.startup.homepage", "https://searx.tyil.nl");」
 function-name => 「user_pref」
 argument-list => 「"browser.startup.homepage", "https://searx.tyil.nl"」
  argument => 「browser.startup.homepage」
  argument => 「https://searx.tyil.nl」

I'm ignoring the output of Grammar::Tracer for now, since there's no problems arising. I would generally suggest just leaving in there until you're completely satisfied with your Grammars, so you can immediately see what's going wrong where during development.

The statement's end

Now all there's left to explicitly match in the TOP rule, is the statement terminator, ;. This can replace the .*, since it's the last character of the string.

rule TOP {
    <function-name>
    <argument-list>
    ';'
}

The final Grammar should look like this.

grammar UserJS {
    rule TOP {
        <function-name>
        <argument-list>
        ';'
    }

    rule function-name {
        'user_pref'
    }

    rule argument-list {
        '('
        <( <argument+ % ',' )>
        ')'
    }

    rule argument {
        '"'
        <( <-["]> )>
        '"'
    }
}

Now, the problem here is that it's still quite naive. It won't deal with double quotes inside strings, not with boolean values or integers. The current Grammar is also not capable of matching multiple lines. All of these problems can be solved, some easier than others. Come back here tomorrow to learn how!


Parsing Firefox' user.js with Raku

Yesterday, we made a short Grammar that could parse a single line of the user.js that Firefox uses. Today, we'll be adding a number of testcases to make sure everything we want to match will match properly. Additionally, the Grammar can be expanded to match multiple lines, so we can let the Grammar parse an entire user.js file in a single call.

Adding more tests

To get started with matching other argument types, we should extend the list of test cases that are defined in MAIN. Let's add a couple to match true, false, null and integer values.

my @inputs = (
    'user_pref("browser.startup.homepage", "https://searx.tyil.nl");',
    'user_pref("extensions.screenshots.disabled", true);',
    'user_pref("browser.search.suggest.enabled", false);',
    'user_pref("i.have.no.nulls", null);',
    'user_pref("browser.startup.page", 3);',
);

I would suggest to update the for loop as well, to indicate which input it is currently trying to match. Things will fail to match, and it will be easier to see which output belongs to which input if we just print it out.

for @inputs {
    say "\nTesting $_\n";
    say UserJS.parse($_);
}

If you run the script now, you'll see that only the first test case is actually working, while the others all fail on the argument. Let's fix each of these tests, starting at the top.

Matching other types

To make it easy to match all sorts of types, let's introduce a proto regex. This will help keep everything into small, managable blocks. Let's also rename the argument rule to constant, which will more aptly describe the things we're going to match with them. Before adding new functionalities, let's see what the rewritten structure would be.

rule argument-list {
    '('
    <( <constant>+ % ',' )>
    ')'
}

proto rule constant { * }

rule constant:sym<string> {
    '"'
    <( <-["]>+? )>
    '"'
}

As you can see, I've given the constant the sym adverb named string. This makes it easy to see for us that it's about constant strings. Now we can also easily add additional constant types, such as booleans.

rule constant:sym<boolean> {
    | 'true'
    | 'false'
}

This will match both the bare words true and false. Adding just this and running the script once more will show you that the next two test cases are now working. Adding the null` type is just as easy.

rule constant:sym<null> {
    'null'
}

Now all we need to pass the 5th test case is parsing numbers. In JavaScript, everything is a float, so let's stick to that for our Grammar as well. Let's accept one or more numbers, optionally followed by both a dot and another set of numbers. Of course, we should also allow a - or a + in front of them.

rule constant:sym<float> {
    <[+-]>? \d+ [ "." \d+ ]?
}

Working out some edge cases

It looks like we can match all the important types now. However, there's some edge cases that are allowed that aren't going to work yet. A big one is of course a string containing a ". If we add a test case for this, we can see it failing when we run the script.

my @inputs = (
    ...
    'user_pref("double.quotes", "\"my value\"");',
);

To fix this, we need to go back to constant:sym<string>, and alter the rule to take escaped double quotes into account. Instead of looking for any character that is not a ", we can alter it to look for any character that is not directly following a \, because that would make it escaped.

rule constant:sym<string> {
    '"'
    <( .*? <!after '\\'> )>
    '"'
}

Parsing multiple lines

Now that it seems we are able to handle all the different user_pref values that Firefox may throw at us, it's time to update the script to parse a whole file. Let's move the inputs we have right now to user.js, and update the MAIN subroutine to read that file.

sub MAIN () {
    say UserJS.parse('user.js'.IO.slurp);
}

Running the script now will print a Nil value on STDOUT, but if you still have Grammar::Tracer enabled, you'll also notice that it has no complaints. It's all green!

The problem here is that the TOP rule is currently instructed to only parse a single user_pref line, but our file contains multiple of such lines. The parse method of the UserJS Grammar expects to match the entire string it is told to parse, and that's causing the Grammar to ultimately fail.

So, we'll need to alter the TOP rule to allow matching of multiple lines. The easieset way is to wrap the current contents into a group, and add a quantifier to that.

rule TOP {
    [
        <function-name>
        <argument-list>
        ';'
    ]*
}

Now it matches all lines, and correctly extracts the values of the user_pref statements again.

Any comments?

There is another edge case to cover: comments. These are allowed in the user.js file, and when looking up such files online for preset configurations, they're often making extensive use of them. In JavaScript, comments start with // and continue until the end of the line.

We'll be using a token instead of a rule for this, since that doesn't handle whitespace for us. The newline is a whitespace character, and is significant for a comment to denote its end. Additionally, the TOP rule needs some small alteration again to accept comment lines as well. To keep things readable, we should move over the current contents of the matching group to it's own rule.

rule TOP {
    [
    | <user-pref>
    | <comment>
    ]*
}

token comment {
    '//'
    <( <-[\n]>* )>
    "\n"
}

rule user-pref {
    <function-name>
    <argument-list>
    ';'
}

Now you should be able to parse comments as well. It shouldn't matter wether they are on their own line, or after a user_pref statement.

Make it into an object

What good is parsing data if you can't easily play with it afterwards. So, let's make use of Grammar Actions to transform the Match objects into a list of UserPref objects. First, let's declare what the class should look like.

class UserPref {
    has $.key;
    has $.value;

    submethod Str () {
        my $value;

        given ($!value) {
            when Str  { $value = "\"$!value\"" }
            when Num  { $value = $!value }
            when Bool { $value = $!value ?? 'true' !! 'false' }
            when Any  { $value = 'null' }
        }

        sprintf('user_pref("%s", %s);', $!key, $value);
    }
}

A simple class containing a key and a value, and some logic to turn it back into a string usable in the user.js file. Next, creating an Action class to make these objects. An Action class is like any regular class. All you need to pay attention to is to name the methods the same as the rules used in the Grammar.

class UserJSActions {
    method TOP ($/) {
        make $/<user-pref>.map({
            UserPref.new(
                key => $_<argument-list><constant>[0].made,
                value => $_<argument-list><constant>[1].made,
            )
        })
    }

    method constant:sym<boolean> ($/) {
        make (~$/ eq 'true' ?? True !! False)
    }

    method constant:sym<float> ($/) {
        make +$/
    }

    method constant:sym<null> ($/) {
        make Any
    }

    method constant:sym<string> ($/) {
        make ~$/
    }
}

The value methods convert the values as seen in the user.js to Raku types. The TOP method maps over all the user_pref statements that have been parsed, and turns each of them into a UserPref object. Now all that is left is to add the UserJSActions class as the Action class for the parse call in MAIN, and use its made value.

sub MAIN () {
    my $match = UserJS.parse('user.js'.IO.slurp, :actions(UserJSActions));

    say $match.made;
}

Now we can also do things with it. For instance, we can sort all the user_pref statements alphabatically.

sub MAIN () {
    my $match = UserJS.parse('user.js'.IO.slurp, :actions(UserJSActions));
    my @prefs = $match.made;

    for @prefs.sort(*.key) {
        .Str.say
    }
}

Sorting alphabetically may be a bit boring, but you have all sorts of possibilities now, such as filtering out certain options or comments, or merging in multiple files from multiple sources.

I hope this has been an interesting journey into parsing a whole other programming language using Raku's extremely powerful Grammars!

The complete code

parser.pl6

class UserPref {
    has $.key;
    has $.value;

    submethod Str () {
        my $value;

        given ($!value) {
            when Str  { $value = "\"$!value\"" }
            when Num  { $value = $!value }
            when Bool { $value = $!value ?? 'true' !! 'false' }
            when Any  { $value = 'null' }
        }

        sprintf('user_pref("%s", %s);', $!key, $value);
    }
}

class UserJSActions {
    method TOP ($/) {
        make $/<user-pref>.map({
            UserPref.new(
                key => $_<argument-list><constant>[0].made,
                value => $_<argument-list><constant>[1].made,
            )
        })
    }

    method constant:sym<boolean> ($/) {
        make (~$/ eq 'true' ?? True !! False)
    }

    method constant:sym<float> ($/) {
        make +$/
    }

    method constant:sym<null> ($/) {
        make Any
    }

    method constant:sym<string> ($/) {
        make ~$/
    }
}

grammar UserJS
{
    rule TOP {
        [
            | <user-pref>
            | <comment>
        ]*
    }

    token comment {
        '//' <( <-[\n]>* )> "\n"
    }

    rule user-pref {
        <function-name>
        <argument-list>
        ';'
    }

    rule function-name {
        'user_pref'
    }

    rule argument-list {
        '('
        <( <constant>+ % ',' )>
        ')'
    }

    proto rule constant { * }

    rule constant:sym<string> {
        '"'
        <( .*? <!after '\\'> )>
        '"'
    }

    rule constant:sym<boolean> {
        | 'true'
        | 'false'
    }

    rule constant:sym<null> {
        'null'
    }

    rule constant:sym<float> {
        <[+-]>? \d+ [ "." \d+ ]?
    }
}

sub MAIN () {
    my $match = UserJS.parse('user.js'.IO.slurp, :actions(UserJSActions));
    my @prefs = $match.made;

    for @prefs.sort(*.key) {
        .Str.say
    }
}

user.js

// Comments are welcome!

user_pref("browser.startup.homepage", "https://searx.tyil.nl");
user_pref("extensions.screenshots.disabled", true); //uwu
user_pref("browser.search.suggest.enabled", false);
user_pref("i.have.no.nulls", null);
user_pref("browser.startup.page", +3);
user_pref("double.quotes", "\"my value\"");
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment