Free Software, Free Society!
Thoughts of the FSFE Community (English)

Monday, 12 April 2021

Submit your talks now for Akademy 2021!

As you can see in https://akademy.kde.org/2021/cfp the Call for Participation for Akademy 2021 (that will take place online from Friday the 18th to Friday the 25th of June) is already open.

 

You have until Sunday the 2nd of May 2021 23:59 UTC to submit your proposals but you will make our (talks committee) live much easier if you start sending the proposals *now* and not send them all last minute in 2 weeks ;)


I promise i'll buy you a $preferred_beverage$ for next years Akademy (which we're hoping will happen live) if you send a talk before end of this week (and you send me a mail about it)


Saturday, 03 April 2021

Why Signature Verification in OpenPGP is hard

An Enigma cipher machine which is probably easier to understand than OpenPGP
An Enigma cipher machine which is less secure but also easier to understand than OpenPGP.
Photo by Mauro Sbicego on Unsplash.

When I first thought about signature verification in OpenPGP I thought “well, it cannot be that hard, right?”. In the end all you got to do is check if a signature was made by the given key and if that signature checks out (is cryptographically correct). Oh boy, was I wrong.

The first major realization that struck me was that there are more than just two factors involved in the signature verification process. While the first two are pretty obvious – the signature itself and the key that created it – another major factor is played by the point in time at which a signature is being verified. OpenPGP keys are changing over the course of their lifespan. Subkeys may expire or be revoked for different reasons. A subkey might be eligible to create valid signatures until its binding signature expires. From that point in time all new signatures created with that key must be considered invalid. Keys can be rebound, so an expired key might become valid again at some point, which would also make (new) signatures created with it valid once again.

But what does it mean to rebind a key? How are expiration dates set on keys and what role plays the reason of a revocation?

The answer to the first two questions is – Signatures!

OpenPGP keys consist of a set of keys and subkeys, user-id packets (which contain email addresses and names and so forth) and lastly a bunch of signatures which tie all this information together. The root of this bunch is the primary key – a key with the ability to create signatures, or rather certifications. Signatures and certifications are basically the same, they just have a different semantic meaning, which is quite an important detail. More to that later.

The main use of the primary key is to bind additional data to it. First and foremost user-id packets, which can (along with the primary keys key-ID) be used to identify the key. Alice might for example have a user-id packet on her key which contains her name and email address. Keys can have more than one user-id, so Alice might also have an additional user-id packet with her work-email or her chat address added to the key.

But simply adding the packet to the key is not enough. An attacker might simply take her key, change the email address and hand the modified key to Bob, right? Wrong. Signatures Certifications to the rescue!

   Primary-Key
      [Revocation Self Signature]
      [Direct Key Signature...]
      [User ID [Signature ...] ...]
      [User Attribute [Signature ...] ...]
      [[Subkey [Binding-Signature-Revocation]
              Subkey-Binding-Signature ...] ...]

Information is not just loosely appended to the key. Instead it is cryptographically bound to it by the help of a certification. Certifications are signatures which can only be created by a key which is allowed to create certifications. If you take a look at any OpenPGP (v4) key, you will see that most likely every single primary key will be able to create certifications. So basically the primary key is used to certify that a piece of information belongs to the key.

The same goes for subkeys. They are also bound to the primary key with the help of a certification. Here, the certification has a special type and is called “subkey binding signature”. The concept though is mostly the same. The primary key certifies that a subkey belongs to it by help of a signature.

Now it slowly becomes complicated. As you can see, up to this point the binding relations have been uni-directional. The primary key claims to be the dominant part of the relation. This might however introduce the risk of an attacker using a primary key to claim ownership of a subkey which was used to make a signature over some data. It would then appear as if the attacker is also the owner of that signature. That’s the reason why a signing-capable subkey must somehow prove that it belongs to its primary key. Again, signatures to the rescue! A subkey binding signature that binds a signing capable subkey MUST contain a primary key binding signature made by the subkey over the primary key. Now the relationship is bidirectional and attacks such as the one mentioned above are mitigated.

So, about certifications – when is a key allowed to create certifications? How can we specify what capabilities a key has?

The answer are Signature… Subpackets!
Those are some pieces of information that are added into a signature that give it more semantic meaning aside from the signature type. Examples for signature subpackets are key flags, signature/key creation/expiration times, preferred algorithms and many more. Those subpackets can reside in two areas of the signature. The unhashed area is not covered by the signature itself, so here packets can be added/removed without breaking the signature. The hashed area on the other hand gets its name from the fact that subpackets placed here are taken into consideration when the signature is being calculated. They cannot be modified without invalidating the signature.

So the unhashed area shall only contain advisory information or subpackets which are “self-authenticating” (meaning information which is validated as a side-effect of validating the signature). An example of a self-authenticating subpacket would be the issuers-id packet, which contains the key-id of the key that created the signature. This piece of information can be verified by checking if the denominated key really created the signature. There is no need to cover this information by the signature itself.

Another really important subpacket type is the key flags packet. It contains a bit-mask that declares what purpose a key can be used for, or rather what purpose the key is ALLOWED to be used for. Such purposes are encryption of data at rest, encryption of data in transit, signing data, certifying data, authentication. Additionally there are key flags indicating that a key has been split by a key-splitting mechanism or that a key is being shared by more than one entity.

Each signature MUST contain a signature creation time subpacket, which states at which data and time a signature was created. Optionally a signature might contain a signature expiration time subpacket which denotes at which point in time a signature expires and becomes invalid. So far so good.

Now, those subpackets can also be placed on certifications, eg. subkey binding signatures. If a subkey binding signature contains a key expiration time subpacket, this indicates that the subkey expires at a certain point in time. An expired subkey must not be used anymore and signatures created by it after it has been expired must be considered invalid. It gets even more complicated if you consider, that a subkey binding signature might contain a key expiration time subpacket, along with a signature expiration time subpacket. That could lead to funny situations. For example a subkey might have two subkey binding signatures. One simply binds the key indefinitely, while the second one has an expiration time. Here the latest binding signature takes precedence, meaning the subkey might expire at lets say t+3, while at t+5 the signature itself expires, meaning that the key regains validity, as now the former binding signature is “active” again.

Not yet complicated enough? Consider this: Whether or not a key is eligible to create signatures is denoted by the key flags subpacket which again is placed in a signature. So when verifying a signature, you have to consult self-signatures on the signing key to see if it carries the sign-data key flag. Furthermore you have to validate that self-signature and check if it was created by a key carrying the certify-other key flag. Now again you have to check if that signature was created by a key carrying the certify-other key flag (given it is not the same (primary) key). Whew.

Lastly there are key revocations. If a key gets lost or stolen or is simply retired, it can be revoked. Now it depends on the revocation reason, what impact the revocation has on past and/or future signatures. If the key was revoked using a “soft” revocation reason (key has not been compromised), the revocation is mostly handled as if it were an expiration. Past signatures are still good, but the key must no longer be used anymore. If it however has a “hard” revocation reason (or no reason at all) this could mean that the key has been lost or compromised. This means that any signature (future and past) that was made by this key has now to be considered invalid, since an attacker might have forged it.

Now, a revocation can only be created by a certification capable key, so in order to check if a revocation is valid, we have to check if the revoking key is allowed to revoke this specific subkey. Permitted revocation keys are either the primary key, or an external key denoted in a revocation key subpacket on a self-signature. Can you see why this introduces complexity?

Revocation signatures have to be handled differently from other signatures, since if the primary key is revoked, is it eligible to create revocation signatures in the first place? What if an external revocation key has been revoked and is now used to revoke another key?

I believe the correct way to tackle signature validity is to first evaluate the key (primary and subkeys) at signature creation time. Evaluating the key at a given point in time tn means we reject all signatures made after tn (except hard revocations) as those are not yet valid. Furthermore we reject all signatures that are expired a tn as those are no longer valid. Furthermore we remove all signatures that are superseded by another more recent signature. We do this for all signatures on all keys in the “correct” order. What we are left with is a canonicalized key ring, which we can now use to verify the signature in question with.

So lets try to summarize every step that we have to take in order to verify a signatures validity.

  • First we have to check if the signature contains a creation time subpacket. If it does not, we can already reject it.
  • Next we check if the signature is expired by now. If it is, we can again reject.
  • Now we have to evaluate the key ring that contains the signatures signing key at the time at which the signature was created.
    • Is the signing key properly bound to the key ring?
      • Was is created before the signature?
      • Was it bound to the key ring before the signature was made?
      • Is the binding signature not expired?
      • Is the binding signature not revoked?
      • Is the subkey binding signature carrying a valid primary key binding signature?
      • Are the binding signatures using acceptable algorithms?
    • Is the subkey itself not expired?
    • Is the primary key not expired?
    • Is the primary key not revoked?
    • Is the subkey not revoked?
  • Is the signing key capable of creating signatures?
  • Was the signature created with acceptable algorithms? Reject weak algorithms like SHA-1.
  • Is the signature correct?

Lastly of course, the user has to decide if the signing key is trustworthy or not, but luckily we can leave this decision up to the user.

As you can see, this is not at all trivial and I’m sure I missed some steps and/or odd edge cases. What makes implementing this even harder is that the specification is deliberately sparse in places. What subpackets are allowed to be placed in the unhashed area? What MUST be placed in the hashed area instead? Furthermore the specification contains errors which make it even harder to get a good picture of what is allowed and what isn’t. I believe what OpenPGP needs is a document that acts as a guide to implementors. That guide needs to specify where and where not certain subpackets are to be expected, how a certain piece of semantic meaning can be represented and how signature verification is to be conducted. It is not desirable that each and every implementor has to digest the whole specification multiple times in order to understand what steps are necessary to verify a signature or to select a valid key for signature creation.

Did you implement signature verification in OpenPGP? What are your thoughts on this? Did you go through the same struggles that I do?

Lastly I want to give a shout-out to the devs of Sequoia-PGP, which have a pretty awesome test suite that covers lots and lots of edge-cases and interoperability concerns of implementations. I definitely recommend everyone who needs to work with OpenPGP to throw their implementation against the suite to see if there are any shortcomings and problems with it.

Thursday, 01 April 2021

Microphone settings - how to deactivate webcam microphone

As many people out, in the last months I have participated in more remote video conferences than most likely in my whole life before. In those months I improved the audio and video hardware, the lighting, learned new shortcuts for those tools, and in general tried to optimise a few parts.

One of the problems I encountered on this journey was the selection of the correct microphone. The webcam, which luckily I got already before the pandemic, has an integrated microphone. The sound quality is ok. But compared to the microphone in the new headset the sound quality is awful. The problem was that whenever I plugged-in the webcam, its microphone will be selected as the new default. So I had to manually change the sound setting every time I plug and unplug the webcam.

I asked about that problem on mastodon (automatically forwarded to this proprietary microblogging service). There were several suggestions how to fix that. In the end I decided to use PulseAudio Volume Control, as I thought that is also a solution other people around me can easily implement. There under the Configuration tab you can switch the Profile for the webcam to Off as seen in the screenshot.

Screenshot of apps in work profile

This way I can plug in and unplug the webcam without the need to always switch the audio microphone settings. Saves quite some time while having such a high amount of video calls every day.

Thanks a lot to all who provided suggestions how to fix this problem, and to Lennart Poettering for writing PulseAudio Volume Control and publishing it under a Free Software license. Hopefully in future, such settings about the default sound and video hardware, are included directly in the general settings of the desktop environment as well.

Wednesday, 31 March 2021

MotionPhoto / MicroVideo File Formats on Pixel Phones

  • Losca
  • 11:24, Wednesday, 31 March 2021

Google Pixel phones support what they call ”Motion Photo” which is essentially a photo with a short video clip attached to it. They are quite nice since they bring the moment alive, especially as the capturing of the video starts a small moment before the shutter button is pressed. For most viewing programs they simply show as static JPEG photos, but there is more to the files.

I’d really love proper Shotwell support for these file formats, so I posted a longish explanation with many of the details in this blog post to a ticket there too. Examples of the newer format are linked there too.

Info posted to Shotwell ticket

There are actually two different formats, an old one that is already obsolete, and a newer current format. The older ones are those that your Pixel phone recorded as ”MVIMG_[datetime].jpg", and they have the following meta-data:

Xmp.GCamera.MicroVideo                       XmpText     1  1
Xmp.GCamera.MicroVideoVersion XmpText 1 1
Xmp.GCamera.MicroVideoOffset XmpText 7 4022143
Xmp.GCamera.MicroVideoPresentationTimestampUs XmpText 7 1331607

The offset is actually from the end of the file, so one needs to calculate accordingly. But it is exact otherwise, so one simply extract a file with that meta-data information:

#!/bin/bash
#
# Extracts the microvideo from a MVIMG_*.jpg file

# The offset is from the ending of the file, so calculate accordingly
offset=$(exiv2 -p X "$1" | grep MicroVideoOffset | sed 's/.*\"\(.*\)"/\1/')
filesize=$(du --apparent-size --block=1 "$1" | sed 's/^\([0-9]*\).*/\1/')
extractposition=$(expr $filesize - $offset)
echo offset: $offset
echo filesize: $filesize
echo extractposition=$extractposition
dd if="$1" skip=1 bs=$extractposition of="$(basename -s .jpg $1).mp4"

The newer format is recorded in filenames called ”PXL_[datetime].MP.jpg”, and they have a _lot_ of additional metadata:

Xmp.GCamera.MotionPhoto                      XmpText     1  1
Xmp.GCamera.MotionPhotoVersion XmpText 1 1
Xmp.GCamera.MotionPhotoPresentationTimestampUs XmpText 6 233320
Xmp.xmpNote.HasExtendedXMP XmpText 32 E1F7505D2DD64EA6948D2047449F0FFA
Xmp.Container.Directory XmpText 0 type="Seq"
Xmp.Container.Directory[1] XmpText 0 type="Struct"
Xmp.Container.Directory[1]/Container:Item XmpText 0 type="Struct"
Xmp.Container.Directory[1]/Container:Item/Item:Mime XmpText 10 image/jpeg
Xmp.Container.Directory[1]/Container:Item/Item:Semantic XmpText 7 Primary
Xmp.Container.Directory[1]/Container:Item/Item:Length XmpText 1 0
Xmp.Container.Directory[1]/Container:Item/Item:Padding XmpText 1 0
Xmp.Container.Directory[2] XmpText 0 type="Struct"
Xmp.Container.Directory[2]/Container:Item XmpText 0 type="Struct"
Xmp.Container.Directory[2]/Container:Item/Item:Mime XmpText 9 video/mp4
Xmp.Container.Directory[2]/Container:Item/Item:Semantic XmpText 11 MotionPhoto
Xmp.Container.Directory[2]/Container:Item/Item:Length XmpText 7 1679555
Xmp.Container.Directory[2]/Container:Item/Item:Padding XmpText 1 0

Sounds like fun and lots of information. However I didn’t see why the “length” in first item is 0 and I didn’t see how to use the latter Length info. But I can use the mp4 headers to extract it:

#!/bin/bash
#
# Extracts the motion part of a MotionPhoto file PXL_*.MP.mp4

extractposition=$(grep --binary --byte-offset --only-matching --text \
-P "\x00\x00\x00\x18\x66\x74\x79\x70\x6d\x70\x34\x32" $1 | sed 's/^\([0-9]*\).*/\1/')

dd if="$1" skip=1 bs=$extractposition of="$(basename -s .jpg $1).mp4"

UPDATE: I wrote most of this blog post earlier. When now actually getting to publishing it a week later, I see the obvious ie the ”Length” is again simply the offset from the end of the file so one could do the same less brute force approach as for MVIMG. I’ll leave the above as is however for the ❤️ of binary grepping.

(cross-posted to my other blog)

What's in a name?

Often when people conceptualise transgender people, there is a misery inherent to our identity. There is the everyday discrimination, gender dysphoria, the arduous road of transition, and the sort of identity crisis that occurs when we let go of one name and choose another. And while being transgender certainly can be all of that, it’s a pity that the joyous aspects are often forgotten, or disappear behind all the negative clouds that more desperately require society’s attention. For this International Transgender Day of Visibility, I want to talk about those happy things. I want to talk about names, introspection, and the mutability of people.

Sometimes I mention to people that I have chosen my own name. In response, people often look at me as though I had just uttered an impossibility. What? Why would you? How would you do that? Can you even do that? I find this a little funny because it’s the most normal thing for me, but a never-even-thought-about-that for the vast majority of people. Such an unlikely thing, in fact, that some people consequently ask me for my real name. A self-selected name cannot be a real name, right?

But I want to make a case for changing one’s name, even if you’re not trans.

Scrooge

Imagine Scrooge from A Christmas Carol. In your mind’s eye, you will likely picture him as he was introduced to you: “a squeezing, wrenching, grasping, scraping, clutching, covetous, old sinner! Hard and sharp as flint, from which no steel had ever struck out generous fire; secret, and self-contained, and solitary as an oyster.”

But that Scrooge—the Scrooge which everyone knows—is the man from the beginning of the novel. By the end of the novel, Scrooge has changed into a compassionate man who wishes joy to humanity and donates his money to the poor. And even though everyone knows the story, almost no one thinks “good” when they hear the name Scrooge.

I don’t really know why people can’t let go of the image of Scrooge. Maybe his awfulness was so strong that one can’t just forget about it. Maybe we haven’t spent enough time with the new Scrooge yet. Maybe his name simply sounds essentially hideous, and whatever he did to deserve his reputation doesn’t even matter.

Or maybe his name has become akin to a meme for stingy people, similar to today’s “Chad” for handsome, popular, probably not-so-smart men, or “Karen” for vile, selfish, bourgeois women (although that meme quickly devolved into a sexist term meaning “any woman I don’t like”, but whatever).

All names are memes

Ultimately, all names are memes. They evoke diverse feelings from their related clichés. The clichés can relate to country, region, language, age, gender, class, or any other arbitrary thing, and any combination thereof.

The fact that names are memes likely isn’t a great thing—all the above assumptions can be harmful in their own ways—but it’s also probably unavoidable. If most people with the name Thijs are Dutch men, then it follows that people are going to take note of this. And sometimes, someone becomes famous enough that they become the prime instance of their name: it’s difficult to imagine a Napoléon who isn’t the 19th-century leader of France.

More interestingly, though, we ourselves memeify our names through existence. If you’re the only person with a certain name inside of a community, then all members of that community will subconsciously base their associations with that name on you. You effectively become the prime instance of that name within your community.

Memes don’t change; people do

A trait of memes—certainly popular ones—is that they are incredibly well-polished. Like a Platonic Form, a meme embodies the essence of something very specific at the intersection of all of its instances. Because of this, memes very rarely change in their meanings. So what to do when an instance of the meme changes?

Although the journeys of trans people all wildly vary, I’ve yet to meet a trans person whose journey did not include an almost unbearable amount of introspection. A deep investigation of the self, not just as who you are, but who you want to be. And inevitably, you end up asking yourself this question: “If you could be anybody at all, who would you want to be?”

Invariably the answer is something along the lines of “myself, but different”. For trans people, this involves a change in gender presentation, and society mandates that people who present a certain gender should have a name that reflects that. So we choose our own name. With any luck, we choose a cool name.

But what if we extended that line of thinking? Going through a period of introspection and coming out of it a different person is not something that is entirely unique to trans people. We ultimately only get to occupy a single body in this life, so we might as well make that person resemble the kind of person we would really fancy being. So why not change your name? Get rid of the old meme, and start a new one.

Now, understandably, we cannot change into anybody at all. There are limits to our mutability, although those limits are often broader than our imagination. Furthermore, we may never become the perfect person we envision when we close our eyes. And that’s okay, but we can get a little closer. And for me, the awareness of the mutability of names excites the awareness of the mutability—and consequently the potential for improvement—of people.

Happy International Transgender Day of Visibility.

Sunday, 28 March 2021

Saturday, 13 March 2021

KDE Gear 21.04 releases branches created

 Make sure you commit anything you want to end up in the 21.04 releases to them

We're already past the dependency freeze.

The Feature Freeze and Beta is this Thursday 18 of March.

More interesting dates
   April  8: 21.04 RC (21.03.90) Tagging and Release
   April 15: 21.04 Tagging
   April 22: 21.04 Release

https://community.kde.org/Schedules/KDE_Gear_21.04_Schedule

Okular: Should continuous view be an okular setting or a document setting?

In Okular:

 

Some settings are okular wide, if you change them, they will be changed in all the future okular instances, an easy example is if you change the shortcut for saving from Ctrl+S to Ctrl+Shift+E. 

 

Some other settings are document specific, for example zoom, if you change the zoom of a document it will only be restored when opening the same document again, but not if you open a different one. There's also a "default zoom value for documents you've never opened before" in the settings.


Some other settings like "Continuous View" are a bit of a mess and are both. "Continuous View" wants to be a global setting (i.e. so that if you hate continuous view you always get a non continous view) but it is also restored to the status it had when you closed the document you're just opening.


That's understandably a bit confusing for users :D


My suggestion for continuous view would be to make it work like Zoom, be purely document specific but also have a default option in the settings dialog for people that hate continous view.

 

I'm guessing this should cover all our bases?

 

Opinions? Anything i may have missed?

Thursday, 04 March 2021

Is okular-devel mailing list the correct way to reach the Okular developers? If not what do we use?

After my recent failure of gaining traction to get people to join a potential Okular Virtual Sprint i wondered, is the okular-devel mailing list representative of the current okular contributors?

 

Looking at the sheer number of subscribers one would think that probably. There's currently 128 people subscribed to the okular-devel mailing list, and we definitely don't have that many contributors, so it would seem the mailing list is a good place to reach all the contributors, but let's look at the actual numbers.

 

Okular git repo has had 46 people contributing code[*] in the last year.


Only 17% of those are subscribed to the okular-devel mailing list.


If we count commits instead of commiters, the number raises to 65% but that's just because I account for more than 50% of the commits, if you remove myself from the equation the number drops to 28%.


If we don't count people that only commited once (thinking that they may not be really interested in the project), the number is still at only 25% of commiters and 30% of commits (ignoring me again) subscribed to the mailing list.


So it would seem that the answer is leaning towards "no, i can't use okular-devel to contact the okular developers".


But if not the mailing list? What am i supposed to use? I don't see any other method that would be better.


Suggestions welcome!



[*] Yes I'm limiting contributors to git commiters at this point, it's the only thing i can easily count, i understand there's more contributions than code contributions

Saturday, 20 February 2021

How to build your own dyndns with PowerDNS

I upgraded my home internet connection and as a result I had to give up my ~15y Static IP. Having an ephemeral Dynamic IP means I need to use a dynamic dns service to access my homepc. Although the ISP’s CPE (router) has a few public dynamic dns services, I chose to create a simple solution on my own self-hosted DNS infra.

There are a couple of ways to do that, PowerDNS supports Dynamic Updates but I do not want to open PowerDNS to the internet for this kind of operations. I just want to use cron with a simple curl over https.

PowerDNS WebAPI

to enable and use the Built-in Webserver and HTTP API we need to update our configuration:

/etc/pdns/pdns.conf

api-key=0123456789ABCDEF
api=yes

and restart powerdns auth server.

verify it

ss -tnl 'sport = :8081'
State   Recv-Q  Send-Q  Local Address:Port  Peer Address:Port
LISTEN      0       10      127.0.0.1:8081             *:*

WebServer API in PHP

Next to build our API in PHP

Basic Auth

By using https means that the transport layer is encrypted so we only need to create a basic auth mechanism.

<?php
  if ( !isset($_SERVER["PHP_AUTH_USER"]) ) {
      header("WWW-Authenticate: Basic realm='My Realm'");
      header("HTTP/1.0 401 Unauthorized");
      echo "Restricted area: Only Authorized Personnel Are Allowed to Enter This Area";
      exit;
  } else {
    // code goes here
  }
?>

by sending Basic Auth headers, the _SERVER php array variable will contain two extra variables

$_SERVER["PHP_AUTH_USER"]
$_SERVER["PHP_AUTH_PW"]

We do not need to setup an external IDM/LDAP or any other user management system just for this usecase (single user access).

and we can use something like:

<?php
  if (($_SERVER["PHP_AUTH_USER"] == "username") && ($_SERVER["PHP_AUTH_PW"] == "very_secret_password")){
    // code goes here
  }
?>

RRSet Object

We need to create the RRSet Object

here is a simple example

<?php
  $comments = array(
  );

  $record = array(
      array(
          "disabled"  => False,
          "content"   => $_SERVER["REMOTE_ADDR"]
      )
  );

  $rrsets = array(
      array(
          "name"          => "dyndns.example.org.",
          "type"          => "A",
          "ttl"           => 60,
          "changetype"    => "REPLACE",
          "records"       => $record,
          "comments"      => $comments
      )
  );

  $data = array (
      "rrsets" => $rrsets
  );

?>

by running this data set to json_encode should return something like this

{
  "rrsets": [
    {
      "changetype": "REPLACE",
      "comments": [],
      "name": "dyndns.example.org.",
      "records": [
        {
          "content": "1.2.3.4",
          "disabled": false
        }
      ],
      "ttl": 60,
      "type": "A"
    }
  ]
}

be sure to verify that records, comments and rrsets are also arrays !

Stream Context

Next thing to create our stream context

$API_TOKEN = "0123456789ABCDEF";
$URL = "http://127.0.0.1:8081/api/v1/servers/localhost/zones/example.org";

$stream_options = array(
    "http" => array(
        "method"    => "PATCH",
        "header"    => "Content-type: application/json \r\n" .
                        "X-API-Key: $API_TOKEN",
        "content"   => json_encode($data),
        "timeout"   => 3
    )
);

$context = stream_context_create($stream_options);

Be aware of " \r\n" . in header field, this took me more time than it should ! To have multiple header fiels into the http stream, you need (I don’t know why) to carriage return them.

Get Zone details

Before continue, let’s make a small script to verify that we can successfully talk to the PowerDNS HTTP API with php

<?php
  $API_TOKEN = "0123456789ABCDEF";
  $URL = "http://127.0.0.1:8081/api/v1/servers/localhost/zones/example.org";

  $stream_options = array(
      "http" => array(
          "method"    => "GET",
          "header"    => "Content-type: application/jsonrn".
                          "X-API-Key: $API_TOKEN"
      )
  );

  $context = stream_context_create($stream_options);

  echo file_get_contents($URL, false, $context);
?>

by running this:

php get.php | jq .

we should get the records of our zone in json format.

Cron Entry

you should be able to put the entire codebase together by now, so let’s work on the last component of our self-hosted dynamic dns server, how to update our record via curl

curl -sL https://username:very_secret_password@example.org/dyndns.php

every minute should do the trick

# dyndns
* * * * * curl -sL https://username:very_secret_password@example.org/dyndns.php

That’s it !

Tag(s): php, curl, dyndns, PowerDNS

Panel - Free Software development for the public administration

On 17 February I participated in a panel discussion about opportunities, hurdles with and incentives for Free Software in the public administration. The panel was part of the event "digital state online", focusing on topics like digital administration, digital society and digital sovereignty. Patron of the event is the German Federal Chancellor's Office. Here a quick summary of the points I made and some quick thoughts about the discussion.

The "Behördenspiegel" meanwhile published the recordings of the discussion moderated by Benjamin Stiebel (Editor, Behörden Spiegel) with Dr. Hartmut Schubert (State Secretary, Thuringian Ministry of Finance), Reiner Bamberger (developer at Higher Administrative Court of Rhineland-Palatinate), Dr. Matthias Stürmer (University of Bern), and myself (as always you can use youtube-dl).

We were asked to make a short 2 minutes statement at the beginning in which I focused on three theses:

  • The public administration actions are more intransparent than it used to be due to digitalization. We need to balance this.
  • The sharing and re-use of software in public administrations together with small and medium-sized enterprises must be better promoted.
  • Free Software (also called Open Source) in administration is important in order to maintain control over state action and Free Software is an important building block of a technical distribution of powers in a democracy in the 21st century.

Furthermore, the "Public Money? Public Code!" video was played:

In the discussion, we also talked about why there is not yet more Free Software in public administrations yet. State Secretary Dr. Schubert point was he is not aware about legal barriers and that the main problem is in the implementation phase (as it is the case for policies most of the time). I still mentioned few hurdles here:

  • Buying licences is still sometimes easier than buying services and some public administrations have budgets for licences which cannot be converted to buy services. This should be more flexible; and it was good to hear from State Secretary Dr. Schubert that they changed this in Thuringia.
  • Some proprietary vendors can simply be procured from framework contracts. For Free Software that option is often missing. It could be helpful if other governments follow the example of France and provide a framework contract which makes it easy to procure Free Software solutions - including from smaller and medium-sized companies.

One aspect I noticed in the discussion and the questions we received in the chat: Sometimes Free Software is presented in a way that in order to use it, public administration would have to look at code repositories and click through online forums in order to find out specifics of the software. Of course, they could do that, but they could - as they do for proprietary software as well and as it is the more common if you do not have in-house contributors - simply write in an invitation to tender that a solution must for example be data-protection compliant, that you want the rights to use, study, share, and improve the software for every purpose. So as public administration you not have to do such research yourself, as you would have to do for your private hobby project, but you can - and depending on the level of in-house expertise often really should - involve external professional support to implement a Free Software solution. This can be support from other public administrations or from companies providing Free Software solutions (on purpose I am not writing "Free Software companies" here, for further details see the previous article "There is no Free Software company - But!").

We also still need more statements by government officials, politicians, and other decisions makers why Free Software is important. Like in the recent months in Germany the conservative CDU's party convention resolution on the use of Free Software or the statement by the German Chancellor Merkel about Free Software below. This is important so that people in the public administration who want to move to more Free Software can better justify and defend their actions. In order to increase the speed for more digital sovereignty, decision makers need to reverse the situation. It should not be "nobody gets fired for buying Microsoft" to "nobody gets fired for procuring Free Software".

I also plead for a different error culture in the public administration. Experimentation clauses would allow to be test innovative approaches without every bad feedback immediately suggesting that a project has to be stopped. We should think about how to incentivize the sharing and reuse of Free Software. For example if public administrations document good solutions and support others in benefiting from those solutions as well could they get a budget bonus for that for future projects? Could we provide smaller budgets which can be more flexible used to experiment with Free Software, e.g. by providing small payments to Free Software offers even if they do not yet meet all the criteria to use it productively for the tasks envisioned.

One point we also briefly talked about was centralization vs decentralization. We have to be careful that "IT consolidation" efforts do not lead to a situation of more monopolies and more centralization of powers. For Germany, I argued that the bundling of IT services and expertise in some authorities should not go that far that federal states like Thuringia or other levels and parts of government lose their sovereignty and are dependent on a service centre controlled by the federal government or another part of the state. Free Software provides the advantage that for example the federal state of Bavaria can offer a software solution for other federal states. But if they abuse their power over this technology, other federal states like Thuringa could decide to host the Free Software solution themselves, and contract a company to make modifications, so they can have it their way. The same applies for other mechanisms for distribution of power like the separation between a legislature, an executive, and a judiciary. All of them have to make sure their sovereignty is not impacted by technology - neither by companies (as more often discussed) nor by other branches of government. For a democracy in the 21st century such a technological distribution of power is crucial.

PS: In case you read German, Heise published an article titled "The public administration's dependency on Microsoft & Co is 'gigantic'" (in German) about several of the points from the discussion. And if you do not know it yet, have a look at the expert brochure to modernise public digital infrastructure with public code, currently available in English, German, Czech, and Brazilian Portuguese.

Sunday, 14 February 2021

My history with free software – a story told on #ilovefs day

In October 2019, I went to Linuxhotel in Essen, as I had been invited to attend that year’s General Assembly in the FSFE as a prospective member. I had a very enjoyable weekend, where I met new people and renewed older acquaintances, and it was confirmed to me what good and idealistic people are behind that important part of the European free software movement.

On the photo you see Momo, the character from Michael Ende’s eponymous novel – a statue which I was delighted to see, given that “Momo”  has been one of my favorite children’s novels for decades.

I first met the concept of free software at the university, as a student of physics and computer science in the early nineties. As students, we had to work on  the old proprietary SunOS and HP-UX systems; we had to use the Emacs editor and GNU C compiler (thanks, Richard Stallman and GNU team!) as well as the LaTeX text processing system (thanks, Donald Knuth and Leslie Lamport!)

Some of my fellow students were pretty interested in the concepts of free software and the struggle against software patents, but not me – to be honest, at the time I  was not interested in software or computing at all. Computer science to me was mainly algorithmics and fundamental concepts, invariants and termination functions (thanks, Grete Hermann!) as well as Turing machines, formal languages and the halting theorem (thanks, Alan Turing and Noam Chomsky!). The fact that computer programs could be written and run was, I thought,  mainly a not very interesting byproduct of these intellectual pursuits. In my spare time I was interested in physics as well as in topics more to the “humanities” side – I spent a lot of afternoons studying Sanskrit and Latin and, at a time, even biblical Hebrew, and read Goethe’s Faust and Rabelais’ Gargantua and Pantagruel in the original languages. My main, overarching interests these years were art in the widest sense, epistemology (specifically, the epistemology of physics) and the history of religon. I also read a lot of comic books and science fiction novels.

After leaving the university, however, I got a job as a developer and worked mainly with internal software at a huge corporation and with proprietary software at a major vendor to the newspaper industry. It was at that time, in 2005, that I once again stumbled over the concept of free software – as explained by Richard Stallman – and started using GNU/Linux at home (Ubuntu, thanks, Mark Shuttleworth, Ian Murdoch, Linus Torvalds, Ingo Molnar and everybody else involved in creating Ubuntu and its building blocks Linux and Debian!)

I suddenly realized as someone that had become interested in software and its potential impact on society, that Stallman’s analysis of the situation is correct: If we want to build society’s infrastructure on software – and that seems to be happening – and we still want a free society, software must be free in the FSF sense – users, be they individuals, public authorities or corporations, must have the four freedoms. If this is not the case, all of these users will be at the mercy of the software vendors, and this lack of transparency in the public infrastructure may, in turn, be used by governments and corporations to oppress users – which has happened time and time again.

Free software enables users (once again – be they governments, companies or actual people) to protect themselves against this kind of abuse and gives them the freedom to understand and participate in the public infrastructure. By allowing changing and redistributing software products it also reverses the power relations, giving the users the power they should have and ensures that vendors can no longer exploit monopolies as their private money printing machines (no thanks, Microsoft, for that lesson !)

After discovering the concept of free software and the fact that I could practically use free software only in my daily existence, I started  blogging and communicating about it – a lot.

In 2009, a friend of mine started the “Ubuntu Community Day”, an initiative to further the use of Ubuntu in Aarhus, Denmark, and give back to the community that created this OS.

As such, in 2010 we helped co-found Open Space Aarhus, along with a group of hardware hackers. After some years, this group petered out, and I had discovered the FSFE and become a Fellow (that which now is called Supporter). As such, I was more interested in addressing real political work for free software than in Ubuntu advocacy (as good a path as this is for getting people acquainted with free software), and in 2012 I started an FSFE Local Group in Aarhus, with regular meetings in the hacker space. This group existed until 2015, where we co-organized that year’s LibreOffice conference (thanks to Florian Effenberger and Leif Lodahl and everyone else involved!) but ended up stopping the meetings, as I had become busy with other, non-software related things and none of the other members were ready to assume the responsibility of running it.

As mentioned, when I discovered free software, I was in a job writing proprietary software. While I could live with that as long as the salary was good and i was treated well, making a living from free software  had become not only a dream, it now seemed to be the best  and possibly the only ethically defensible way of working with software.

It also became clear to me that while we may admire volunteer-driven and community-driven free software projects, these are not enough if software freedom is to become the accepted industry standard for all software, as is after all the goal. Some software may be fun to write, but writing and maintaining domain-specific software for governments and large corporations to use in mission-critical scenarios is not fun – it is work, and people will not and cannot do it in their spare time. We need actual companies supplying free software only, and we need many of them.

After some turbulence in my employment, including a period of unemployment in the wake of the financial crisis in 2009, in 2011 I joined my current employer, Magenta ApS. This is the largest Scandinavian company producing only free software – a software vendor  that has never delivered any product to any customer under a proprietary license and has no plans to do this either, ever. With 40 employees, we are currently selling products to local governments and similar organizations that used to be the sole province of huge corporations with shady ethical practices – and I’m proud to say that this means that in our daily jobs, we’re actually changing things to the benefit of these organizations, and of Danish society at large. (Thanks to Morten Kjærsgaard for founding the company and being its motor and main driving force for all these years!)

And in the FSFE General Assembly  of 2020, I was confirmed as a permanent member of the GA. I’d like to thank the founders of the FSFE and the community around it (too many to list all the names, so y’all are included!) for this confidence and hope to be able to continue contributing to free software, a cause I discovered 15 years ago,  for at least the next 15 years as well.

Shelter - take a break from work

On today's "I love Free Software Day" I would like to thank "PeterCXY" and others who contributed to shelter.

Until recently I have used two separate phones: one for the FSFE and one privately. The reason is that I prefer to have the ability to switch off the work phone when I do not want to be available for work but focus on my private life. After the FSFE phone did not get further security updates for a long time I was facing the decision: should I get a new phone for work -- but waste resources for the hardware -- or should I continue to use the old one with the known security issues?

Thanks to Torsten Grote, the FSFE volunteer who started the FSFE's Free Your Android campaign, I was made aware about another option: use shelter which is leveraging the work profile feature in Android. With this solution I have the latest security updates from my private phone and have the ability to easily switch off the work applications.

Screenshot of apps in work profile

You just clone installed apps into the work profile. If you do not also use them privately remove them from your personal profile afterwards. Once that is setup, you can disable notifications from all those apps by pausing the work profile.

Screenshot of paused work profile

This is just one of the use cases of shelter, you can also use it to

  1. Isolate apps, which you do not trust (e.g. if you are forced to use some proprietary apps on your phone) so they cannot access your data / files outside the profile

  2. Disable "background-heavy, tracking-heavy or seldom-used apps when you don't need them."

  3. Clone apps to use two accounts on one device. Something many people asked about for messenger apps which do not allow to setup more than one account (like most Matrix or XMPP clients) in one instance of the app.

If you want to read more about it and speak German, I can recommend the shelter tutorial by Moritz Tremmel, unfortunately I have not yet found something comparable in English yet.

So a big thank you to PeterCXY and others contributing to the Free Software app shelter! Please keep up your work for software freedom!

Shelter logo

Beside that thanks to countless other Free Software contributors who work on other components of that setup:

  • CalxyOS: for providing an operating system which you can also recommend to non-tech-savvy people, without being their point of support afterwards.
  • LineageOS: for providing builds to liberate your phones on many devices.
  • Replicant: for working hard to remove and replace proprietary components from Android phones.
  • F-Droid: for making it easy to install shelter as well as many other apps on liberated phones.
  • OpenMoko: for doing the pioneer work for Free Software on mobile phones
  • Librem phone, Pine Phone, Ubuntu Phone, and others who are working on non-android Free Software solutions for phones.
  • Finally: Torsten Grote and many other FSFE volunteers who helped people to liberate their phones with the FSFE's Free Your Android project.

Friday, 12 February 2021

Destination status quo

I recently happened upon an article1 that argued against the four freedoms as defined by the Free Software Foundation. I don’t actually want to link to the article—its tone is rather rude and unsavoury, and I do not want to end up in a kerfuffle—but I’ll include an obfuscated link at the end of the article for the sake of integrity.

The article—in spite of how much I disagree with its conclusions—inspired me to reflect on idealism and the inadequacy of things. Those are the things I want to write about in this article.

So instead of refuting all the points with arguments and counter-arguments, my article is going to work a little differently. I’m going to concede a lot of points and truths to the author. I’m also going to assume that they are ultimately wrong, even though I won’t make any arguments to the contrary. That’s simply not what I want to do in this article, and smarter people than I have already made a great case for the four freedoms. Rather, I want to follow the author’s arguments to where they lead, or to where they do not.

The four freedoms

The four freedoms of free software are four conditions that a program must meet before it can be considered free. They are—roughly—the freedoms to (1.) use, (2.) study, (3.) share, and (4.) improve the program. The assertion is that if any of these conditions is not met, the user is meaningfully and helplessly restricted in how they can exercise their personal liberties.

The aforementioned article views this a little differently, however. Specifically, I found its retorts on the first and second freedoms interesting.

The first freedom

The first freedom—in full—is “the freedom to run the program as you wish, for any purpose”. The retort goes a little like this, using generous paraphrasing:

The freedom to use the program for any purpose is a meaningless freedom. It is the programmer—and by extension, the program—that determines the parameters of the program’s purpose. If it is the program’s purpose to display image files, then try as you might, but it’s not going to send any e-mails. Furthermore, the “free” program might even contain purposes that you find undesirable whereas a “non-free” program might not.

There’s one very interesting thing about this retort. You see, the author doesn’t actually make any factual errors on the face of it. An image viewer cannot send e-mails, and some non-free programs exhibit less undesirable behaviours than their free counterparts. These things are true, but truths is all they are. Something is missing…

The second freedom

In the article, the author points towards one free program that they consider harmful or malicious, accusing it of containing a lot of anti-features. Furthermore, they emphasise the uselessness of the second freedom to study and change the program. Paraphrased:

Even supposing that you have the uncommon ability to read and write source code, this program is so thoroughly complex that you can’t meaningfully study or alter it. And supposing that you do manage to adjust the program, you would have to keep pace with upstream’s updates, which is such a labour-intensive, nigh-insurmountable task that you might end up wondering what use this freedom is to you if it’s practically impossible to make any use of it.

The author goes on to add that there exist other, better programs that achieve approximately the same thing as this malicious free program, but without the anti-features. To the author, the fact that the malicious program is free is a useless distinction—it’s malicious, after all—and they could not care one way or another whether the better program is free or not. They care that it’s better. What use is freedom if you have to suffer worse programs?

And, you know, the author is right—again. All the logic follows. But this is all very matter-of-fact. It’s stating the obvious. It’s repeating the way that things are, and concluding that that’s how it is. Which it is. And there may well be a lot to this state of affairs. But, well, that’s all it is. There’s nothing more to it than that which is, in fact, the case.

Am I losing you yet?

An intermission: the future is behind us

In some languages, when people use gestures to aid in speech, they gesture ahead of them to signal the past, and behind them to signal the future. If you’re like me, this may seem absurd. Surely the past is behind us and the future ahead of us. We move forward through time, don’t we? Ever onward.

But then you look at the English language, and it starts to make a modicum of sense. We use the words “before” and “behind/after” both in spatial and temporal contexts. If we imagine a straight narrow hallway containing me and a cat, and I am looking towards the cat, then we would say that the cat is standing before me. If we also imagine a straight line of time, and the cat jumps first, and I jump second, then we would also say that the cat jumped before I did.

/img/catspacetime.png

The above graphic should make this idea a little clearer. In both the perspectives, I am looking at the before. As a matter of fact, in the temporal perspective, it’s the only perceivable direction. If the cat turns around in the hallway, it can see me. If the cat turns around in the timeline, it can see nothing—just the great uncertainty of the future that has not yet happened. It needs to wait until I jump in order to perceive it, by which time it’s looking towards the past—the before—again.

The future is necessarily behind us, after us.

Staring at one’s feet

Let’s create an analogy and stretch it beyond its limits. If we place the author on the aforementioned timeline, then I’m going to assert that the author is neither looking ahead towards the past to learn from history, nor turns their head to look behind to the future to imagine what it could look like. Rather, they are looking down, staring at their feet on the exact spot in time which they occupy.

There’s a Machiavellian immediacy to the author’s arguments—faced with a certain set of circumstances, it’s the author’s responsibility to choose what’s best at this moment in time. But the immediacy is also immensely short-sighted. The article contains no evaluation of the past—no lessons drawn from the past abuses of non-free software—and the article neither contains a coherent vision of the future. If not free software, then what?

Better software, the author responds.

A stiff neck

“If I had asked people what they wanted, they would have said faster horses” is a quote misattributed to Henry Ford of automobile fame. It’s often used to emphasise that people don’t actually know what they want, but I like it more as a metaphor for the blindness towards the future and an inability to imagine. People don’t know what could be, so when prompted about a better future, their response is effectively “like the status quo, but better”.

This sentiment is echoed in Mark Fisher’s Capitalist Realism: Is There No Alternative? (2009). The book talks about “the widespread sense that not only is capitalism the only viable political and economic system, but also that it is now impossible even to imagine a coherent alternative to it”. “It is easier to imagine an end to the world than an end to capitalism”, Fisher attributes to Fredric Jameson and Slavoj Žižek.

In this sense, I think that horses and capitalism are analogous to non-free software. There exists time before all these things, and there exists a time after them. But because our backs are turned to the future, we can’t see it—we must imagine it.

This strikes at the heart of why the author inspired me to write this article. To me, the author demonstrates a rigid, stubborn incapability of turning their head and imagining a future that isn’t just the present, but better. The author tackles an ideology—a way of imagining a different, better future—without ever lifting their head from staring at their feet.

And that’s fascinating.

Painting in the mind’s eye

Now, suppose I could visit the author and (gently!) turn their head to face the future. Without intending to come across as insulting, I’m not entirely certain that the author would see anything other than a void. Of course, seeing a void is entirely reasonable—when we turn our eyes to the future, we’re trying to see something that does not exist.

Looking at the future, therefore, is an exercise in creativity. The void is a potentially blank canvas on which one can paint. And like painting, it’s easiest to paint something you’re currently looking at, harder to paint something from memory, and hardest to paint something you’ve never before seen. And frankly, it’s really uncomfortable to twist your neck to look behind you.

But sometimes, even simply lifting one’s head feels like a struggle. Because the past isn’t any different from the future in one important aspect—it is not immediately perceptible. We can look at the present by opening our eyes, but we need either artefacts or imagination to paint the past. The fewer artefacts or memories we have, the harder it becomes to perceive the past.

La langue anglaise

If you’re reading this, chances are you’re invested in software freedom, and you don’t exactly struggle to see the common vision of a future with free software. But I want to nonetheless try to demonstrate how difficult it is to see something that does not yet exist, and how difficult it is to remember that the past exists when considering a change to the status quo. Furthermore, I want to demonstrate why someone might be hostile to our painting of the future even if they were able to see it.

This article is written in English. English, of course, is the common international language. Everybody—or at least everybody with an interest in interacting with the globe—speaks it. Surely. And what a boon this language is to the world. We can all talk to each other and understand each other, and the language is rich and flexible and– why are you looking at the Wikipedia page on English orthography? Why are you looking at the Wikipedia page on the British Empire?

You see, the English language isn’t exactly great, and comes with a lot of disadvantages. Its spelling is atrocious, the vocabulary is bigger than it has any right being, it unfairly gives native speakers an advantage, it unfairly gives countries that use the English language as an official language extra cultural influence, and some people might deservedly have some opinions on using the language of their oppressors.

Now, suppose we could imagine any future at all. Would we like to keep English as the common tongue? Well, there surely are some disadvantages to doing this. All of modern engineering—modern life—is built on top of English, so we’d have to convert all of that. It would also inevitably make art and resources from this time period less accessible. And, you know, people would have to learn a different language. Who has time for that? We’ve already settled on English, so we might as well ride it out.

These are all arguments from the status quo, however. If we equate English to non-free software, and a better auxiliary language to free software, then these arguments are the equivalent of saying that you really just want to do X, and the non-free program is simply better at doing X. Besides, you’re already using this non-free program, and it would be a hassle to switch. This line of thought is incapable of imagining a better future, and dismissive of morality. The morality of the matter was never even addressed (although I realise that I am writing my own strawman here!).

Furthermore, the arguments were entirely dismissive of the past. I take great joy in the knowledge that English is today’s lingua franca, meaning “French language” in Latin. Latin and French were, of course, the common tongues in their respective time periods. The time of the French language wasn’t even that long ago—barely over a century ago! So we’ve obviously switched languages at least twice before, and the world hasn’t at all ended, but it still seems so unthinkable to imagine a future without English.

It is easier to imagine an end to the world than an end to the English language.

Solidarity

Here’s the frustrating thing—you don’t even need to be able to see the future nor participate in its creation to be sympathetic to the idea that change might be desirable, or to acknowledge that a problem exists. It is entirely feasible to say that non-free software isn’t a great status quo even if you still depend on it. The least that can be asked of you is to not stand in the way of progress, and the most that can be asked is that you participate in bringing about change, but none of these are necessary for a simple act of solidarity.

So too it goes with many other things. There are a great many undesirable status quos in the world, and I don’t have the energy or capacity to look over my shoulder to imagine potential better futures for all of them, and I most certainly don’t have the capacity to participate in effecting change for all of them, but I do my bit where I can. And more importantly, I try not to get in the way.

If the wheels of time err ever on the side of progress, then one day we’ll live in a post-proprietary world. And if the wheels keep churning as they do, the people of the future will see the free software advocates of today as regressive thinkers that were at least moderately better than what came before, but worse than what came after.

In the meantime, I’m still not sure what to do about people who are staring at their feet, but at least I slightly understand where they’re coming from and where they’re headed—destination status quo.


  1. <https colon slash slash digdeeper dot neocities dot org slash ghost slash freetardism dot html> ↩︎

Friday, 05 February 2021

Using Fedora Silverblue for development

I recently switched to Fedora Silverblue for my development machine. I want to document approximately how I do this (and why it’s awesome!).

Fedora Silverblue

This article is not an introduction to Fedora Silverblue, but a short summary is well-placed: Fedora Silverblue is an immutable operating system that upgrades atomically. Effectively, the root filesystem is mounted read-only, with the exception of /var, /home, and /etc. The system is upgraded by mounting a new read-only snapshot as the root filesystem.

There are three methods of installing software on Fedora Silverblue:

  • Expanding the immutable base image using rpm-ostree. The base image is technically reserved for system components, but you can technically put anything in this image.

  • Installing graphical applications using Flatpak. This is sometimes-sandboxed, sometimes-not-so-sandboxed, but generally quite stable. If a layman were to install Fedora Silverblue, this would be the only installation method they would care about, aside from updating the base image when prompted.

  • Installing CLI tools using toolbox. This is a Podman (read: Docker) container of Fedora that mounts the user’s home directory. Because it’s a Podman image, you can install any RPM using Fedora’s package manager, DNF.

This article by Fedora Magazine goes into slightly more detail.

The basic development workflow

Instead of littering the base operating system with all means of development tools, development takes place within a toolbox. This looks a little like:

carmenbianca@thinkpad-x395 ~ $ ls
Elŝutoj  Labortablo  Nextcloud  Projektoj  Publike  Ŝablonoj
carmenbianca@thinkpad-x395 ~ $ toolbox create my-project
Created container: my-project
Enter with: toolbox enter my-project
carmenbianca@thinkpad-x395 ~ $ toolbox enter my-project
⬢[carmenbianca@toolbox ~]$
⬢[carmenbianca@toolbox ~]$ ls
Elŝutoj  Labortablo  Nextcloud  Projektoj  Publike  Ŝablonoj
⬢[carmenbianca@toolbox ~]$ # We're still in the same directory, which means[carmenbianca@toolbox ~]$ # that we still have our GPG and SSH keys, and[carmenbianca@toolbox ~]$ # other configuration files![carmenbianca@toolbox ~]$
⬢[carmenbianca@toolbox ~]$ sudo dnf groupinstall "Development Tools"
[...]

The nice thing now is that you now have full freedom to mess with absolutely anything, carefree. If your program touches some files in /etc, you can mess with those files without affecting your operating system. If you want to test your program against a custom-built glibc, you can simply do that without fear of breaking your computer. At worst you’ll break the toolbox, from which you can easily recover by recreating it. And if you need more isolation from e.g. your home directory, you can do that inside of a non-toolbox Podman container.

The editor problem

There is a slight problem with the above workflow, however. Unless you install your code editor inside of the toolbox, the editor has no access to the development tools you’ve installed. Instead, the editor only has access to the tools that are available in the base system image, or the editor only has access to the tools in its Flatpak runtime.

There are several ways to get around this, but they’re dependent on the editor you use. I’ll document how I circumvented this problem.

VSCodium in a Flatpak

I use the Flatpak version of VSCodium. VSCodium has an integrated terminal emulator. Unfortunately the Flatpak shell is rather extremely barebones—it doesn’t even have vi! Fortunately, we can tell VSCodium that we want to run a different program as our shell. Change settings.json to include:

  [...]
  "terminal.integrated.shell.linux": "/usr/bin/env",
  "terminal.integrated.shellArgs.linux": [
    "--",
    "flatpak-spawn",
    "--host",
    "toolbox",
    "enter",
    "main-toolbox"
  ],
  [...]

main-toolbox here is an all-purpose toolbox that has heaps of tools installed. You can adjust these settings on a per-workspace or per-project level, so a given project might use a different toolbox than main-toolbox.

The way the above command works is a little roundabout. It breaks out of the flatpak and into the base installation using flatpak-spawn --host, and then enters a toolbox using toolbox enter. The end result is that you have an integrated terminal with a functional and feature-plenty shell.

This doesn’t solve everything, however. The editor itself also has some integration such as running tests. Because I mainly do Python development, it is fairly easy to bootstrap this functionality. The Flatpak runtime ships with Python 3.8. This means that I can create a Python 3.8 virtualenv and tell VSCodium to use this virtualenv for all of its Python stuff, which ends up working out just fine. The virtualenv is shared between the Flatpak and the toolbox, because both environments have access to the same file system.


For non-Python endeavours, Flatpak SDK extensions can be installed. This looks a little like:

$ flatpak install flathub org.freedesktop.Sdk.Extension.dotnet
$ flatpak install flathub org.freedesktop.Sdk.Extension.golang
$ FLATPAK_ENABLE_SDK_EXT=dotnet,golang flatpak run com.vscodium.codium

The final problem I ran into was using VSCodium to edit my Git commit messages. VSCodium has a tiny box in the UI where you can write commit messages, but it’s rather tiny and fiddly and not great. So instead I do git config --global --add core.editor 'codium --wait'. This should launch codium --wait path/to/git/commit/message when git commit is run, allow VSCodium to edit the message, and wait until the file is saved and exited to evaluate the message.

The problem is that codium only exists as an executable inside of the Flatpak. It does not exist in the toolbox or the base operating system.

I circumvented this problem by creating a custom script in .local/bin/codium. It tries to detect its current environment by checking whether a certain file exists, and tries to use that environment’s method of accessing the VSCodium Flatpak. This looks like:

#!/bin/bash

# Will still need to pass "--wait" to make it behave nicely.
# In order to get --wait to work in a Flatpak, run
# `flatpak override --user --env=TMPDIR=/var/tmp com.vscodium.codium`.

if [ -f /app/bin/codium ]
then
    exec /app/bin/codium "$@"
elif [ -f /usr/bin/codium ]
then
    exec /usr/bin/codium "$@"
elif [ -f /usr/bin/flatpak-spawn ]
then
    exec /usr/bin/flatpak-spawn --host flatpak run com.vscodium.codium "$@"
elif [ -f /usr/bin/flatpak ]
then
    exec /usr/bin/flatpak run com.vscodium.codium "$@"
else
    for arg do
        shift
        [ "$arg" = "--wait" ] && continue
        set -- "$@" "$arg"
    done
    exec vi "$@"
fi

This allows you to run codium [--wait] from anywhere and expect a functional editor to pop up. There’s a fallback to vi, which I’ve not yet hit.

I hope this helps anybody looking for solutions to these problems :) It’s more than a little bit of bother, but it’s incredibly reassuring to know that your operating system won’t ever break on you.

Thursday, 04 February 2021

21.04 releases schedule finalized

It is available at the usual place https://community.kde.org/Schedules/release_service/21.04_Release_Schedule

Dependency freeze is in five weeks (March 11) and Feature Freeze a week after that, make sure you start finishing your stuff!

Tuesday, 26 January 2021

ROC and Precision-Recall curves - How do they compare?

The accuracy of a model is often criticized for not being informative enough to understand its performance trade offs. One has to turn to more powerful tools instead. Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves are standard metrics used to measure the accuracy of binary classification models and find an appropriate decision threshold. But how do they relate to each other?

What are they for?

Often, the result of binary classification (with a positive and negative class) models is a real number ranging from 0 to 1. This number can be interpreted as a probability. Above a given threshold, the model is considered to have predicted the positive class. This threshold often defaults to 0.5. While sound, this default may not be the optimal value. Fine-tuning it can impact the balance between false positives and false negatives, which is especially useful when they don’t have the same importance. This fine-tuning can be done with ROC and PR curves, and is also useful as a performance indicator.

How to make a ROC and PR curve?

Both curves are based on the same idea: measuring the performance of the model at different threshold values. They differ on the performance measures. The ROC curve measures both the ability of the model to correctly classify positive examples and the ability of the model to minimize false positive errors. On the other hand, the PR curve focuses exclusively on the positive class and ignore correct predictions of the negative class, making it a compelling measure for imbalanced datasets. While the two curves are different, it has however been proved that they are equivalent, because although the true negatives (correct predictions of the negative class) are not taken into account by the PR curve, it is possible to deduce it from the other measures.

Receiver Operating Characteristic (ROC) curve

ROC curves measure the True Positive Rate (among the positive samples, how many were correctly identified as positives), and the False Positive Rate (among the negatives samples, how many were falsely identified as positive):

$$TPR = \frac {TP} {TP + FN}$$

$$FPR = \frac {FP} {FP + TN}$$

A perfect predictor would be able to maximize the TPR while minimizing the FPR.

Precision-Recall (PR) curve

The Precision-Recall curve uses the Positive Predictive Value, precision (among the samples which the model predicted as being positive, how many were correctly classified) and the True Positive Rate (also called recall):

$$PPV = \frac {TP} {TP + FP}$$

A perfect predictor would both maximize the TPR and the PPV at the same time.

ROC and Precision-Recall curves in Python

With scikit-learn and matplotlib (both are Free Software), creating these curves is easy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from matplotlib import pyplot as plt
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import plot_roc_curve, plot_precision_recall_curve

X, y = make_classification(n_samples=1000, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, random_state=42, test_size=0.2
)
lr = LogisticRegression().fit(X_train, y_train)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
plot_roc_curve(lr, X_test, y_test, ax=ax1)
plot_precision_recall_curve(lr, X_test, y_test, ax=ax2)
ax1.set_title("ROC curve")
ax2.set_title("Precision-Recall curve")
fig.suptitle("Comparaison of ROC and P-R curves")
plt.show()
Plot of a ROC and Precision-Recall curves

Plot of a ROC and Precision-Recall curves

Line 7-11 create a sample dataset with a binary target, split it into a training set and a testing set, and train a logistic regression model. The important lines are lines 14 and 15 which automatically compute the performance measures at different threshold values.

How to read the curves?

Both curves offer two useful information: how to choose the positive class prediction threshold and what is the overall performance of the classification model. The former is determined by selecting the threshold which yield the best tradeoff, in adequation with the prediction task and operational needs. The latter is done by measuring the area under the curves which informs about how good the model is, because by measuring the area under the curves, one computes the overall probability that a sample from the negative class has a lower probability than a sample from the positive class.

With scikit-learn, the values can be computed either by using the roc_auc attribute of the object returned by plot_roc_curve() or by calling roc_auc_score() directly for ROC curves and by using the average_precision attribute of the object returned by plot_precision_recall_curve() or by calling average_precision_score() directly for PR curves.

Tuesday, 15 December 2020

AVIF support for KImageFormats just landed

Thanks to Daniel Novomeský we will have support for AVIF images in KImageFormats starting in the next release.


We have (and by we I mean him) also added the avif code to be fuzzed under oss-fuzz so we'll be helping the upstream libavif/libaom to find potential memory issues in their code.


https://invent.kde.org/frameworks/kimageformats/-/merge_requests/8

https://github.com/google/oss-fuzz/pull/4850

Tuesday, 08 December 2020

Dutch Public Money? Public Code! video released!

In my last blogpost I wrote about how we created a Dutch video translation of the Public Money? Public Code! campaign video. Well, you can now watch it yourself, as it has been released! On the 25th of November we held our Netherlands online get-together in which we showed it as a sneak preview, before release. At this meeting Matthias also joined to congratulate us with the result and to thank us for our efforts. This was a welcome surprise. Our next online get-together will be held on the 23rd of December, feel free to join and have a chat.

Thursday, 03 December 2020

BTRFS and RAID1 over LUKS

Hi! I’m writing this article as a mini-HOWTO on how to setup a btrfs-raid1 volume on encrypted disks (luks). This page servers as my personal guide/documentation, althought you can use it with little intervention.

Disclaimer: Be very careful! This is a mini-HOWTO article, do not copy/paste commands. Modify them to fit your environment.

$ date -R
Thu, 03 Dec 2020 07:58:49 +0200

wd40purz.jpg

Prologue

I had to replace one of my existing data/media setup (btrfs-raid0) due to some random hardware errors in one of the disks. The existing disks are 7.1y WD 1TB and the new disks are WD Purple 4TB.

Western Digital Green  1TB, about  70€ each, SATA III (6 Gbit/s), 7200 RPM, 64 MB Cache
Western Digital Purple 4TB, about 100€ each, SATA III (6 Gbit/s), 5400 RPM, 64 MB Cache

This will give me about 3.64T (from 1.86T). I had concerns with the slower RPM but in the end of this article, you will see some related stats.

My primarly daily use is streaming media (video/audio/images) via minidlna instead of cifs/nfs (samba), although the service is still up & running.

Disks

It is important to use disks with the exact same size and speed. Usually for Raid 1 purposes, I prefer using the same model. One can argue that diversity of models and manufactures to reduce possible firmware issues of a specific series should be preferable. When working with Raid 1, the most important things to consider are:

  • Geometry (size)
  • RPM (speed)

and all the disks should have the same specs, otherwise size and speed will downgraded to the smaller and slower disk.

Identify Disks

the two (2) Western Digital Purple 4TB are manufacture model: WDC WD40PURZ

The system sees them as:

$ sudo find /sys/devices -type f -name model -exec cat {} ;

WDC WD40PURZ-85A
WDC WD40PURZ-85T

try to identify them from the kernel with list block devices:

$ lsblk

NAME         MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sdc            8:32   0   3.6T  0 disk
sde            8:64   0   3.6T  0 disk

verify it with hwinfo

$ hwinfo --short --disk
disk:
  /dev/sde             WDC WD40PURZ-85A
  /dev/sdc             WDC WD40PURZ-85T

$ hwinfo --block --short

  /dev/sde             WDC WD40PURZ-85A
  /dev/sdc             WDC WD40PURZ-85T

with list hardware:

$ sudo lshw -short | grep disk

/0/100/1f.5/0        /dev/sdc   disk           4TB WDC WD40PURZ-85T
/0/100/1f.5/1        /dev/sde   disk           4TB WDC WD40PURZ-85A

$ sudo lshw -class disk -json | jq -r .[].product

WDC WD40PURZ-85T
WDC WD40PURZ-85A

Luks

Create Random Encrypted keys

I prefer to use random generated keys for the disk encryption. This is also useful for automated scripts (encrypt/decrypt disks) instead of typing a pass phrase.

Create a folder to save the encrypted keys:

$ sudo mkdir -pv /etc/crypttab.keys/

create keys with dd against urandom:

WD40PURZ-85A

$ sudo dd if=/dev/urandom of=/etc/crypttab.keys/WD40PURZ-85A bs=4096 count=1

1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.00015914 s, 25.7 MB/s

WD40PURZ-85T

$ sudo dd if=/dev/urandom of=/etc/crypttab.keys/WD40PURZ-85T bs=4096 count=1

1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000135452 s, 30.2 MB/s

verify two (2) 4k size random keys, exist on the above directory with list files:

$ sudo ls -l /etc/crypttab.keys/WD40PURZ-85*

-rw-r--r-- 1 root root 4096 Dec  3 08:00 /etc/crypttab.keys/WD40PURZ-85A
-rw-r--r-- 1 root root 4096 Dec  3 08:00 /etc/crypttab.keys/WD40PURZ-85T

Format & Encrypt Hard Disks

It is time to format and encrypt the hard disks with Luks

Be very careful, choose the correct disk, type uppercase YES to confirm.

$ sudo  cryptsetup luksFormat /dev/sde --key-file /etc/crypttab.keys/WD40PURZ-85A

WARNING!
========
This will overwrite data on /dev/sde irrevocably.

Are you sure? (Type 'yes' in capital letters): YES
$ sudo  cryptsetup luksFormat /dev/sdc --key-file /etc/crypttab.keys/WD40PURZ-85T

WARNING!
========
This will overwrite data on /dev/sdc irrevocably.

Are you sure? (Type 'yes' in capital letters): YES

Verify Encrypted Disks

print block device attributes:

$ sudo  blkid | tail -2

/dev/sde: UUID="d5800c02-2840-4ba9-9177-4d8c35edffac" TYPE="crypto_LUKS"
/dev/sdc: UUID="2ffb6115-09fb-4385-a3c9-404df3a9d3bd" TYPE="crypto_LUKS"

Open and Decrypt

opening encrypted disks with luks

  • WD40PURZ-85A
$ sudo  cryptsetup luksOpen /dev/disk/by-uuid/d5800c02-2840-4ba9-9177-4d8c35edffac WD40PURZ-85A -d /etc/crypttab.keys/WD40PURZ-85A
  • WD40PURZ-85T
$ sudo  cryptsetup luksOpen /dev/disk/by-uuid/2ffb6115-09fb-4385-a3c9-404df3a9d3bd WD40PURZ-85T -d /etc/crypttab.keys/WD40PURZ-85T

Verify Status

  • WD40PURZ-85A
$ sudo  cryptsetup status   /dev/mapper/WD40PURZ-85A

/dev/mapper/WD40PURZ-85A is active.

  type:         LUKS2
  cipher:       aes-xts-plain64
  keysize:      512 bits
  key location: keyring
  device:       /dev/sde
  sector size:  512
  offset:       32768 sectors
  size:         7814004400 sectors
  mode:         read/write
  • WD40PURZ-85T
$ sudo  cryptsetup status   /dev/mapper/WD40PURZ-85T

/dev/mapper/WD40PURZ-85T is active.

  type:         LUKS2
  cipher:       aes-xts-plain64
  keysize:      512 bits
  key location: keyring
  device:       /dev/sdc
  sector size:  512
  offset:       32768 sectors
  size:         7814004400 sectors
  mode:         read/write

BTRFS

Current disks

$sudo btrfs device stats /mnt/data/

[/dev/mapper/western1T].write_io_errs     28632
[/dev/mapper/western1T].read_io_errs      916948985
[/dev/mapper/western1T].flush_io_errs     0
[/dev/mapper/western1T].corruption_errs   0
[/dev/mapper/western1T].generation_errs   0
[/dev/mapper/western1Tb].write_io_errs    0
[/dev/mapper/western1Tb].read_io_errs     0
[/dev/mapper/western1Tb].flush_io_errs    0
[/dev/mapper/western1Tb].corruption_errs  0
[/dev/mapper/western1Tb].generation_errs  0

There are a lot of write/read errors :(

btrfs version

$ sudo  btrfs --version
btrfs-progs v5.9

$ sudo  mkfs.btrfs --version
mkfs.btrfs, part of btrfs-progs v5.9

Create BTRFS Raid 1 Filesystem

by using mkfs, selecting a disk label, choosing raid1 metadata and data to be on both disks (mirror):

$ sudo mkfs.btrfs
  -L WD40PURZ
  -m raid1
  -d raid1
  /dev/mapper/WD40PURZ-85A
  /dev/mapper/WD40PURZ-85T

or in one-liner (as-root):

mkfs.btrfs -L WD40PURZ -m raid1 -d raid1 /dev/mapper/WD40PURZ-85A /dev/mapper/WD40PURZ-85T

format output

btrfs-progs v5.9
See http://btrfs.wiki.kernel.org for more information.

Label:              WD40PURZ
UUID:               095d3b5c-58dc-4893-a79a-98d56a84d75d
Node size:          16384
Sector size:        4096
Filesystem size:    7.28TiB
Block group profiles:
  Data:             RAID1             1.00GiB
  Metadata:         RAID1             1.00GiB
  System:           RAID1             8.00MiB
SSD detected:       no
Incompat features:  extref, skinny-metadata
Runtime features:
Checksum:           crc32c
Number of devices:  2
Devices:
   ID        SIZE  PATH
    1     3.64TiB  /dev/mapper/WD40PURZ-85A
    2     3.64TiB  /dev/mapper/WD40PURZ-85T

Notice that both disks have the same UUID (Universal Unique IDentifier) number:

UUID: 095d3b5c-58dc-4893-a79a-98d56a84d75d

Verify block device

$ blkid | tail -2

/dev/mapper/WD40PURZ-85A: LABEL="WD40PURZ" UUID="095d3b5c-58dc-4893-a79a-98d56a84d75d" UUID_SUB="75c9e028-2793-4e74-9301-2b443d922c40" BLOCK_SIZE="4096" TYPE="btrfs"
/dev/mapper/WD40PURZ-85T: LABEL="WD40PURZ" UUID="095d3b5c-58dc-4893-a79a-98d56a84d75d" UUID_SUB="2ee4ec50-f221-44a7-aeac-aa75de8cdd86" BLOCK_SIZE="4096" TYPE="btrfs"

once more, be aware of the same UUID: 095d3b5c-58dc-4893-a79a-98d56a84d75d on both disks!

Mount new block disk

create a new mount point

$ sudo  mkdir -pv /mnt/WD40PURZ
mkdir: created directory '/mnt/WD40PURZ'

append the below entry in /etc/fstab (as-root)

echo 'UUID=095d3b5c-58dc-4893-a79a-98d56a84d75d    /mnt/WD40PURZ    auto    defaults,noauto,user,exec    0    0' >> /etc/fstab

and finally, mount it!

$ sudo  mount /mnt/WD40PURZ

$ mount | grep WD
/dev/mapper/WD40PURZ-85A on /mnt/WD40PURZ type btrfs (rw,nosuid,nodev,relatime,space_cache,subvolid=5,subvol=/)

Disk Usage

check disk usage and free space for the new encrypted mount point

$ df -h /mnt/WD40PURZ/

Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/WD40PURZ-85A  3.7T  3.4M  3.7T   1% /mnt/WD40PURZ

btrfs filesystem disk usage

$ btrfs filesystem df /mnt/WD40PURZ | column -t

Data,           RAID1:   total=1.00GiB,  used=512.00KiB
System,         RAID1:   total=8.00MiB,  used=16.00KiB
Metadata,       RAID1:   total=1.00GiB,  used=112.00KiB
GlobalReserve,  single:  total=3.25MiB,  used=0.00B

btrfs filesystem show

$ sudo btrfs filesystem show /mnt/WD40PURZ

Label: 'WD40PURZ'  uuid: 095d3b5c-58dc-4893-a79a-98d56a84d75d
    Total devices 2 FS bytes used 640.00KiB
    devid    1 size 3.64TiB used 2.01GiB path /dev/mapper/WD40PURZ-85A
    devid    2 size 3.64TiB used 2.01GiB path /dev/mapper/WD40PURZ-85T

stats

$ sudo  btrfs device stats /mnt/WD40PURZ/

[/dev/mapper/WD40PURZ-85A].write_io_errs    0
[/dev/mapper/WD40PURZ-85A].read_io_errs     0
[/dev/mapper/WD40PURZ-85A].flush_io_errs    0
[/dev/mapper/WD40PURZ-85A].corruption_errs  0
[/dev/mapper/WD40PURZ-85A].generation_errs  0
[/dev/mapper/WD40PURZ-85T].write_io_errs    0
[/dev/mapper/WD40PURZ-85T].read_io_errs     0
[/dev/mapper/WD40PURZ-85T].flush_io_errs    0
[/dev/mapper/WD40PURZ-85T].corruption_errs  0
[/dev/mapper/WD40PURZ-85T].generation_errs  0

btrfs fi disk usage

btrfs filesystem disk usage

$ sudo  btrfs filesystem usage /mnt/WD40PURZ

Overall:
    Device size:                  7.28TiB
    Device allocated:             4.02GiB
    Device unallocated:           7.27TiB
    Device missing:                 0.00B
    Used:                         1.25MiB
    Free (estimated):             3.64TiB   (min: 3.64TiB)
    Data ratio:                      2.00
    Metadata ratio:                  2.00
    Global reserve:               3.25MiB   (used: 0.00B)
    Multiple profiles:                 no

Data,RAID1: Size:1.00GiB, Used:512.00KiB (0.05%)
   /dev/mapper/WD40PURZ-85A    1.00GiB
   /dev/mapper/WD40PURZ-85T    1.00GiB

Metadata,RAID1: Size:1.00GiB, Used:112.00KiB (0.01%)
   /dev/mapper/WD40PURZ-85A    1.00GiB
   /dev/mapper/WD40PURZ-85T    1.00GiB

System,RAID1: Size:8.00MiB, Used:16.00KiB (0.20%)
   /dev/mapper/WD40PURZ-85A    8.00MiB
   /dev/mapper/WD40PURZ-85T    8.00MiB

Unallocated:
   /dev/mapper/WD40PURZ-85A    3.64TiB
   /dev/mapper/WD40PURZ-85T    3.64TiB

Speed

Using hdparm to test/get some speed stats

$ sudo  hdparm -tT /dev/sde

/dev/sde:
 Timing cached reads:    25224 MB in  1.99 seconds = 12662.08 MB/sec
 Timing buffered disk reads: 544 MB in  3.01 seconds = 181.02 MB/sec

$ sudo  hdparm -tT /dev/sdc

/dev/sdc:
 Timing cached reads:    24852 MB in  1.99 seconds = 12474.20 MB/sec
 Timing buffered disk reads: 534 MB in  3.00 seconds = 177.85 MB/sec

$ sudo  hdparm -tT /dev/disk/by-uuid/095d3b5c-58dc-4893-a79a-98d56a84d75d

/dev/disk/by-uuid/095d3b5c-58dc-4893-a79a-98d56a84d75d:
 Timing cached reads:   25058 MB in  1.99 seconds = 12577.91 MB/sec
 HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device
 Timing buffered disk reads: 530 MB in  3.00 seconds = 176.56 MB/sec

These are the new disks with 5400 rpm, let’s see what the old 7200 rpm disk shows here:

/dev/sdb:
 Timing cached reads:    26052 MB in  1.99 seconds = 13077.22 MB/sec
 Timing buffered disk reads: 446 MB in  3.01 seconds = 148.40 MB/sec

/dev/sdd:
 Timing cached reads:    25602 MB in  1.99 seconds = 12851.19 MB/sec
 Timing buffered disk reads: 420 MB in  3.01 seconds = 139.69 MB/sec

So even that these new disks are 5400 seems to be faster than the old ones !!
Also, I have mounted as read-only the problematic Raid-0 setup.

Rsync

I am now moving some data to measure time

  • Folder-A
du -sh /mnt/data/Folder-A/
795G   /mnt/data/Folder-A/
time rsync -P -rax /mnt/data/Folder-A/ Folder-A/
sending incremental file list
created directory Folder-A
./
...

real  163m27.531s
user    8m35.252s
sys    20m56.649s
  • Folder-B
du -sh /mnt/data/Folder-B/
464G   /mnt/data/Folder-B/
time rsync -P -rax /mnt/data/Folder-B/ Folder-B/
sending incremental file list
created directory Folder-B
./
...

real    102m1.808s
user    7m30.923s
sys     18m24.981s

Control and Monitor Utility for SMART Disks

Last but not least, some smart info with smartmontools

$sudo smartctl -t short /dev/sdc

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.79-1-lts] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Thu Dec  3 08:58:06 2020 EET
Use smartctl -X to abort test.

result :

$sudo smartctl -l selftest /dev/sdc

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.79-1-lts] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         1         -

details

$sudo smartctl -A  /dev/sdc

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.79-1-lts] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   100   253   021    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       1
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       1
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       1
194 Temperature_Celsius     0x0022   119   119   000    Old_age   Always       -       31
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

Second disk

$sudo smartctl -t short /dev/sde

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.79-1-lts] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Thu Dec  3 09:00:56 2020 EET
Use smartctl -X to abort test.

selftest results

$sudo smartctl -l selftest /dev/sde

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.79-1-lts] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         1         -

details

$sudo smartctl -A  /dev/sde

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.79-1-lts] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   100   253   021    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       1
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       1
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       1
194 Temperature_Celsius     0x0022   116   116   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

that’s it !

-ebal

Tag(s): btrfs, raid, raid1, luks

Shifting Emphasis

I joined the Debian project in late 1994, well before the first stable release was issued, and have been involved in various ways continuously ever since. Over the years, I adopted a number of packages that are, or at least were at one time, fundamental to the distribution.

But, not surprisingly, my interests have shifted over time. In the more than quarter century I've contributed to Debian, I've adopted existing packages that needed attention, packaged new software I wanted to use that wasn't yet in Debian, offered packages up for others to adopt, and even sometimes requested the removal of packages that became obsolete or replaced by something better. That all felt completely healthy.

But over the last couple weeks, I realized I'm still "responsible" for some packages I'd had for a very long time, that generally work well but over time have accumulated bugs in functionality I just don't use, and frankly haven't been able to find the motivation to chase down. As one example, I just noticed that I first uploaded the gzip package 25 years ago today, on 2 December 1995. And while the package works fine for me and most other folks, there are 30 outstanding bugs and 3 forwarded bugs that I just can't muster up any energy to address.

So, I just added gzip to a short list of packages I've offered up for adoption recently. I'm pleased that tar already has a new maintainer, and hope that both sudo and gzip will get more attention soon.

It's not that I'm less interested in Debian. I've just been busy recently packaging up more software I use or want to use in designing high power model rockets and the solid propellant motors I fly in them, and would rather spend the time I have available for Debian maintaining those packages and all their various build dependencies than continuing to be responsible for core packages in the distribution that "work fine for me" but could use attention.

I'm writing about this partly to mark the passing of more than a quarter century as a package maintainer for Debian, partly to encourage other Debian package maintainers with the right skills and motivation to consider adopting some of the packages I'm giving up, and finally to encourage other long-time participants in Debian to spend a little time evaluating their own package lists in a similar way.

Wednesday, 02 December 2020

CMake: Use new style imported targets to link libraries

 What does that mean?


It means this https://invent.kde.org/graphics/okular/-/commit/3129b642f996589f1c7fcee513741e1993924241


That is, you add JPEG::JPEG to target_link_libraries, and that's it, no need to add the includes, the compile_definitions and whatnot.

 

This has multiple advantages to the old mode:

  • You can't misspell it, since these are actual things cmake knows about not strings so if you write something like JPEG::JPG it will fail at cmake stage
  • You won't forget to add the include (most of us did at some point) because you don't need to :)
  • The include path is just for that particular target (the old method added it to "everyone")
  • It's basically less lines so it's harder to do wrong

 

To know if you can use that you'll need to use the cmake help, for example the imported target is described at https://cmake.org/cmake/help/latest/module/FindJPEG.html


Now we don't probably depend on cmake 3.19, do we? So we need to make sure the cmake we say we support has that, so use the combo on the top of that page to go and find which version introduced it, for JPEG we can see it was cmake 3.12

Planned obsolence for Android phones

Four years ago I bought a OnePlus 2 for 319€, nowdays it still is good hardware, it's better or as good as most of the mid to low end phones in the market.

 

But it has a big-ish problem, it is stuck on Android 6, which not only is old and probably full of security holes, it is starting to not be supported by some apps.


Solution: LineageOS.


It provides Android 10 for my phone and works flawlessly (from what i can see, only been using it for 4 days), so the fact that "some randos" (with all my love, I'm a rando) can make it work, means that the manufacturer could have if they wanted, they just chose not to.


It's really sad because if i wasn't a geek that knew how to put LineageOS on the device i would have probably ended up buying a new phone, making the world be fill with more junk.


Let's hope the more open devices like the PinePhone KDE Community Edition put an end to this :)

Tuesday, 10 November 2020

st, xft and ubuntu 20.04.1

Some time ago I switched to AwesomeWM and with that came another change, my default terminal emulator. Having used GNOME terminal for years, I soon switched to Terminator back in the day. Leaving GNOME behind, in search for a more lean desktop with less frills and more keyboard centric features, I also had to ditch that terminal emulator (it has too many dependencies for my use case). Eventually I stumbled upon st, which fit the bill.

st still seems almost perfect for me and I'm sticking with it, for now. There is one annoying bug though, which came to light when I started receiving e-mails with emoticons. Those emoticons crashed my 'st' instance!

This is actually caused by an upstream Xft bug. When emoticons are displayed, they crash st. I had to resort to using xterm sometimes, which is, well, not a great experience nowadays. I set out on a journey to fix my desktop.

FAQ

So I checked the FAQ of st and found an answer to my issue:

 ## BadLength X error in Xft when trying to render emoji

 Xft makes st crash when rendering color emojis with the following error:

 "X Error of failed request:  BadLength (poly request too large or internal Xlib length error)"
   Major opcode of failed request:  139 (RENDER)
   Minor opcode of failed request:  20 (RenderAddGlyphs)
   Serial number of failed request: 1595
   Current serial number in output stream:  1818"

 This is a known bug in Xft (not st) which happens on some platforms and
 combination of particular fonts and fontconfig settings.

 See also:
 https://gitlab.freedesktop.org/xorg/lib/libxft/issues/6
 https://bugs.freedesktop.org/show_bug.cgi?id=107534
 https://bugzilla.redhat.com/show_bug.cgi?id=1498269

 The solution is to remove color emoji fonts or disable this in the fontconfig
 XML configuration.  As an ugly workaround (which may work only on newer
 fontconfig versions (FC_COLOR)), the following code can be used to mask color
 fonts:

     FcPatternAddBool(fcpattern, FC_COLOR, FcFalse);

 Please don't bother reporting this bug to st, but notify the upstream Xft
 developers about fixing this bug.

The solution

Checking issue 6 at xft shows me that this is an active issue. Reading the posts I found this merge request which solves the issue in xft, but it is still being worked on by Maxime Coste.

Waiting for the patch to be finalized in xft, then released and then used in my Desktop distibution of choice (currently Ubuntu 20.04) will take too long (yes, I am impatient). So, I decided to patch libxft2 manually on my system, using the patch by Maxime (thank you Maxime!). I also created my own patch file, since I had merge errors. Here are the instructions:

apt-get source libxft2
patch -p0 < modified_for_ubuntu_20.04.1.patch
debuild -b -uc -us
sudo dpkg -i ../libxft2_2.3.3-0ubuntu1_amd64.deb

attachments

Sunday, 08 November 2020

20.12 releases branches created

Make sure you commit anything you want to end up in the 20.12 releases to them


We're already past the dependency freeze.


The Feature Freeze and Beta is this Thursday 12 of November.


More interesting dates

November 26, 2020: 20.12 RC (20.11.90) Tagging and Release

December 3, 2020: 20.12 Tagging

December 10, 2020: 20.12 Release


https://community.kde.org/Schedules/release_service/20.12_Release_Schedule


Cheers,

Albert



Tuesday, 03 November 2020

What does a transformer?

Transformers are giant robots coming from Cybertron. There are two Transformer tribes: the Autobots and the Decepticons. They have been fighting each other over the Allspark, a mythical artifact capable of building worlds and mechanical beings. Well, there is also another kind of Transformers, but those are not about warfare. However they are pretty good at language understanding. Let’s see how!

Attention is all you need!

To understand what the transformer does, one first needs to understand the principle of Attention in neural networks. Attention is a mechanism generating a weights vector $A = \alpha_1,\dotsc, \alpha_n , \ \alpha \in \mathbb{R}$ allowing the neural network to focus on specific parts of an input of length $n$. Since the relationship between two words is not commutative (how important a word $w_1$ is w.r.t a word $w_2$ is different from how $w_2$ is important w.r.t $w_1$), there needs to be two sets of weights associated with a word $w$:

  1. How $w$ is important w.r.t every other word
  2. How every other word is important w.r.t $w$
  3. … and there needs to be a third set of weights which is used to compute the final vector $A$, after having considered 1. and 2.

Each word $w$ is therefore transformed to an embedding $x$ of dimension $d$ and then multiplied with three weight matrices. The results of these operations are respectively named $Q$, $K$ and $V$. To get the attention weights, we use the following formulas:

$$\text{Attention}(Q, K, V) = \sigma\left(\frac{QK^\mathrm{T}}{\sqrt{d}}\right)V$$

$$\sigma(t_i) = \frac{e^{t_i}}{\sum_{j=1}^{N}{e^{t_j}}}$$

We perform dot products between $Q$ and $K$: $QK^\mathrm{T} = q_1k_1 + q_2k_2+ \dotsc + q_{nd}k_{nd}$ and divide by $\sqrt{d}$, which is proportional to the size of the embedding vectors, to scale down the results so that it doesn’t grow to a large number. We then apply the softmax function $\sigma$ so that the values sum up to 1, and multiply by $V$. This allows the network to understand the relative importance of each word, and is parallelizable.

Attention in Transformers

Transformers work by stacking blocks onto each other. Each block is composed of an attention unit, a layer normalization, a dense layer and a second layer normalization. Residual connections are added after the attention unit and after the dense layer.

Diagram of an Attention block

Diagram of an attention block.

Layer normalization

The goal of layer normalization is to speed up the convergence of the network by making sure the means of the embeddings are 0. This makes the learning faster, because else the value of their gradients will not be centered and the weight’s update will thus not have an optimal modularity. As the name says, in layer normalization, normalization is done layer-wise, meaning that the mean and variance of the layer is computed for every training example.1

$$LayerNorm(x) = \gamma\Big(\frac{x - \mu_x}{\sqrt{var(x) + \epsilon}}\Big) + \beta$$

$\gamma$ and $\beta$ are learnable parameters that can make the network to learn a distribution potentially more optimal than the normal distribution. They are initially set to 1 and 0, respectively.

Dense layer

Dense layers (also known as feedforward neural network) are simple neural networks where every input is connected to $H$ perceptrons.

Diagram of the dense layer

Diagram of a dense layer in an attention block. The input is of same size as $d$ and $H$ is the size of the dense layer.

In case of transformers, the value of the $j$th perceptron is:

$$\sum_{i=1}^d ReLu(w_{ij} \cdot \alpha_i + b_j)$$

$$ReLu(x) = \max(0, x)$$

Where $w$ and $b$ are the weight and bias terms of the perceptrons.

Positional encoding

As the transformer blocks are not stateful, they are not aware of the order in which the words come, yet this is obviously important for language understanding, we need a way to encode the position of the words inside the model, prior to feeding them to the transformer blocks, For this, we can encode the position of each word in the input of size $n$, and add the encoding values to the embeddings. The encoding may be done with a custom function mapping each word position $1,\dotsc, n$ to continuous values, or with embeddings of the same dimension as the word embeddings, to facilitate the broadcast operation during the sum.

Conclusion

Transformers have revolutionized Natural Language Processing by reducing the time required to train the models compared to Recurrent Neural Networks. Because they are not sequential, attention-based models can process words in parallel, which is a major speed improvement. This allows the model to scale better, both in term of parameters and dataset size. This is also useful for interpretability, because attention weights allow one to easily understand the part of the input which contributed to the most to the predictions.


  1. As opposed to batch normalization where the mean and variance are computed for the whole batch. ↩︎

Monday, 02 November 2020

Recording a Public Money! Public Code? video translation

A Dutch translation of the Public Money? Public Code! campaign video is in the works and close to being released. The video was initially released in English and has been translated in many languages already: German, French, Italian, Polish and Russian. And there is an even greater number of subtitles available. Getting a voice-over translation for the video was one of this year’s goals for the Netherlands local group, to help us advocate for this cause. Getting a voice-over translation can be much more involving than a textual translation, so that why I want to explain how we did it. And by showing others the way, hopefully there will be more audio translations in the future.

Getting quality

What makes a good voice over translation? It should be clearly spoken, be comfortable to listen too, be a correct translation, have a timing that matches the sound effects and visuals, has a varying tone that matches the message, and keep a rhythm to it to keep the attention. As you can tell, there are many factors that have to be balanced, requiring an iterative process. A good translation has to be adjusted if it doesn’t work for the required timing, and the best way to check the timing is by rendering the complete video with sounds effects. And so one has to be able to adjust parameters on the fly. Especially because arranging a voice actor and recording setup can be difficult and costly. You should be able to record it in about 5 to 10 takes. So you need a good preparation and the flexibility to make adjustments.

Process overview

Let me sum up the approach we took in the Netherlands:

  1. Subtitle translation: Translating the English subtitles into Dutch. Working with these .srt subtitle files has the benefit of having a timing attached to them. You’ll see the benefit of that in a minute.
  2. Adjusting translations for a voice-over: Speaking the translated subtitles to the video to get a feel for the timing. Focusing on long sentences especially. The ones where you need to speed up. Those should be shortened to enable silences and a slower pace for clear pronunciation.
  3. Record a demo to validate: Just like a singer, we recorded a ‘demo’. We put the modified subtitle translation file in a subtitle editor to have a consistent timing (more on that later) and recorded a voice track. No fancy equipment, just a phone headset microphone and Audacity recording software There were still some miss-spoken words and false timings in it, but it was good enough. This demo allowed us to validate the translation in the team, to be certain we were ready for a recording. We also used it to show the voice actor what to expect.
  4. Arranging the recording: We contacted a befriended couple for the recording. She has a quality voice, he has the technical knowledge and equipment for the recording. We had backup plans like renting equipment, reaching out to a local broadcasting station, or getting a professional to do it for us.
  5. The recording: This was the most critical, but also the most fun part of the process. Seeing the preparation pay off and getting a quality recording. More about the technical side of the recording further down in this article.
  6. Mixing: As we used two types of microphones for a stereo effect, they had to be balanced and mixed to get a nice sound. This was mostly done during the process of the recording. Also a gate and compressor were applied to reduce the noise during silences but keep a constant volume.
  7. Editing: Despite having a practical auto-cue from the subtitles file, it took a couple of hours of editing to get the timing right. I used the English recording, the sound effects track, and the video to check the timing. Mostly I just had to just move sentences one or two seconds in the timing. But some parts required breaking down sentences to leave more space between words, to reduce the pace of the rhythm. Although the largest part of the recording was from the last take, some parts had to be ‘repaired’ with pieces of earlier takes.
  8. Mastering: The PMPC video has a strong sound effects track. This does require the voice to cut through that for the audience to hear it. I had to apply more compression on the voice to further increase the volume, and had to EQ the female voice with a recommended boost at 400Hz and 4kHz to make it stand out more. Now both tracks could be combined into a single one to be added to the video.
  9. Release: Adding the audio to the video to actually publish it.

In this case I was involved in the recording, mixing, editing and mastering. A professional would probably do the mixing besides the recording, but I’m not sure about the editing and mastering. Please look into this when you want to do it.

Autocue

Early on I realized that reading translations from a paper wouldn’t cut it. Timing has to be correct, even though you can make corrections in the editing process. Having a timed text will help you keep the correct pace and eases the editing process.

First I tried reading from subtitles. Although that contains the timing, each time the subtitles pop up, you are surprised by the content and have to build a sentence. There is no way to view the next line of the translations, so you have to stop and pause until the next line shows up. This leads to a stop-and-go recording with bad rhythm.

Als an alternative I looked into autocue software and apps, but couldn’t find any that fit my need. Most were made for speeches, where there was no requirement on timing, it would just do a certain words per minute. But this use-case required exact timing.

Then I found subtitle editors. Most have a preview where you can see overview of lines besides the video. That worked quite well. The best one I found was Subtitle Composer from the KDE suite of applications. Subtitle Composer has one major feature for this use-case: an auto-scrolling waveform.

You load Subtitle Composer with the translation and the PMPC video file and can just press play. The subtitles will appear on the video but also on the side at the scrolling waveform. The scrolling waveform has the benefit of showing a bar indicating the current time, passing through boxed-off subtitles. This helps to give you a feel for if you are leading or lagging, and how much time is reserved for a sentence. It works similar to the interface of games like Dance Dance Revolution or Guitar Hero, which also solve the issue of timing in this way.

Thinking about it now, I could also have looked into karaoke solutions, because there also timing is critical. I’m not sure if that provides a similar option to look ahead to upcoming lines of text.

I made two adjustments to the settings of Subtitle Composer to enhance the experience of the auto-scrolling waveform:

  • Auto Scroll Padding: set to the maximum to prevent the waveform from jumping per page, causing the voice actor to lose its place. With the maximum padding it scrolls continuously.
  • Subtitle text font size: The normal font size is quite small. I increased it to improve readability. The waveform changes its appearance to a horizontal waveform when the pane is stretched to more than about half the window size. In this horizontal form it becomes unusable for this purpose, so the waveform size is limited to about half the screen size. I found a size of 14pt was the largest I could go before words would end up besides the waveform screen.

Subtitle Composer is designed to make changes to the subtitles. Use that feature if you find that the current translation isn’t working in practice. For the Dutch translations we still had a typo in the translation, had some comma’s that were confusing the voice actress and she preferred to change the order of a sentences. We immediately changed these bits when we found them, so they went well in the next take. This allowed us to iterate quickly. Because of these modifications the last recording was used as the base, as it has the final version of the text.

Recording

Sound proofing

As any engineer will tell you: garbage in, garbage out. If you start with low quality, don’t expect it to end up much better at the end. Let’s start with acoustics. We recorded it in the study at my friends’ place. The room is box-shaped filled with two desks and a few closets. It has laminate flooring and a huge window pane off to the side. So plenty of surfaces to reflect sound and little to disperse or dampen it. We did some sound-proofing:

  • Hung a blanket at the wall behind the voice actress
  • Closed the curtains before the window pane
  • Used sheets of sound dampening foam to build a box around the microphone with an opening to the voice actress

We did this with stuff that was already available in the house and it made a huge difference for the audio quality. It reduced the echo in the room and blocked out noise from us moving on our chairs and spanning computer fans.

Microphones

Perhaps we over-engineerd this part a bit. We used a Sure SM58 as the main voice microphone, combined with a matched pair of RØDE M5 microphones to pick up on the stereo effect of certain vowels. This all went into an M-Audio interface connected to the recording computer. We used the non-free Reaper software on Windows as my friend was used to it and had everything configured and ready to go. I guess we could as well have used Ardour, which I used for the editing and mastering. Perhaps something for a future recording. (I ended up with the WAV files of the recordings and the mixed recordings, so I could always recreate it if I needed to).

The Sure SM58 has a built-in pop-filter, to reduce the amount of air blowing into the microphone when vowels like P, B, S, T and F are spoken. This blowing of air creates a high-volume rumbling which is hard to remove in post-processing. The word ‘PoPriaTary SoFTware’ is really in need of a pop-filter. In hindsight it would have been better if we used an additional pop-filter to mount to the microphone, to further reduce the pops. I still consider the end-result perfectly usable, but would encourage you to take care of this if you arrange the equipment yourself.

We recorded in 48.000 Hz like the audio in the video and recorded in 24bits to keep plenty of detail.

Keeping notes

Another tip is to keep notes during the process. Do you notice a word being mispronounced, did you hear an unintended noise, or do you notice some wrong intonation, just write it down. I printed the text before the recording and kept it at hand during the process of recording, editing and mastering. As you can see I used it quite a bit.

Reviewing

During the recording, we had the video and sound effect track ready to go, to verify the timing. Granted, a lot could be done afterwards ‘in post’, but it is nice to be sure you have everything you need before breaking up the studio. Perhaps there was the option to synchronize a video with playing the software, but we just clicked the play-buttons of audio and video at the same time. I’d like to think the intermediary review helped the voice actress to better understand the meaning and timing of the words, leading to a better result.

Editing and mastering

I used to use Audacity for editing recordings. But in this case I had to work with multiple tracks and add some effects. I prefer to do so in a non-destructive way so I have more flexibility when editing. As far as I know Audacity cannot do so, so this was a nice opportunity for me to get familiar with Ardour. I had some initial problems running Ardour, because it didn’t have the permissions to set realtime priorities for JACK. On Debian these permissions can be granted during installations or afterwards, as is described in the Debian JACK documentation. I was surprised that it was actually more performant than Audacity on my computer whilst packing more features.

I used four tracks for editing the recording:

  1. The final recording, used as the base of the sound track
  2. A track containing parts of other recordings to ‘repair’ certain parts
  3. The English audio track from the video, including sound effects, to compare timing of text
  4. The sound effects track to check timing and for mastering into a single file

The wave-form of the English track helped me to pinpoint where certain parts of audio had to start.

As you can see, some sentences were really cut into pieces to increase the duration of silences between words. These are the small details that make it a more pleasant listen.

Besides fixing timing and repairing text, I also cut out some noised like deep inhaling or scraping a throat in between sentences.

Pay attention to the end marker in Ardour, as that will determine the length of the exported audio. I set that to the length of the sound effects track.

For mastering I added an equalizer to boost the 400Hz and 4kHz and used the VocalLeveller mode of the compressor to boost the volume. The sound effects track was mastered to zero dB sound pressure, hitting that level with the typing sound at the end of the video. The English voice also seemed to be mastered up to zero dB, so I did the same.

Release

The mastering resulted in a single .wav file to be added in the video. It already had the right length, as the end marker was set to the length of the sound effects track.

I initially added the sound to the video using OpenShot. Although that worked, it resulted in a stuttering video that wasn’t pleasing to watch. Perhaps it had something to do with my process or setup. Anyhow, I ended up choosing a different solution: using the power of ffmpeg to replace the audio but keep the video as is. This was also a lot quicker. I used the instructions from this helpful blogpost. This resulted in the following command taking in ‘pmpc_desktop.mp4’ for video, ‘pmpc-nl-mastered.wav’ for audio, resulting in ‘pmpc_desktop_nl.mp4’:

ffmpeg -i pmpc_desktop.mp4 -i pmpc-nl-mastered.wav -map 0:0 -map 1:0 -shortest -c:v copy -c:a aac -b:a 256k pmpc_desktop_nl.mp4

Considering that all mastered tracks of the video are kept stored at the FSFE, the core team probably also has a method to add the audio.

Final words

I would like to thank my friends for their help. The video is close to being released. Just a few checks and we should be able to publish it.

I enjoyed the process of making the video and the final result. It took more time than I originally anticipated, mostly because I had to work out how to do it. That why I wrote this blogpost, to encourage you to do it too and save you time by suggesting a methodology. In the process I learned some new skills and got to use some free software that was new to me.

We will continue the Public Money? Public Code! campaign in the Netherlands and the video will help us. And as a reminder, if you haven’t already, please sign the open letter if you agree with its content.

Tuesday, 20 October 2020

Make sure KDE software is usable in your language, join KDE translations!

Translations are a vital part of software. More technical people often overlook it because they understand English well enough to use the software untranslated, but only 15% of the World understands English, so it's clear we need good translations to make our software more useful to the rest of the world.

Translations are a place that [almost] always needs help, so I would encourage you to me (aacid@kde.org) if you are interested in helping.

Sadly, some of our teams are not very active, so you may find yourself alone, it can be a bit daunting at the beginning, but the rest of us in kde-i18n-doc will help you along the way :)

This is a list of teams sorted by how many translation commits have happened in the last year, more commits doesn't mean better, even teams with lots of commits will probably welcome help, maybe it's not in pure translation but instead in reviewing, you can also check statistics at https://l10n.kde.org/stats/gui/trunk-kf5/team/

More than 250 commits


Azerbaijani
Basque
Brazilian Portuguese
Catalan
Estonian
French
Interlingua
Lithuanian
Dutch
Portuguese
Russian
Slovak
Slovenian
Swedish
Ukrainian

Between 100 and 250 commits


German
Greek
Italian
Norwegian Nynorsk
Spanish

Between 50 and 100 commits


Asturian
Catalan (Valencian)
Czech
Finnish
Hungarian
Indonesian
Korean
Norwegian Bokmal
Polish
Vietnamese
Chinese Traditional

Between 10 and 50 commits


British English
Danish
Galician
Hindi
Icelandic
Japanese
Malayalam
Northern Sami
Panjabi/Punjabi
Romanian
Tajik
Chinese Simplified

Between 0 and 10 commits


Albanian
Belarusian
Latvian
Serbian
Telugu
Turkish

No commits


Afrikaans
Arabic
Armenian
Assamese
Bengali
Bosnian
Bulgarian
Chhattisgarhi
Crimean Tatar
Croatian
Esperanto
Farsi
Frisian
Georgian
Gujarati
Hausa
Hebrew
Irish Gaelic
Kannada
Kashubian
Kazakh
Khmer
Kinyarwanda
Kurdish
Lao
Low Saxon
Luxembourgish
Macedonian
Maithili
Malay
Maltese
Marathi
Nepali
Northern Sotho
Occitan
Oriya
Pashto
Scottish Gaelic
Sinhala
Tamil
Tatar
Thai
Tswana
Upper Sorbian
Uyghur
Uzbek
Venda
Walloon
Welsh
Xhosa

P.S: Please don't mention web base translation workflows as comments to this blog, it's not the place to discuss that.

Monday, 19 October 2020

Akademy-es call for papers expanded to October 27

This year Akademy-es is a bit special since it is happening in the Internet so you don't need to travel to Spain to participate.

If you're interested in giving a talk please visit https://www.kde-espana.org/akademy-es-2020/presenta-tu-charla for more info.

Tuesday, 13 October 2020

The stories we tell each other

I have recently been working on a conversion document that adapts Dungeons & Dragons’ Eberron campaign setting to the Savage Worlds system. I’m not a game designer and I’m not a particularly prolific writer, so this was a bit of a challenge for me. One of the most challenging things to pull off was converting the races. Through writing the document, I think I developed a deeper understanding for the racism inherent to fantasy fiction. And I’d like to talk about that.

Two disclaimers to start off:

  1. Real-world racism is incomparably worse than fictitious racism against fictitious beings.
  2. Fantasy fiction unfortunately uses the word “race” when describing species. I’ll be using both terms interchangeably.

The human problem

Before the rest of this article can make sense, I have to establish that games are generally predicated on some kind of balance. If I choose a dwarf character and you choose an elf character, the advantages and disadvantages we gain from our choices should balance out, and neither of us should be strictly better than the other. This can be difficult to precisely quantify, but it’s possible to at least be reasonably balanced. This is fine, because games require some sort of balance to remain fun for everyone.

Of course, there’s a problem, and it’s humans. What are humans if not strictly worse elves? Elves get to see in the dark, have keen hearing, amazing agility, meditate instead of sleeping, and—in D&D—have resistance against certain types of magic. What do humans get to counterbalance that? From a narrative perspective: squat.

From a mechanical perspective, game designers often give humans a flat increase to everything, or allow the player to assign some extra points to things of their own choice. The thinking goes that this evens the balancing scales—which is fine, because it’s a game after all. The narrative justification is that humans are exceptionally “adaptive” or “ambitious”.

It’s a lazy justification, but because we might like some explanation, maybe it will do, especially considering that there are worse offenders.

Conflation of culture and species

All elves in D&D receive proficiency in swords and bows. The thinking goes that these weapons hold some significant status in elven society, and therefore it is reasonable to assume that all elves raised in such a society would have received training in these weapons. This is wrong for so many reasons:

  • It assumes that there is such a thing as an elven society and culture.
  • It assumes that that culture values proficiency in swords and bows.
  • It assumes that the player character grew up in this culture.
  • It assumes that the player character actually took classes in these weapons.

If any of the above assumptions are false, then tough luck—you’re going to be good at those weapons regardless, whether or not you want to, as if elves are born with innate knowledge of swinging swords and loosing arrows.

Moreover, if you’re a human living in this elven culture—in spite of it making no narrative sense—you do not get automatic proficiency in these weapons.

Not by the rules, anyway. Put a pin in that.

The bell curve

If we can’t or shouldn’t conflate culture and species, then perhaps we should just stick to biology. This seems promising at first. It doesn’t seem so weird that elves might be a species with better eyesight and hearing—after all, dogs are a species with a veritably better sense of smell. And maybe it’s not so weird that elves can get their night’s rest through meditation instead of a human sleeping pattern.

The most difficult biological trait to justify would be the elves’ heightened dexterity. The stereotype of elves is that they are these highly agile beings with a certain finesse. And before elaborating on that, I think it’s good to stop for a moment to appreciate that this is aesthetically cool. It’s pleasing to imagine cool people do cool things.

Having stopped to appreciate the aesthetics, we can move on to question them. Are all elves dexterous? Judging by the rules—yes, all elves get a bonus to dexterity. But narratively, surely, this can’t be true. Maybe an elf was born with a physical disability, or acquired such a disability later in life. Maybe they simply don’t get any exercise in. It does not require a lot of imagination to come up with counter-examples. But like with the weapon proficiency, an elf is going to get their bonus to dexterity whether or not they want it. Hold onto that pin.

Let’s loop back to the other biological traits. If it was so easy to find a counter-example to dexterity, maybe it’s possible to do the same to other traits. To counter improved eyesight, maybe the elf is myopic and requires glasses to see. To counter improved hearing, maybe the elf has any sort of hearing disability. Countering their meditative sleep is possibly the hardest, but it’s not too far-fetched to imagine an excitable elf with an immensely short attention span who never quite got into the whole meditation thing. This elf might still technically biologically be capable of meditative sleep, but if they’ve never done it before, it’s a distinction without a difference.

If we now put all those pieces together, someone might play an elf who requires glasses to see, and gladly uses those glasses to read every magical tome they can find. In their advanced age, they have stopped exercising, and have slowly become hard of hearing. Instead of meditating, they often fall asleep on top of their books, long after the appropriate time to go to bed.

It’s worth noting that the above elf is not an especially exceptional character. They’re an old wizard with glasses who has no time for doing anything other than reading. Nevertheless, this elf gets all the bonuses that run contrary to their concept. And that just can’t be right.

The rationale for the racial bonuses of elves being as they are often comes down to the bell curve. The thinking goes that those bonuses represent the median elf at the centre of the statistical bell curve of the distribution of those traits. If you take the median elf and the median human, the elf will simply be more dexterous as a result of their natural physiology. And if you take the lowest 5th percentile of elves and compare them to the lowest 5th percentile of humans, those elves would still be more dexterous.

This of course completely ignores that the player character can be anywhere on the bell curve. The low-dexterity elf wizard from above could have a highly dexterous human companion. As a matter of fact, that human companion could even be more dexterous than the most dexterous elf. This would be statistically unlikely if we assume that the bell curve is real, but odds have never stopped fantasy heroes.

Note that the median (most common) elf is more dexterous than the median human, but that there exist humans on the right side of the curve that are more dexterous than elves on the left side of the curve.

A professionally drawn bell curve of the distribution of dexterity in humans and elves.

Note that the median (most common) elf is more dexterous than the median human, but that there exist humans on the right side of the curve that are more dexterous than elves on the left side of the curve.

In conclusion to this section: even if traits in fantasy races are distributed along these bell curves, there would still be heaps of exceptions, and the system should support that.

Additionally, I’d like to put a differently coloured pin in the very concept of bell curves. I’ll get back to that later.

Race is real

So far, the problems posed in this article have been fictitious and trivial in nature. It’s past time to get to the point.

Orcs are the traditional fantasy bogeyman. They’re a species that are characterised by their superlatively barbarous and savage state, brutish virility to the point of bestiality, low intelligence, and dark brown-grey skin tones.

Unless you have been living under a rock, you may notice a blatant parallel to the real world. The above paragraph could—verbatim—be a racist description of real-world black people. And indeed, part of that paragraph was paraphrased from Types of Mankind (1854), and another part was paraphrased from White on Black: images of Africa and Blacks in Western popular culture (1992).

And, you know, that’s bad. Like really, really bad. And it only gets worse. Because unlike in the real world, that characterisation of orcs is real.

In the default campaign setting of Dungeons & Dragons, orcs really are barbarous monsters that roam the lands to plague civilisation as raiders and pillagers. Quoting from Volo’s Guide to Monsters (2016):

[Orcs are] savage and fearless, [and] are ever in search of elves, dwarves, and humans to destroy. […] Orcs survive through savagery and force of numbers. Orcs aren’t interested in treaties, trade negotiations or diplomacy. They care only for satisfying their insatiable desire for battle, to smash their foes and appease their gods.

[…]

[The orcs are led] on a mission of ceaseless slaughter, fuelled by an unending rage that seeks to lay waste to the civilised world and revel in its anguish. […] Orcs are naturally chaotic and [disorganised], acting on their emotions and instincts rather than out of reason and logic.

[…]

In order to replenish the casualties of their endless warring, orcs breed prodigiously (and they aren’t choosy about what they breed with, which is why such creatures as half-orcs and ogrillons are found in the world). Females that are about to give birth are […] taken to the lair’s whelping pens.

Orcs don’t take mates, and no pair-bonding occurs in a tribe other than at the moment when coupling takes place.

It is difficult to overstate how absolutely “savage” and evil these orcs are, as depicted. The above quotation outright states that orcs eschew civilisation in favour of war and destruction, and heavily implies that orcs know no love and leave human women pregnant with half-orcs in their wake. Note also the use of the words “females”, “whelping pens”, and “mates”, as though orcs are nothing short of beasts.

The rationale for this sheer evil is that orcs are under the influence of their creator Gruumsh, an evil god of destruction. “Even those orcs who turn away from his worship can’t fully escape his influence”, says the Player’s Handbook (2014).

If things couldn’t get any worse, they do, because the above is compounded by D&D’s alignment system, which is a system of absolute deterministic morality of good-and-evil that can be measured by magic. A character in D&D can be—by their very essence—veritably evil. In this case, the entire orc species is evil owing to the influence of this evil god.

Cutting to the chase, this effectively means that it is morally justifiable for people to kill orcs on sight. They are evil, after all, without a shred of a doubt.

Worse still, this means that Dungeons & Dragons has effectively created a world in which the most wicked theories of racists are actually true:

  • “Race is real”, meaning that there are measurable differences in physiology between the races. This is represented by the game’s racial traits, which I earlier demonstrated don’t make a whole heap of sense.
  • Some races are evil and/or inferior. This is represented by the utter evil of orcs, and their inferior intelligence in the game’s racial traits.

This is where it might be expedient to take a look at that differently coloured pin with regard to the bell curve. The Bell Curve, as it happens, is a 1994 book written by two racists that states that intelligence is heritable through genes, and that the median intelligence of black people is lower than the median intelligence of white people by mere virtue of those genes. This claim is wrong in the real world, but it appears to be true in the fantasy world.

Now, if one could play pretend in any world, I think I’d like to play pretend in a world in which the racists are wrong. But that’s not the world of Dungeons & Dragons.

Redemption actually makes things worse

This section is a small tangent. Earlier I mentioned that the entire orc species is evil, thereby morally justifying killing them on sight. There is no real-world analogy for this—there exists no species on Earth whose sole purpose is the destruction of humans. But if such a species did hypothetically exist, driving it to extinction could realistically be justified as an act of self-defence, lest that species succeed in its goal of wiping out humans.

It’s a questionable thing to focus one’s story on, but at least adventurers in Dungeons & Dragons can rest easy after clearing an entire cave of orcs.

There’s just one small problem: orcs can be good, actually. In the fantasy world, it is possible for an orc to free themself of the corrupting influence of Gruumsh, and become “one of the good ones”. If even a single orc is capable of attaining moral good, this means that moral determinism is false. The rules are wrong and they can be broken. Therefore, every orc is potentially capable of attaining moral good. This twist just turned an uncomplicated story of a fight against objective evil into a story of the justified genocide of slaves who are forced to fight for their master.

And, you know, that’s a lot to take in. And it’s built right into the game’s premise, and it didn’t take a lot of thinking to come to this utterly disturbing conclusion. Heroes are supposed to slay orcs without giving it much thought, but burdened with this knowledge, is that a morally justifiable thing to do at all?

More importantly, is this a story we want to be telling?

Missing the point

I’m not the first person to point out Dungeons & Dragons’ problem with race and racism, especially as it pertains to orcs and drow (dark elves). Recently, the people behind the game have begun to take some small steps towards solving these fundamental issues, and that’s a good thing. But a lot of people disagree.

I’ve read far too many criticisms of these steps in the writing of this article. Altogether, I think that their arguments can be boiled down to these points:

  • Orcs are not analogous to real-world black people. They’re not even people. Orcs are orcs are orcs.
  • As a matter of fact, if you think that orcs are analogous to real-world black people, you are the racist for seeing the similarities.
  • I really just want monsters to slay in my game. Why are you overthinking this?

I feel that these arguments monumentally miss the point, and countering them head-on would be a good way to waste your time. One could pussyfoot about and argue about the first two points, and although I vehemently disagree with these conclusions, it really wouldn’t matter if one conceded these points. The third point isn’t even an argument—it’s the equivalent of throwing a tantrum because other people are discussing something you don’t like.

The reason the arguments miss the point is that the point of contention is not whether orcs specifically are “bad”. Rather, the point of contention is that they—and other races like them—exist at all in the way that they do, because they tell a story of justified racism. That is to say: orcs would be bad even if they didn’t mirror real-world depictions of black people so closely. The fact that they do just makes the story of justified racism worse.

Given a choice between anything at all, why choose racism?

When we play pretend, we could imagine any world at all. The only limit is our own imagination, and this is—crudely put—really cool. And when we play pretend, we tell each other stories. And again, we could be telling any story at all. Storytelling being as it is, we will require some conflict to drive the story forward, and we can do this through antagonists—the “bad guys”. Now, not all stories actually require conflict, but I’m going to let that be.

And here’s the point: a story in which the antagonists—the bad guys—are a race of sapient human-like people that are inherently evil through moral determinism is a shitty story.

Phrased differently, the story of “we must kill these people of a different race because that race is inherently evil” is a bad story that is far too close for comfort to real-world stories of racism and genocide. That story is especially bad because—within the context of the story—the protagonists are completely justified in their racial hatred and violence. This is in stark contrast to the real world, where racism is always completely and utterly unjustifiable.

Phrased differently again: given a choice to tell any sort of story whatsoever, why choose to tell a racist story?

Systemic racism

I think we’re past the central thesis of this article, but I want to try to actually answer the above question—why choose racism? The lazy answer would be to suppose that the players of the game are racists—wittingly or not. But I’m not very satisfied by that answer.

In an attempt to answer this question, I want to return to the first pin regarding playing by the rules. As a light refresher: We were creating an elf wizard, but none of the racial bonuses were suitable for our elf. The rules forced us to play a certain way, even if that way didn’t make sense for our character.

But that’s a lie, of course. Nobody is forcing us to play a certain way. We could just discard the book; ignore the system and go our own way. We can tweak away to our heart’s content.

But before we do that, I want to emphasise how unlikely it was that we found a flaw in the system in the first place. For most people, when they create a new character, it’s like going to an ice cream parlour. There’s a large selection of flavours available, and you simply pick and choose from the things that appeal to you. By the end, you leave the shop and have your ice cream—close the book and enjoy your character. You may add some additional custom flair, but that is usually after you have already chosen the foundation for your character.

For our elf wizard, this process went differently—atypically. Instead of choosing from a list of available options, we created a character in a free-form manner. Then when it was time to open the book, we found that the options did not support our concept. I want to emphasise here also two additional things: we may not have noticed the discrepancy between our concept and the rules in the first place, and simply gone ahead; or we may have noticed the discrepancy and thereafter discarded the concept in favour of something else that might work.

Regardless, having come so far, it’s time to begin the tweaking. There’s just a small problem… Nobody at the table is a game designer or has ever balanced a race before. Furthermore, the rules don’t exactly give robust hints on how to go about doing this. And if we’re discarding all of the elf racial traits, why are we an elf again? Why is nobody else tweaking their character’s race? Everyone else was perfectly capable of creating their characters within the constraints set by the rules, so why aren’t we? Is it such a big deal that the number next to Dexterity on our character sheet is a little higher? Can’t we simply ignore the additional weapon proficiency? If we never pick up a bow, it will be as if that proficiency was never there.

That is to say: breaking the rules is hard. There’s a heavy inertia to overcome, and that inertia can stop creativity dead in its tracks.

In summary, any of the following things can stop a person from creating their character outside of the rules:

  • They simply stick to the options provided by the book.
  • They come up with a concept of their own, and just pick from the options provided by the book afterwards.
  • They come up with a concept of their own, find that it is not supported by the book, and discard the concept.
  • They come up with a concept of their own, and—due to peer pressure or fear of being the odd one out—do not want to pursue creating custom mechanics.
  • They come up with a concept of their own, find that the books do not give them any guidance on creating custom mechanics, and give up there.
  • They come up with a concept of their own, successfully create or tweak the mechanics to suit that concept, but the GM does not allow this type of customisation.

One can only conclude that the rulebook—the system—enables certain outcomes much more than others. Even if you encounter a problem with the way that Dungeons & Dragons handles race, the odds of doing anything about it are very much stacked against you.

Involuntary racism

So why choose racism? Because the system has chosen for you. The system all-but-assures that the players will buy into its racism. In this system, all elves are dexterous, all humans are adaptive and ambitious, and all orcs are big and strong. There’s no choice in the system, and any choice to the contrary has to be outside of the system, for which the rulebook offers little to no guidance.

This is further compounded by the default campaign setting, Forgotten Realms, that creates a world in which orcs are unambiguously evil—barbarous savages reminiscent of the worst racist depictions of real-world peoples. It systematically enables a story of justified genocide against a people—a story that might as well be a wet dream for this world’s racists.

And, you know, that sucks.

Creating a better system

I want to end this article with my personal solution. I like fantasy, even though I spent the last however-many words comparing it to racism of the highest order, and I would like to enjoy it without its worst aspects.

A better system has heaps of requirements, but I think it boils down to the following two things:

  • The campaign setting mustn’t enable justified racism, and must be playable without racism entirely.
  • Players must be able to easily decouple mechanics from races.

For the campaign setting, I chose Eberron. I’m not sure if the Forgotten Realms are salvageable. Perhaps Gruumsh could be slain and all orcs could be freed, but there would still be a lot of other racisms that need solving in that campaign setting.

Eberron, on the other hand, is a lot more redeemable. The world is divided into rival nations, and the world’s races are more-or-less evenly distributed throughout the nations, creating a cosmopolitan feel. Moreover, there are no deterministically evil peoples in the world—Eberron’s original druids were orcs, and orcs can be as good or as evil as any other person. Even more importantly, culture and race in Eberron can be completely decoupled. An elf from the main continent is generally of the local nation’s culture, and an elf from the “elven continent” will generally be of one of the two local cultures, and this racial-cultural fluidity is explicitly called out in the campaign setting’s books.

Of course, there are some less likeable aspects of the campaign setting. There exist people with heritable magical tattoos, effectively making them an objectively superior breed. There’s also the fact that the “elven continent” exists at all, when it could instead be mixed-race like the rest of the world (although the racism on this continent is called out as being bad in Exploring Eberron (2020)). There is also racism against the world’s robot people and shape changers, which may not be a theme you want to play around with. But by and large, that’s it, and it’s a huge improvement over other settings.

For the mechanics, I ditched Dungeons & Dragons. Savage Worlds is a system that—unlike Dungeons & Dragons—truly gives you the tools to tweak the system if something is not to your liking. It has an entire section on modifying and creating races, and the rulebook is littered with reminders that you can change things to fit your game, and suggestions on how to do that.

Of course, Savage Worlds is not perfect. Its name is a little ‘eh’, its first campaign setting imagines a world in which the Confederate States of America seceded, and it has this extremely annoying Outsider feature that makes no sense whatsoever. Moreover, for our purposes, it does not explicitly tell the player that they can freely adjust their character’s racial features, but it does give the player the tools to do so, so I guess that’s good enough. Perfect is the enemy of good, and Savage Worlds’ flaws are trivially easy to work around.

Just one problem remains: this imaginary world still holds on to the bell curve. It still imagines a world in which the racists are sort of right—where elves are more dexterous and orcs are taller and stronger. And although the player characters are no longer bound by the bell curve, it still feels a little wrong.

And in truth, I have no solution for this whatsoever. If we want to play in a world where the racists are wrong, then maybe we shouldn’t tell a story in which their central theory of race holds true. It’s completely possible to tell a fantasy story composed of just humans, after all.

But I also feel that we would be losing something if we simply ditched the fantasy concept of race. Earlier in this article, I stopped to appreciate that dexterous elves are cool. And I think that appreciation bears repeating—not just for elves, but for all the fantasy races. When I play an orc, maybe I want to lean into the really cool concept of being inhumanely strong—or break with that stereotype to explore what it means to be weak in a society where everybody can effortlessly lift a hundred kilos.

After all, it’s about the stories we tell each other. And doesn’t a world in which radically different peoples live together and work to oppose bad actors make for a beautiful story?

Planet FSFE (en): RSS 2.0 | Atom | FOAF |

          Albrechts Blog  Alessandro's blog  Andrea Scarpino's blog  André Ockers on Free Software  Bela's Internship Blog  Bernhard's Blog  Bits from the Basement  Blog of Martin Husovec  Bobulate  Brian Gough’s Notes  Chris Woolfrey — FSFE UK Team Member  Ciarán’s free software notes  Colors of Noise - Entries tagged planetfsfe  Communicating freely  Daniel Martí's blog  David Boddie - Updates (Full Articles)  ENOWITTYNAME  English Planet – Dreierlei  English on Björn Schießle - I came for the code but stayed for the freedom  English – Alessandro at FSFE  English – Alina Mierlus – Building the Freedom  English – Being Fellow #952 of FSFE  English – Blog  English – FSFE supporters Vienna  English – Free Software for Privacy and Education  English – Free speech is better than free beer  English – Jelle Hermsen  English – Nicolas Jean's FSFE blog  English – The Girl Who Wasn't There  English – Thinking out loud  English – Viktor's notes  English – With/in the FSFE  English – gollo's blog  English – mkesper's blog  English – nico.rikken’s blog  Escape to freedom  Evaggelos Balaskas - System Engineer  FSFE interviews its Fellows  FSFE – Frederik Gladhorn (fregl)  FSFE – Matej's blog  Fellowship News  Free Software & Digital Rights Noosphere  Free Software with a Female touch  Free Software –  Free Software – Frank Karlitschek_  Free Software – hesa's Weblog  Free as LIBRE  Free, Easy and Others  FreeSoftware – egnun's blog  From Out There  Giacomo Poderi  Green Eggs and Ham  Handhelds, Linux and Heroes  HennR’s FSFE blog  Henri Bergius  Inductive Bias  Karsten on Free Software  Losca  MHO  Mario Fux  Martin's notes - English  Matthias Kirschner's Web log - fsfe  Max Mehl (English)  Michael Clemens  Myriam's blog  Mäh?  Nice blog  Nikos Roussos - opensource  Planet FSFE on irl.xyz  Posts - Carmen Bianca Bakker  Posts on Hannes Hauswedell's homepage  Pressreview  Ramblings of a sysadmin (Posts about planet-fsfe)  Rekado  Riccardo (ruphy) Iaconelli – blog  Saint’s Log  TSDgeos' blog  Tarin Gamberini  Technology – Intuitionistically Uncertain  The trunk  Thomas Løcke Being Incoherent  Told to blog - Entries tagged fsfe  Tonnerre Lombard  Vincent Lequertier's blog  Vitaly Repin. Software engineer's blog  Weblog  Weblog  Weblog  Weblog  Weblog  Weblog  a fellowship ahead  agger's Free Software blog  anna.morris's blog  ayers's blog  bb's blog  blog  en – Florian Snows Blog  en – PB's blog  en – rieper|blog  english – Davide Giunchi  english – Torsten's FSFE blog  foss – vanitasvitae's blog  free software blog  freedom bits  freesoftware – drdanzs blog  fsfe – Thib's Fellowship Blog  julia.e.klein’s blog  marc0s on Free Software  pichel’s blog  planet-en – /var/log/fsfe/flx  polina's blog  softmetz' anglophone Free Software blog  stargrave's blog  tobias_platen's blog  tolld's blog  wkossen’s blog  yahuxo’s blog