Thoughts of the FSFE Community

Wednesday, 13 December 2017

FSFE asks to include software into the list of re-usable public sector information

polina's blog | 16:40, Wednesday, 13 December 2017

The Directive on the re-use of public sector information (Directive 2003/98/EC, revised in 2013 by Directive 2013/37/EU – ‘PSI Directive’) establishes a common legal framework for a European market for government-held data (public sector information). It is built around two key pillars of the internal market: transparency and fair competition.

The PSI Directive focuses on economic aspects of re-use of information gathered by governments, and while it does mention some societal impact of such re-use, its main focus is on contributing to a cross-borer European data economy by making re-usable data held by governments accessible both for commercial and non-commercial purposes (i.e. “open data”). The objective of PSI Directive is not to establish truly “open government” as such, although it does contribute to such goal by demanding the re-usability of government-held data based on open and machine-readable formats.

For Free Software the PSI Directive is important because it affects re-use of documents as in texts, databases, audio files and film fragments, but explicitly excludes “computer programmes” from its scope for no apparent reason in the recital 9 of Directive 2003/98/EC.

However, despite this explicit exclusion of software in the PSI Directive recital, EU member states are not precluded from creating their own rules for opening up data held by public bodies and including “software” into the list of re-usable government-held information. First, the PSI Directive establishes “minimum” requirements for member states to follow when opening up their data, and second, the exclusion of computer programmes from the scope of the Directive is enshrined in its non-legislative part: recitals, acting solely as a guidance to the interpretation of the legislative part: the articles.

The recent case in France is a good example why there are no evident reason why the EU member states should exclude software from the list or re-usable and open data held by governments. In particular, France’s “Digital Republic” law, adopted in 2016, (LOI n° 2016-1321 du 7 octobre 2016 pour une République numérique) considers source code as a possible administrative document that must be made available in an open standard format that can be easily reused and processed.

Therefore, our response to the PSI Directive public consultation can be summarised to:

  • Consider source code owned by a public administration as a ‘document’ within the scope of the Directive.
  • Algorithmic accountability in government decision-making process is a must for truly transparent government, therefore, the software developed for public sector that is used in delivering tasks of public interest either by a publicly owned company or a private company, should be made available as Free Software.
  • Free Software is crucial for scientific verification of research results, and it is absolutely necessary to make sure that Open Science policies include the requirement to publish software tools and applications produced during publicly funded research under Free Software licences.
  • No special agreements with private services for delivering tasks of public interest shall ever preclude the re-usability of government-held data by both commercial and non-commercial Free Software. Public bodies shall focus on making data available in open and accessible formats.
  • Sui generis database rights cannot be invoked in order to preclude the re-usability of government-held data.
  • Minimum level of harmonisation for the relationship between Freedom of Information (FoI) laws and the PSI Directive is needed in order to bring the EU closer to the cross-border market for public sector information.

Please find our submission to the public consultation in full here.

Image: CC0

Tuesday, 12 December 2017

Report about the FSFE Community Meeting 2017

English Planet – Dreierlei | 14:09, Tuesday, 12 December 2017

Two weeks ago we had our first general community meeting as an opportunity for all people engaged inside FSFE to come together, share knowledge, grow projects, hack, discuss and get active. Integral part and topic of the meeting was knowledge sharing of FSFE related tools and processes. Find some notes and pictures in this report.

For the first time, we we merging our annual German speaking team meeting this year with the bi-annual coordinators meeting into one bigger meeting for all active people of the FSFE community. Active people in this context means that invited was any member of any team, be it a local or topical one. All together, we met on the weekend of November 25 and 26 at Endocode, Berlin.

Integral part and topic of the meeting was knowledge sharing of FSFE related tools and processes. For this, we have had several slots in the agenda in that participants had the possibility to self-host a knowledge- or tool-sharing session that they are interested in. Or one in that they are an expert in and they like to share their knowledge. In a next step everyone could mark his own interest in the proposed sessions and based on that we arranged the agenda.

We have seen particularly high interest in giving input to the plans for FSFE to grow membership, in tips for implementing our Code of Conduct, in strategies to increase diversity and in introductions of tools offered by the FSFE like lime survey and git.

The feedback about the meeting was very positive, in particular about the dynamic agenda and the productive sessions that left participants with the feeling of having got something done. Myself, in the role of the organiser, this years meeting left me with the good feeling that we did not only have got something done but that we also will see further collaboration on several topics among participants coming out as a result of this meeting.

Personally, it makes me happy again and again to be part of such a friendly and accommodating community. A community in that participants respect each other in a natural way and no one tries to overrule others.

The productive feeling and the unique atmosphere already make me looking forward to organise the next community meeting 2018.

Hereafter some pictures of this year’s event:

<figure class="wp-caption aligncenter" id="attachment_2352" style="max-width: 580px"><figcaption class="wp-caption-text">Participants of the FSFE community meeting 2017</figcaption></figure> <figure class="wp-caption aligncenter" id="attachment_2341" style="max-width: 580px"><figcaption class="wp-caption-text">Session about implementing our Code of Conduct.</figcaption></figure> <figure class="wp-caption aligncenter" id="attachment_2343" style="max-width: 580px"><figcaption class="wp-caption-text">Session about updates of our Free Your Android campaign.</figcaption></figure> <figure class="wp-caption aligncenter" id="attachment_2344" style="max-width: 580px"><figcaption class="wp-caption-text">Session about diversity.</figcaption></figure> <figure class="wp-caption aligncenter" id="attachment_2345" style="max-width: 580px"><figcaption class="wp-caption-text">The blueboard shows the amount of session-proposals (one on each yellow cards) during the community meeting.</figcaption></figure> <figure class="wp-caption aligncenter" id="attachment_2346" style="max-width: 580px"><figcaption class="wp-caption-text">Breaks are always good for a chat.</figcaption></figure> <figure class="wp-caption aligncenter" id="attachment_2347" style="max-width: 580px"><figcaption class="wp-caption-text">One of our lightning talks by Paul Hänsch</figcaption></figure> <figure class="wp-caption aligncenter" id="attachment_2342" style="max-width: 580px"><figcaption class="wp-caption-text">Lightning talks audience.</figcaption></figure>

Sunday, 10 December 2017

How a single unprivileged app can brick the whole Android system

Daniel's FSFE blog | 19:04, Sunday, 10 December 2017

This article is highly subjective and only states the author’s opinion based on actual observations and “wild” assumptions. Better explanations and corrections are warmly welcome!

Motivation

After updating an App from the F-Droid store (OpenCamera), my Android device was completely unusable. In this state, the only feasible option for a typical end-user to recover the device (who does not know how to get to safe mode in order to remove or downgrade the app [5]) would have been to wipe data in recovery, loosing all data.

How can such a disaster happen? In this article, I argue why I have serious doubts about the memory management approach taken in Android.

The failure

After updating the OpenCamera app to the recently released version 1.42, my Android device ran into a bootloop that was hard to recover from. I was able to repeatingly reproduce the failure on a different device, namely the following:

  • Device: Samsung Galaxy S3 (i9300)
  • ROM: Lineage OS 13 (Android 6.0), freshly built from latest sources, commit 42f4b851c9b2d08709a065c3931f6370fd78b2b0 [1]

Steps to reproduce:

  1. wipe all data and caches
  2. newly configure the device using the first-use wizard
  3. install the F-Droid store
  4. search for “Open Camera”
  5. install Open Camera version 1.42

Expected:

The install completes and the app is available. If installation fails (for whatever reason), an error message is shown but the device is still working

Actual:

The install freezes, the LineageOS splash screen appears and re-initializes all apps; this happens several times and after aprox 10-15 minutes the device is back “working”; when trying to start apps they crash or even the launcher (“Trebuchet”) crashes. After rebooting the device, it is stuck in an infinite loop initializing apps.

The fault (what happens under the hood?)

When installing OpenCamera, the following is printed in the log:

12-10 14:48:30.915  4034  5483 I ActivityManager: START u0 {act=org.fdroid.fdroid.installer.DefaultInstaller.action.INSTALL_PACKAGE dat=file:///data/user/0/org.fdroid.fdroid/files/Open Camera-1.42.apk cmp=org.fdroid.fdroid/.installer.DefaultInstallerActivity (has extras)} from uid 10070 on display 0
12-10 14:48:30.915  4034  5483 W ActivityManager: startActivity called from non-Activity context; forcing Intent.FLAG_ACTIVITY_NEW_TASK for: Intent { act=org.fdroid.fdroid.installer.DefaultInstaller.action.INSTALL_PACKAGE dat=file:///data/user/0/org.fdroid.fdroid/files/Open Camera-1.42.apk cmp=org.fdroid.fdroid/.installer.DefaultInstallerActivity (has extras) }
12-10 14:48:30.925  4034  5483 D lights  : set_light_buttons: 2
12-10 14:48:30.955  4034  5649 I ActivityManager: START u0 {act=android.intent.action.INSTALL_PACKAGE dat=file:///data/user/0/org.fdroid.fdroid/files/Open Camera-1.42.apk cmp=com.android.packageinstaller/.PackageInstallerActivity (has extras)} from uid 10070 on display 0
12-10 14:48:31.085  6740  6740 W ResourceType: Failure getting entry for 0x7f0c0001 (t=11 e=1) (error -75)
12-10 14:48:31.700  4034  4093 I ActivityManager: Displayed com.android.packageinstaller/.PackageInstallerActivity: +724ms (total +758ms)
12-10 14:48:36.770  4034  4362 D lights  : set_light_buttons: 1
12-10 14:48:36.840  4034  4938 I ActivityManager: START u0 {dat=file:///data/user/0/org.fdroid.fdroid/files/Open Camera-1.42.apk flg=0x2000000 cmp=com.android.packageinstaller/.InstallAppProgress (has extras)} from uid 10018 on display 0
12-10 14:48:36.850  3499  3895 D audio_hw_primary: select_output_device: AUDIO_DEVICE_OUT_SPEAKER
12-10 14:48:36.955  6863  6874 D DefContainer: Copying /data/user/0/org.fdroid.fdroid/files/Open Camera-1.42.apk to base.apk
12-10 14:48:37.100  4034  4093 I ActivityManager: Displayed com.android.packageinstaller/.InstallAppProgress: +251ms
12-10 14:48:37.155  6740  6753 D OpenGLRenderer: endAllStagingAnimators on 0x486226f0 (RippleDrawable) with handle 0x48604d28
12-10 14:48:37.170  4034  4100 W ResourceType: Failure getting entry for 0x7f0c0001 (t=11 e=1) (error -75)
12-10 14:48:37.465  4034  4100 I PackageManager.DexOptimizer: Running dexopt (dex2oat) on: /data/app/vmdl872450731.tmp/base.apk pkg=net.sourceforge.opencamera isa=arm vmSafeMode=false debuggable=false oatDir = /data/app/vmdl872450731.tmp/oat bootComplete=true
12-10 14:48:37.585  7205  7205 I dex2oat : Starting dex2oat.
12-10 14:48:37.585  7205  7205 E cutils-trace: Error opening trace file: No such file or directory (2)
12-10 14:48:42.405  7205  7205 I dex2oat : dex2oat took 4.815s (threads: 4) arena alloc=5MB java alloc=2023KB native alloc=13MB free=1122KB
12-10 14:48:42.415  4034  4100 D lights  : set_light_buttons: 2
12-10 14:48:42.680  4034  4100 V BackupManagerService: restoreAtInstall pkg=net.sourceforge.opencamera token=3 restoreSet=0
12-10 14:48:42.680  4034  4100 W BackupManagerService: Requested unavailable transport: com.google.android.gms/.backup.BackupTransportService
12-10 14:48:42.680  4034  4100 W BackupManagerService: No transport
12-10 14:48:42.680  4034  4100 V BackupManagerService: Finishing install immediately
12-10 14:48:42.705  4034  4100 W Settings: Setting install_non_market_apps has moved from android.provider.Settings.Global to android.provider.Settings.Secure, returning read-only value.
12-10 14:48:42.705  4034  4100 I art     : Starting a blocking GC Explicit
12-10 14:48:42.805  4034  4100 I art     : Explicit concurrent mark sweep GC freed 52637(2MB) AllocSpace objects, 20(424KB) LOS objects, 33% free, 14MB/21MB, paused 2.239ms total 96.416ms
12-10 14:48:42.835  4034  4363 I InputReader: Reconfiguring input devices.  changes=0x00000010
12-10 14:48:42.935  5420  5420 D CarrierServiceBindHelper: Receive action: android.intent.action.PACKAGE_ADDED
12-10 14:48:42.940  5420  5420 D CarrierServiceBindHelper: mHandler: 3
12-10 14:48:42.940  5420  5420 D CarrierConfigLoader: mHandler: 9 phoneId: 0
12-10 14:48:42.945  4034  4034 F libc    : invalid address or address of corrupt block 0x120 passed to dlfree
12-10 14:48:42.945  4034  4034 F libc    : Fatal signal 11 (SIGSEGV), code 1, fault addr 0xdeadbaad in tid 4034 (system_server)
12-10 14:48:42.950  3496  3496 I DEBUG   : property debug.db.uid not set; NOT waiting for gdb.
12-10 14:48:42.950  3496  3496 I DEBUG   : HINT: adb shell setprop debug.db.uid 100000
12-10 14:48:42.950  3496  3496 I DEBUG   : HINT: adb forward tcp:5039 tcp:5039
12-10 14:48:42.975  3496  3496 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
12-10 14:48:42.975  3496  3496 F DEBUG   : LineageOS Version: '13.0-20171125-UNOFFICIAL-i9300'
12-10 14:48:42.975  3496  3496 F DEBUG   : Build fingerprint: 'samsung/m0xx/m0:4.3/JSS15J/I9300XXUGMJ9:user/release-keys'
12-10 14:48:42.975  3496  3496 F DEBUG   : Revision: '0'
12-10 14:48:42.975  3496  3496 F DEBUG   : ABI: 'arm'
12-10 14:48:42.975  3496  3496 F DEBUG   : pid: 4034, tid: 4034, name: system_server  >>> system_server <<<
12-10 14:48:42.975  3496  3496 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xdeadbaad
12-10 14:48:43.030  3496  3496 F DEBUG   : Abort message: 'invalid address or address of corrupt block 0x120 passed to dlfree'
12-10 14:48:43.030  3496  3496 F DEBUG   :     r0 00000000  r1 00000000  r2 00000000  r3 00000002
12-10 14:48:43.030  3496  3496 F DEBUG   :     r4 00000120  r5 deadbaad  r6 404e0f38  r7 40005000
12-10 14:48:43.030  3496  3496 F DEBUG   :     r8 00000128  r9 bee01b0c  sl 40358be3  fp 40358bec
12-10 14:48:43.030  3496  3496 F DEBUG   :     ip 404db5d8  sp bee019f8  lr 404abfab  pc 404abfaa  cpsr 60070030
12-10 14:48:43.045  3496  3496 F DEBUG   :
12-10 14:48:43.045  3496  3496 F DEBUG   : backtrace:
12-10 14:48:43.045  3496  3496 F DEBUG   :     #00 pc 00030faa  /system/lib/libc.so (dlfree+1285)
12-10 14:48:43.045  3496  3496 F DEBUG   :     #01 pc 000158df  /system/lib/libandroidfw.so (_ZN7android13ResStringPool6uninitEv+38)
12-10 14:48:43.045  3496  3496 F DEBUG   :     #02 pc 0001662b  /system/lib/libandroidfw.so (_ZN7android10ResXMLTree6uninitEv+12)
12-10 14:48:43.045  3496  3496 F DEBUG   :     #03 pc 00016649  /system/lib/libandroidfw.so (_ZN7android10ResXMLTreeD1Ev+4)
12-10 14:48:43.045  3496  3496 F DEBUG   :     #04 pc 00013373  /system/lib/libandroidfw.so (_ZN7android12AssetManager10getPkgNameEPKc+258)
12-10 14:48:43.045  3496  3496 F DEBUG   :     #05 pc 000133cf  /system/lib/libandroidfw.so (_ZN7android12AssetManager18getBasePackageNameEj+62)
12-10 14:48:43.045  3496  3496 F DEBUG   :     #06 pc 00088b33  /system/lib/libandroid_runtime.so
12-10 14:48:43.045  3496  3496 F DEBUG   :     #07 pc 72cb9011  /data/dalvik-cache/arm/system@framework@boot.oat (offset 0x1f78000)
12-10 14:48:50.095  3496  3496 F DEBUG   :
12-10 14:48:50.095  3496  3496 F DEBUG   : Tombstone written to: /data/tombstones/tombstone_00
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'statusbar' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'netstats' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'power' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'media_projection' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'network_management' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'window' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'consumer_ir' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'telecom' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'cmpartnerinterface' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'package' died
12-10 14:48:50.185  1912  1912 I ServiceManager: service 'user' died

Since Open Camera needs some background service and is started on bootup, I assume that after installation the system tries to restart this service. However, it appears that there is some memory issue with the app, as it requests so much memory that Android starts killing other apps to make this memory available. In case Android does not manage to provide this space, the device is rebooted. Since OpenCamera is started at bootup, it again tries to allocate (too much) memory and the device is stuck in an infinite loop.

Looking at Android’s memory management

I expected that the following excerpt from the log above might lead to some useful hints:

12-10 14:48:42.945  4034  4034 F libc    : invalid address or address of corrupt block 0x120 passed to dlfree
12-10 14:48:42.945  4034  4034 F libc    : Fatal signal 11 (SIGSEGV), code 1, fault addr 0xdeadbaad in tid 4034 (system_server)

After searching on the net, I found an interesting discussion [2] suggesting the following:


“A likely cause of this is that you have ran out of memory, maybe because a memory leak or simply used up all memory. This can be caused by a bug you are using in a plugin that uses native C/C++ code through NDK.”

To rule out hardware issues, I also exchanged the storage (I run /data from sdcard) and compiled memtester [3] to test the device’s RAM. When experimenting with memtester, I noticed a striking difference between running memtester on a regular GNU/Linux system and running it on Android/LineageOS. When giving memtester less memory than actually available, there is no difference. However, when giving memtester *more* RAM than acutally available, the following happens on GNU/Linux:

# free -h
              total        used        free      shared  buff/cache   available
Mem:            28G        124M         28G        8.5M        219M         28G
Swap:            0B          0B          0B
# memtester 40G
memtester version 4.3.0 (64-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 40960MB (42949672960 bytes)
got  29075MB (30488387584 bytes), trying mlock ...Killed
#
Killed
[1]+  Stopped                 sh

While on Android the device suddenly reboots after trying to mlock the memory:

root@i9300:/ # free -h
                total        used        free      shared     buffers
Mem:             828M        754M         74M           0        1.3M
-/+ buffers/cache:           752M         75M
Swap:            400M         18M        382M

root@i9300:/ # /sbin/memtester 2G
memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 2048MB (2147483648 bytes)
got  2008MB (2105921536 bytes), trying mlock ...

This is what is printed to logcat:

01-01 01:10:29.485  4933  4933 D su      : su invoked.
01-01 01:10:29.485  4933  4933 E su      : SU from: shell
01-01 01:10:29.490  4933  4933 D su      : Allowing shell.
01-01 01:10:29.490  4933  4933 D su      : 2000 /system/bin/sh executing 0 /system/bin/sh using binary /system/bin/sh : sh
01-01 01:10:29.490  4933  4933 D su      : Waiting for pid 4934.
01-01 01:10:44.840  2478  3264 D LightsService: Excessive delay setting light: 81ms
01-01 01:10:44.925  2478  3264 D LightsService: Excessive delay setting light: 82ms
01-01 01:10:45.010  2478  3264 D LightsService: Excessive delay setting light: 82ms
01-01 01:10:45.090  2478  3264 D LightsService: Excessive delay setting light: 82ms
01-01 01:10:45.175  2478  3264 D LightsService: Excessive delay setting light: 82ms
01-01 01:10:45.260  2478  3264 D LightsService: Excessive delay setting light: 82ms
01-01 01:10:45.340  2478  3264 D LightsService: Excessive delay setting light: 82ms
01-01 01:10:50.735  2478  2538 I PowerManagerService: Going to sleep due to screen timeout (uid 1000)...
01-01 01:10:50.785  2478  2538 E         : Device driver API match
01-01 01:10:50.785  2478  2538 E         : Device driver API version: 29
01-01 01:10:50.785  2478  2538 E         : User space API version: 29
01-01 01:10:50.785  2478  2538 E         : mali: REVISION=Linux-r3p2-01rel3 BUILD_DATE=Tue Aug 26 17:05:16 KST 2014
01-01 01:10:52.000  2478  2538 V KeyguardServiceDelegate: onScreenTurnedOff()
01-01 01:10:52.040  2478  2538 E libEGL  : call to OpenGL ES API with no current context (logged once per thread)
01-01 01:10:52.045  2478  2536 I DisplayManagerService: Display device changed: DisplayDeviceInfo{"Integrierter Bildschirm": uniqueId="local:0", 720 x 1280, modeId 1, defaultModeId 1, supportedModes [{id=1, width=720, height=1280, fps=60.002}], colorTransformId 1, defaultColorTransformId 1, supportedColorTransforms [{id=1, colorTransform=0}], density 320, 304.8 x 306.71698 dpi, appVsyncOff 0, presDeadline 17666111, touch INTERNAL, rotation 0, type BUILT_IN, state OFF, FLAG_DEFAULT_DISPLAY, FLAG_ROTATES_WITH_CONTENT, FLAG_SECURE, FLAG_SUPPORTS_PROTECTED_BUFFERS}
01-01 01:10:52.060  1915  1915 D SurfaceFlinger: Set power mode=0, type=0 flinger=0x411dadf0
01-01 01:10:52.160  2478  2538 I PowerManagerService: Sleeping (uid 1000)...
01-01 01:10:52.165  2478  3231 D WifiConfigStore: Retrieve network priorities after PNO.
01-01 01:10:52.170  1938  3241 E bt_a2dp_hw: adev_set_parameters: ERROR: set param called even when stream out is null
01-01 01:10:52.170  2478  3231 E native  : do suspend false
01-01 01:10:52.175  2478  3231 D WifiConfigStore: No blacklist allowed without epno enabled
01-01 01:10:52.190  3846  4968 D NfcService: Discovery configuration equal, not updating.
01-01 01:10:52.435  2478  3231 D WifiConfigStore: Retrieve network priorities before PNO. Max priority: 0
01-01 01:10:52.435  1938  1938 E bt_a2dp_hw: adev_set_parameters: ERROR: set param called even when stream out is null
01-01 01:10:52.440  2478  3231 E WifiStateMachine:  Fail to set up pno, want true now false
01-01 01:10:52.440  2478  3231 E native  : do suspend true
01-01 01:10:52.670  2478  3231 D WifiStateMachine: Disconnected CMD_START_SCAN source -2 3, 4 -> obsolete
01-01 01:10:54.160  2478  2538 W PowerManagerService: Sandman unresponsive, releasing suspend blocker
01-01 01:10:55.825  2478  3362 D CryptdConnector: SND -> {3 cryptfs getpw}
01-01 01:10:55.825  1903  1999 D VoldCryptCmdListener: cryptfs getpw
01-01 01:10:55.825  1903  1999 I Ext4Crypt: ext4 crypto complete called on /data
01-01 01:10:55.825  1903  1999 I Ext4Crypt: No master key, so not ext4enc
01-01 01:10:55.830  1903  1999 I Ext4Crypt: ext4 crypto complete called on /data
01-01 01:10:55.830  1903  1999 I Ext4Crypt: No master key, so not ext4enc
01-01 01:10:55.830  2478  2798 D CryptdConnector: RCV  {4 cryptfs clearpw}
01-01 01:10:55.835  1903  1999 D VoldCryptCmdListener: cryptfs clearpw
01-01 01:10:55.835  1903  1999 I Ext4Crypt: ext4 crypto complete called on /data
01-01 01:10:55.835  1903  1999 I Ext4Crypt: No master key, so not ext4enc
01-01 01:10:55.835  2478  2798 D CryptdConnector: RCV <- {200 4 0}
01-01 01:10:55.925  3417  3417 D PhoneStatusBar: disable:
01-01 01:10:56.020  3417  3417 D PhoneStatusBar: disable:
01-01 01:10:56.330  3417  3417 D PhoneStatusBar: disable:
01-01 01:11:44.875  2478  4667 I ActivityManager: Process com.android.messaging (pid 4607) has died
01-01 01:11:44.920  2478  4667 D ActivityManager: cleanUpApplicationRecord -- 4607
01-01 01:11:45.860  2478  3356 W art     : Long monitor contention event with owner method=void com.android.server.am.ActivityManagerService$AppDeathRecipient.binderDied() from ActivityManagerService.java:1359 waiters=0 for 907ms
01-01 01:11:45.890  2478  3356 I ActivityManager: Process org.cyanogenmod.profiles (pid 4593) has died
01-01 01:11:45.900  2478  3356 D ActivityManager: cleanUpApplicationRecord -- 4593
01-01 01:11:45.955  2478  2529 W art     : Long monitor contention event with owner method=void com.android.server.am.ActivityManagerService$AppDeathRecipient.binderDied() from ActivityManagerService.java:1359 waiters=1 for 914ms
01-01 01:11:45.960  1913  1913 E lowmemorykiller: Error opening /proc/3662/oom_score_adj; errno=2
01-01 01:11:45.970  2478  2529 I ActivityManager: Process com.android.exchange (pid 3662) has died
01-01 01:11:45.970  2478  2529 D ActivityManager: cleanUpApplicationRecord -- 3662
01-01 01:11:45.985  2478  3943 W art     : Long monitor contention event with owner method=void com.android.server.am.ActivityManagerService$AppDeathRecipient.binderDied() from ActivityManagerService.java:1359 waiters=2 for 611ms
01-01 01:11:45.995  2478  3943 I ActivityManager: Process com.android.calendar (pid 4415) has died
01-01 01:11:45.995  2478  3943 D ActivityManager: cleanUpApplicationRecord -- 4415
01-01 01:11:46.000  2478  2532 W art     : Long monitor contention event with owner method=void com.android.server.am.ActivityManagerService$AppDeathRecipient.binderDied() from ActivityManagerService.java:1359 waiters=3 for 537ms
01-01 01:11:46.025  2478  3362 W art     : Long monitor contention event with owner method=void com.android.server.am.ActivityManagerService$AppDeathRecipient.binderDied() from ActivityManagerService.java:1359 waiters=4 for 378ms
01-01 01:11:46.045  2478  3362 I ActivityManager: Process org.lineageos.updater (pid 4449) has died
01-01 01:11:46.045  2478  3362 D ActivityManager: cleanUpApplicationRecord -- 4449
01-01 01:11:46.045  1913  1913 E lowmemorykiller: Error writing /proc/3938/oom_score_adj; errno=22
01-01 01:11:46.050  2478  3413 W art     : Long monitor contention event with owner method=void com.android.server.am.ActivityManagerService$AppDeathRecipient.binderDied() from ActivityManagerService.java:1359 waiters=5 for 372ms
01-01 01:11:46.505  2478  3232 D WifiService: Client connection lost with reason: 4
01-01 01:11:47.165  2478  4666 D GraphicsStats: Buffer count: 3
01-01 01:11:47.400  2478  2532 W art     : Long monitor contention event with owner method=int com.android.server.am.ActivityManagerService.broadcastIntent(android.app.IApplicationThread, android.content.Intent, java.lang.String, android.content.IIntentReceiver, int, java.lang.String, android.os.Bundle, java.lang.String[], int, android.os.Bundle, boolean, boolean, int) from ActivityManagerService.java:17497 waiters=0 for 667ms
01-01 01:11:47.465  2478  4664 W art     : Long monitor contention event with owner method=int com.android.server.am.ActivityManagerService.broadcastIntent(android.app.IApplicationThread, android.content.Intent, java.lang.String, android.content.IIntentReceiver, int, java.lang.String, android.os.Bundle, java.lang.String[], int, android.os.Bundle, boolean, boolean, int) from ActivityManagerService.java:17497 waiters=1 for 858ms
01-01 01:11:47.465  2478  3412 W art     : Long monitor contention event with owner method=int com.android.server.am.ActivityManagerService.broadcastIntent(android.app.IApplicationThread, android.content.Intent, java.lang.String, android.content.IIntentReceiver, int, java.lang.String, android.os.Bundle, java.lang.String[], int, android.os.Bundle, boolean, boolean, int) from ActivityManagerService.java:17497 waiters=2 for 859ms
01-01 01:11:47.475  2478  4665 I ActivityManager: Process com.android.providers.calendar (pid 4434) has died
01-01 01:11:47.480  2478  4665 D ActivityManager: cleanUpApplicationRecord -- 4434
01-01 01:11:47.545  1913  1913 E lowmemorykiller: Error opening /proc/3938/oom_score_adj; errno=2
01-01 01:11:47.545  1913  1913 E lowmemorykiller: Error opening /proc/4014/oom_score_adj; errno=2
01-01 01:11:47.550  1913  1913 E lowmemorykiller: Error opening /proc/4542/oom_score_adj; errno=2
01-01 01:11:47.550  2478  3943 W art     : Long monitor contention event with owner method=int com.android.server.am.ActivityManagerService.broadcastIntent(android.app.IApplicationThread, android.content.Intent, java.lang.String, android.content.IIntentReceiver, int, java.lang.String, android.os.Bundle, java.lang.String[], int, android.os.Bundle, boolean, boolean, int) from ActivityManagerService.java:17497 waiters=3 for 894ms
01-01 01:11:47.560  2478  3943 I ActivityManager: Process org.cyanogenmod.themes.provider (pid 3497) has died
01-01 01:11:47.560  2478  3943 D ActivityManager: cleanUpApplicationRecord -- 3497
01-01 01:11:47.560  2478  2529 W art     : Long monitor contention event with owner method=int com.android.server.am.ActivityManagerService.broadcastIntent(android.app.IApplicationThread, android.content.Intent, java.lang.String, android.content.IIntentReceiver, int, java.lang.String, android.os.Bundle, java.lang.String[], int, android.os.Bundle, boolean, boolean, int) from ActivityManagerService.java:17497 waiters=4 for 673ms
01-01 01:11:47.570  2478  2529 I ActivityManager: Process com.svox.pico (pid 4014) has died
01-01 01:11:47.570  2478  2529 D ActivityManager: cleanUpApplicationRecord -- 4014
01-01 01:11:48.325  2478  2529 W ActivityManager: Scheduling restart of crashed service com.svox.pico/.PicoService in 1000ms

Verdict

I wasted lots of time with this issue, but was finally able to reproduce it and to recover all of my data. At least, I have an explanation now for various random reboots I experienced in the past in similar low-memory conditions.

Overall, I am really shocked that a simple, unprivileged Android app that is scheduled to start on bootup can ruin a working system so badly. Further research indicates that there are more apps known to cause such behavior [4]. I hope that a device based on a GNU/Linux system instead of Android (such as the announced Librem5) will not suffer from such a severe flaw.

References

[1] https://review.lineageos.org/#/c/197305/
[2] https://stackoverflow.com/questions/25069186/invalid-address-passed-to-dlfree
[3] https://github.com/royzhao/memtester4Android
[4] https://gitlab.com/fdroid/fdroiddata/issues/979
[5] https://gitlab.com/fdroid/fdroiddata/issues/979#note_48990149

Saturday, 09 December 2017

The - surprisingly limited - usefulness of function multiversioning in GCC

Posts on Hannes Hauswedell's homepage | 16:00, Saturday, 09 December 2017

Modern CPUs have quite a few features that generic amd64/intel64 code cannot make use of, simply because they are not available everywhere and including them would break the code on unsupporting platforms. The solution is to not use these features, or ship different specialised binaries for different target CPUs. The problem with the first approach is that you miss out on possible optimisations and the problem with the second approach is that most users don’t know which features their CPUs support, possibly picking a wrong executable (which won’t run → bad user experience) or a less optimised one (which is again problem 1). But there is an elegant GCC-specific alternative: Function multiversioning!

But does it really solve our problems? Let’s have a closer look!

Prerequisites

  • basic understanding of C++ and compiler optimisations (you should have heard of “inlining” before, but you don’t need to know assembler, in fact I am not an assembly expert either)
  • Most code snippets are demonstrated via Compiler Explorer, but the benchmarks require you to have GCC ≥ version 7 locally installed.
  • You might want to open a second tab or window to display the Compiler Explorer along side this post (two screens work best 😎).

Population counts

Many of the CPU features used in machine-optimised code relate to SIMD, but for our example, I will use a more simple operation: population count or short popcount.

The popcount of an integral number is the number of bits that are set to 1 in its bit-representation. [More details on Wikipedia if you are interested.]

Popcounts are used in many algorithms, and are important in bioinformatics (one of the reasons I am writing this post). You could implement a naive popcount by iterating over the bits, but GCC already has us covered with a “builtin”, called __builtin_popcountll (the “ll” in the end is for long long, i.e. 64bit integers). Here’s an example:

1  __builtin_popcountll(6ull) // == 2, because 6ull's bit repr. is `...00000110`

To get a feeling for how slow/fast this function is, we are going to call it a billion times. The golden rule of optimisation is to always measure and not make wild assumptions about what you think the compiler or the CPU is/isn’t doing!

 1  #include <cstdint>
 2
 3  uint64_t pc(uint64_t const v)
 4  {
 5      return __builtin_popcountll(v);
 6  }
 7
 8  int main()
 9  {
10      for (uint64_t i = 0; i < 1'000'000'000; ++i)
11          volatile uint64_t ret = pc(i);
12  }

view in compiler explorer

This code should should be fairly easy to understand, the volatile modifier is only used to make sure that this code is always generated (if not used the compiler will see that the return values are not actually used and optimise all the code away!). In any case, before you compile this locally, click on the link and check out the compiler explorer. Using the colour code you can easily see that our call to __builtin_popcountll in line 5 is translated to another function __popcountdi2 internally. Before you continue, add -03 to the compiler arguments in compiler explorer, this will add machine-independent optimisations. The assembly code should change, but you will still be able to find __popcountdi2.

This is a generic function that works on all amd64/intel64 platforms and counts the set bits. What does it actually do? You can search the net and find explanations that say it does some bit-shifting and table-lookups, but the important part is that it performs multiple operations to compute the popcount in a generic way.

Modern CPUs, however, have a feature that does popcount in hardware (or close). Again, we don’t need to know exactly how this works, but we would expect that this single operation function is better than anything we can do (for very large bit-vectors this is not true entirely, but that’s a different issue).

How do we use this magic builtin? Just go back to the compiler explorer, and add -mpopcnt to the compiler flags, this tells GCC to expect this feature from the hardware and optimise for it. Voila, the assembly code generated now resolves to popcnt rsi, rsi instead of call __popcountdi2 (GCC is smart and it’s builtin resolves to whatever is best on the architecture we are targeting).

But how much better is this actually? Compile both versions locally and measure, e.g. with the time command.

compiler flags time on my pc
-O3 3.1s
-O3 -mpopcnt 0.6s

A speed-up of 5x, nice!

But what happens when the binary is run on a CPU that doesn’t have builtin popcnt? The program crashes with “Illegal hardware instruction” 💀

Function multiversioning

This is where function multiversioning (“FMV”) comes to the rescue. It is a GCC specific feature, that inserts a branching point in place of our original function and then dispatches to one of the available “clones” at run-time. You can specify how many of these “clones” you want and for which features or architectures each are built, then the dispatching function chooses the most highly optimised automatically. You can even manually write different function bodies for the different clones, but we will focus on the simpler kind of FMV where you just compile the same function body with different optimisation strategies.

Enough of the talking, here is our adapted example from above:

 1  #include <cstdint>
 2
 3  __attribute__((target_clones("default", "popcnt")))
 4  uint64_t pc(uint64_t const v)
 5  {
 6      return __builtin_popcountll(v);
 7  }
 8
 9  int main()
10  {
11      for (uint64_t i = 0; i < 1'000'000'000; ++i)
12          volatile uint64_t ret = pc(i);
13  }

view in compiler explorer

The only difference is that line 3 was inserted. The syntax is quite straight-forward insofar as anything in the C++ world is straight-forward 😉 :

  • We are telling GCC that we want two clones for the targets “default” and “popcnt”.
  • Everything else gets taken care of.

Follow the link to compiler explorer and check the assembly code (please make sure that you are not specifying -mpopcnt!). It is a little longer, but we immediately see via the colour code of __builtin_popcountll(v) that two functions are generated, one with the generic version and one with optimised version, similar to what we had above, but now in one program. The “function signatures” in the assembly code also tell us that one of them is “the original” and one is “popcnt clone”. Some further analysis will reveal a third function, the “clone .resolver” which is the dispatching function. Even without knowing any assembly you might be able to pick out the statement that looks up the CPU feature and calls the correct clone.

Great! So we have a single binary that is as fast as possible and works on older hardware. But is it really as fast as possible? Compile and run it!

version compiler flags time on my pc
original -O3 3.1s
original -O3 -mpopcnt 0.6s
FMV -O3 2.2s

Ok, we are faster than the original generic code so we are probably using the optimised popcount call, but we are nowhere near our 5x speed-up. What’s going on?

Nested function calls

We have replaced the core of our computation, the function pc() with a dispatcher that chooses the best implementation. As noted above this decision happens at run-time (it has to, because we can’t know beforehand if the target CPU will support native popcount, it’s the whole point of the exercise), but now this happens one billion times!

Wow, this check seems to be more expensive than the actual popcount call. If you write a lot of optimised code, this won’t be a surprise, decision making at run-time just is very expensive.

What can we do about it? Well, we could decide between generic VS optimised before running our algorithm, instead of deciding in our algorithm on every iteration:

 1  #include <cstdint>
 2
 3  uint64_t pc(uint64_t const v)
 4  {
 5      return __builtin_popcountll(v);
 6  }
 7
 8  __attribute__((target_clones("default", "popcnt")))
 9  void loop()
10  {
11      for (uint64_t i = 0; i < 1'000'000'000; ++i)
12          volatile uint64_t ret = pc(i);
13  }
14
15  int main()
16  {
17      loop();
18  }

view in compiler explorer

The assembly of this gets a little more messy, but you can follow around the jmp instructions or just scan the assembly for our above mentioned instructions and you will see that we still have the two versions (although the actual pc() function is not called, because it was inlined and is moved around a bit).

Compile the code and measure the time:

version compiler flags time on my pc
original -O3 3.1s
original -O3 -mpopcnt 0.6s
FMVed pc() -O3 2.2s
FMVed loop -O3 0.6s

Hurray, we are back to our original speed-up!

If you expected this, than you likely have dealt with strongly templated code before and also heard of tag-dispatching, a technique that can be used to translate arbitrary run-time decisions to different code-paths beneath which you can treat your run-time decision as a compile-time one.

Our simplified callgraph for the above cases looks like this (the dotted line is where the dispatching takes place):

In real world code the graph is of course bigger, but it should become obvious that by moving the decision making further to the left, the code becomes faster – because we have to decide less often –, but also the size of the generated executable becomes larger – because more functions are actually compiled. [There are corner cases where the executable being bigger actually results in certain things becoming slower, but lets not get into that now.]

Anyway, I thought that FMV would be like dispatching a tag down the call-graph, but it’s not! In fact we just got lucky in our above example, because the pc() call was inlined. Inlining means that the function itself is optimised away entirely and its code is inserted at the place in the calling function where the function call would have been otherwise. Only because pc() is inlined, do we actually get the opimisation!

How do you know? Well you can force GCC to not inline pc():

 1  #include <cstdint>
 2
 3  __attribute__((noinline))
 4  uint64_t pc(uint64_t const v)
 5  {
 6      return __builtin_popcountll(v);
 7  }
 8
 9  __attribute__((target_clones("default", "popcnt")))
10  void loop()
11  {
12      for (uint64_t i = 0; i < 1'000'000'000; ++i)
13          volatile uint64_t ret = pc(i);
14  }
15
16  int main()
17  {
18      loop();
19  }

view in compiler explorer

Just add the third line to your previous Compiler Explorer window, or open the above link. You can see that the optimised popcnt call has disappeared from the assembly and pc() only appears once. So in fact our callgraph is (no optimised pc() contained):

But how serious is this, you may ask? Didn’t the compiler inline automatically? Well, the problem about inlining is, that it is entirely up to the compiler whether it inlines a function or not (prefixing the function with inline does in fact not force it to). The deeper the call-graph gets, the more likely it is for the compiler not to inline all the way from the FMV invocation point.

Trying to save FMV for our use case

<details style="border:1px solid; padding: 2px; margin: 2px"> <summary>click to see some more complex but futile attempts</summary>

It’s possible to force the compiler to use inlining, but it’s also non-standard and it obviously doesn’t work if the called functions are not customisable by us (e.g. stable interfaces or external code / a library). Furthermore it might not even be desirable to force inline every function / function template, because they might be used in other places or with differently typed arguments resulting in an even higher increase of executable size.

An alternative to inlining would be to use the original form of FMV where you actually have different function bodies and in those add a custom layer of (tag-)dispatching yourself:

 1  #include <cstdint>
 2
 3  template <bool is_optimised>
 4  __attribute__((noinline))
 5  uint64_t pc(uint64_t const v)
 6  {
 7      return __builtin_popcountll(v);
 8  }
 9
10  __attribute__((target("default")))
11  void loop()
12  {
13      for (uint64_t i = 0; i < 1'000'000'000; ++i)
14          volatile uint64_t ret = pc<false>(i);
15  }
16
17  __attribute__((target("popcnt")))
18  void loop()
19  {
20      for (uint64_t i = 0; i < 1'000'000'000; ++i)
21          volatile uint64_t ret = pc<true>(i);
22  }
23
24  int main()
25  {
26      loop();
27  }

view in compiler explorer

In this code example we have turned pc() into a function template, customisable by a bool variable. This means that two versions of this function can be instantiated. We then also implement the loops separately and make each pass a different bool value to pc() as a template argument. If you look at the assembly in compiler explorer you can see that two functions are created for pc(), but unfortunately they both contain the unoptimised popcount call¹. This is due to the compiler not knowing/assuming that one of the functions is only called in an optimised context. → This method won’t solve our problem.

And while it is of course possible to add C++17’s if constexpr to pc() and start hacking custom code into the the function depending on the template parameter, it does further complicate the solution moving us further and further away from our original goal of a thin dispatching layer.

¹ Since the resulting function bodies are the same they are actually merged into a single one at optimisation levels > 1 (but this is independent of our problem). </details>

Summary

  • Function multiversioning is a good thing, because it aims to solve an actual problem: delivering optimised binary code to users that can’t or don’t want to build themselves.
  • Unfortunately it does not multiversion the functions called by a versioned function, forcing developers to move FMV very close to the intended function call.
  • This has the drawback of invoking the dispatch much more often than theoretically needed, possibly incurring a penalty in run-time that might exceed the gain from more highly optimised code.
  • It would be great if GCC developers could address this by adding a version of FMV that recursively clones the indirectly invoked functions (without further branching), as well as providing the machine-aware context to these clones, i.e. the presumed CPU features.

Further reading

On popcnt and CPU specific features:

On FMV:

Thursday, 07 December 2017

2017 in Review

Paul Boddie's Free Software-related blog » English | 18:07, Thursday, 07 December 2017

On Planet Debian there seems to be quite a few regularly-posted articles summarising the work done by various people in Free Software over the month that has most recently passed. I thought it might be useful, personally at least, to review the different things I have been doing over the past year. The difference between this article and many of those others is that the work I describe is not commissioned or generally requested by others, instead relying mainly on my own motivation for it to happen. The rate of progress can vary somewhat as a result.

Learning KiCad

Over the years, I have been playing around with Arduino boards, sensors, displays and things of a similar nature. Although I try to avoid buying more things to play with, sometimes I manage to acquire interesting items regardless, and these aren’t always ready to use with the hardware I have. Last December, I decided to buy a selection of electronics-related items for interfacing and experimentation. Some of these items have yet to be deployed, but others were bought with the firm intention of putting different “spare” pieces of hardware to use, or at least to make them usable in future.

One thing that sits in this category of spare, potentially-usable hardware is a display circuit board that was once part of a desk telephone, featuring a two-line, bitmapped character display, driven by the Hitachi HD44780 LCD controller. It turns out that this hardware is so common and mundane that the Arduino libraries already support it, but the problem for me was being able to interface it to the Arduino. The display board uses a cable with a connector that needs a special kind of socket, and so some research is needed to discover the kind of socket needed and how this might be mounted on something else to break the connections out for use with the Arduino.

Fortunately, someone else had done all this research quite some time ago. They had even designed a breakout board to hold such a socket, making it available via the OSH Park board fabricating service. So, to make good on my plan, I ordered the mandatory minimum of three boards, also ordering some connectors from Mouser. When all of these different things arrived, I soldered the socket to the board along with some headers, wired up a circuit, wrote a program to use the LiquidCrystal Arduino library, and to my surprise it more or less worked straight away.

Breakout board for the Molex 52030 connector

Breakout board for the Molex 52030 connector

Hitachi HD44780 LCD display boards driven by an Arduino

Hitachi HD44780 LCD display boards driven by an Arduino

This satisfying experience led me to consider other boards that I might design and get made. Previously, I had only made a board for the Arduino using Fritzing and the Fritzing Fab service, and I had held off looking at other board design solutions, but this experience now encouraged me to look again. After some evaluation of the gEDA tools, I decided that I might as well give KiCad a try, given that it seems to be popular in certain “open source hardware” circles. And after a fair amount of effort familiarising myself with it, with a degree of frustration finding out how to do certain things (and also finding up-to-date documentation), I managed to design my own rather simple board: a breakout board for the Acorn Electron cartridge connector.

Acorn Electron cartridge breakout board (in 3D-printed case section)

Acorn Electron cartridge breakout board (in 3D-printed case section)

In the back of my mind, I have vague plans to do other boards in future, but doing this kind of work can soak up a lot of time and be rather frustrating: you almost have to get into some modified mental state to work efficiently in KiCad. And it isn’t as if I don’t have other things to do. But at least I now know something about what this kind of work involves.

Retro and Embedded Hardware

With the above breakout board in hand, a series of experiments were conducted to see if I could interface various circuits to the Acorn Electron microcomputer. These mostly involved 7400-series logic chips (ICs, integrated circuits) and featured various logic gates and counters. Previously, I had re-purposed an existing ROM cartridge design to break out signals from the computer and make it access a single flash memory chip instead of two ROM chips.

With a dedicated prototyping solution, I was able to explore the implementation of that existing board, determine various aspects of the signal timings that remained rather unclear (despite being successfully handled by the existing board’s logic), and make it possible to consider a dedicated board for a flash memory cartridge. In fact, my brother, David, also wanting to get into board design, later adapted the prototyping cartridge to make such a board.

But this experimentation also encouraged me to tackle some other items in the electronics shipment: the PIC32 microcontrollers that I had acquired because they were MIPS-based chips, with somewhat more built-in RAM than the Atmel AVR-based chips used by the average Arduino, that could also be used on a breadboard. I hoped that my familiarity with the SoC (system-on-a-chip) in the Ben NanoNote – the Ingenic JZ4720 – might confer some benefits when writing low-level code for the PIC32.

PIC32 on breadboard with Arduino programming circuit

PIC32 on breadboard with Arduino programming circuit (and some LEDs for diagnostic purposes)

I do not need to reproduce an account of my activities here, given that I wrote about the effort involved in getting started with the PIC32 earlier in the year, and subsequently described an unusual application of such a microcontroller that seemed to complement my retrocomputing interests. I have since tried to make that particular piece of work more robust, but deducing the actual behaviour of the hardware has been frustrating, the documentation can be vague when it needs to be accurate, and much of the community discussion is focused on proprietary products and specific software tools rather than techniques. Maybe this will finally push me towards investigating programmable logic solutions in the future.

Compiling a Python-like Language

As things actually happened, the above hardware activities were actually distractions from something I have been working on for a long time. But at this point in the article, this can be a diversion from all the things that seem to involve hardware or low-level software development. Many years ago, I started writing software in Python. Over the years since, alternative implementations of the Python language (the main implementation being CPython) have emerged and seen some use, some continuing to be developed to this day. But around fifteen years ago, it became a bit more common for people to consider whether Python could be compiled to something that runs more efficiently (and more quickly).

I followed some of these projects enthusiastically for a while. Starkiller promised compilation to C++ but never delivered any code for public consumption, although the associated academic thesis might have prompted the development of Shed Skin which does compile a particular style of Python program to C++ and is available as Free Software. Meanwhile, PyPy elevated to prominence the notion of writing a language and runtime library implementation in the language itself, previously seen with language technologies like Slang, used to implement Squeak/Smalltalk.

Although other projects have also emerged and evolved to attempt the compilation of Python to lower-level languages (Pyrex, Cython, Nuitka, and so on), my interests have largely focused on the analysis of programs so that we may learn about their structure and behaviour before we attempt to run them, this alongside any benefits that might be had in compiling them to something potentially faster to execute. But my interests have also broadened to consider the evolution of the Python language since the point fifteen years ago when I first started to think about the analysis and compilation of Python. The near-mythical Python 3000 became a real thing in the form of the Python 3 development branch, introducing incompatibilities with Python 2 and fragmenting the community writing software in Python.

With the risk of perfectly usable software becoming neglected, its use actively (and destructively) discouraged, it becomes relevant to consider how one might take control of one’s software tools for long-term stability, where tools might be good for decades of use instead of constantly changing their behaviour and obliging their users to constantly change their software. I expressed some of my thoughts about this earlier in the year having finally reached a point where I might be able to reflect on the matter.

So, the result of a great deal of work, informed by experiences and conversations over the years related to previous projects of my own and those of others, is a language and toolchain called Lichen. This language resembles Python in many ways but does not try to be a Python implementation. The toolchain compiles programs to C which can then be compiled and executed like “normal” binaries. Programs can be trivially cross-compiled by any available C cross-compilers, too, which is something that always seems to be a struggle elsewhere in the software world. Unlike other Python compilers or implementations, it does not use CPython’s libraries, nor does it generate in “longhand” the work done by the CPython virtual machine.

One might wonder why anyone should bother developing such a toolchain given its incompatibility with Python and a potential lack of any other compelling reason for people to switch. Given that I had to accept some necessary reductions in the original scope of the project and to limit my level of ambition just to feel remotely capable of making something work, one does need to ask whether the result is too compromised to be attractive to others. At one point, programs manipulating integers were slower when compiled than when they were run by CPython, and this was incredibly disheartening to see, but upon further investigation I noticed that CPython effectively special-cases integer operations. The design of my implementation permitted me to represent integers as tagged references – a classic trick of various language implementations – and this overturned the disadvantage.

For me, just having the possibility of exploring alternative design decisions is interesting. Python’s design is largely done by consensus, with pronouncements made to settle disagreements and to move the process forward. Although this may have served the language well, depending on one’s perspective, it has also meant that certain paths of exploration have not been followed. Certain things have been improved gradually but not radically due to backwards compatibility considerations, this despite the break in compatibility between the Python 2 and 3 branches where an opportunity was undoubtedly lost to do greater things. Lichen is an attempt to explore those other paths without having to constantly justify it to a group of people who may regard such exploration as hostile to their own interests.

Lichen is not really complete: it needs floating point number and other useful types; its library is minimal; it could be made more robust; it could be made more powerful. But I find myself surprised that it works at all. Maybe I should have more confidence in myself, especially given all the preparation I did in trying to understand the good and bad aspects of my previous efforts before getting started on this one.

Developing for MIPS-based Platforms

A couple of years ago I found myself wondering if I couldn’t write some low-level software for the Ben NanoNote. One source of inspiration for doing this was “The CI20 bare-metal project“: a series of blog articles discussing the challenges of booting the MIPS Creator CI20 single-board computer. The Ben and the CI20 use CPUs (or SoCs) from the same family: the Ingenic JZ4720 and JZ4780 respectively.

For the Ben, I looked at the different boot payloads, principally those written to support booting from a USB host, but also the version of U-Boot deployed on the Ben. I combined elements of these things with the framebuffer driver code from the Linux kernel supporting the Ben, and to my surprise I was able to get the device to boot up and show a pattern on the screen. Progress has not always been steady, though.

For a while, I struggled to make the CPU leave its initial exception state without hanging, and with the screen as my only debugging tool, it was hard to see what might have been going wrong. Some careful study of the code revealed the problem: the code I was using to write to the framebuffer was using the wrong address region, meaning that as soon as an attempt was made to update the contents of the screen, the CPU would detect a bad memory access and an exception would occur. Such exceptions will not be delivered in the initial exception state, but with that state cleared, the CPU will happily trigger a new exception when the program accesses memory it shouldn’t be touching.

Debugging low-level code on the Ben NanoNote (the hard way)

Debugging low-level code on the Ben NanoNote (the hard way)

I have since plodded along introducing user mode functionality, some page table initialisation, trying to read keypresses, eventually succeeding after retracing my steps and discovering my errors along the way. Maybe this will become a genuinely useful piece of software one day.

But one useful purpose this exercise has served is that of familiarising myself with the way these SoCs are organised, the facilities they provide, how these may be accessed, and so on. My brother has the Letux 400 notebook containing yet another SoC in the same family, the JZ4730, which seems to be almost entirely undocumented. This notebook has proven useful under certain circumstances. For instance, it has been used as a kind of appliance for document scanning, driving a multifunction scanner/printer over USB using the enduring SANE project’s software.

However, the Letux 400 is already an old machine, with products based on this hardware platform being almost ten years old, and when originally shipped it used a 2.4 series Linux kernel instead of a more recent 2.6 series kernel. Like many products whose software is shipped as “finished”, this makes the adoption of newer software very difficult, especially if the kernel code is not “upstreamed” or incorporated into the official Linux releases.

As software distributions such as Debian evolve, they depend on newer kernel features, but if a device is stuck on an older kernel (because the special functionality that makes it work on that device is specific to that kernel) then the device, unable to run the newer kernels, gradually becomes unable to run newer versions of the distribution as well. Thus, Debian Etch was the newest distribution version that would work on the 2.4 kernel used by the Letux 400 as shipped.

Fortunately, work had been done to make a 2.6 series kernel work on the Letux 400, and this made Debian Lenny functional. But time passes and even this is now considered ancient. Although David was running some software successfully, there was other software that really needed a newer distribution to be able to run, and this meant considering what it might take to support Debian Squeeze on the hardware. So he set to work adding patches to the 2.6.24 kernel to try and take it within the realm of Squeeze support, making it beyond the bare minimum of 2.6.29 and into the “release candidate” territory of 2.6.30. And this was indeed enough to run Squeeze on the notebook, at least supporting the devices needed to make the exercise worthwhile.

Now, at a much earlier stage in my own experiments with the Ben NanoNote, I had tried without success to reproduce my results on the Letux 400. And I had also made a rather tentative effort at modifying Ben NanoNote kernel drivers to potentially work with the Letux 400 from some 3.x kernel version. David’s success in updating the kernel version led me to look again at the tasks of familiarising myself with kernel drivers, machine details and of supporting the Letux 400 in even newer kernels.

The outcome of this is uncertain at present. Most of the work on updating the drivers and board support has been done, but actual testing of my work still needs to be done, something that I cannot really do myself. That might seem strange: why start something I cannot finish by myself? But how I got started in this effort is also rather related to the topic of the next section.

The MIPS Creator CI20 and L4/Fiasco.OC

Low-level programming on the Ben NanoNote is frustrating unless you modify the device and solder the UART connections to the exposed pads in the battery compartment, thereby enabling a serial connection and allowing debugging information to be sent to a remote display for perusal. My soldering skills are not that great, and I don’t want to damage my device. So debugging was a frustrating exercise. Since I felt that I needed a bit more experience with the MIPS architecture and the Ingenic SoCs, it occurred to me that getting a CI20 might be the way to go.

I am not really a supporter of Imagination Technologies, producer of the CI20, due to the company’s rather hostile attitude towards Free Software around their PowerVR technologies, meaning that of the different graphics acceleration chipsets, PowerVR has been increasingly isolated as a technology that is consistently unsupportable by Free Software drivers. However, the CI20 is well-documented and has been properly supported with Free Software, apart from the PowerVR parts of the hardware, of course. Ingenic were seemingly persuaded to make the programming manual for the JZ4780 used by the CI20 publicly available, unlike the manuals for other SoCs in that family. And the PowerVR hardware is not actually needed to be able to use the CI20.

The MIPS Creator CI20 single-board computer

The MIPS Creator CI20 single-board computer

I had hoped that the EOMA68 campaign would have offered a JZ4775 computer card, and that the campaign might have delivered such a card by now, but with both of these things not having happened I took the plunge and bought a CI20. There were a few other reasons for doing so: I wanted to see how a single-board computer with a decent amount of RAM (1GB) might perform as a working desktop machine; having another computer to offload certain development and testing tasks, rather than run virtual machines, would be useful; I also wanted to experiment with and even attempt to port other operating systems, loosening my dependence on the Linux monoculture.

One of these other operating systems involves two components: the Fiasco.OC microkernel and the L4 Runtime Environment (L4Re). Over the years, microkernels in the L4 family have seen widespread use, and at one point people considered porting GNU Hurd to one of the L4 family microkernels from the Mach microkernel it then used (and still uses). It seems to me like something worth looking at more closely, and fortunately it also seemed that this software combination had been ported to the CI20. However, it turned out that my expectations of building an image, testing the result, and then moving on to developing interesting software were a little premature.

The first real problem was that GCC produced position-independent code that was not called correctly. This meant that upon trying to get the addresses of functions, the program would end up loading garbage addresses and trying to call any code that might be there at those addresses. So some fixes were required. Then, it appeared that the JZ4780 doesn’t support a particular MIPS instruction, meaning that the CPU would encounter this instruction and cause an exception. So, with some guidance, I wrote a handler to decode the instruction and generate the rather trivial result that the instruction should produce. There were also some more generic problems with the microkernel code that had previously been patched but which had not appeared in the upstream repository. But in the end, I got the “hello” program to run.

With a working foundation I tried to explore the hardware just as I had done with the Ben NanoNote, attempting to understand things like the clock and power management hardware, general purpose input/output (GPIO) peripherals, and also the Inter-Integrated Circuit (I2C) peripherals. Some assistance was available in the form of Linux kernel driver code, although the style of code can vary significantly, and it also takes time to “decode” various mechanisms in the Linux code and to unpick the useful bits related to the hardware. I had hoped to get further, but in trying to use the I2C peripherals to talk to my monitor using the DDC protocol, I found that the data being returned was not entirely reliable. This was arguably a distraction from the more interesting task of enabling the display, given that I know what resolutions my monitor supports.

However, all this hardware-related research and detective work at least gave me an insight into mechanisms – software and hardware – that would inform the effort to “decode” the vendor-written code for the Letux 400, making certain things seem a lot more familiar and increasing my confidence that I might be understanding the things I was seeing. For example, the JZ4720 in the Ben NanoNote arranges its hardware registers for GPIO configuration and access in a particular way, but the code written by the vendor for the JZ4730 in the Letux 400 accesses GPIO registers in a different way.

Initially, I might have thought that I was missing some important detail: are the two products really so different, and if not, then why is the code so different? But then, looking at the JZ4780, I encountered another scheme for GPIO register organisation that is different again, but which does have similarities to the JZ4730. With the JZ4780 being publicly documented, the code for the Letux 400 no longer seemed quite so bizarre or unfathomable. With more experience, it is possible to have a little more confidence in one’s understanding of the mechanisms at work.

I would like to spend a bit more time looking at microkernels and alternatives to Linux. While many people presumably think that Linux is running on everything and has “won”, it is increasingly likely that the Linux one sees on devices does not completely control the hardware and is, in fact, virtualised or confined by software systems like L4/Fiasco.OC. I also have reservations about the way Linux is developed and how well it is able to handle the demands of its proliferation onto every kind of device, many of them hooked up to the Internet and being left to fend for themselves.

Developing imip-agent

Alongside Lichen, a project that has been under development for the last couple of years has been imip-agent, allowing calendar-based scheduling activities to be integrated with mail transport agents. I haven’t been able to spend quite as much time on imip-agent this year as I might have liked, although I will also admit that I haven’t always been motivated to spend much time on it, either. Still, there have been brief periods of activity tidying up, fixing, or improving the code. And some interest in packaging the software led me to reconsider some of the techniques used to deploy the software, in particular the way scheduling extensions are discovered, and the way the system configuration is processed (since Debian does not want “executable scripts” in places like /etc, even if those scripts just contain some simple configuration setting definitions).

It is perhaps fairly typical that a project that tries to assess the feasibility of a concept accumulates the necessary functionality in order to demonstrate that it could do a particular task. After such an initial demonstration, the effort of making the code easier to work with, more reliable, more extensible, must occur if further progress is to be made. One intervention that kept imip-agent viable as a project was the introduction of a test suite to ensure that the basic functionality did indeed work. There were other architectural details that I felt needed remedying or improving for the code to remain manageable.

Recently, I have been refining the parts of the code that support editing of calendar objects and the exchange of updates caused by changes to calendar events. Such work is intended to make the Web client easier to understand and to expose such functionality to proper testing. One side-effect of this may be the introduction of a text-based client for people using e-mail programs like Mutt, as well as a potentially usable library for other mail clients. Such tidying up and fixing does not show off fancy new features or argue the case for developing such software in the first place, but I suppose it makes me feel better about the software I have written.

Whither Moin?

There are probably plenty of other little projects of my own that I have started or at least contemplated this year. And there are also projects that are not mine but which I use and which have had contributions from me over the years. One of these is the MoinMoin wiki software that powers a number of Free Software and other Web sites where collaborative editing is made available to the communities involved. I use MoinMoin – or Moin for short – to publish content on the Web myself, and I have encouraged others to use it in the past. However, it worries me now that the level of maintenance it is receiving has fallen to a level where updates for faults in the software are not likely to be forthcoming and where it is no longer clear where such updates should be coming from.

Earlier in the year, having previously read queries about the static export output from Moin, which can be rather basic and not necessarily resemble the appearance of the wiki such output has come from, I spent some time considering my own use of Moin for documentation publishing. For some of my projects, I don’t take advantage of the “through the Web” editing of the solution when publishing the public documentation. Instead, I use Moin locally, store the pages in a separate repository, and then make page packages that get installed on a public instance of Moin. This means that I do not have to worry about Web-based authentication and can just have a wiki as a read-only resource.

Obviously, the parts of Moin that I really need here are just the things that parse the wiki formatting (which I regard as more usable than other document markup formats in various respects) and that format the content as HTML. If I could format it as static content with some pages, some stylesheets, some images, with some Web server magic to make the URLs look nice, then that would probably be sufficient. For some things like the automatic generation of SVG from Graphviz-format files, I would also need to have the relevant parsers available, too. Having a complete Web framework, which is what Moin really is, is rather unnecessary with these diminished requirements.

But I do use Moin as a full wiki solution as well, and so it made me wonder whether I shouldn’t try and bring it up to date. Of course, there is already the MoinMoin 2.0 effort that was intended to modernise and tidy up the software, but since this effort made a clean break from Moin 1.x, it was never an attractive choice for those people already using Moin in anything more than a basic sense. Since there wasn’t an established API for extensions, it was not readily usable for many existing sites that rely on such extensions. In a way, Moin 2 has suffered from something that Python 3 only avoided by having a lot more people working on it, including people being paid to work on it, together with a policy of openly shaming those people who had made Python 2 viable – by developing software for it – into spending time migrating their code to Python 3.

I don’t have an obvious plan of action here. Moin perhaps illustrates the fundamental problem facing many Free Software projects, this being a theme that I have discussed regularly this year: how they may remain viable by having people able to dedicate their time to writing and maintaining Free Software without this work being squeezed in around the edges of people’s “actual work” and thus burdening them with yet another obligation in their lives, particularly one that is not rewarded by a proper appreciation of the sacrifice being made.

Plenty of individuals and organisations benefit from Moin, but we live in an age of “comparison shopping” where people will gladly drop one thing if someone offers them something newer and shinier. This is, after all, how everyone ends up using “free” services where the actual costs are hidden. To their credit, when Moin needed to improve its password management, the Python Software Foundation stepped up and funded this work rather than dropping Moin, which is what I had expected given certain Python community attitudes. Maybe other, more well-known organisations that use Moin also support its development, but I don’t really see much evidence of it.

Maybe they should consider doing so. The notion that something else will always come along, developed by some enthusiastic developer “scratching their itch”, is misguided and exploitative. And a failure to sustain Free Software development can only undermine Free Software as a resource, as an activity or a cause, and as the basis of many of those organisations’ continued existence. Many of us like developing Free Software, as I hope this article has shown, but motivation alone does not keep that software coming forever.

Monday, 04 December 2017

Install Signal Desktop to Archlinux

Evaggelos Balaskas - System Engineer | 22:41, Monday, 04 December 2017

How to install Signal dekstop to archlinux

Download Signal Desktop

eg. latest version v1.0.41

$ curl -s https://updates.signal.org/desktop/apt/pool/main/s/signal-desktop/signal-desktop_1.0.41_amd64.deb \
    -o /tmp/signal-desktop_1.0.41_amd64.deb

Verify Package

There is a way to manually verify the integrity of the package, by checking the hash value of the file against a gpg signed file. To do that we need to add a few extra steps in our procedure.

Download Key from the repository

$ wget -c https://updates.signal.org/desktop/apt/keys.asc

--2017-12-11 22:13:34--  https://updates.signal.org/desktop/apt/keys.asc
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Connecting to 127.0.0.1:8118... connected.
Proxy request sent, awaiting response... 200 OK
Length: 3090 (3.0K) [application/pgp-signature]
Saving to: ‘keys.asc’

keys.asc                          100%[============================================================>]   3.02K  --.-KB/s    in 0s      

2017-12-11 22:13:35 (160 MB/s) - ‘keys.asc’ saved [3090/3090]

Import the key to your gpg keyring

$ gpg2 --import keys.asc

gpg: key D980A17457F6FB06: public key "Open Whisper Systems <support@whispersystems.org>" imported
gpg: Total number processed: 1
gpg:               imported: 1

you can also verify/get public key from a known key server

$ gpg2 --verbose --keyserver pgp.mit.edu --recv-keys 0xD980A17457F6FB06

gpg: data source: http://pgp.mit.edu:11371
gpg: armor header: Version: SKS 1.1.6
gpg: armor header: Comment: Hostname: pgp.mit.edu
gpg: pub  rsa4096/D980A17457F6FB06 2017-04-05  Open Whisper Systems <support@whispersystems.org>
gpg: key D980A17457F6FB06: "Open Whisper Systems <support@whispersystems.org>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1

Here is already in place, so no changes.

Download Release files

$ wget -c https://updates.signal.org/desktop/apt/dists/xenial/Release

$ wget -c https://updates.signal.org/desktop/apt/dists/xenial/Release.gpg

Verify Release files

$ gpg2 --no-default-keyring --verify Release.gpg Release

gpg: Signature made Sat 09 Dec 2017 04:11:06 AM EET
gpg:                using RSA key D980A17457F6FB06
gpg: Good signature from "Open Whisper Systems <support@whispersystems.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: DBA3 6B51 81D0 C816 F630  E889 D980 A174 57F6 FB06

That means that Release file is signed from whispersystems and the integrity of the file is not changed/compromized.

Download Package File

We need one more file and that is the Package file that contains the hash values of the deb packages.

$ wget -c https://updates.signal.org/desktop/apt/dists/xenial/main/binary-amd64/Packages

But is this file compromized?
Let’s check it against Release file:

$ sha256sum Packages

ec74860e656db892ab38831dc5f274d54a10347934c140e2a3e637f34c402b78  Packages

$ grep ec74860e656db892ab38831dc5f274d54a10347934c140e2a3e637f34c402b78 Release

 ec74860e656db892ab38831dc5f274d54a10347934c140e2a3e637f34c402b78     1713 main/binary-amd64/Packages

yeay !

Verify deb Package

Finally we are now ready to manually verify the integrity of the deb package:

$ sha256sum signal-desktop_1.0.41_amd64.deb

9cf87647e21bbe0c1b81e66f88832fe2ec7e868bf594413eb96f0bf3633a3f25  signal-desktop_1.0.41_amd64.deb

$ egrep 9cf87647e21bbe0c1b81e66f88832fe2ec7e868bf594413eb96f0bf3633a3f25 Packages

SHA256: 9cf87647e21bbe0c1b81e66f88832fe2ec7e868bf594413eb96f0bf3633a3f25

Perfect, we are now ready to continue

Extract under tmp filesystem

$ cd /tmp/

$ ar vx signal-desktop_1.0.41_amd64.deb

x - debian-binary
x - control.tar.gz
x - data.tar.xz

Extract data under tmp filesystem

$ tar xf data.tar.xz

Move Signal-Desktop under root filesystem

# sudo mv opt/Signal/ /opt/Signal/

Done

Actually, that’s it!

Run

Run signal-desktop as a regular user:

$ /opt/Signal/signal-desktop

Signal Desktop

signal-desktop-splash.png

Proxy

Define your proxy settings on your environment:

declare -x ftp_proxy="proxy.example.org:8080"
declare -x http_proxy="proxy.example.org:8080"
declare -x https_proxy="proxy.example.org:8080"

Signal

signal_desktop.png

Tag(s): signal, archlinux

Friday, 01 December 2017

Hacking with posters and stickers

DanielPocock.com - fsfe | 20:27, Friday, 01 December 2017

The FIXME.ch hackerspace in Lausanne, Switzerland has started this weekend's VR Hackathon with a somewhat low-tech 2D hack: using the FSFE's Public Money Public Code stickers in lieu of sticky tape to place the NO CLOUD poster behind the bar.

Get your free stickers and posters

FSFE can send you these posters and stickers too.

Friday, 24 November 2017

Free software in the snow

DanielPocock.com - fsfe | 08:31, Friday, 24 November 2017

There are an increasing number of events for free software enthusiasts to meet in an alpine environment for hacking and fun.

In Switzerland, Swiss Linux is organizing the fourth edition of the Rencontres Hivernales du Libre in the mountain resort of Saint-Cergue, a short train ride from Geneva and Lausanne, 12-14 January 2018. The call for presentations is still open.

In northern Italy, not far from Milan (Malpensa) airport, Debian is organizing a Debian Snow Camp, a winter getaway for developers and enthusiasts in a mountain environment where the scenery is as diverse as the Italian culinary options. It is hoped the event will take place 22-25 February 2018.

Wednesday, 22 November 2017

VR Hackathon at FIXME, Lausanne (1-3 December 2017)

DanielPocock.com - fsfe | 19:25, Wednesday, 22 November 2017

The FIXME hackerspace in Lausanne, Switzerland is preparing a VR Hackathon on the weekend of 1-3 December.

Competitors and visitors are welcome, please register here.

Some of the free software technologies in use include Blender and Mozilla VR.

Wednesday, 15 November 2017

Linking hackerspaces with OpenDHT and Ring

DanielPocock.com - fsfe | 19:57, Wednesday, 15 November 2017

Francois and Nemen at the FIXME hackerspace (Lausanne) weekly meeting are experimenting with the Ring peer-to-peer softphone:

Francois is using Raspberry Pi and PiCam to develop a telepresence network for hackerspaces (the big screens in the middle of the photo).

The original version of the telepresence solution is using WebRTC. Ring's OpenDHT potentially offers more privacy and resilience.

KVM-virtualization on ARM using the “virt” machine type

Daniel's FSFE blog | 17:20, Wednesday, 15 November 2017

Introduction

A while ago, I described how to run KVM-based virtual machines on libre, low-end virtualization hosts on Debian Jessie [1]. For emulating the ARM board, I used the vexpress-a15 which complicates things as it requires the specification of compatible DTBs. Recently, I stepped over Peter’s article [2] that describes how to use the generic “virt” machine type instead of vexpress-a15. This promises to give some advantages such as the ability to use PCI devices and makes the process of creating VMs much easier.

As it was also reported to me that my instructions caused trouble on Debian Stretch (virt-manager generates incompatible configs when choosing the vexpress-a15 target). So I spent some time trying to find out how to run VMs using the “virt” machine type using virt-manager (Peter’s article only described the manual way using command-line calls). This included several traps, so I decided to write up this article. It gives a brief overview how to create a VM using virt-manager on a ARMv7 virtualization host such as the Cubietruck or the upcoming EOMA68-A20 computing card.

Disclaimer

All data and information provided in this article is for informational purposes only. The author makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information on this article and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.

In no event the author we be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from loss of data or profits arising out of, or in connection with, the use of this article.

Tested VMs

I managed to successfully create and boot up the following VMs on a Devuan (Jessie) system:

  • Debian Jessie Installer
  • Debian Stretch Installer
  • Debian Unstable Installer
  • Fedora Core 27 (see [3] for instructions how to obtain the necessary files)
  • Arch Linux, using the latest ARMv7 files available at [4]
  • LEDE 17.0.1.4

I was able to reproduce the steps for the Debian guests on a Debian Stretch system as well (I did not try with the other guests).

Requirements / Base installation

This article assumes you have setup a working KVM virtualization host on ARM. If you don’t, please work through my previous article [1].

Getting the necessary files

Depending on the system you want to run in your Guest, you typically need an image of the kernel and the initrd. For the Debian-unstable installer, you could get the files like this:

wget http://http.us.debian.org/debian/dists/unstable/main/installer-armhf/current/images/netboot/vmlinuz -O vmlinuz-debian-unstable-installer

and

wget http://http.us.debian.org/debian/dists/unstable/main/installer-armhf/current/images/netboot/vmlinuz -O initrd-debian-unstable-installer.gz

Creating the Guest

Now, fire up virt-manager and start the wizard for creating a new Guest. In the first step, select “Import existing disk image” and the default settings, which should use the Machine Type “virt” already:

In the second step, choose a disk image (or create one) and put the paths to the Kernel and Initrd that you downloaded previously. Leave the DTB path blank and put “console=ttyAMA0″ as kernel arguments. Choose an appropriate OS type of just leave the default (may negatively impact the performance of your guest, or other things may happen such as your virtual network card may not be recognized by the installer):

Next, select memory and CPU settings as required by your Guest:

Finally, give the VM a proper name and select the “Customize configuration before install” option:

In the machine details, make sure the CPU model is “host-passthrough” (enter it manually if you can’t select it in the combo box):

In the boot options tab, make sure the parameter “console=ttyAMA0″ is there (otherwise you will not get any output on the console). Depending on your guest, you might also need more parameters, such as for setting the rootfs:

Finally, click “begin installation” and you should see your VM boot up:

Post-Installation steps

Please note, that after installing your guest, you must extract the kernel and initrd from the installed guest image (you want to boot the real system, not the installer) and change your VM configuration to use these files instead.

Eventually, I will provide instructions how to do this for a few guest types in the future. Meanwhile, you can obtain instructions for extracting the files from a Debian guest from Peter’s article [2].

[1] https://blogs.fsfe.org/kuleszdl/2016/11/06/installing-a-libre-low-power-low-cost-kvm-virtualization-host/
[2] https://translatedcode.wordpress.com/2016/11/03/installing-debian-on-qemus-32-bit-arm-virt-board
[3] https://fedoraproject.org/wiki/QA:Testcase_Virt_ARM_on_x86
[4] http://os.archlinuxarm.org/os/ArchLinuxARM-armv7-latest.tar.gz

Monday, 13 November 2017

Software freedom in the Cloud

English on Björn Schießle - I came for the code but stayed for the freedom | 22:00, Monday, 13 November 2017

Looking for Freedom

How to stay in control of the cloud? - Photo by lionel abrial on Unsplash

What does software freedom actually means, in a world where more and more software no longer runs on our own computer but in the cloud? I keep thinking about this topic for quite some time and from time to time I run into some discussions about this topic. For example a few days ago at Mastodon. Therefore I think it is time to write down my thoughts on this topic.

Cloud is a huge marketing term which can actually mean a lot. In the context of this article cloud is meant as something quite similar to SaaS (software as a service). This article will use this terms interchangeable, because this are also the two terms the Free Software community uses to discuss this topic.

The original idea of software freedom

At the beginning every software was free. In the 80s, when computer become widely used and people start to make software proprietary in order to maximise their profit, Richard Stallman come up with a incredible hack. He used copyright to reestablish software freedom by defining these four essential freedoms:

  1. The freedom to run the software for every purpose
  2. The freedom study how the program works and adapt it to your needs
  3. The freedom to distribute copies
  4. The freedom to distribute modified versions of the program

Every software licensed in a way that grants the user this four freedoms is called Free Software. This are the basic rules to establish software freedom in the world of traditional computing, where the software runs on our own devices.

Today almost no company can exist without using at least some Free Software. This huge success was possible due to a pragmatic move by Richard Stallman, driven by a vision on how a freedom respecting software world should look like. His idea was the starting point for a movement which come up with a complete new set of software licenses and various Free Software operating systems. It enabled people to continue to use computers in freedom.

SaaS and the cloud

Today we no longer have just one computer. Instead we have many devices such as smart phones, tablets, laptops, smartwatches, small home servers, IoT devices and maybe still a desktop computer at our office. We want to access our data from all this devices and switch during work between the devices seamlessly. That’s one of the main reasons why software as a service (SaaS) and the cloud became popular. Software which runs on a server and all the devices can connect to it. But of course this comes with a price, it means that we are relaying more and more on someones else computer instead of running the programs on our own computer. We lose control. This is not completely new, some of this solutions are quite old, others are rather new, some examples are mail servers, social networks, source code hosting platforms, file sharing services, platforms for collaborative work and many more. Many of this services are build with Free Software, but the software only runs on the server of the service provider and so the freedom never arrives at the user. The user stays helpless. We hand over the data to servers we don’t control. We have no idea what happens to our data and for many services we have no way to get our data again out of the service. Even if we can export the data we are often helpless because without the software which runs the original service, we can’t perform the same operations on our own servers.

We can’t turn back the time

We can’t stop the development of such services. History tells us that we can’t stop technological progress, whether we like it or not. Telling people not to use it will not have any notable impact. Quite the opposite, we the Free Software movement would lose the reputation we build over the last decades and with it any influence. We would no longer be able to change things for the better. Think again what Richard Stallman did about thirty years ago. He grew up in a world where software was free by default. When computers become a mass market product more and more manufactures turned software into a proprietary product. Instead of developing the powerful idea of Free Software, Richard Stallman could have decided to no longer use this modern computers and ask people to follow him? But would have many people joined him? Would it have stopped the development? I don’t think so. We would still have all the computers as we know them today, but without Free Software.

That’s why I strongly believe that, like thirty years ago, we need again a constructive and forward looking answer to the new challenges, brought to us by the cloud and SaaS. We, the Free Software community, need to be the driving force to lead this new way of computing into a way that respect the users freedom. Same as Richard Stallman did it back then by starting the Free Software movement. All this is done by people, so it’s people like us who can influence it.

Finding answers to this questions requires us to think in new directions. The software license is still the corner stone. Without the software being Free Software everything else is void. But being Free Software is by no means enough to establish freedom in the world of the cloud.

What does this mean to software freedom?

Having a close look at cloud solutions, we realise that it contains most of the time two categories of software. Software that runs on the server itself and software served by the server but executed on the users computer, so called JavaScript.

Following the principle of the well established definition of software freedom, the software distributed to the user needs to be Free Software. I would call this the necessary precondition. But by just looking at the license of the JavaScript code we are trying to solve today’s problems with the tools of the past, completely ignoring that in the world of SaaS your computer is no longer the primary device. Getting the source code of the JavaScript under a Free Software license is nice but it is not enough to establish software freedom. The JavaScript is tightly connected to the software which runs of the server so users can’t change it a lot without breaking the functionality of the service. Further, with each page reload the user gets again the original version of the JavaScript. This means that, with respect to the users freedom, access to the JavaScript code alone is insufficient. Free JavaScript has mainly two benefits: First, the user can study the code and learn how it works and second, maybe reuse parts of it in their own projects. But to establish real software freedom a service needs to fulfil more criteria.

The user needs access to the whole software stack, both the software which runs on the server and the software which runs the browser. Without the right to use, study, share and improve the whole software stack, freedom will not be possible. That’s why the GNU AGPLv3 is incredible important. Without going into too much details, the big difference is how the license defines the meaning of “distribute”. This term is critical to the Free Software definition. It defines at which point the rights to use, study, share and improve the software gets transferred to a user. Typically that happens when the user gets a copy of the software. But in the world of SaaS you no longer get a real copy of the software, you just use it over a network connection. The GNU AGPLv3 makes sure that this kind of usage already entitles you to get the source code. Only if both, the software which runs on the server and the software which runs on the browser is Free Software, users can start to consider exercising their freedom. Therefore my minimal definition of freedom respecting services would be that the whole software stack is Free Software.

But I don’t think we should stop here. We need more in order to drive innovation forward in a freedom respecting way. This is also important because various software projects already work on it. Telling them that these extra steps are only “nice to have” but not really important sends the wrong message.

If the whole software stack is Free Software we achieved the minimum requirement to allow everyone to set up their own instance. But in order to avoid building many small islands we need to enable the instances to communicated with each other. A feature called federation. We see this already in the area of freedom respecting social networks or in the area of file sync and share. About a year ago I wrote an article, arguing that this is a feature needed for the next generation code hosting platforms as well. I’m happy to see that GitLab started to look into exactly this. Only if many small instances can communicate with each other, completely transparent for the user so that it feels like one big service, exercising your freedom to run your own server becomes really interesting. Think for a moment about the World Wide Web. If you browse the Internet it feels like one gigantic universe, the web. It doesn’t matter if the page you navigate to is located at the same server or on a different ones, thousands of kilometres away from each other.

If we reach the point where the technology is build and licensed in a way that people can decide freely where to run a particular service, there is one missing piece. We need a way to migrate from one server to another. Let’s say you start using a service provided by someone but at some point you want to move to a different provider or decide to run your own server. In this case you need a way to export your data from the first server and import it to the new one. Ideally in a way which allows you to keep the connection to your friends and colleagues, in case of a service which provides collaboration or social features. Initiatives like the User Data Manifesto thought already about it and gave some valuable answers.

Conclusion

How do we achieve practical software freedom in the world of the cloud? In my opinion this are the corner stones:

  1. Free Software, the whole software stack, this means software which runs on the server and on the users browser, needs to be free. Only then people can exercise their freedom.

  2. Control, people need to stay in control of their data and need to be able to export/import them in order to move.

  3. Federation, being able to exercise your freedom to run your own instance of a service without creating small islands and losing the connection to your friends and colleagues.

This is my current state of thinking, with respect to this subject. I’m happy to hear more opinions about this topic.

Introducing: forms

free software - Bits of Freedom | 17:30, Monday, 13 November 2017

Introducing: forms

In this post, I will introduce you to the FSFE's forms API, a way to send emails and manage sign-ups on web pages used in the FSFE community.

For our Public Money, Public Code campaign, as well as for two of our other initiatives which launched not too long ago (Save Code Share and REUSE), we needed a way to process form submissions from static web sites in a way which allowed for some templating and customisation for each web site.

This is not a new problem by any means, and there are existing solutions like formspree and Google Forms which allows you to submit a form to another website and get a generated email or similar with the results. Some of these are proprietary, others not.

We decided to expand upon the idea of formspree and create a utility which not only turns form submissions into emails, but also allows for storing submissions in json format (to allow us to easily process them afterwards), and allows for customising the mails sent. So we built forms.

The idea of this is easy: on a static website, put a <form> which has as an action to submit the form to forms.fsfe.org. Our forms system then processes the request, acts according to the configuration for that particular website, and then redirects the user back to the static website again.

Some of the use cases where this can be employed is:

  • Signup for a newsletter or mailing list
  • Adding people to an open letter
  • Sending templated e-mails to politicians on behalf of others
  • Contact forms of various kinds

Each of these can be made to happen with or without confirmation of an email address. We typically require confirmed e-mail addresses though, and this is the recommended practice. It means the person submitting a form will get an email asking to click a link to confirm their submission. Only when they click that email will any action be taken.

Here's how a form on a webpage could look like:

<form method="POST" action="https://forms.fsfe.org/email">  
  <input type="hidden" name="appid" value="totick2">
  Your name: <input type="text" name="name">
  Your e-mail: <input type="email" name="from" />
  Your message: <input type="text" name="msg" />
</form>  

You will notice that aside from the text and email fields, which the visitor can fill in, there's also a hidden field called appid. This is the identifier used in the forms API to be able to separate different forms, and to know how to behave in each case, and what templates to use.

The configuration, if you want to have a look at it, is in https://git.fsfe.org/FSFE/forms/src/master/src/configuration/applications.json. For a simple contact form, it can look like this:

  "contact": {
    "to": [
      "contact@fsfe.org"
    ],
    "include_vars": true,
    "redirect": "https://fsfe.org/contact/contact-thankyou",
    "template": "contact-form"
  },

This does not have any confirmation of the senders email, and simply says that upon submission, an email should be sent to contact@fsfe.org using the template contact-form, including the extra variables (like name, email, etc) which was included in the form. The submitter should then be redirected to https://fsfe.org/contact/contact-thankyou where there would supposedly be some thank you note.

The templates are a bit magical, and they are defined using a two step process. First you give the identifier of the template in the applications.json. Then in templates.json in the same directory you define the actual template:

  "contact-form": {
    "plain": {
      "content": "{{ msg }}. Best regards, {{ name }}"
    },
    "required_vars": [
      "msg", "name"
    ]
  },

This simply says that we need the content and name variables from the form, and we include them in the content of the email which is sent on submit. You can also specify a html fragment, which would then complement the plain part, and instead of content you can specify filename so the template isn't included in the JSON but loaded from an external filename.

Now, back to our web form. The form we created contained from, name and msg input fields. The latter two were created by us for this particular form, but from is part of a set of form values which control the behaviour of the forms API.

In this case, from is understood by the forms API to be the value of the From: field of the email it is supposed to send. This variable can be set either in the <form> as an input variable, hidden field, or similar, or it can be set in the application config.

If it appears in the application config, this takes precedence and anything submitted in the form with the same name will be ignored. These are the variables which can be included either in the <form> or in the application config:

  • from
  • to
  • reply-to
  • subject
  • content
  • template

Each of them do pretty much what it says; it defines the headers of the email sent, or defines the template or content to be used for the email.

The application config in itself can define a number of additional options, which control how the forms API function. The most frequently used ones are given below (you can see the whole list and some examples in the README.

  • include_vars, we also touched upon, it makes extra variables from the form available to the template if set to true.
  • confirm if set to true means the form must contain an input field called confirm, with a valid email address and that this address should receive a confirmation mail with a link to click before acting on it.
  • redirect is the URL to which to redirect the user after submitting the form.
  • redirect-confirmed is the URL to which to redirect the user after clicking a confirmation link in an email.
  • confirmation-template is the template id for the confirmation mail.
  • confirmation-subject is the subject of the confirmation mail.

Let's look at a more complete example. We can use the form which at one point was used to sign people up for a mailing list about our REUSE initiative.

<form method="post" action="https://forms.fsfe.org/email">  
  <input type="hidden" name="appid" value="reuse-signup" />
  <input type="email" name="confirm" size="45" id="from" placeholder="Your email address" /><br />
  <input type="checkbox" name="optin" value="yes"> I agree to the <a href="/privacy">privacy</a> policy.
  <input type="submit" value="Send" /><br />
</form>  

You can see here we use the trick of naming the email field confirm to use the confirmation feature. Else, we mostly include a checkbox called optin for the user to confirm they've agreed to the privacy policies.

The application config for this is:

  "reuse-signup": {
    "from": "no-reply@fsfe.org",
    "to": [ "jonas@fsfe.org" ],
    "subject": "New signup to REUSE",
    "confirm": true,
    "include_vars": true,
    "redirect": "https://reuse.software/sign-confirm",
    "redirect-confirmed": "https://reuse.software/sign-success",
    "template": "reuse-signup",
    "store": "/store/reuse/signups.json",
    "confirmation-template": "reuse-signup-confirm",
    "confirmation-subject": "Just one step left! Confirm your email for REUSE!"
  }

What this says is that upon submission, we want to confirm the email address (confirm == TRUE). So when someone submits this form, instead of acting on it right away, the system will redirect the user to a webpage (redirect) and send an email to the address entered into the webform. That email will have the subject Just one step left! Confirm your email for REUSE! (confirmation-subject) and the following content (which is defined in the templates):

Thank you for your interest in our work on making copyrights and licenses computer readable!

There's just one step left: you need to confirm we have the right email for you, so we know where to reach you to stay in touch about our work on this.

Click the link below and we'll be on our way!

  {{ confirmation_url }}


Thank you,

Jonas Öberg  
Executive Director

FSFE e.V. - keeping the power of technology in your hands.  
Your support enables our work, please join us today http://fsfe.org/join  

The confirmation_url will be replaced with the URL the submitter has to click.

When the submitter gets this mail and clicks on the link given, they will be redirected to another website redirect-confirmed, and en email will be sent according to the specifications:

The content of this mail will be

{{ confirm }};{{ optin }}

For instance:

jonas@example.com;yes  

Since it's an email to myself, I didn't bother to make it fancy. But we could easily have written a template such as:

Great news! {{ confirm }} just signed up to hear more from your awesome project!  

And that's it. But if you paid attention, you'll have noticed another defined variable which we didn't explain yet:

    "store": "/store/reuse/signups.json",

This isn't terribly useful unless you have access to the server where the API runs, but essentially, this will make sure that not only does it send an email to me upon confirmation, but it also stores away, in a JSON structure, the contents of that email, and all the forms variables, which we can then use to do something more automated with the information.

If you're interested in the code behind this, it's available in Git, where there are also a bunch of issues for improvement we've thought about. In the long term, it would be nice if:

  • the confirmation and templating options which grew over time were reviewed to make them clearer, right now adding a template often involves messing with three files (the application config, the template config, and the template files itself),
  • it would also be nice if the storage of the signups were accessible without having direct access to the server running it.

But those are bigger tasks, and at least for now, the forms API does what it needs to do. If you want to use it too, the best way would be to clone the Git repository, update the application and template config and send a pull request. We can help you review your configuration and merge it into master.

The 2% discussion - "Free Software" or "Open Source Software"

Matthias Kirschner's Web log - fsfe | 07:34, Monday, 13 November 2017

Scott Peterson from Red Hat this week published an article "Open Source or Free Software". It touches on a very important misunderstanding; people still believe that the terms "Open Source Software" and "Free Software" are referring to different software: they are not! Scott asked several interesting questions in his article and I thought I should share my thoughts about them here and hopefully provoke some more responses on an important topic.

Is it a car?

The problem described in the article is that "Free Software" and "Open Source Software" are associated with certain values.

One of the questions was, would it be useful to have a neutral term to describe this software. Yes I think so; it would be useful. The question which I read between the lines: is it possible to have a neutral term? ("Or is the attempt to separate the associated values a flawed goal?") Here I see a huge challenge, and doubt it is possible. Almost all terms develop a connection with values over time.

In my talks I often use the example of a car. A lot of people say "car", but there are still many other terms, which are used depending on the context, e.g. the brand name. They say let's go to my (Audi, BMW, Mercedes, Peugeot, Porsche, Tesla, Toyota, Volkswagen, ...). This puts the emphasis on the manufacturer. Some people might call it "auto", if you call it "automobile", "vehicle", "vessel", "craft", or "motor-car" you might have different values or at least be perceived in a different way (maybe "old-school" or conservative) than someone calling it always "car". Some goes with "computer on four wheels", which highlights other aspects again, and is also not neutral, as you most likely want to shift the focus on a certain aspect.

Which brings me to Scott's next questions "What if someone wants to refer to this type of software without specifying underlying values?" I doubt it will be possible to find such a term and to keep it neutral over a longer term. Especially if there is already an existing term. So you have to explain people why they should use this new term and it is difficult to do that without associating the new term with an opposite value of the existing term ("so you don't agree that freedom / availability of source code is important?").

As a side note, Scott mentioned FOSS or FLOSS as possible neutral terms. This might work (and for some projects the FSFE also used those terms as some organisations else would not have participated). It might also mean that people who prefer "Free Software" or "Open Source Software" will both be unhappy with you. The problem I see with those combined terms FOSS and FLOSS, is that it deepens the misunderstanding that Open Source Software and Free Software is different software. Why would you else have to combine them? (Would you say "car automobile vehicle"? If you do that would you be seen as more neutral by doing that?)

My main question is: do we really need a neutral term? Why cannot everybody just choose the term which is closer to her values? Whatever term someone else is using, we treat them with respect and without any prejudice. Instead of trying to find something neutral, shouldn't we work on making that happen?

Why is it a problem if someone is using Free Software and the other one is using Open Source Software, if we agree it is the same thing we are talking about? Do we see a problem if one person says car and the other vehicle? (Would be different if people cannot agree if that thing is a BMW or a Volkswagen.)

I would be interested in your thoughts about this.

Beside that, a lot of those discussions happen on an expert level and sometimes assume that other people choose those terms deliberately, too. I want to challenge that assumption. I have met many people who use one of the terms, and after talking with them I realised that they are more in line with those values I would have associated with the other term. That is why I think it is important to keep in mind that you will most likely not know the values of your conversation partner just by them saying "Open Source" or "Free Software"; you need to invest more efforts to understand the other person.

There are also many people in our community who use completely different terms, as they mainly speak in their native language which is not English. They might say Vapaat ohjelmistot, Logiciels Libres, Software Libero, Ελεύθερο Λογισμικό, Fri software Software Libre, Özgür Yazılım, Fri mjukvara, Software-i i Lirë, Свободные программы, 自由暨開源軟, Software Livre, Freie Software, Offene Software, ... Some of them might have a slightly different meaning than the corresponding English translation. What values do people have who use them? And if we assume we would find a neutral English term, would we ever find neutral words for people who do not speak English?

Let's also keep in mind that there are people discussing about underlying principles and values without using any of the terms "Open Source Software", "Free Software", "Libre Software", FOSS, or FLOSS. They rather discuss the principle by saying: we need to make sure the software does not restrict us what we can do with it, or how long. We need to be able to understand what it does, or ask others to do so without asking anyone else for permission. We should be able to give it to our business partners or put it on as many of our servers / products as we want, scale our installations, ... without restrictions. We need to make sure that we or someone else can always make sure to adapt the software to our (new/changing) needs.

They might not once mention any of the terms, although Free Software is the solution for those topics. They might discuss that under labels of digital sovereignty, digital sustainability, vendor neutrality, agility, reliability or other terms. I am sure that if a concept is successful, this will often happen -- and it is not a bad sign if that is happening. So we do not have to see it as a problem if someone else is using another term than we ourselves, especially if they agree with us on most of our goals and values.

Finally, my biggest concern are people who (deliberately or by mistake) say something is Free Software or Open Source Software, but the software is simply not and let us not forget more than 98% of the people around the world who do not know that Free Software -- or however else you call it -- exists or what it exactly means. For me that is the part we have to concentrate our efforts on.

Thanks for reading and I am looking forward to your comments.

PS: On this topic I highly recommend Björn Schießle, 2012, "Free Software, Open Source, FOSS, FLOSS - same same but different" which we heavily use in the FSFE when people have questions about that topic (and thanks to the FSFE's translation team this article is meanwhile available in four languages) and you might also be interested in Simon Phipps, 2017, "Free vs Open".

Saturday, 11 November 2017

Digital preservation of old crap

free software - Bits of Freedom | 13:17, Saturday, 11 November 2017

Digital preservation of old crap

I've collected a lot of crap over the years. Most of it in subdirectories of subdirectories. Of subdirectories of subdirectories. I recently made some useful discoveries /home/jonas/own/_private/Arkiv/Ancient/Arkiv/ancient-archive/Salvage/misc/14. The stash of documents in this place originated in old floppy disks from my youth, which I salvaged at some point, then placed them into an archive directory. Which got placed in another archive directory. Which was ancient, so I placed it in an ancient directory, which was placed in an archive directory.

Over the years, I've made some attempts at sorting this out, and possibly around 7 years ago I even made a tool which would help me tag and index archived material. It didn't last long. But it in itself has a handful of archived documents which I clearly felt was important at the time: notes from the FSFE General Assembly in Manchester in 2006, a Swedish translation of Rangzen Tibetan song and then this:

 Archive ID: 2d8f7304
Description: Kiosk computer image for Tekniska Museet

#  Filename             Filetype                  Tags                          
-------------------------------------------------------------------------------
1  kiosk.zip            application/zip           work[gu, fsfe]  
-------------------------------------------------------------------------------

This is the image file used for an exhibition at the Swedish National Museum of Science and Technology which I once helped create a Freedom Toaster. I doubt this has any historical value, but I couldn't manage to part with it. And this is how you end up with paths such as ./Arkiv/Ancient/Ancient/Programming/Gopher/GopherHistory/data/raw/gopher.coyote.org/70/0/Slashdot/Archive/1999-08-14/

(That directory contains a copy of an 18 year-old Slashdot article talking about how SCO might start offering Linux support. The article was snarfed up, archived and included in my Gopher mirror of Slashdot at the time.)

Either way, back to the point of this posting: I'm looking for recommendations. What I would like to have is a tool which would allow me to organise my archive in some sensible way. I feel a need to be able to add tags (like my previous tool did), but I also feel I need to add more metadata and stories to it.

The entire Gopher project, I would probably wrap into one big file and archive it as a collection. But I would want to add to this some information about what that collection actually contains, when it's from, and how I ended up having it.

Ideally in a way such that parts of the archive which are public, and which could be interesting for others, can easily and automatically be published in an inviting way.

Let me have your thoughts. Do I really need to look at tools such as Omeka or Collective Access or can I wing it, and avoid having to pay for an archivist?

Tuesday, 07 November 2017

Software Archaeology

David Boddie - Updates (Full Articles) | 23:32, Tuesday, 07 November 2017

Just over 21 years ago I took a summer job between university courses. Looking back at it now I find it surprising that I was doing contract work. These days I tend to think that I'm not really cut out for that kind of thing but, when you're young, you tend to think you can do anything. Maybe it's just a case of having enough confidence, even if that can get you into trouble sometimes.

The software itself was called Zig Zag - Ancient Greeks and was written for the Acorn RISC OS platform that, in 1996, was still widely used in schools. Acorn had dominated the education market since the introduction of the BBC Micro in the early 1980s but the perception of the PC, particularly in its Windows incarnation, as an "industry standard" continuously undermined Acorn's position with decision-makers in education. Although Acorn released the RiscPC in 1994 with better-than-ever PC compatibility, it wasn't enough to halt the decline of the platform and, despite a boost from the launch of a StrongARM CPU upgrade in 1996, the original lifespan of the platform ended in 1998.

The history of the platform isn't really very relevant, except that Acorn's relentless focus on the education market, while potentially lucrative for the company, made RISC OS software seem a bit uncool to aspiring students and graduates. Perhaps that might explain why I didn't seem to face much competition when I applied for a summer job writing an educational game.

Back to BASICs

This article isn't about the design of the software, or the process of making it, though maybe I should try and make an effort to dig through the sources a bit more. Indeed, because the game was written in a dialect of BBC BASIC called ARM BASIC which was the standard BASIC on the Archimedes series of computers, and fortunately wasn't obfuscated, it's still possible to look at it today. Today, the idea of writing a multi-component, multi-tasking educational experience in BASIC makes me slightly nervous. However, at that time in my life, I was very comfortable writing non-trivial BASIC programs and, although a project of this scope and complexity wasn't something I'd done before, it just seemed more of a challenge than anything else.

Apart from some time spent in the Logotron office at the beginning and end of the project, most of the work was done from home with floppy disks and documents being sent back and forth between Nicola Bradley, the coordinator, and myself. I would be told what should happen in each activity, implement it, send it back, and get feedback on what needed changing. Despite mostly happening remotely and offline, it all got done fairly quickly. It wasn't really any more efficient when I was in Cambridge working in the office.

Everything you need to make olive oil.

The other people involved in the project were also working remotely, so being in the office didn't mean that I would be working alongside them. I only met the artist, Howard Taylor, when Nicola and I went to discuss some work with him. I didn't meet Peter Oxley, the historian responsible for the themes and accuracy of the software, at all. In some ways, apart from the ongoing discussion with Nicola about each revision of the activities, it was Howard with whom I was working most closely. The graphics he created are very much of their time - for a screen of 640 by 256 pixels with 16 colours - but still charming today.

One of the limitations that we encountered was that the software needed to fit onto two 800K floppy disks. Given that all the artwork had been created and tested for each individual activity, and we couldn't do much about the code to implement the behaviour of the activities, that required some kind of compression. I wanted to use the Squash tool that Acorn supplied with their operating system but this apparently wasn't an option. Perhaps Acorn couldn't sublicense its distribution - it was based around the LZW algorithm which was presumably affected by patents in the UK. We ended up using a tool with a fairly vague, permissive license to compress the images and shipped the corresponding decompression code with the software. I believe that the algorithm used was Lempel-Ziv with Huffman coding, though I would have to disassemble the code to find out because it was only supplied in binary form.

As you can see above, I have a way of viewing these images today. As the author of the software, I had the original images but I wanted to view the ones that had been compressed for the release version of the software. This required the use of the RPCEmu emulator to execute a few system calls to get the original images out of the compressed data. However, once extracted, how can we view images stored in an old, proprietary file format?

Worlds Collide

Fortunately, I prepared the ground for handling images in this format a long time ago. My Spritefile Python module was created many years ago so that I could access images I wanted to keep from my teenage years. I've used it in other projects, too, so that I could view more complex files that used this format to store bitmapped images.

In keeping with my more recent activities, I wanted to see if I could create a application for Android that allows the user to browse the contents of any spritefiles that they might still have. Since I'm stubborn and have my own toolchain for writing applications on Android, this meant writing it in a Python-like language, but that worked to my advantage since the Spritefile module is written in Python. It just meant that I would have to fix it up so that it fit into the constraints imposed by the runtime environment on Android.

Blessed are the cheesemakers.

However, things are never quite that simple, though it has to be said that ensuring that the core algorithms ran on Android was a lot easier than getting the application's GUI to behave nicely, and certainly easier than getting the application to run when the user tries to open a file with the appropriate file extension. Free Software desktops are so much more advanced than Android in this regard, and even old RISC OS has better support for mapping between MIME types and file extensions!

I've put the source code for the Sprite Viewer application up in a Mercurial repository. Maybe I'll create a binary package for it at some point. Maybe someone else will find it useful, or perhaps it will bring back fond memories of 1990s educational computing.

Categories: Python, Android, Free Software

Monday, 06 November 2017

Background for future changes to membership in FSFE e.V.

Repentinus » English | 22:25, Monday, 06 November 2017

At the general assembly in October the Executive Council sought the members’ consent to simplify and streamline the route to membership in FSFE e.V. The members gave it, and as a consequence, the Executive Council will prepare a constitutional amendment to remove the institution of Fellowship Representatives at the next general assembly. If this constitutional amendment is accepted, active volunteers meeting a yet-to-be-decided threshold will be expected to directly apply for membership in the FSFE e.V. The Executive’s reasoning for moving in this direction can be found below.

For the reasons listed below, the Council believes that the institution of Fellowship Representatives has ceased to serve its original purpose (and may indeed have never served its intended purpose). In addition, it has become a tool for arbitrarily excluding active contributors from membership, and has thus become harmful to the future development of the organization. Wherefore, the Council believes that the institution of Fellowship Representatives should be removed and asks for the members’ consent in preparing a constitutional amendment to eliminate the institution and resolve the future status of Fellowship Representatives in office at the time of removal. The proposal would be presented to the General Assembly for adoption at the next ordinary meeting.

The Council believes the following:

1) The Fellowship Representatives were introduced for the purpose of giving FSFE’s sustaining donors (known as the Fellowship) a say in how FSFE is operated. This is almost unprecedented in the world of nonprofits, and our community would have been justly outraged if we had introduced similar representation for corporate donors.

2) The elections have identified a number of useful additions to the GA. Most of them can be described as active volunteers with FSFE before their election. The Council believes that by identifying and encouraging active contributors to become GA members and better documenting the procedure of becoming a member, the FSFE would have attracted the same people.

3) We should either agree on including volunteers whose contribution exceeds a certain threshold (core team membership? local/topical team coordinatorship? active local/topical team contributor for a year? – the threshold is entirely up for debate) as members or we should decline to extend membership on the basis of volunteering. It is simply wrong to pit volunteers against each other in a contest where a mixture of other volunteers and a miniscule fraction of solely financial contributors decide which of our volunteers are most deserving of membership. This unfortunate mechanism has excluded at least one current GA member from membership for several years, and it has been used to discourage a few coordinators from applying for membership in the past.

4) Reaching consensus on removing the Fellowship seats is always going to be difficult because we will keep electing new Fellowship Representatives who will understandably be hostile to the idea of eliminating the post. The current members who have been able to observe past and current Fellowship Representatives and their involvement in our activities need to decide if the institution serves a useful role or not, and hence whether to remove it or not. The Council believes it does not, and will prepare a constitutional amendment for GA2018 if the majority of the members feel likewise.

Sunday, 05 November 2017

In Defence of Mail

Paul Boddie's Free Software-related blog » English | 23:21, Sunday, 05 November 2017

A recent LWN.net article, “The trouble with text-only email“, gives us an insight through an initially-narrow perspective into a broader problem: how the use of e-mail by organisations and its handling as it traverses the Internet can undermine the viability of the medium. And how organisations supposedly defending the Internet as a platform can easily find themselves abandoning technologies that do not sit well with their “core mission”, not to mention betraying that mission by employing dubious technological workarounds.

To summarise, the Mozilla organisation wants its community to correspond via mailing lists but, being the origin of the mails propagated to list recipients when someone communicates with one of their mailing lists, it finds itself under the threat of being blacklisted as a spammer. This might sound counterintuitive: surely everyone on such lists signed up for mails originating from Mozilla in order to be on the list.

Unfortunately, the elevation of Mozilla to being a potential spammer says more about the stack of workaround upon workaround, second- and third-guessing, and the “secret handshakes” that define the handling of e-mail today than it does about anything else. Not that factions in the Mozilla organisation have necessarily covered themselves in glory in exploring ways of dealing with their current problem.

The Elimination Problem

Let us first identify the immediate problem here. No, it is not spamming as such, but it is the existence of dubious “reputation” services who cause mail to be blocked on opaque and undemocratic grounds. I encountered one of these a few years ago when trying to send a mail to a competition and finding that such a service had decided that my mail hosting provider’s Internet address was somehow “bad”.

What can one do when placed in such a situation? Appealing to the blacklisting service will not do an individual any good. Instead, one has to ask one’s mail provider to try and fix the issue, which in my case they had actually been trying to do for some time. My mail never got through in the end. Who knows how long it took to persuade the blacklisting service to rectify what might have been a mistake?

Yes, we all know that the Internet is awash with spam. And yes, mechanisms need to be in place to deal with it. But such mechanisms need to be transparent and accountable. Without these things, all sorts of bad things can take place: censorship, harassment, and forms of economic crime spring readily to mind. It should be a general rule of thumb in society that when someone exercises power over others, such power must be controlled through transparency (so that it is not arbitrary and so that everyone knows what the rules are) and through accountability (so that decisions can be explained and judged to have been properly taken and acted upon).

We actually need better ways of eliminating spam and other misuse of common communications mechanisms. But for now we should at least insist that whatever flawed mechanisms that exist today uphold the democratic principles described above.

The Marketing Problem

Although Mozilla may have distribution lists for marketing purposes, its problem with mailing lists is something of a different creature. The latter are intended to be collaborative and involve multiple senders of the original messages: a many-to-many communications medium. Meanwhile, the former is all about one-to-many messaging, and in this regard we stumble across the root of the spam problem.

Obviously, compulsive spammers are people who harvest mail addresses from wherever they can be found, trawling public data or buying up lists of addresses sourced during potentially unethical activities. Such spammers create a huge burden on society’s common infrastructure, but they are hardly the only ones cultivating that burden. Reputable businesses, even when following the law communicating with their own customers, often employ what can be regarded as a “clueless” use of mail as a marketing channel without any thought to the consequences.

Businesses might want to remind you of their products and encourage you to receive their mails. The next thing you know, you get messages three times a week telling you about products that are barely of interest to you. This may be a “win” for the marketing department – it is like advertising on television but cheaper because you don’t have to bid against addiction-exploiting money launderers gambling companies, debt sharks consumer credit companies or environment-trashing, cure peddlers nutritional supplement companies for “eyeballs” – but it cheapens and worsens the medium for everybody who uses it for genuine interpersonal communication and not just for viewing advertisements.

People view e-mail and mail software as a lost cause in the face of wave after wave of illegal spam and opportunistic “spammy” marketing. “Why bother with it at all?” they might ask, asserting that it is just a wastebin that one needs to empty once a week as some kind of chore, before returning to one’s favourite “social” tools (also plagued with spam and surveillance, but consistency is not exactly everybody’s strong suit).

The Authenticity Problem

Perhaps to escape problems with the overly-zealous blacklisting services, it is not unusual to get messages ostensibly from a company, being a customer of theirs, but where the message originates from some kind of marketing communications service. The use of such a service may be excusable depending on how much information is shared, what kinds of safeguards are in place, and so on. What is less excusable is the way the communication is performed.

I actually experience this with financial institutions, which should be a significant area of concern both for individuals, the industry and its regulators. First of all, the messages are not encrypted, which is what one might expect given that the sender would need some kind of public key information that I haven’t provided. But provided that the message details are not sensitive (although sometimes they have been, which is another story), we might not set our expectations so high for these communications.

However, of more substantial concern is the way that when receiving such mails, we have no way of verifying that they really originated from the company they claim to have come from. And when the mail inevitably contains links to things, we might be suspicious about where those links, even if they are URLs in plain text messages, might want to lead us.

The recipient is now confronted with a collection of Internet domain names that may or may not correspond to the identities of reputable organisations, some of which they might know as a customer, others they might be aware of, but where the recipient must also exercise the correct judgement about the relationship between the companies they do use and these other organisations with which they have no relationship. Even with a great deal of peripheral knowledge, the recipient needs to exercise caution that they do not go off to random places on the Internet and start filling out their details on the say-so of some message or other.

Indeed, I have a recent example of this. One financial institution I use wants me to take a survey conducted by a company I actually have heard of in that line of business. So far, so plausible. But then, the site being used to solicit responses is one I have no prior knowledge of: it could be a reputable technology business or it could be some kind of “honeypot”; that one of the domains mentioned contains “cloud” also does not instil confidence in the management of the data. To top it all, the mail is not cryptographically signed and so I would have to make a judgement on its authenticity based on some kind of “tea-leaf-reading” activity using the message headers or assume that the institution is likely to want to ask my opinion about something.

The Identity Problem

With the possibly-authentic financial institution survey message situation, we can perhaps put our finger on the malaise in the use of mail by companies wanting our business. I already have a heavily-regulated relationship with the company concerned. They seemingly emphasise issues like security when I present myself to their Web sites. Why can they not at least identify themselves correctly when communicating with me?

Some banks only want electronic communications to take place within their hopefully-secure Web site mechanisms, offering “secure messaging” and similar things. Others also offer such things, either two-way or maybe only customer-to-company messaging, but then spew e-mails at customers anyway, perhaps under the direction of the sales and marketing branches of the organisation.

But if they really must send mails, why can they not leverage their “secure” assets to allow me to obtain identifying information about them, so that their mails can be cryptographically signed and so that I can install a certificate and verify their authenticity? After all, if you cannot trust a bank to do these things, which other common institutions can you trust? Such things have to start somewhere, and what better place to start than in the banking industry? These people are supposed to be good at keeping things under lock and key.

The Responsibility Problem

This actually returns us to the role of Mozilla. Being a major provider of software for accessing the Internet, the organisation maintains a definitive list of trusted parties through whom the identity of Web sites can be guaranteed (to various degrees) when one visits them with a browser. Mozilla’s own sites employ certificates so that people browsing them can have their privacy upheld, so it should hardly be inconceivable for the sources of Mozilla’s mail-based communications to do something similar.

Maybe S/MIME would be the easiest technology to adopt given the similarities between its use of certificates and certificate authorities and the way such things are managed for Web sites. Certainly, there are challenges with message signing and things like mailing lists, this being a recurring project for GNU Mailman if I remember correctly (and was paying enough attention), but nothing solves a longstanding but largely underprioritised problem than a concrete need and the will to get things done. Mozilla has certainly tried to do identity management in the past, recalling initiatives like Mozilla Persona, and the organisation is surely reasonably competent in that domain.

In the referenced article, Mozilla was described as facing an awkward technical problem: their messages were perceived as being delivered indiscriminately to an audience of which large portions may not have been receiving or taking receipt of the messages. This perception of indiscriminate, spam-like activity being some kind of metric employed by blacklisting services. The proposed remedy for potential blacklisting involved the elimination of plain text e-mail from Mozilla’s repertoire and the deployment of HTML-only mail, with the latter employing links to images that would load upon the recipient opening the message. (Never mind that many mail programs prevent this.)

The rationale for this approach was that Mozilla would then know that people were getting the mail and that by pruning away those who didn’t reveal their receipt of the message, the organisation could then be more certain of not sending mail to large numbers of “inactive” recipients, thus placating the blacklisting services. Now, let us consider principle #4 of the Mozilla manifesto:

Individuals’ security and privacy on the Internet are fundamental and must not be treated as optional.

Given such a principle, why then is the focus on tracking users and violating their privacy, not on deploying a proper solution and just sending properly-signed mail? Is it because the mail is supposedly not part of the Web or something?

The Proprietary Service Problem

Mozilla can be regarded as having a Web-first organisational mentality which, given its origins, should not be too surprising. Although the Netscape browser was extended to include mail facilities and thus Navigator became Communicator, and although the original Mozilla browser attempted to preserve a range of capabilities not directly related to hypertext browsing, Firefox became the organisation’s focus and peripheral products such as Thunderbird have long struggled for their place in the organisation’s portfolio.

One might think that the decision-makers at Mozilla believe that mundane things like mail should be done through a Web site as webmail and that everyone might as well use an established big provider for their webmail needs. After all, the vision of the Web as a platform in its own right, once formulated as Netscape Constellation in more innocent times, can be used to justify pushing everything onto the Web.

The problem here is that as soon as almost everyone has been herded into proprietary service “holding pens”, expecting a free mail service while having their private communications mined for potential commercial value, things like standards compliance and interoperability suffer. Big webmail providers don’t need to care about small mail providers. Too bad if the big provider blacklists the smaller one: most people won’t even notice, and why don’t the users of the smaller provider “get with it” and use what everybody else is using, anyway?

If everyone ends up almost on the same server or cluster of servers or on one of a handful of such clusters, why should the big providers bother to do anything by the book any more? They can make all sorts of claims about it being more efficient to do things their own way. And then, mail is no longer a decentralised, democratic tool any more: its users end up being trapped in a potentially exploitative environment with their access to communications at risk of being taken away at a moment’s notice, should the provider be persuaded that some kind of wrong has been committed.

The Empowerment Problem

Ideally, everyone would be able to assert their own identity and be able to verify the identity of those with whom they communicate. With this comes the challenge in empowering users to manage their own identities in a way which is resistant to “identity theft”, impersonation, and accidental loss of credentials that could have a severe impact on a person’s interactions with necessary services and thus on their life in general.

Here, we see the failure of banks and other established, trusted organisations to make this happen. One might argue that certain interests, political and commercial, do not want individuals controlling their own identity or their own use of cryptographic technologies. Even when such technologies have been deployed so that people can be regarded as having signed for something, it usually happens via a normal secured Web connection with a button on a Web form, everything happening at arm’s length. Such signatures may not even be any kind of personal signature at all: they may just be some kind of transaction surrounded by assumptions that it really was “that person” because they logged in with their credentials and there are logs to “prove” it.

Leaving the safeguarding of cryptographic information to the average Internet user seems like a scary thing to do. People’s computers are not particularly secure thanks to the general neglect of security by the technology industry, nor are they particularly usable or understandable, especially when things that must be done right – like cryptography – are concerned. It also doesn’t help that when trying to figure out best practices for key management, it almost seems like every expert has their own advice, leaving the impression of a cacophony of voices, even for people with a particular interest in the topic and an above-average comprehension of the issues.

Most individuals in society might well struggle if left to figure out a technical solution all by themselves. But institutions exist that are capable of operating infrastructure with a certain level of robustness and resilience. And those institutions seem quite happy with the credentials I provide to identify myself with them, some of which being provided by bits of hardware they have issued to me.

So, it seems to me that maybe they could lead individuals towards some kind of solution whereupon such institutions could vouch for a person’s digital identity, provide that person with tools (possibly hardware) to manage it, and could help that person restore their identity in cases of loss or theft. This kind of thing is probably happening already, given that smartcard solutions have been around for a while and can be a component in such solutions, but here the difference would be that each of us would want help to manage our own identity, not merely retain and present a bank-issued identity for the benefit of the bank’s own activities.

The Real Problem

The LWN.net article ends with a remark mentioning that “the email system is broken”. Given how much people complain about it, yet the mail still keeps getting through, it appears that the brokenness is not in the system as such but in the way it has been misused and undermined by those with the power to do something about it.

That the metric of being able to get “pull requests through to Linus Torvalds’s Gmail account” is mentioned as some kind of evidence perhaps shows that people’s conceptions of e-mail are themselves broken. One is left with an impression that electronic mail is like various other common resources that are systematically and deliberately neglected by vested interests so that they may eventually fail, leaving those vested interests to blatantly profit from the resulting situation while making remarks about the supposed weaknesses of those things they have wilfully destroyed.

Still, this is a topic that cannot be ignored forever, at least if we are to preserve things like genuinely open and democratic channels of communication whose functioning may depend on decent guarantees of people’s identities. Without a proper identity or trust infrastructure, we risk delegating every aspect of our online lives to unaccountable and potentially hostile entities. If it all ends up with everyone having to do their banking inside their Facebook account, it would be well for the likes of Mozilla to remember that at such a point there is no consolation to be had any more that at least everything is being done in a Web browser.

Friday, 03 November 2017

EU Ministers call for more Free Software in governmental infrastructure

polina's blog | 15:46, Friday, 03 November 2017

On 6 October, 32 European Ministers in charge of eGovernment policy signed Tallinn Declaration on eGovernment that calls for more collaboration, interoperable solutions and sharing of good practices throughout public administrations and across borders. Amongst many other things, the EU ministers recognised the need to make more use of Free Software solutions and Open Standards when (re)building governmental digital systems.

Tallinn Declaration, lead by the Estonian presidency in the EU, has been adopted on 6 October 2017. It is a ministerial declaration that marks a new political commitment at EU and EFTA (European Free Trade Area) level on priorities to ensure user-centric digital public services for both citizens and businesses cross-border. While having no legislative power, ministerial declaration marks a political commitment to ensure the digital transformation of public administrations through a set of commonly agreed principles and actions.

The FSFE has previously submitted its input for the aforementioned declaration during the public consultation round, asking for greater inclusion of Free Software in delivering truly inclusive, trustworthy and interoperable digital services to all citizens and businesses across the EU.

The adopted Tallinn Declaration proves to be a forward-looking document that acknowledges the importance of Free Software in order to ensure the principle of ‘interoperability by default’, and expresses the will of all signed EU countries to:

make more use of open source solutions and/or open standards when (re)building ICT systems and solutions (among else, to avoid vendor lock-ins)[...]

Additionally, the signatories call upon the European Commission to:

consider strengthening the requirements for use of open source solutions and standards when (re)building of ICT systems and solutions takes place with EU funding, including by an appropriate open licence policy – by 2020

The last point is especially noteworthy, as it explicitly calls for the European Commission to make use of Free Software and Open Standards in building their ICT infrastructure with EU funds, making the point in line with our “Public Money, Public Code” campaign that is targeted at the demand for all publicly financed software developed for the public sector to be publicly made available under Free Software licence.

What’s next?

Tallinn Declaration sets several deadlines for its implementation in the next few years: with the annual presentation on the progress of implementation of the declaration in the respective countries across the EU and EFTA through the eGovernment Action Plan Steering Board. The signatories also called upon the Austrian Presidency of the Council of the EU to take stock of the implementation of Tallinn Declaration in autumn 2018.

While reinstating the fact that ministerial declaration has no legislative power inflicted on the signed countries, it nevertheless expresses the political will of the EU and EFTA countries to digitise their governments in the most user-friendly and efficient way. The fact that it explicitly recognises the role of Free Software and Open Standards for a trustworthy, transparent and open eGovernment on a high level, along with a demand for strengthened reuse of ICT solutions based on Free Software in the EU public sector, is a valuable step forward establishing a real “Public Money, Public Code” reality across Europe.

Hence, it is always worthy to have a ‘good’ declaration, than no declaration at all. Now it all depends on a proper implementation.

Thursday, 02 November 2017

Get ready for NoFlo 1.0

Henri Bergius | 00:00, Thursday, 02 November 2017

After six years of work, and bunch of different projects done with NoFlo, we’re finally ready for the big 1.0. The two primary pull requests for the 1.0.0 cycle landed today, and so it is time to talk about how to prepare for it.

tl;dr If your project runs with NoFlo 0.8 without deprecation warnings, you should be ready for NoFlo 1.0

ES6 first

The primary difference between NoFlo 0.8 and 1.0 is that now we’re shipping it as ES6 code utilizing features like classes and arrow functions.

Now that all modern browsers support ES6 out of the box, and Node.js 8 is the long-term supported release, it should be generally safe to use ES6 as-is.

If you need to support older browsers, Node.js versions, or maybe PhantomJS, it is of course possible to compile the NoFlo codebase into ES5 using Babel.

We recommend new components to be written in ES6 instead of CoffeeScript.

Easier webpack builds

It has been possible to build NoFlo projects for browsers since 2013. Last year we switched to webpack as the module bundler.

However, at that stage there was still quite a lot of configuration magic happening inside grunt-noflo-browser. This turned out to be sub-optimal since it made integrating NoFlo into existing project build setups difficult.

Last week we extracted the difficult parts out of the Grunt plugin, and released the noflo-component-loader webpack loader. With this, you can generate a configured NoFlo component loader in any webpack build. See this example.

In addition to generating the component loader, your NoFlo browser project may also need two other loaders, depending how your NoFlo graphs are built: json-loader for JSON graphs, and fbp-loader for graphs defined in the .fbp DSL.

Removed APIs

There were several old NoFlo APIs that we marked as deprecated in NoFlo 0.8. In that series, usage of those APIs logged warnings. Now in 1.0 the deprecated APIs are completely removed, giving us a lighter, smaller codebase to maintain.

Here is a list of the primary API removals and the suggested migration strategy:

  • noflo.AsyncComponent class: use WirePattern or Process API instead
  • noflo.ArrayPort class: use InPort/OutPort with addressable: true instead
  • noflo.Port class: use InPort/OutPort instead
  • noflo.helpers.MapComponent function: use WirePattern or Process API instead
  • noflo.helpers.WirePattern legacy mode: now WirePattern always uses Process API internally
  • noflo.helpers.WirePattern synchronous mode: use async: true and callback
  • noflo.helpers.MultiError function: send errors via callback or error port
  • noflo.InPort process callback: use Process API
  • noflo.InPort handle callback: use Process API
  • noflo.InPort receive method: use Process API getX methods
  • noflo.InPort contains method: use Process API hasX methods
  • Subgraph EXPORTS mechanism: disambiguate with INPORT/OUTPORT

The easiest way to verify whether your project is compatible is to run it with NoFlo 0.8.

You can also make usage of deprecated APIs throw errors instead of just logging them by setting the NOFLO_FATAL_DEPRECATED environment variable. In browser applications you can set the same flag to window.

Scopes

Scopes are a flow isolation mechanism that was introduced in NoFlo 0.8. With scopes, you can run multiple simultaneous flows through a NoFlo network without a risk of data leaking from one scope to another.

The primary use case for scope isolation is building things like web API servers, where you want to isolate the processing of each HTTP request from each other safely, while reusing a single NoFlo graph.

Scope isolation is handled automatically for you when using Process API or WirePattern. If you want to manipulate scopes, the noflo-packets library provides components for this.

NoFlo in/outports can also be set as scoped: false to support getting out of scopes.

asCallback and async/await

noflo.asCallback provides an easy way to expose NoFlo graphs to normal JavaScript consumers. The produced function uses the standard Node.js callback mechanism, meaning that you can easily make it return promises with Node.js util.promisify or Bluebird. After this your NoFlo graph can be run via normal async/await.

Component libraries

There are hundreds of ready-made NoFlo components available on NPM. By now, most of these have been adapted to work with NoFlo 0.8.

Once 1.0 ships, we’ll try to be as quick as possible to update all of them to run with it. In the meanwhile, it is possible to use npm shrinkwrap to force them to depend on NoFlo 1.0.

If you’re relying on a library that uses deprecated APIs, or hasn’t otherwise been updated yet, please file an issue in the GitHub repo of that library.

This pull request for noflo-gravatar is a great example of how to implement all the modernization recommendations below in an existing component library.

Recommendations for new projects

This post has mostly covered how to adapt existing NoFlo projects for 1.0. How about new projects? Here are some recommendations:

  • While NoFlo projects have traditionally been written in CoffeeScript, for new projects we recommend using ES6. In particular, follow the AirBnB ES6 guidelines
  • Use fbp-spec for test automation
  • Use NPM scripts instead of Grunt for building and testing
  • Make browser builds with webpack utilizing noflo-component-loader
  • Use Process API when writing components
  • If you expose any library functionality, provide an index file using noflo.asCallback for non-NoFlo consumers

The BIG IoT Node.js bridge is a recent project that follows these guidelines if you want to see an example in action.

There is also a project tutorial available on the NoFlo website.

Wednesday, 01 November 2017

Akademy 2018 site visit

TSDgeos' blog | 22:16, Wednesday, 01 November 2017

Last week I was part of the expedition by KDE (together with Kenny and Petra) to visit the local team that is helping us organize Akademy 2018 in Vienna.

I can say that I'm very happy :)

The accommodation that we're probably going to recommend is 15 minutes away by train from the airport, 20 minutes on foot from the Venue (10 on metro) so kind of convenient.

The Venue is modern and centrally located in Vienna (no more "if you miss the bus you're stranded")

Vienna itself seems to be also a nice city for late evening sight-seeing (or beers :D)

Hopefully "soon" we will have confirmation of the dates and we'll be able to publish them :)

Adoption of Free Software PDF readers in Italian Regional Public Administrations (fifth monitoring)

Tarin Gamberini | 08:56, Wednesday, 01 November 2017

The following monitoring shows that, in the last semester, eight Italian Regions have reduced advertisement of proprietary PDF readers on their website, and that a Region has increased its support for Free Software PDF readers.

Continue reading →

Monday, 30 October 2017

Technoshamanism and Free Digital Territories in Benevento, Italy

agger's Free Software blog | 09:30, Monday, 30 October 2017

From October 23 to 29, an international seminar about technoshamanism and the concept of “Digital Land” or “free digital territories” was held in the autonomous ecological project Terra Terra near Reino, Benevento, Italy. The event was organized by Vincenzo Tozzi announced on the Bricolabs Mailing List. The seminar was held in the form of a “Pajelança Quilombólica Digital”, as it’s called in Brazilian Portuguese, a “digital shamanism” brainstorming on the possibilities of using free digital territories to connect free real-world territories.

Vincenzo Tozzi is the founder of the Baobáxia project which is run by the Brazilian network Rede Mocambos, and the main point of departure was the future of Baobáxia – sustainability, development, paths for the future.

Arriving in Napoli, I had the pleasure of meeting Hellekin and Natacha Roussel from Brussels who had received the call through the Bricolabs list. Vince had arranged that we could stay in the Mensa Occupata, a community-run squat in central Napoli that was occupied in 2012 and is the home of a hackerspace, a communal kitchen and a martial arts gym, the “Palestra Populare Vincenzo Leone”, where I was pleased to see that my own favourite capoeira is among the activities.

The actual seminar took place in much more rural settings outside Reino, in the country known locally as “terra delle streghe” or “land of the witches”. With respect to our desire to work with free territories and the inherent project of recuperating and learning from ancestral traditions, the area is interesting in the sense that the land is currently inhabited by the last generation of farmers to cultivate the land with traditional methods supported by an oral tradition which har mostly been lost in the most recent decades.  During the seminar, we had the opportunity to meet up with people from the local cooperative Lentamente, which is working to preserve and recuperate the traditional ways of growing crops and keeping animals without machines (hence the name “lentamente”, slowly) as well as trying to preserve as much as possible of the existing oral traditions.

During the seminar, we accomodated to the spirit of the territory and the settings by dividing the day into two parts: In the morning, we would go outside and work on the land until lunchtime, which would be around three o’clock.  After dinner, we’d dedicate the evenings to more general  discussions as well as to relaxing, often still covering important ground.

After lunch, hopefully properly wake and inspired by the fresh air and the beauty of the countryside, we would start looking at the technical side of things, delve into the code, discuss protocols and standards and explore possible pathways to the future. Among other things, we built some stairs and raised beds on a hillside leading up to the main buildings and picked olives for about twenty litres of oil.

As for the technical side of the encounter, we discussed the structure of the code, the backend  repositories and the offline synchronization process with newcomers to the project, reviewed various proposals for the technical future of the project and installed two new mucuas. In the process, we identified some bugs and fixed a couple of them.

An important aspect  of the concept of “free digital territories” is that we are looking for and using new metaphors for software development. Middle-class or otherwise well-off people who are used to have the means to employ servants or hire e.g. a lawyer whenever they need one may find it easy to conceive of a computer as a “server”  whose life is dedicated to serving its “clients”. For armies of office workers, having a computer pre-equipped with a “desktop” absolutely makes sense. But in the technoshamanism and quilombolic networks we’re not concerned with perpetuating the values and structures of capitalist society. We wish to provide software for free territories, and thus our metaphors are not based on the notion of clients and servers, but of digital land: A mucúa or node of the Baobáxia system is not a “server”, it’s digital land for the free networks to grow and share their culture.

Another important result was that the current offline synchronization and storage using git and git-annex can be generalized to other applications. Baobáxia currently uses a data format whose backend representation and synchronization is fixed by convention, but we could build other applications using the same protocol, a protocol for “eventually connected federated networks“. Other examples of applications that could use this technology for offline or eventually connected communications is wikis, blogs and calendars. One proposal is therefore to create an RFC for this communication, basically documenting and generalizing Baobáxia’s current protocol. The protocol, which at present includes the offline propagation of requests for material not present on a local node of the system, could also be generalized to allow arbitrary messages and commands, e.g. requesting the performance of a service known to be running in another community, to be stored offline and performed when the connection actually happens. This RFC (or RFCs) should be supplemented by proof-of-concept applications which should not be very difficult to write.

This blog post is a quick summary of my personal impressions, and I think there are many more stories to be told about the threads we tried to connect those days in Benevento. All in all, the encounter was very fruitful and I was happy to meet new people and use these days to concentrate of the future of Baobáxia and related projects for free digital territories.

Sunday, 29 October 2017

Open Source Summit - Day 3

Inductive Bias | 08:35, Sunday, 29 October 2017

Open source summit Wednesday started with a keynote by members of the Banks family telling a packed room on how they approached raising a tech family. The first hurdle that Keila (the teenage daughter of the family) talked about was something I personally had never actually thought about: Communication tools like Slack that are in widespread use come with an age restriction excluding minors. So by trying to communicate with open source projects means entering illegality.

A bit more obivious was their advise to help raise kids' engagement with tech: Try to find topics that they can relate to. What works fairly often are reverse engineering projects that explain how things actually work.

The Banks are working with a goal based model where children get ten goals to pursue during the year with regular quarterly reviews. An intersting twist though: Eight of these ten goals are choosen by the children themselves, two are reserved for parents to help with guidance. As obvious as this may seem, having clear goals and being able to influence them yourselves is something that I believe is applicable in the wider context of open source contributor and project mentoring as well as employee engagement.

The speakers also talked about embracing children's fear. Keila told the story of how she was afraid to talk in front of adult audiences - in particular at the keynote level. The advise that her father gave that did help her: You can trip on the stage, you can fall, all of that doesn't matter for as long as you can laugh at yourself. Also remember that every project is not the perfect project - there's always something you can improve - and that's ok. This is fairly in line with the feedback given a day earlier during the Linux Kernel Panel where people mentioned how today they would never accept the first patch they themselves had once written: Be persistant, learn from the feedback you get and seek feedback early.

Last but not least, the speakers advised to not compare your family to anyone, not even to yourself. Everyone arrives at tech via a different route. It can be hard to get people from being averse to tech to embrace it - start with a tiny little bit of motivation, from there on rely on self motivation.

The family's current project turned business is to support L.A. schools to support children get a handle on tech.

The TAO of Hashicorp

In the second keynote Hashimoto gave an overview of the Tao of Hashicorp - essentially the values and principles the company is built on. What I found interesting about the talk was the fact that these values were written down very early in the process of building up Hashicorp when the company didn't have much more than five employees, comprised vision, roadmap and product design pieces and has been applied to every day decisions ever since.

The principles themselves cover the following points:
  • Workflows - not technologies. Essentially describing a UX first approach where tools are being mocked and used first before diving deeper into the architecture and coding. This goes as far as building a bash script as a mockup for a command line interface to see if it works well before diving into coding.
  • Simple, modular and Comosable. Meaning that tools built should have one clear purpose instead of piling features on top of each other for one product.
  • Communicating sequential processes. Meaning to have standalone tools with clear APIs.
  • Immutability.
  • Versioning through Codification. When having a question, the answer "just talk to X" doesn't scale as companies grow. There are several fixes to this problem. The one that Hashicorp decided to go for was to write knowledge down in code - instead of having a README.md detailing how startup works, have something people can execute.
  • Automate.
  • Resilient systems. Meaning to strive for systems that know their desired state and have means to go back to it.
  • Pragmatism. Meaning that the principles above shouldn't be applied blindly but adjusted to the problem at hand.


While the content itself differs I find it interesting that Hashicorp decided to communicate in terms of their principles and values. This kind of setup reminds me quite a bit about the way Amazon Leadership principles are being applied and used inside of Amazon.

Integrating OSS in industrial environments - by Siemens

The third keynote was given by Siemens, a 170 year old, 350k employees rich German corporation focussed on industrial appliances.

In their current projects they are using OSS in embedded projects related to power generation, rail automation (Debian), vehicle control, building automation (Yocto), medical imaging (xenomai on big machines).

Their reason for tapping into OSS more and more is to grow beyond their own capabilities.

A challenge in their applications relates to long term stability, meaning supporiting an appliance for 50 years and longer. Running there appliances unmodified for years today is not feasible anymore due to policies and corporate standards that requrire updates in the field.

Trouble they are dealing with today is in the cost of software forks - both, self inflicted and supplier caused forks. The amount of cost attached to these is one of the reasons for Siemens to think upstream-first, both internally as well as when choosing suppliers.

Another reason for this approach is to be found in trying to become part of the community for three reasons: Keeping talent. Learning best practices from upstream instead of failing one-self. Better communication with suppliers through official open source channels.

One project Siemens is involved with at the moment is the so-called Civil Infrastructure Platform project.

Another huge topic within Siemens is software license compliance. Being a huge corporation they rely on Fossology for compliance checking.

Linus Torvalds Q&A

The last keynote of the day was an on stage interview with Linus Torvalds. The introduction to this kind of format was lovely: There's one thing Linus doesn't like: Being on stage and giving a pre-created talk. Giving his keynote in the form of an interview with questions not shared prior to the actual event meant that the interviewer would have to prep the actual content. :)

The first question asked was fairly technical: Are RCs slowing down? The reason that Linus gave had a lot to do with proper release management. Typically the kernel is released on a time-based schedule, with one release every 2.5 months. So if some feature doesn't make it into a release it can easily be integrated into the following one. What's different with the current release is Greg Kroah Hartman having announced it would be a long term support release, so suddenly devs are trying to get more features into it.

The second question related to a lack of new maintainers joining the community. The reasons Linus sees for this are mainly related to the fact that being a maintainer today is still fairly painful as a job: You need experience to quickly judge patches so the flow doesn't get overwhelming. On the other hand you need to have shown to the community that you are around 24/7, 365 days a year. What he wanted the audience to know is that despite occasional harsh words he loves maintainers, the project does want more maintainers. What's important to him isn't perfection - but having people that will stand up to their mistakes.

One fix to the heavy load mentioned earlier (which was also discussed during the kernel maintainers' panel a day earlier) revolved around the idea of having a group of maintainers responsible for any single sub-system in order to avoid volunteer burnout, allow for vacations to happen, share the load and ease hand-over.

Asked about kernel testing Linus admitted to having been sceptical about the subject years ago. He's a really big fan of random testing/ fuzzing in order to find bugs in code paths that are rarely if ever tested by developers.

Asked about what makes a successful project his take was the ability to find commonalities that many potential contributors share, the ability to find agreement, which seems easier for systems with less user visibility. An observation that reminded my of the bikeshedding discussions.

Also he mentioned that the problem you are trying to solve needs to be big enough to draw a large enough crowd. When it comes to measuring success though his insight was very valuable: Instead of focussing too much on outreach or growth, focus on deciding whether your project solves a problem you yourself have.

Asked about what makes a good software developer, Linus mentioned that the community over time has become much less homogenuous compared to when he started out in his white, male, geeky, beer-loving circles. The things he believes are important for developers are caring about what they do, being able to invest in their skills for a long enough period to develop perfection (much like athletes train a long time to become really sucessful). Also having fun goes a long way (though in his eyes this is no different when trying to identify a successful marketing person).

While Linus isn't particularly comfortable interacting with people face-to-face, e-mail for him is different. He does have side projects beside the kernel. Mainly for the reason of being able to deal with small problems, actually provide support to end-users, do bug triage. In Linux kernel land he can no longer do this - if things bubble up to his inbox, they are bound to be of the complex type, everything else likely was handled by maintainers already.

His reason for still being part of the Linux Kernel community: He likes the people, likes the technology, loves working on stuff that is meaningful, that people actually care about. On vacation he tends to check his mail three times a day to not loose track and be overwhelmed when he gets back to work. There are times when he goes offline entirely - however typically after one week he longing to be back.

Asked about what further plans he has, he mentioned that for the most part he doesn't plan ahead of time, spending most of his life reacting and being comfortable with this state of things.

Speaking of plans: It was mentioned that likely Linux 5.0 is to be released some time in summer 2018 - numbers here don't mean anything anyway.

Nobody puts Java in a container

Jörg Schad from Mesosphere gave an introduction to how container technolgies like Docker really work and how that applies to software run in the JVM.

He started off by explaining the advantages of containers: Isolating what's running inside, supplying standard interfaces to deployed units, sort of the write once, run anywhere promise.

Compared to real VMs they are more light weight, however with the caveat of using the host kernel - meaning that crashing the kernel means crashing all container instances running on that host as well. In turn they are faster to spin up, need less memory and less storage.

So which properties do we need to look at when talking about having a JVM in a container? Resource restrictions (CPU, memory, device visibility, blkio etc.) are being controlled by cgroups. Process spaces for e.g. pid, net, ipc, mnt, users and hostnames are being controlled through libcontainer namespaces.

Looking at cgroups there are two aspects that are very obviously interesting for JVM deployments: For memory settings one can set hard and soft limits. However much in contrast to the JVM there is no such thing as an OOM being thrown when resources are exhausted. For CPUs available there are two ways to configure limits: cpushares lets you give processes a relative priority weighting. Cpusets lets you pin groups to specific cpus.

General advise is to avoid cupsets as it removes one level of freedom from scheduling, often leads to less efficiency. However it's a good tool to avoid cup-bouncing, and to maximise cache usage.

When trying to figure out the caveats of running JVMs in containers one needs to understand what the memory requirements for JVMs are: In addition to the well known, configurable heap memory, each JVM needs a bit of native JRE memory, perm get/ meta space, JIT bytecode space, JNO and NIO space as well as additional native space for threads. With permgen space turned native meta space that means that class loader leaks are capable of maxing out the memory of the entire machine - one good reason to lock JVMs in containers.

The caveats of putting JVMs into containers are related to JRE intialisation defaults being influenced by information like the number of cores available: It influences the number of JIT compilation threads, hotspot thresholds and limits.

One extreme example: When running ten JVM containers in a 32 core box this means that:
  • Each JVM believes it's alone on the machine configuring itself to the maximally availble CPU count.
  • pre-Java-9 the JVM is not aware of cpusets, meaning it will think that it can use all 32 cores even if configured to use less than that.


Another caveat: JVMs typically need more resources on startup, leading to a need for overprovisioning just to get it started. Jörg promised a blog post to appear on how to deal with this question on the DC/OS blog soon after the summit.

Also for memory Java9 provides the option to look at memory limits set through cgroups. The (still experimental) option for that: -XX:+UseCGroupMemLimitForHeap

As a conclusion: Containers don't hide the underlying hardware - which is both, good and bad.

Goal - question - metric approach to community measurement

In his talk on applying goals question metrics to software development management Jose Manrique Lopez de la Fuente explained how to successfully choose and use metrics in OSS projects.

He contrasted the OKR based approach to goal setting with the goal question metric approach. In the latter one first thinks about a goal to achieve (e.g. "We want a diverse community."), go from there to questions to help understand the path ot that goal better ("How many people from underrepresented groups do we have."), to actual metrics to answer that question.

Key to applying this approach is a cycle that integrates planning, making changes, checking results and acting on them.

Goals, questions and metrics need to be in line with project goals, involve management and involve contributors. Metrics themselves are only useful for as long as they are linked to a certain goal.

What it needs to make this approach successful is a mature organisation that understands the metrics' value, refrains from gaming the system. People will need training on how to use the metrics, as well as transparency about metrics.

Projects dealing with applying more metrics and analytics to OSS projects include Grimoire Lab, CHAOSS (Community Health Analytics for OSS).

There's a couple interesting books: Managing inner source projects. Evaluating OSS projects as well as the Grimoire training which are all available freely online.

Container orchestration - the state of play

In his talk Michael Bright gave an overview of current container orchestration systems. In his talk he went into some details for Docker Swarm, Kubernetes, Apache Mesos. Technologies he left out are things like Nomad, Cattle, Fleet, ACS, ECS, GKE, AKS, as well as managed cloud.

What became apparent from his talk was that the high level architecture is fairly similar from tool to tool: Orchestration projects make sense where there are enough microservices to be unable to treat them like pets with manual intervention needed in case something goes wrong. Orchestrators take care of tasks like cluster management, micro service placement, traffic routing, monitoring, resource management, logging, secret management, rolling updates.

Often these systems build a cluster that apps can talk to, with masters managing communication (coordinated through some sort of distributed configuration management system, maybe some RAFT based consensus implementation to avoid split brain situations) as well as workers that handle requests.

Going into details Michael showed the huge takeup of Kubernetes compared to Docker Swarm and Apache Mesos, up the point where even AWS joined CNCF.

For Thursday I went to see Rich Bowen's keynote on the Apache Way at MesosCon. It was great to hear how people were interested in the greater context of what Apache provides to the Mesos project in terms of infrastructure and mentoring. Also there were quite a few questions on what that thing called The Apache Software Foundation actually is at their booth at MesosCon.

Hopefully the initiative started on the Apache Community development mailing list on getting more information out on how things are managed at Apache will help spread the word even further.

Overall Open Source Summit, together with it's sister events like e.g. KVM forum, MesosCon as well as co-located events like the OpenWRT summit was a great chance to meet up with fellow open source developers and project leads, learn about technologies and processes both familiar was well as new (in my case the QEMU on UEFI talk clearly was above my personal comfort zone understanding things - here it's great to be married to a spouse who can help fill the gaps after the conference is over). There was a fairly broad spectrum of talks from Linux kernel internals, to container orchestration, to OSS licensing, community management, diversity topics, compliance, and economics.

Saturday, 28 October 2017

Dutch coalition agreement: where’s the trust in Free Software?

André Ockers on Free Software » English | 07:14, Saturday, 28 October 2017

The new Dutch government, consisting of liberal-conservatives (VVD), christian democrats (CDA), democrats (D66) and orthodox protestants (CU), published the new coalition agreement: Vertrouwen in de toekomst (“Trust in the future”). I searched through all sections of this document, searching for the word “software”.

According to the new government, software is a matter for the justice department. Software is not mentioned in any other section, including the economic, education, labor policy, innovation policy and living environment sections.

So it’s the minister of justice who deals with software. Software is mentioned at two places in the justice section:

  • The making of a cybersecurity agenda, including the stimulation of companies to make software safer through software liability.
  • Buying hacksoftware for the Dutch intelligence service.

This means software is being seen as:

  • Unsafe, and the state will ensure it’s going to be safer.
  • A tool to further build the Dutch surveillance and control state.

There’s a world of possibilities to use (existing!) Free Software to strengthen the economy, provide the youth with real education and turn the Netherlands into a more innovative and livable part of Europe. Apparently this is not a priority. Where’s the trust in Free Software?

Wednesday, 25 October 2017

Autonomous by Annalee Newitz

Evaggelos Balaskas - System Engineer | 11:09, Wednesday, 25 October 2017

autonomous.jpg

Autonomous by Annalee Newitz

The year is 2144. A group of anti-patent scientists are working to reverse engineer drugs in free labs, for (poor) people to have access to them. Agents of International Property Coalition are trying to find the lead pirate-scientist and stop any patent violation by any means necessary. In this era, without a franchise (citizenship) autonomous robots and people are slaves. But only a few of the bots have are autonomous. Even then, can they be free ? Can androids choose their own gender identity ? Transhumanism and extension life drugs are helping people to live a longer and better life.

A science fiction novel without Digital Rights Management (DRM).

Tag(s): Autonomous, books

Open source summit - Day 2

Inductive Bias | 10:58, Wednesday, 25 October 2017

Day two of Open Source summit for me started a bit slow for lack of sleep. The first talk I went to was on "Developer tools for Kubernetes" by Michelle Noorali and Matt Butcher. Essentially the two of them showed two projects (Draft and Brigade to help ease development apps for Kubernetes clusters. Draft here is the tool to use for developing long running, daemon like apps. Brigade has the goal of making event driven app development easier - almost like providing shell script like composability to Kubernetes deployed pipelines.

Kubernetes in real life

In his talk on K8s in real life Ian Crosby went over five customer cases. He started out by highlighting the promise of magic from K8s: Jobs should automatically be re-scheduled to healthy nodes, traffic re-routed once a machine goes down. As a project it came out of Google as a re-implementation of their internal, 15 years old system called Borg. Currently the governance of K8s lies with the Cloud Native Foundation, part of the Linux Foundation.

So what are some of the use cases that Ian saw talking to customers:
  • "Can you help us setup a K8s cluster?" - asked by a customer with one monolithic application deployed twice a year. Clearly that is not a good fit for K8s. You will need a certain level of automation, continuous integration and continuous delivery for K8s to make any sense at all.
  • There were customers trying to get into K8s in order to be able to hire talent interested in that technology. That pretty much gets the problem the wrong way around. K8s also won't help with organisational problems where dev and ops teams aren't talking with each other.
  • The first question to ask when deploying K8s is whether to go for on-prem, hosted externally or a mix of both. One factor pulling heavily towards hosted solution is the level of time and training investment people are willing to make with K8s. Ian told the audience that he was able to migrate a complete startup to K8s within a short period of time by relying on a hosted solution resulting in a setup that requires just one ops person to maintain. In that particular instance the tech that remained on-prem were Elasticsearch and Kafka as services.
  • Another client (government related, huge security requirements) decided to go for on-prem. They had strict requirements to not connect their internal network to the public internet resulting in people carrying downloaded software on USB sticks from one machine to the other. The obvious recommendation to ease things at least a little bit is to relax security requirements at least a little bit here.
  • In a third use case the customer tried to establish a prod cluster, staging cluster, test cluster, one dev cluster per developer - pretty much turning into a maintainance nightmare. The solution was to go for a one cluster architecture, using shared resources, but namespaces to create virtual clusters, role based access control for security, network policies to restrict which services can talk to each other, service level TLS to get communications secure. Looking at CI this can be taken one level furter even - spinning up clusters on the fly when they are needed for testing.
  • In another customer case Java apps were dying randomly - apparently because what was deployed was using the default settings. Lesson learnt: Learn how it works first, go to production after that.

Rebuilding trust through blockchains and open source

Having pretty much no background in blockchains - other than knowing that a thing like bitcoin exists - I decided to go to the introductory "Rebuilding trust through blockchains and open source" talk next. Marta started of by explaining how societies are built on top of trust. However today (potentially accelerated through tech) this trust in NGOs, governments and institutions is being eroded. Her solution to the problem is called Hyperledger, a trust protocol to build an enterprise grade distributed database based on a permissioned block chain with trust built-in.

Marta went on detailing eight use cases:
  • Cross border payments: Currently, using SWIFT, these take days to complete, cost a lot of money, are complicated to do. The goal with rolling out block chains for this would be to make reconcillation real-time. Put information on a shared ledger to make it audible as well. At the moment ANZ, WellsFargo, BNP Paribas and BNY Mellon are participating in this POC.
  • Healthcare records: The goal is to put pointers to medical data on a shared ledger so that procedures like blood testing are being done just once and can be trusted across institutions.
  • Interstate medical licensing: Here the goal is to make treatment re-imbursment easier, probably even allowing for handing out fixed-purpose budgets.
  • Ethical seafood movement: Here the goal is to put information on supply chains for seafood on a shared ledger to make tracking easier, audible and cheaper. The same applies for other supply chains, think diamonds, coffee etc.
  • Real estate transactions: The goal is to keep track of land title records on a shared ledger for easier tracking, auditing and access. Same could be done for certifications (e.g. of academic titles etc.)
  • Last but not least there is a POC to how how to use shared ledgers to track ownership of creative works in a distributed way and take the middleman distributing money to artists out of the loop.

Kernel developers panel discussion

For the panel discussion Jonathan Corbet invited five different Linux kernel hackers in different stages of their career, with different backgrounds to answer audience questions. The panel featured Vlastimil Babka, Arnd Bergmann, Thomas Gleixner, Narcisa Vasile, Laura Abbott.

The first question revolved around how people had gotten started with open source and kernel development and what advise they would have for newbies. The one advise shared by everyone other than scratch your own itch and find something that interests you: Be persistant. Don't give up.

Talking about release cycles and moving too fast or too slow there was a comment on best practice to get patches into the kernel that I found very valuable: Don't get started coding right away. A lot of waste could have been prevented if people just shared their needs early on and asked questions instead of diving right into coding.

There was discussion on the meaning of long time stability. General consensus seemed to be that long term support really only includes security and stability fixes. No new features. Imaging adding current devices to a 20 year old kernel that doesn't even support USB yet.

There was a lovely quote by Narcisa on the dangers and advantages of using C as a primary coding language: With great power come great bugs.

There was discussion on using "new-fangled" tools like github instead of plain e-mail. Sure e-mail is harder to get into as a new contributor. However current maintainer processes heavily rely on that as a tool for communication. There was a joke about implementing their own tool for that just like was done with git. One argument for using something less flexible that I found interesting: Aparently it's hard to switch between subsystems just because workflows differ so much, so agreeing on a common workflow would make that easier.
  • Asked for what would happen if Linus was eaten by a shark when scuba diving the answer was interesting: Likely at first there would be a hiding game because nobody would want to take up his work load. Next there would likely develop a team of maintainers collaborating in a consensus based model to keep up with things.
  • In terms of testing - that depend heavily on hardware being available to test on. Think like the kernel CI community help a lot with that.

    I closed the day going to Zaheda Bhorat's talk on "Love would you do - everyday" on her journey in the open source world. It's a great motiviation for people to start contributing to the open source community and become part of it - often for life changing what you do in ways you would never have imagined before. Lots of love for The Apache Software Foundation in it.
  • Tuesday, 24 October 2017

    Security by Obscurity

    Planet FSFE on Iain R. Learmonth | 22:00, Tuesday, 24 October 2017

    Today this blog post turned up on Hacker News, titled “Obscurity is a Valid Security Layer”. It makes some excellent points on the distinction between good and bad obscurity and it gives an example of good obscurity with SSH.

    From the post:

    I configured my SSH daemon to listen on port 24 in addition to its regular port of 22 so I could see the difference in attempts to connect to each (the connections are usually password guessing attempts). My expected result is far fewer attempts to access SSH on port 24 than port 22, which I equate to less risk to my, or any, SSH daemon.

    I ran with this alternate port configuration for a single weekend, and received over eighteen thousand (18,000) connections to port 22, and five (5) to port 24.

    Those of you that know me in the outside world will have probably heard me talk about how it’s insane we have all these services running on the public Internet that don’t need to be there, just waiting to be attacked.

    I have previously given a talk at TechMeetup Aberdeen where I talk about my use of Tor’s Onion services to have services that only I should ever connect to be hidden from the general Internet.

    Onion services, especially the client authentication features, can also be useful for IoT dashboards and devices, allowing access from the Internet but via a secure and authenticated channel that is updated even when the IoT devices behind it have long been abandoned.

    If you’re interested to learn more about Onion services, you could watch Roger Dingledine’s talk from Def Con 25.

    Monday, 23 October 2017

    Public Money? Public Code!

    Norbert Tretkowski | 22:11, Monday, 23 October 2017

    <iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube.com/embed/iuVUzg6x2yo" width="560"></iframe>

    Open Source Summit Prague 2017 - part 1

    Inductive Bias | 11:18, Monday, 23 October 2017

    Open Source Summit, formerly known as LinuxCon, this year took place in Prague. Drawing some 2000 attendees to the lovely Czech city, the conference focussed on all things Linux kernel, containers, community and governance. The first day started with three crowded keynotes: First one by Neha Narkhede on

    Keynotes

    Apache Kafka and the Rise of the Streaming Platform. Second one by Reuben Paul (11 years old) on how hacking today really is just childs play: The hack itself might seem like toying around (getting into the protocol of children's toys in order to make them do things without using the app that was intended to control them). Taken into the bigger context of a world that is getting more and more interconnected - starting with regular laptops, over mobile devices to cars and little sensors running your home the lack of thought that goes into security when building systems today is both startling and worrying at the same time.

    The third keynote of the morning was given by Jono Bacon on what it takes to incentivise communities - be it open source communities, volunteer run organisations or corporations. According to his perspective there are four major factors that drive human actions:

    • People thrive for acceptance. This can be exploited when building communities: Acceptance is often displayed by some form of status. People are more likely to do what makes them proceed in their career, gain the next level in a leadership board, gain some form of real or artificial title.
    • Humans are a reciprocal species. Ever heart of the phrase "a favour given - a favour taken"? People who once received a favour from you are more likely to help in the long run.
    • People form habits through repetition - but it takes time to get into a habit: You need to make sure people repeat the behaviour you want them to show for at least two months until it becomes a habit that they themselves continue to drive without your help. If you are trying to roll out peer review based, pull request based working as a new model - it will take roughly two months for people to adapt this as a habit.
    • Humans have a fairly good bullshit radar. Try to remain authentic, instead of automated thank yous, extend authentic (I would add qualified) thank you messages.


    When it comes to the process of incentivising people Jono proposed a three step model: From hook to reason to reward.

    Hook here means a trigger. What triggers the incentivising process? You can look at how people participate - number of pull requests, amount of documentation contributed, time spent giving talks at conferences. Those are all action based triggers. What's often more valuable is to look out for validation based triggers: Pull requests submitted, reviewed and merged. He showed an example of a public hacker leaderboard that had their evaluation system published. While that's lovely in terms of transparency IMHO it has two drawbacks: It makes it much easier to evaluate known wanted contributions than what people might not have thought about being a valuable contribution when setting up the leadership board. With that it also heavily influences which contribtions will come in and might invite a "hack the leadership board" kind of behaviour.

    When thinking about reason there are two types of incentives: The reason could be invisible up-front, Jono called this submarine rewards. Without clear prior warning people get their reward for something that was wanted. The reason could be stated up front: "If you do that, then you'll get reward x". Which type to choose heavily depends on your organisation, the individual giving out the reward as well as the individual receiving the reward. The deciding factor often is to be found in which is more likely authentic to your organisation.

    In terms of reward itself: There are extrinsic motivators - swag like stickers, t-shirts, give-aways. Those tend to be expensive, in particular if shipping them is needed. Something that in professional open source projects is often overlooked are intrinsic rewards: A Thank You goes a long way. So does a blog post. Or some social media mention. Invitations help. So do referrals to ones own network. Direct lines to key people help. Testimonials help.

    Overall measurement is key. So is concentrating on focusing on incentivising shared value.

    Limux - the loss of a lighthouse



    In his talk, Matthias Kirschner gave an overview of Limux - the Linux rolled out for the Munich administration project. How it started, what went wrong during evaluation, which way political forces were drawing.

    What I found very interesting about the talk were the questions that Matthias raised at the very end:

    • Do we suck at desktop? Are there too many depending apps?
    • Did we focus too much on the cost aspect?
    • Is the community supportive enough to people trying to monetise open source?
    • Do we harm migrations by volunteering - as in single people supporting a project without a budget, burning out in the process instead of setting up sustainable projects with a real budget? Instead of teaching the pros and cons of going for free software so people are in a good position to argue for a sustainable project budget?
    • Within administrations: Did we focus too much on the operating system instead of freeing the apps people are using on a day to day basis?
    • Did we focus too much on one star project instead of collecting and publicising many different free software based approaches?


    As a lesson from these events, the FSFE launched an initiative to drive developing code funded by public money under free licenses.

    Dude, Where's My Microservice

    In his talk Dude, Where's My Microservice? - Tomasz Janiszewski from Allegro gave an introduction to what projects like Marathon on Apache Mesos, Docker Swarm, Kubernetes or Nomad can do for your Microservices architecture. While the examples given in the talk refer to specific technologies, they are intented to be general purpose.

    Coming from a virtual machine based world where apps are tied to virtual machines who themselves are tied to physical machines, what projects like Apache Mesos try to do is to abstract that exact machine mapping away. Is a first result from this decision, how to communicate between micro services becomes a lot less obvious. This is where service discovery enters the stage.

    When running in a microservice environment one goal when assigning tasks to services is to avoid unhealthy targets. In terms of resource utilization instead of overprovisioning the goal is to use just the right amount of your resources in order to avoid wasting money on idle resources. Individual service overload is to be avoided.

    Looking at an example of three physical hosts running three services in a redundant matter, how can assigning tasks to these instances be achieved?

    • One very simple solution is to go for a proxy based architecture. There will be a single point of change, there aren't any in-app dependencies to make this model work. You can implement fine-grained load balancing in your proxy. However this comes at the cost of having a single point of failure, one additional hop in the middle, and usually requires using a common protocol that the proxy understands.
    • Another approach would be to go for a DNS based architecture: Have one registry that holds information on where services are located, but talking to these happens directly instead of through a proxy. The advantages here: No additional hop once the name is resolved, no single point of failure - services can work with stale data, it's protocol independent. However it does come with in-app dependencies. Load balancing has to happen local to the app. You will want to cache name resolution results, but every cache needs some cache invalidation strategy.


    In both solutions you will also still have logic e.g. for de-registrating services. You will have to make sure to register your service only once is successfully booted up.

    Enter the Service Mesh architecture, e.g. based on Linker.d, or Envoy. The idea here is to have what Tomek called a sidecar added to each service that talks to the service mesh controller to take care of service discovery, health checking, routing, load balancing, authn/z, metrics and tracing. The service mesh controller will hold information on which services are available, available load balancing algorithms and heuristics, repeating, timeouts and circuit breaking, as well as deployments. As a result the service itself no longer has to take care of load balancing, ciruict breaking, repeating policies, or even tracing.

    After that high level overview of where microservice orchestration can take you, I took a break, following a good friend to the Introduction to SoC+FPGA talk. It's great to see Linux support for these systems - even if not quite as stable as would be an ideal world case.

    Trolling != Enforcement

    The afternoon for me started with a very valuable talk by Shane Coughlan on how Trolling doesn't equal enforcement. This talk was related to what was published on LWN earlier this year. Shane started off by explaining some of the history of open source licensing, from times when it was unclear if documents like the GPL would hold in front of courts, how projects like gplviolations.org proofed that indeed those are valid legal contracts that can be enforced in court. What he made clear was that those licenses are the basis for equal collaboration: They are a common set of rules that parties not knowing each other agree to adhere to. As a result following the rules set forth in those licenses does create trust in the wider community and thus leads to more collaboration overall. On the flipside breaking the rules does erode this very trust. It leads to less trust in those companies breaking the rules. It also leads to less trust in open source if projects don't follow the rules as expected. However when it comes to copyright enforcement, the case of Patrick McHardy does imply the question if all copyright enforcement is good for the wider community. In order to understand that question we need to look at the method that Patrick McHardy employs: He will get in touch with companies for seemingly minor copyright infringements, ask for a cease and desist to be signed and get a small sum of money out of his target. In a second step the process above repeats, except the sum extracted increases. Unfortunately with this approach what was shown is that there is a viable business model that hasn't been tapped into yet. So while the activities by Patrick McHardy probably aren't so bad in and off itself, they do set a precedent that others might follow causing way more harm. Clearly there is no easy way out. Suggestions include establishing common norms for enforcement, ensuring that hostile actors are clearly unwelcome. For companies steps that can be taken include understanding the basics of legal requirements, understanding community norms, and having processes and tooling to address both. As one step there is a project called Open Chain publishing material on the topic of open source copyright, compliance and compliance self certification.

    Kernel live patching

    Following Tomas Tomecek's talk on how to get from Dockerfiles to Ansible Containers I went to a talk that was given by Miroslav Benes from SuSE on Linux kernel live patching.

    The topic is interesting for a number of reasons: As early as back in 2008 MIT developed something called Ksplice which uses jumps patched into functions for call redirection. The project was aquired by Oracle - and discontinued.

    In 2014 SuSE came up with something called kGraft for Linux live patching based on immediate patching but lazy migration. At the same time RedHat developed kpatch based on an activeness check.

    In the case of kGraft the goal was to be able to apply limited scope fixes to the Linux kernel (e.g. for security, stability or corruption fixes), require only minimal changes to the source code, have no runtime cost impact, no interruption to applications while patching, and allow for full review of patch source code.

    The way it is implemented is fairly obvious - in hindsight: It's based on re-useing the ftrace framework. kGraft uses the tracer for inception but then asks ftrace to return back to a different address, namely the start of the patched function. So far the feature is available for x86 only.

    Now while patching a single function is easy, making changes that affect multiple funtions get trickier. This means a need for lazy migration that ensures function type safety based on a consistency model. In kGraft this is based on a per-thread flag that marks all tasks in the beginning and makes waiting for them to be migrated possible.

    From 2014 onwards it took a year to get the ideas merged into mainline. What is available there is a mixture of both kGraft and kpatch.

    What are the limitations of the merged approach? There is no way right now to deal with data structure changes, in particular when thinking about spinlocks and mutexes. Consistency reasoning right now is done manually. Architectures other than X86 are still an open issue. Documentation and better testing are open tasks.

    Planet FSFE (en): RSS 2.0 | Atom | FOAF |

      /127.0.0.?  /var/log/fsfe/flx » planet-en  Albrechts Blog  Alessandro at FSFE » English  Alessandro's blog  Alina Mierlus - Building the Freedom » English  Andrea Scarpino's blog  André Ockers on Free Software » English  Being Fellow #952 of FSFE » English  Bela's Internship Blog  Bernhard's Blog  Bits from the Basement  Blog of Martin Husovec  Blog » English  Blog – Think. Innovation.  Bobulate  Brian Gough's Notes  Carlo Piana :: Law is Freedom ::  Ciarán's free software notes  Colors of Noise - Entries tagged planetfsfe  Communicating freely  Computer Floss  Daniel Martí's blog  Daniel's FSFE blog  DanielPocock.com - fsfe  David Boddie - Updates (Full Articles)  Don't Panic » English Planet  ENOWITTYNAME  Elena ``of Valhalla''  English Planet – Dreierlei  English on Björn Schießle - I came for the code but stayed for the freedom  English – Kristi Progri  English – Max's weblog  English — mina86.com  Escape to freedom  Evaggelos Balaskas - System Engineer  FLOSS – Creative Destruction & Me  FSFE Fellowship Vienna » English  FSFE interviews its Fellows  FSFE – Patis Blog  Fellowship News  Fellowship News » Page not found  Florian Snows Blog » en  Frederik Gladhorn (fregl) » FSFE  Free Software & Digital Rights Noosphere  Free Software with a Female touch  Free Software –  Free Software – Frank Karlitschek_  Free Software – GLOG  Free Software – hesa's Weblog  Free as LIBRE  Free speech is better than free beer » English  Free, Easy and Others  From Out There  Graeme's notes » Page not found  Green Eggs and Ham  Handhelds, Linux and Heroes  HennR's FSFE blog  Henri Bergius  Hook’s Humble Homepage  Hugo - FSFE planet  Inductive Bias  Jelle Hermsen » English  Jens Lechtenbörger » English  Karsten on Free Software  Losca  MHO  Mario Fux  Martin's notes - English  Matej's blog » FSFE  Matthias Kirschner's Web log - fsfe  Myriam's blog  Mäh?  Nice blog  Nico Rikken » fsfe  Nicolas Jean's FSFE blog » English  Norbert Tretkowski  PB's blog » en  Paul Boddie's Free Software-related blog » English  Planet FSFE on Iain R. Learmonth  Posts - Carmen Bianca Bakker  Posts on Hannes Hauswedell's homepage  Pressreview  Ramblings of a sysadmin (Posts about planet-fsfe)  Rekado  Repentinus » English  Riccardo (ruphy) Iaconelli - blog  Saint's Log  Seravo  TSDgeos' blog  Tarin Gamberini  Technology – Intuitionistically Uncertain  The Girl Who Wasn't There » English  The trunk  Thib's Fellowship Blog » fsfe  Thinking out loud » English  Thomas Koch - free software  Thomas Løcke Being Incoherent  Told to blog - Entries tagged fsfe  Tonnerre Lombard  Torsten's FSFE blog » english  Viktor's notes » English  Vitaly Repin. Software engineer's blog  Weblog  Weblog  Weblog  Weblog  Weblog  Weblog  Werner's own blurbs  With/in the FSFE » English  a fellowship ahead  agger's Free Software blog  anna.morris's blog  ayers's blog  bb's blog  blog  drdanzs blog » freesoftware  egnun's blog » FreeSoftware  free software - Bits of Freedom  free software blog  freedom bits  gollo's blog » English  julia.e.klein's blog  marc0s on Free Software  mkesper's blog » English  nikos.roussos - opensource  pichel's blog  polina's blog  rieper|blog » en  softmetz' anglophone Free Software blog  stargrave's blog  the_unconventional's blog » English  things i made  tobias_platen's blog  tolld's blog  vanitasvitae's blog » englisch  wkossen's blog  yahuxo's blog