1. Install OBS on Ubuntu 14.04

    If you've been using SimpleScreenRecorder like I have to stream games to Twitch, you know there's some issues that are pretty annoying to solve with it. OpenGL capture requires injecting some memory addresses into the game every launch (very annoying to change these each time with Steam) and while performance is better than using something like ffmpeg with x11 capture, it's still not as good as it could be. Mixing your microphone audio into the stream requires reconfiguring an external mixer (like PulseAudio).

    OBS running on Ubuntu 14.04

    Enter the Open Broadcasting Software Studio rewrite. OBS has been one of the major choices for streaming on Windows for a long time and they've been working on a cross platform rewrite. For me, it's now faster than SimpleScreenRecorder and the only configuration I had to do was select my audio devices in the menu, then add my Twitch.tv stream key. I'm sure there's quite a few bugs left to squash but it's already the best choice for streaming games on Linux.

    To get OBS running on Ubuntu 14.04, you need a couple of PPAs.

     sudo apt-add-repository ppa:jon-severinsson/ffmpeg
     sudo apt-add-repository ppa:obsproject/obs-studio

    Ubuntu 14.04 uses libav instead of ffmpeg (libav is a fork of ffmpeg). That's lacking some parts of ffmpeg you'll need to run OBS, so the first ppa is upstream ffmpeg builds for Ubuntu. The second ppa is OBS studio itself.

     sudo apt-get update
     sudo apt-get install obs-studio

    Will install OBS. You then run it from the command line with obs. Launch a game, add it to the sources list for your scene. I used the Xcomposite source type.


  2. RPMB eMMC errors under Linux

    Newer eMMC flash devices have a small partition (several megabytes) that is used to store OEM security keys (for things like DRM or encrypting private app data under something like Android). Linux implements support for the Replay Protected Memory Block partition in the form of ioctls. It's a pretty raw access layer, most of the implementation on these devices rightfully lives in userland instead of the kernel.

    If you have one of these devices in some configurations you will get errors like this.

    [   11.361670] mmc0: Got data interrupt 0x00000002 even though no data operation was in progress.
    [   11.363818] mmcblk0rpmb: error -110 transferring data, sector 8064, nr 8, cmd response 0x900, card status 0xb00
    [   11.363932] mmcblk0rpmb: retrying using single block read
    [   11.365980] mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
    [   11.368122] mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
    [   11.370246] mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
    [   11.372380] mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
    [   11.374503] mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
    [   11.376637] mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
    [   11.376723] end_request: I/O error, dev mmcblk0rpmb, sector 8064
    [   11.376793] Buffer I/O error on device mmcblk0rpmb, logical block 1008

    On the Asus T100 tablet I was testing with, during boot Ubuntu would hang while this happened for 5-10 seconds each time and it would happen 3 or 4 times before the boot would finish. Some disk operations later would try to check that partition and cause the same sort of I/O hangs. I went looking for a udev rule to fix it but it kept happening despite removing the MMC setup rules. To finally fix it, I ended up writing this small kernel patch.

    From 0f5081c323c52ac842b01fd79df3b3c251f7aca9 Mon Sep 17 00:00:00 2001
    From: Nell Hardcastle <nell@dev-nell.com>
    Date: Thu, 29 May 2014 22:06:50 -0700
    Subject: [PATCH] eMMC: Don't initialize partitions on RPMB flagged areas.
    Prevents a lot of pointless hanging at boot on some devices.
     drivers/mmc/card/block.c | 2 +-
     1 file changed, 1 insertion(+), 1 deletion(-)
    diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
    index 452782b..dd85dcf 100644
    --- a/drivers/mmc/card/block.c
    +++ b/drivers/mmc/card/block.c
    @@ -2255,7 +2255,7 @@ static int mmc_blk_alloc_parts(struct mmc_card *card, struct mmc_blk_data *md)
           return 0;
        for (idx = 0; idx < card->nr_parts; idx++) {
    -       if (card->part[idx].size) {
    +       if (card->part[idx].size && !(card->part[idx].area_type & MMC_BLK_DATA_AREA_RPMB)) {
                        ret = mmc_blk_alloc_part(card, md,
                                                card->part[idx].size >> 9,

    This isn't suitable for contributing to the kernel, if you actually do need to use the RPMB partition for something you might need the partition setup, but for anyone else having this problem this might be useful. The right way to do this is probably deeper in the initialization so that it doesn't entirely disable the device.


  3. Simple network multiplayer pong with PySDL2 and ZeroMQ

    I've been playing around with simple client synchronization for multiplayer games during Ludum Dare 29. I didn't end up building a game to submit but I did pull this tutorial together from some of what I was working on.

    Lockstep Synchronization of Input

    In any multiplayer real time game, you need to keep your game state in sync between the players. The most obvious approach is to update the other client whenever you alter the game state. Imagine a game of chess, whenever you move a piece, the state on the board updates so you can send a copy of the board to the other player and you'll both be playing the same game again. This works very well for small boards, simple collections of state. Computer games are rarely played on "small boards", they tend to have lots of moving state that has to be updated for all the clients somehow. Sending the entire state whenever it changes, sixty time a second, doesn't work if your state is 1MB and the players are connecting over the internet.

    The alternative is to sync a portion of the board. Only send the difference between the current board and the previous one. Many games do this, sync the whole board when you connect and then carefully manage updates to any kind of state.

    For many games, the smallest stream of data is the input made by the player. If your game state is deterministic, instead of sending any changes in state, you can forward each player's input to each other player and have both clients agree on the same simulation without exchanging state. This requires only one initial sync of the state. After that, each client takes input and relays it to all the others. After everyone has seen the same input, the next step in the simulation begins and new input is forwarded.

    Using ZeroMQ

    The PySDL2 pong tutorial is the starting point for this. Read through and understand it first.

    First we'll initialize ZeroMQ and create two sockets, one for receiving messages and one for sending them. For testing, both sockets will connect to localhost and which player the client is using is passed as either a 0 or a 1 when starting the script. ./pong 0

    import zmq
    def run():
        zmq_addresses = ["tcp://", "tcp://"]
        us = int(sys.argv[1])
        zmq_context = zmq.Context.instance()
        in_sock = zmq_context.socket(zmq.PULL)
        out_sock = zmq_context.socket(zmq.PUSH)
        out_sock.connect(zmq_addresses[(us + 1) % 2])

    We should decide on a protocol for the two players clients to use. Since we only have a few kinds of input events and the only extra information associated with is which key has been pressed, we send a step value identifying which "frame" of game state the input is for, followed by the kind of event and the key pressed if one is associated with the event. This replaces the direct modification of the player state on events. ZeroMQ deals in bytes, not integers like the SDL tokens, so the tokens chosen are strings encoded by PyZMQ to UTF-8.

    def run():
        running = True
        step = 0
        while running:
            local_events = sdl2.ext.get_events()
            # Prepend the step for this frame
            send_events = [str(step)]
            # First list events to forward in (type, key) pairs
            for event in local_events:
                if event.type == sdl2.SDL_QUIT:
                    send_events.extend(["QUIT", ""])
                elif event.type == sdl2.SDL_KEYDOWN:
                    if event.key.keysym.sym == sdl2.SDLK_UP:
                    elif event.key.keysym.sym == sdl2.SDLK_DOWN:
                elif event.type == sdl2.SDL_KEYUP:
                    if event.key.keysym.sym == sdl2.SDLK_UP:
                    elif event.key.keysym.sym == sdl2.SDLK_DOWN:
            step += 1

    Each message ends up looking like this:


    Now that we have all the local events, we ask the input socket for any events from the other client. The order is important to prevent deadlocks, we always want to send a message before looking for new ones.

    remote_events = in_sock.recv_multipart()
    assert remote_events[0] == str(step)

    We still need to handle the events from before. Now they are only recorded in the same structure as the ones coming in on the PULL socket.

    def handle_events(player, events):
        for event_type, event_key in zip(events[0::2], events[1::2]):
            if event_type == "QUIT":
                running = False
            elif event_type == "KEYDOWN":
                if event_key == "UP":
                    player.velocity.vy = -3
                elif event_key == "DOWN":
                    player.velocity.vy = 3
            elif event_type == "KEYUP":
                if event_key in ("UP", "DOWN"):
                    player.velocity.vy = 0
    def run():
        handle_events(players[not us], remote_events[1:])
        handle_events(players[us], send_events[1:])
        # Run the simulation after all events are handled
        step += 1

    Now you can start both pong clients up. ./pong 0 and ./pong 1 and they'll start running. Each client has to wait for the other to send its next message before continuing, so they'll stay in sync. Getting this working across a real network or the internet is also very simple, you specify the other client's address as an argument instead of choosing them from the list.

    Example Code

    A working example is available on GitLab. It has a few minor extensions to the PySDL2 tutorial, the frame rate is limited to about 30 frames per second and the ball/paddles move faster. I wrote it with PyPy but CPython should do fine too.

    Some interesting ways to extend it are: allow a client to reconnect, keep track of score, or add additional players. Maybe I'll continue the tutorial for those later.


  4. Brief Technical Overview of Merged Mining

    Such Merkle tree

    A lot of people seem confused about how merged mining of cryptocurrencies sharing a hashing algorithm works, so I thought I'd write this up. This is more of a brief process overview than FAQ as there's been several of those, but feel free to ask any questions here or in the /r/dogecoin thread.

    Mining Basics

    To mine any block, you construct a block header and include the previous block's header hash and a Merkle root hash. The Merkle root is the parent hash from a tree of hashes constructed from the potential transactions that will be included in the block. Each leaf of the tree (a transaction) is hashed, then each parent node is built by hashing two child hashes. This continues up the tree until the root hash is calculated that includes all of the child hashes, meaning that if the Merkle root is valid, the entire block's worth of transactions must also be.

    The block header includes a few other important values. The difficulty target (which decreases as the difficulty rises), nonce, and timestamp. With rising difficulty, finding a valid hash becomes more time consuming because a miner's valid hash must be lower than the difficulty target to be accepted by the network. This restricts the space of accepted hashes. Each time an invalid block hash is calculated, the nonce is incremented and the header is hashed again until a match is found.

    When the header nonce overflows (it's a 32-bit number), the Merkle root is recalculated with its own nonce. The timestamp or transaction contents may also be adjusted when generating this new header and the search process resumes on the new block header. This is how the search space is large enough to support finding a small enough block hash.

    Merged Mining Changes

    If two coins share a hashing algorithm, you can adopt a system that allows you to construct a block header that will be valid for both. The first step is selecting a parent and a child blockchain, as the child will need to be forked (why in a bit). In this example, consider Litecoin as the parent and Doge as the child.

    To mine a merged hash, you select a difficulty equal to the higher difficulty of either blockchain for the header. Then you construct block headers for both. In the Litecoin Merkle tree, you include the Dogecoin block header hash as a transaction. In this way, solving the Litecoin block will also be verifying the Dogecoin transactions with the same proof of work.

    Miners work units are generated from the Litecoin block header and mining continues as usual, looking for a hash that matches that block. If we solve at Litecoin's difficulty, the block is submitted to the Litecoin network with the Litecoin header, the Litecoin transaction set, and has one extra transaction that the Litecoin network ignores (containing the Dogecoin block header hash). Assuming Litecoin has the higher difficulty, this will also result in a Dogecoin block being discovered.

    If we solve at or above Dogecoin's difficulty, both block headers are submitted to the Dogecoin network, with Dogecoin's transaction set. This is why we would need to fork, Dogecoin client would need to be able to accept blocks containing both and accept the hash included in the Litecoin header as proof of work (since it verifies the Dogecoin block header).

    The result is slightly more data in Dogecoin's blockchain and one extra transaction per solved block in the Litecoin chain. A relatively small amount of overhead for the potential benefits of pooling the hash rate.


  5. Postfix email server with Salt configuration

    Here's a Salt repo for quickly configuring an email server running Ubuntu 13.10. Includes configuration for Postfix, Dovecot, Nginx, Stud (SSL for Nginx), Roundcube, PostgreSQL, OpenDKIM, and DSPAM. Git repo with my local configuration stripped out. This configuration is based on NSA-proof your e-mail in 2 hours by Drew Crawford.

    A few steps are not done automatically by Salt. You need to write a pillar configuration to fill in a lot of variables. For example - /srv/pillar/mail.sls:

      dbname: mailserver
      dbuser: postfix
      dbpass: (database_password)
      domain: example.com
        - user1:
          email: user1@example.com
          password: (unix password hash)
        - user2:
          email: user2@example.com
          password: (another hash)
        root@example.com: user1@example.com
        postmaster@example.com: user1@example.com

    The encfs that stores all the email isn't mounted by default. That way the passphrase isn't stored on the server itself. Create the encfs at /var/mail/encrypted:

    encfs /var/mail/encrypted /var/mail/decrypted

    Also a top level pillar config - /srv/pillar/top.sls:

        - mail

    SSL certs need to be installed too. Dovecot/Postfix:


    For STUD (needs to be both public and private components merged into one PEM):


    Then run Salt to set the rest up.

    salt-call --local state.highstate -l debug

    Hopefully that all works on a fresh Ubuntu 13.10 install. I've tested it once but made some changes since, so comment if you run into problems or make a pull request with a fix! It should at least get you close to a quick mailserver without a lot of effort.


  6. Patching 32-bit Windows executables for allocating memory beyond 2GB

    A lot of modern Windows games built for 32-bit are memory bound at times. Depending on the game, that can affect Wine more than native Windows. Final Fantasy XIV would crash by running out of the 2GB space on my system under Wine, for example. To work around this it can be useful to load the process with the LARGE_ADDRESS_AWARE flag. Enabling that flag will let the application allocate memory beyond the 2GB boundary a 32-bit Win32 application is limited to (Windows divides its memory into lower 2GB userspace and upper 2GB kernel ranges). That flag is set on linking time when building the executable and official Wine builds respect the flag.

    There's a couple ways to force enabling it. You could patch Wine's loader to enable the flag for every 32-bit application but that would break some of them due to assumptions about the size of process address space. The other option is to patch the flag into the executable, letting you use a common build of Wine with applications that do not work with the LARGE_ADDRESS_AWARE flag.

    I wrote this Python port of a C# Stack Overflow solution to patching the flag in. Turns out that patching FFXIV doesn't actually work because the game refuses to connect with the modified executable, but maybe it'll be useful to someone trying to run a different game in unmodified Wine.

    The solution for FFXIV was to use Andrew Church's patch to add an environment variable, WINEFORCELARGEADDRESSAWARE, to enable the flag without modifying the executable.


  7. Sony Vaio Pro 11 with Ubuntu

    TH05 eGPU

    Update - 2013-09-15: Both the wireless changes and the intel_pstate changes are upstreamed in Linux 3.12 which is coming soon. Instead of patching the kernel yourself, you can just grab a build of 3.12 for your particular distro.

    Just picked up a Vaio Pro 11 to replace my MacBook Air. Another interesting piece of hardware, one of the first Haswell ultrabooks to be released and in my preferred ~11" form factor. Following is some tips to getting Ubuntu running on these devices, the Pro 11 and Pro 13 are virtually identical, some of these probably apply to all Haswell laptops.

    To install Ubuntu, you need to grab the Ubuntu 13.10 daily image. The 13.04 image doesn't correctly setup the GPU at boot. Before installing, easiest is to switch the device to legacy in the "advanced bios options menu". This will prevent you from running Windows 8 - EFI mode should work in theory but after a day of EFI installs I couldn't get one running. This system does have the option to disable secure boot. Format the disk to MBR (not GPT) when installing in legacy mode, syncing the MBR from the GPT layout doesn't work on this hardware.

    Hopefully I can figure out how to get it running in EFI mode, allowing dual boot setups and generally saner disk layouts. Using BIOS mode and MBR feels like a huge hack these days.

    Next up, no wireless. The Intel 802.11ac 7260 card included in the Pro 11 has a driver on the stock 13.10 kernel but no firmware or correct PCI IDs for this revision of the card. To solve this, build a kernel from Intel's iwlwifi git tree.

    git clone git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi.git
    cd iwlwifi
    # Use your running Ubuntu kernel config
    cp /boot/config-$(uname -r) .config
    # Build packages and install them
    make -j4 deb-pkg
    sudo dpkg -i ../linux-headers* ../linux-image*

    Firmware is pulled from the LKML mailing list as a patch to the linux-firmware git tree. I've mirrored it here, iwlwifi-7260-7.ucode and iwlwifi-3160-7.ucode following the license provided. Download the firmware and place it in /lib/firmware before rebooting on the new kernel.

    Now the system works but you'll probably notice it's rather slow or running very inefficiently. On this kernel, Intel's new pstate CPU scaling driver is enabled but it isn't actually enabled for this CPU. The CPU will be stuck at 800MHz or 1.6GHz (not sure what controls which state you end up in but it seems either case can happen). A small change to the kernel will enable the pstate driver.

    diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
    index 07f2840..1ce506a 100644
    --- a/drivers/cpufreq/intel_pstate.c
    +++ b/drivers/cpufreq/intel_pstate.c
    @@ -522,6 +522,7 @@ static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
            ICPU(0x2a, default_policy),
            ICPU(0x2d, default_policy),
            ICPU(0x3a, default_policy),
    +       ICPU(0x45, default_policy),
     MODULE_DEVICE_TABLE(x86cpu, intel_pstate_cpu_ids);

    Save this patch and apply it with 'git apply haswell-pstate.patch'. Rebuild the kernel with make deb-pkg and install the new package. When you reboot, CPU scaling should work as normal but it may appear wrong in tools using the cpufreq interface still. You can check it's working by looking for the directory '/sys/devices/system/cpu/intel_pstate'. The actual speed of the CPU can be obtained with the i7z utility.

    To apply this patch for other CPUs that may support pstate but do not have the driver enabled, add a new line with the model number from /proc/cpuinfo of your particular CPU. intel_pstate.c is using hex values and /proc/cpuinfo displays decimal, so be sure to get that right.

    My experiences after that bit of kernel setup with the laptop has been pretty good. Thermal design seems much better than the 2012 MacBook Air it replaced. The screen has a slight amount of grain due to the touchscreen overlay that I find a little annoying, it will really bother some. Besides wireless and power management, the rest of the hardware works with no special configuration. Performance is good enough to run most of my Steam games at 1920x1080 with low or medium settings. Build quality seems less than perfect but acceptable. I'll post again how it holds up.


  8. More Thunderbolt / PCIe adapter info

    Some additional information continuing from my past post on the TH05.

    Thunderbolt's implementation on Apple firmware does a few odd things that need to be worked around by Windows 7 and Linux. If your OS doesn't advertise itself as Darwin, the ACPI tables will not wake the controller unless a device is plugged in at boot. Exactly how the controller is woken up is still unknown, so waking it after boot isn't completely possible under Linux and I'm unaware of any Windows support (though it may exist with Windows laptops shipping Thunderbolt finally). Matthew Garrett has done the investation of how to replicate what happens when the OS is detected as Darwin during EFI loading, what's missing is later configuration of the controller once the OS has booted.

    The practical implication of this is you have to boot with a device attached, or you won't get a usable PCIe link once Windows or Linux has booted. Reconnecting a device doesn't work either. For an eGPU on Apple hardware you are restricted to devices that do not interfere with the EFI loader or BIOS emulation, which at least for my 2012 MacBook Air means almost everything I've tried doesn't work in BIOS mode. An AMD GPU (Radeon 5750) didn't crash the BIOS, but it didn't allocate memory under Linux successfully. Another AMD GPU (Radeon 5830) hangs it. Every Nvidia GPU I tried hangs the BIOS (9800 GT, GTS 450, GTX 550 Ti). All of these load fine in EFI mode, though I've only extensively tried the GTX 550 Ti.

    I'd like to try some other PCIe devices with less demanding memory requirements. Much like the laptops with crashing BIOS implementations using the PE4L, memory demands are probably why all these GPUs hang Apple's fragile BIOS emulation.


Page 1 / 2 »