indexwritingsjournal › 2013

grawity's journal

Untitled – on my bash prompt

Warning: One of those really boring posts in which I brag about my epic hax. (But well, that's the point of this whole site, isn't it?)

Like many Linux users, I waste a lot of time customizing the hell out of my terminal's appearance. Part of this is creating an awesome shell prompt. Everyone likes to put all sorts of information there – whether you're in a Git or Hg repository, what branch you're on, what the exit status of last command was (sometimes even expressed in the form of elaborate Unicode emoticons)...

Mine is plain in comparison – it just shows the hostname, path, and branch. So far it looked pretty much like this:

rain ~/pkg/abs/telepathy-mission-control/git/src master
$ foo

Over time I ended up implementing various unusual things in it, however – for example, highlighting the last directory component, or collapsing the path when it becomes too wide for the terminal:

rain ~/pkg/abs/telepathy-mission-control/git/src/telepathy-mission-control master
$ cd tests/twisted/tools

rain ~/…sion-control/git/src/telepathy-mission-control/tests/twisted/tools master
$

It generally shows enough of the path to remember where I am, although I'm probably going to adjust it a little bit yet. Today I also changed the highlight to always start at the repository root, which makes things much clearer when dealing with nested repositories – it looks a bit ugly however:

rain ~/pkg/abs master
$ cd telepathy-mission-control/git

rain ~/pkg/abs/telepathy-mission-control/git master
$ cd src/telepathy-mission-control

rain ~/pkg/abs/telepathy-mission-control/git/src/telepathy-mission-control master
$

The collapsing is implemented in roughly 50 lines of bash.

Irssi ≥ ≨ ≱ ≲ ⋙ … ⋚ *

(Disclaimer: Here is my point of view about the Irssi IRC client. It is not particularly objective. It is not about why everyone should suddenly drop Irssi. Nor is it meant to be blindly shoved to every Irssi user you talk with.)

I've used Irssi as my main IRC client for almost 5 years, before switching to Weechat. Despite being pretty much unmaintained (and lacking some features), Irssi is still a good client, but… it has a problem: the users.

Specifically, the users who always feel the need to declare that Irssi is better, that "irssi > *", that Irssi is perfection.

Most users of other IRC clients openly admit that there's some misfeature or something else that they don't like. For example, the way Weechat works, it must wrap overly long URLs into multiple lines, making them unclickable. Meanwhile, Irssi users (at least the vocal ones) insist that their chosen client is perfect, and if it doesn't have a feature, then it is only because said feature is a) "you don't want it" (obviously unnecessary), b) "why would anyone want it" (obviously stupid), or c) "just install a script :)" (can be implemented using the exposed API).

For example, the nick list. Most clients let you have a sidebar that lists all people currently in the channel, usually sorted by rank. Now, I don't care if it's useful or just clutter for you; that's not my point. My point is that Irssi users always say: "Oh, if I ever wanted a nicklist, I could just install nicklist.pl and have it."

What is always left unsaid is that Irssi does not actually have any API for creating vertical regions, so the script works only if you open a new terminal window running cat ~/.irssi/nicklist-fifo. Alternatively, if you happen to be using SCREEN, the script actually reconfigures Irssi's tty to be narrower than the SCREEN window, and draw directly in the blank space that appears...every single time Irssi's own area is updated. In contrast, even though Weechat has several such scripts (although nicklist is built-in) they do not have to do anything special; they simply create a "bar" and put text inside. (There is no difference between the built-in "nicklist" bar and the scripted "buffers" bar, as far as the user is concerned.)

And there are more examples like that – for example, the cap_sasl.pl script in Irssi doesn't just implement the SASL cap, it has to implement all of capability negotiation on its own, and you cannot write your own scripts that make use of other capabilities unless you change cap_sasl to request them. (Although I have an idea on how this could be done, if the CAP negotiation was split into a second script.)

Somewhat similar to the nicklist example is implementing the server-time capability, which lets bouncers attach the original message timestamp when you connect and see the last messages being replayed. Yes, it is possible to do that from an Irssi script. But again, the only way it can be done is a hack upon a hack:

Is that a good example of "flexible API"? Not very. But again, it's not really the client itself that's the problem – all clients have all sorts of limitations (like Weechat lacking any sort of hook for defining custom SASL mechanisms) – but rather the users who basically worship it, refusing to admit any imperfections.

ICMP, IPsec, IRC, and other random notes

Recent versions of Linux translate incoming ICMPv6 "Administratively prohibited" errors (type 1 code 1) to local -EACCES ("Permission denied") errno's, which is an interesting way of being informed that the server's firewall is blocking you. Unfortunately, all other operating systems (Windows, older Linuxes, various BSDs) appear to just ignore these ICMP packets, which is a bit sad – I expected them to at least terminate the TCP connection attempt with something generic like "Connection reset by peer", but instead they just wait until the connection times out.

Then again, the other OSes often do the same even for ICMP "Port unreachable". Also sad. Also strange that even on Linux, only ICMPv6 uses this translation – the equivalent ICMPv4 "Communication administratively prohibited" (type 3 code 13) results in -EHOSTUNREACH, "No route to host".

Still, I really like the whole translating remote failures to local errno's thing. Somehow it actually makes me feel as if I'm using a network where everything is integrated and where I'm receiving feedback from the network, instead of just a bunch of computers exchanging data.

Similarly, the ping tool on Windows displays the message "Negotiating IP Security" whenever Windows is performing IPsec key exchange, which is a nice touch – when the same is happening in Linux, the packets just go nowhere. (I don't remember offhand if they're queued or discarded; either way, there's just no feedback.)

C:\>ping 10.42.0.1

Pinging 10.42.0.1 with 32 bytes of data:

Negotiating IP Security.
Negotiating IP Security.
Reply from 10.42.0.1: bytes=32 time=21ms TTL=128
...

(On a related note, IPsec with strongSwan is hella confusing at times.)


Spent the majority of the past year on IRC. Somehow I ended up being an operator in #archlinux, then in freenode's #irchelp, finally even in #systemd. Yes, #systemd was finally registered with network services after three years – for a project like this it's really surprising that the channel hasn't been attacked or invaded by trolls even once.

Kind of wondering why I now have +v in #inspircd, too. Given that I've only used InspIRCd for a hour or two, and I mostly just lurk in the channel... But I'm not complaining.


Messing around with Windows on the desktop PC while sister's out somewhere. (I never got around to installing the TermiServ patch since the reinstall last month, so it only allows one user at a time.) It seems that the smaller disk is about to die sometime this year – SMART just started showing a large number of reallocations and failed writes. Which is a bit unexpected, because the disk hasn't been used for almost anything since the reinstall; it only has a tiny boot partition with NTLDR on it. (For some reason, NTLDR refuses to work at all when started from the larger disk – maybe 1 TB is too much for it?)

On the other hand, I did know that it wasn't going to live long – the Event Log started showing "controller errors" in 2010, and I moved all user files to the new disk in early 2012, so when the data corruption started occurring, I only had to reinstall the OS...and, well, everything else.

There was a time when I tried setting up backups on the desktop, but it was the same story again. WinRAR actually has several useful features – storing multiple versions, NTFS streams, file permissions, &c. – but it also turned out to be much slower than expected, and it could not deal with encrypted or locked files at all. RoboCopy was roughly the same, although much faster.

I even ended up writing my own tool in C#, which would just copy a directory tree but also worked with locked/in-use files (using temporary Shadow Copy snapshots, which XP happens to ... kind of "support"), insufficient permissions (using SeBackupPrivilege to bypass the checks), and even encrypted files (using EFS APIs to read the raw contents, without Windows trying to transparently decrypt them). But it was in C#, and the .NET runtime actually took way too long to even start. So in the end, I still have no real backups of the desktop PC, only a snapshot of F:\Users from before the reinstall.

Backup troubles

So I've spent the past week trying to find a good backup program. I still haven't found one.

It could be that my requirements are impossible. I want a tool that would be reasonably fast both when copying data, and when adding a lot of small files; have some form of deduplication to avoid wasting gigabytes of data after I simply move files around; and not require a command-line tool to actually access the backups. But apparently no tool can do all three at once.

I tried rsnapshot (which seems to be just a wrapper around rsync --link-dest), as well as plain rsync combined with btrfs snapshots. While rsync is fast enough, it turns out it is too dumb to detect moves and renames, so if I simply rename ~/Videos/anime to ~/Videos/Anime, or if I move a dozen of CD images from ~/TODO to ~/Attic/Software/OS/WinNT, rsync thinks all files are new and spends ages copying them again, instead of hardlinking from previous snapshot as the --link-dest option normally would. (I'd be happy to know if I'm wrong on this one and if it can actually detect renames.) Plus, copying to a btrfs partition is much slower than expected; only 15 MB/s instead of the usual 25-30 MB/s (that's over USB 2.0).

I also used obnam for quite a while. It's fast and it has deduplication built in so I can easily keep a few dozen weekly snapshots. But I'm not exactly a fan of having to use obnam restore whenever I want my files out. While that's a rather minor problem, and there apparently is a FUSE plugin in the works, there's also the risk that obnam's repository will get corrupted and won't let me access anything anymore. I'm also not exactly a fan of obnam growing to 1.5 GB of memory during its run – and that wasn't even the entire run, that was maybe 1/3 when I finally killed it. (I do hope that's a bug.) Also, while adding data is fast, obnam is slow when adding files – and if I have a directory with 200k smallish files in it, it goes at maybe 6-10 files per second, which means it takes hours to copy a mere 5.2 GB of Gale chatlogs.

Next option is ZFS with dedup enabled – either rsnapshot or plain rsync with ZFS snapshots would work. The problem with it, however, is that it's a pain in the ass to maintain on Arch. Every time I install a new kernel version from [testing], there are four packages I need to rebuild, and since they all have versioned dependencies (e.g. zfs 0.6.2_3.10.9-1 depends exactly on linux=3.10.9-1 and zfs-utils=0.6.2_3.10.9-1) it means I must remove all ZFS tools entirely, then upgrade my kernel, then start rebuilding ZFS. (Of course, I just wrote a shellscript that sed's the versions out of depends= lines, but that doesn't make it any less of a pain in the ass.)

For now, I guess, I'll just stick with rsnapshot and limit it to four, maybe three, snapshots at once... but fuck, how can there not be a backup tool that doesn't suck in some way or other?

daily crontab

A few years ago I wrote a cronjob for updating ~/.ssh/authorized_keys on various servers. (It ended up having the name update-authorized-keys after a few renames.) It basically downloaded my authorized_keys file over HTTP (using one of a dozen HTTP clients to be extra portable), checked if it had my PGP signature on it, and supported some cpp-ish filtering. I was extra careful to look for a specific PGP key by fingerprint and all that.

And several months later, I wrote another cronjob – this time for updating my script collection and my dotfiles – called dist/pull this time. It first updated ~/code over Git, then exec'd the updated version of itself (just in case), which then updated ~/lib/dotfiles (also over Git). Sometimes I would patch dist/pull to do various cleanup jobs, and they would always run at midnight automatically. (As a bonus, it also ran the SSH key updater, instead of having two separate cronjobs.)

And I just realized that despite all my carefulness, I still ended up having an easily pwnable cronjob that automatically downloads and runs code every night without verification. Crap.

SASL authentication in Eggdrop

Many IRC networks now support SASL as the standard authentication method, which removes certain race conditions such as having your client auto-join channels before auth is complete – as a result your vhost/cloak would get applied too late, you might be denied entirely if the channel requires being authenticated, etc.

One day, out of boredom, I wrote a mostly-pure-Tcl implementation of IRCv3 CAP and SASL for the Eggdrop IRC bot. At the moment, it is located on GitHub Gist, and consists of three Tcl scripts – Base64; CAP negotiation and SASL PLAIN; plus a demo script for several other IRCv3 capabilities.

Saying "mostly-pure-Tcl" because the CAP negotiation still needs a one-line patch to the core code. However, two days ago, the "preinit-server" patch was merged into the main Eggdrop 1.8 repository, so it can be used with the scripts without any modification.

Random notes: Setting up my virtual machine network

I'm still trying to set up a sane virtual machine network, one that would put VMs on both the laptop and the desktop in their own networks routed to each other and to the real LAN and to still let my own VMs access "LAN subnets only" services on the desktop, like the file sharing.

It's not going well – I ended up running Unbound, BIND, and dnsmasq on the same laptop: Unbound I already had running before as my validating resolver; dnsmasq serves DHCP to the VM network and hosts a simple dynamic-DNS LAN domain for accessing random PCs; BIND hosts a static domain for accessing the two Active Directory realms installed in two VMs, because dnsmasq's static DNS settings are plain stupid. So now I have all my VMs nice and clean in their own net, routed to the real LAN – that is, routed and NATed so that LAN hosts would see the VM's real addresses but the LAN router/gateway/cheap-ass-DSL-modem could still do its own NAT thing properly but the desktop also needs to see the NATed addresses when VMs try to access shared files, so that the firewall would let them through... I might have written the stupidest NAT rules ever just to make this work:

-A POSTROUTING -s 10.7.0.0/16 -d 192.168.0.0/16 -p tcp --dport 445 -j MASQUERADE
-A POSTROUTING -s 10.7.0.0/16 -d 192.168.0.0/16 -p tcp --dport 139 -j MASQUERADE
-A POSTROUTING -s 10.7.0.0/16 -d 192.168.0.0/16 -j ACCEPT
-A POSTROUTING -s 10.7.0.0/16 -o wlan0 -j MASQUERADE

This whole mess turned out to be needed because my ISP configures its routerdems to have a "management" network in addition to "user" and "Internet", and that network happens to be using 10.0.0.0/8 (already confused the hell out of me once, when I wanted to connect to a VPN but the traceroute to 10.0.x.x addresses kept going through my ISP) which makes the routerdem think the packets from my VMs aren't actually coming from inside the LAN, so it refuses to apply NAT to them, so my laptop (the VM host) has to NAT all of them to the LAN address range... On the other hand, I still want all VMs to be reachable from the real LAN using their own IP addresses, hence the ACCEPT rule.

Aside: The Spooler service in Windows XP is rather picky about the hostname you use to access it. Apparently, the full UNC path of the printer is sent when conneting to it, so if you're trying to connect to \\snow.virt\FooPrinter but the server thinks it's \\snow.home (not \\snow.virt), it will return "Invalid printer name" to the OpenPrinter request – despite it having already accepted a SMB connection to \\snow.virt\IPC$ without even a blink.

wmii, i3, and IPC protocols

I used to use the wmii tiler for a long time (before going back to GNOME), and recently it seems i3 has become popular, so I decided to try it out. I'm not going to comment on the usability, features, etc. – but I sometimes have really odd criteria for choosing software, so here's one such odd comment.

When I used wmii, it had a really sweet control interface, styled after various plan9 software: the configuration file was essentially a bash script (later ported to various other languages) that had its own event loop. The control interface was 9P over a Unix socket: read /event, write to /ctl, list files under /tags, and so on. You could even mount it as a local filesystem, using native 9p.ko.

(Later I went back to GNOME. The Shell isn't scriptable externally (at least not easily), but overall, almost all programs I run use DBus in some way or other. It's also somewhat nice and consistent.)

Then I tried i3, which claimed to be heavily inspired by wmii – and at least the appearance and the control keys were quite similar. (Although wmii has a simpler layout model – it always splits the screen into columns, similar to acme.) But I was somewhat disappointed at i3's IPC protocol – even though I have zero experience in designing such things, it still looks ugly to me.

There's a "command" message type, and six "get_foo" message types. There's "check the highest bit to see if it's an event reply or normal reply. There are no event names – there's a list of magic number definitions in i3/ipc.h which has to be copied into your i3ipc implementation; this is not a problem by itself, of course, but only when the definitions are assumed to be stable – which, in this case, they aren't.

So that's my impression of i3.

Untitled – on .plan files

Today while discussing home directory permissions and the 'finger' command, I mentioned the long list of users at @linerva.mit.edu. Someone quickly discovered a user or two having their contact information in .plan files, and the general reaction was:

<woddf2> User "marc" doxed himself. o_O
<woddf2> "amu" also doxed himself.
<woddf2> Apparently many users ha(d|ve) the habit of putting their doc in ~/.plan.

While I've never been at MIT, "public" seems to be the default there – not only that user's contact info, but their entire home directory is world-accessible over AFS. I often do the same; I consider contact information to be more-or-less public, as it has been in the past. So it is quite unusual for me to see other people finding a random user's phone number, and reacting as if it were a precious gem. Even calling it "doxing" just doesn't seem to fit here.

(Update: It's now @athena.dialup.mit.edu again that has all the users.)

On chat networks

Today I added my Yahoo IM account to Pidgin, just to see if it still works. It did – and as soon as it connected, I got 10 messages from ten different spambots (apparently YMSG stores offline messages). Windows XP has this feature where you can Ctrl+click on taskbar buttons to select multiple windows, the same way you would select multiple files, and then close them at once (or tile/cascade the selected windows). It's something GNOME 3 still lacks.

I did this after Microsoft decided to kind-of shut down their MSN Messenger servers, to make more space for Skype. The standard servers are already refusing raw MSNP connections, although Pidgin can still connect using its "HTTP method". I'm somewhat amazed that even on various Linux geek channels on IRC, people are saying things along the lines of "good riddance", not realizing that Micros~1 is shutting down a sufficiently reverse-engineered IM protocol in favor of a secret one that requires a tightly locked down client. There are at least a dozen unofficial MSNP clients for both Windows and Linux. Hell, MSN Messenger had official XMPP servers. Meanwhile, who still remembers how the attempts to reverse-engineer Skype went? Not well.

Oh well. Maybe things will get better when Microsoft tries to integrate Skype into its build system and forgets to enable obfuscation, or writes a HTML5 client, or something. Meanwhile, Yahoo! Messenger is still online, as are ICQ and AOL Instant Messager. I still remember my UIN, it seems. (And I've never had more than three contacts total over all four protocols, but that's off-topic.)


Recently I found another IM protocol, Gale, which feels somehow like a cross between XMPP, Zephyr and IRC.

In other words, Gale takes the best parts of all three, while keeping a very simple interface (and one much more scriptable than, say, XMPP). Similar to Zephyr, there's no full-blown client by default, only separate command-line tools for subscribing and for posting a message. You can compose a message in Vim and send it with :w !gsend pub.

rain ~/src/gale master
$ gsub -e test@nullroute.eu.org
! 2013-01-18 17:51:37 gsub notice: skipping default subscriptions
! 2013-01-18 17:51:37 gsub notice: subscription: "test@nullroute.eu.org"
! 2013-01-18 17:51:38 gsub notice: connected to decay.nullroute.eu.org
(at this point, gsub simply forks to background)

rain ~/src/gale master
$ echo This is a test. | gsend test@nullroute.eu.org
------------------------------------------------------------------------------
To: test@nullroute.eu.org
This is a test.
         -- grawity@nullroute.eu.org (Mantas Mikulėnas) 2013-01-18 17:51:45 --

Unlike IRC, it's possible to subscribe to the same address from many locations; join/part notifications do not exist; there's no way to know who's reading messages to a public address. The gsub client does support sending special "presence state" messages, but those are merely informative, not persistent. Addresses can be hierarchical – one could subscribe to pub@example.com or only to pub.tv.fox@example.com.

There's a downside, too. Gale messages can be encrypted, and to authenticate senders & receivers everyone has a RSA keypair, which are verified hierarchically – the "ROOT" key signs TLD keys, the TLD keys sign domain keys, domain keys sign user and/or subdomain keys, user keys can sign subkeys. To set up a new domain, one needs to email their domain's public key to the root key's owner and receive a signed key back. So far, signing has been done all by the same person, Gale's creator Dan Egnor. There have been proposals for a notary, but nobody cares enough to finish them... Nevertheless, the scheme is better than Zephyr's Kerberos-based trust relationships, which simply do not scale above half a dozen realms.

Unfortunately, there are very few users of Gale by now. Maybe a dozen still post to pub@ofb.net to this day; most of them probably have migrated to IRC or XMPP or Skype. Overall, it feels as if Gale should have received a lot more attention than it has.

Update: Since the CVS server described in Gale's website is now defunct, I've obtained a copy of the entire repository and imported to Git – it's available at github.com/grawity/gale, with minor fixes such as better libc locale support.


The next post, if I ever get around to it, should be about IRC, Zephyr and PSYC.

Chaos.

Today, I tried accessing my laptop's files from the family desktop, running Windows XP. After typing the usual cd \\rain\grawity in Total Commander, I was greeted with a password prompt... which did not accept any of my usual passwords, for neither rain\Mantas nor rain\grawity.

At first I thought I screwed up my Samba's usermapping script, or that I forgot to configure Windows to use NTLMv2 (after it was reinstalled), but the configuration was right and curiously the usermapping script didn't seem to be executed at all. So I tried to take a look at the raw SMB traffic with Wireshark, and after filtering for smb I was greeted with a blank screen. Odd.

After expanding the filter to smb or netbios, I noticed that the desktop was sending NetBIOS name queries for RAIN, but wasn't receiving any responses... (I had forgotten to restart nmbd.service after killing a bit too many processes on the laptop.) Since the NetBIOS name query failed, Windows would fall back to good ol' DNS and look up rain.nullroute.eu.org – which had no IPv4 addresses, only an IPv6 one.

Since there was no IPv4 address, Windows skipped the LanmanWorkstation network provider entirely – it does not have IPv6 support in XP – and try the next configured one. Since the second provider is WebClient, which implements WebDAV, Windows started poking around on the laptop's webserver. It completely ignored the lack of PROPFIND in the OPTIONS response, sent a PROPFIND request anyway, then interpreted "405 Method Not Allowed" to mean "access denied" rather than "I don't speak WebDAV".


This little problem reminded me that I still do not have proper hostname resolution set up on my LAN. On various occassions it's relying on NetBIOS (sucks), Bonjour (yet another daemon), global DNS (can't put local IPv4 addresses there), and router-provided *.home DNS (router forgets hostnames, adds new-host-1.home and other stupid entries). Sometimes even /etc/hosts (ugh, manual updates). Ironically, of all those, NetBIOS has been the most reliable one so far. (Maybe I should just stop worrying about its inefficiency? The LAN is really quite small anyway.)


On the topic of consistency, I still haven't started doing consistent backups. On the laptop it's easy – just connect the external HD and run obnam ~ every now and then. On the desktop, it's harder, as 1) it runs Windows, 2) it has severely limited CPU resources, 3) it's inconvenient to carry the external HD there.

The largest problem might be #1, it being Windows there is no quick and easy equivalent to tar – and I do want a good backup tool, not a lame Cygwin port. In particular, I need it to back up files currently in use (ignoring share flags needs filesystem snapshots), files which the "backup" tool's account cannot access (requires modifying the process security token & calling low-level CreateFile() function with FILE_FLAG_BACKUP_SEMANTICS), and even EFS-encrypted files (requires an altogether different API to access).

I have, in fact, written a backup tool that does most of the above – even "integrating" it with Volume Shadow Copies despite the fact that Windows XP doesn't allow persistent VSS snapshots, and the API for making temporary snapshots is quite undocumented – but unfortunately it is in C#.NET, which conflicts with #2 "must be extremely light on CPU" (the desktop has a second-hand CPU that had been overheated many times).

As for #3, my latest plan is to have the tool create local snapshots every day (to help with recovering accidentally deleted files), and move old snapshots to external storage (maybe Obnam again?). When writing vssbackup, I had only Notepad2, the csc compiler (which came as part of .NET runtime), and online MSDN docs. Now I have Visual Studio installed, so maybe I should try porting it to unmanaged C++, get rid of some unnecessary parts...

Nah. Who needs backups?


Well, my sister did need backups just today, after accidentally permanent-deleting some files. One of them was an e-book which I thought I had three copies of; but I could find none of them – and I had to search about four separate directories all named "Library". (Then I remembered I had the e-book in my website's /mirrors directory. Whew.) The other, no such luck; it was a personal document that disappeared in the so common case of "ah, it's only a copy, I can delete this" followed by "ah, but it was the last copy" – which also took over a week to write.

Yes, it turns out I have four "Libraries", three software archives (both a complete mess), two "TODO" directories, and one huge "Downloads" dump. Every time I try to organize all that stuff, I just end up with one more half-disorganized directory. Sigh.


Previously

year 2012

year 2011

year 2010

year 2009