Ken (Chanoch) Bloom's Blog

3rd February 2011

Digitally signing a PDF in Linux

So I don't forget, the instructions to digitally sign a PDF in Linux are as follows:

First create a key using Java's keytool if you don't already have one. The command is

keytool -genkey -keyalg RSA -keysize 4096 -alias alias -keystore .keystore

Then, use JSignPDF to sign the PDF file. Make sure to select "JKS" from the keystore type.

Permalink | linux.
20th September 2010

Using GNOME default browser setting without running gnome-session

GTK+ has an API gtk_show_uri for launching a web browser or other helper application to view URLs. This API call uses GIO to determine what programs to use for which protocols. However this only works if you're running gnome-session. I don't want to run all of GNOME on my desktop, so I needed a different solution.

For those who want to use the GNOME default web browser to visit URIs in GTK+ applications, but do not want to run a full GNOME session yourself, the workaround is to set the environment variable GIO_USE_URI_ASSOCIATION=gconf. (I discovered this by reading and experimenting a bit).

I'm sure the procedure for using the XFCE4 default web browser is similar, but I don't know the name of the GDesktopAppInfoLookup implementation to use.

Permalink | linux.
1st December 2009

Using git with wdiff

To configure git to use wdiff to compare files (which can be useful when working with text files that wrap differently even for small edits), add the following to .git/config

[difftool "wdiff"]
    cmd = /usr/bin/wdiff -l ${LOCAL} ${REMOTE} | less

Then you can use wdiff by calling

git difftool --tool=wdiff ...

Update Mar 17, 2009 Stefano Zacchiroli shared the git diff --color-words option, which does the same thing, but is a little less clunky in terms of how it uses th pager.

Permalink | linux.
19th July 2009

Backup requirements and backup strategies

A slashdot article has asked what is the best backup stragtegy for home users in this day and age when hard drive space and peoples' media collections far outpace the sizes of removable media used for backups just a few years ago. I commented there, and then decided to turn my comment into a blog post, because the general principles are important.

To decide how best to back up, we must lay out the kinds of failures that can occur and goals of a backup. (I keep my documents in a constellation of git repositories, so many of my backup needs are covered by replicating the repositories to several places, and some of my examples are based on my experience with git.)

  1. We would like to protect against mechanical drive failure. This can be done with a RAID.

  2. We may also want to protect against the failure of other components of the computer. My primary computer (the one that holds my master git repositories, and is the center of my star for replicating my live copies of my data, and takes care of my email downloading and mail filtering) recently died because its motherboard died. The hard drive was totally intact, but it took about two weeks to get a new computer, and in the meantime, I still needed something to perform the functions of this computer without losing productivity. When the new computer came, it had a brand new generation of most of the technologies on the motherboard, switching from x86 to AMD64, and from IDE to SATA, and after an additional week that it took me to borrow an appropriate adapter I could restore anything I wanted from the old hard drive.

  3. We would like to protect against accidental deletion of files, file corruption, or edits to a file that we have now reconsidered. This can be done with snapshotting. In source code, to reconsider and edit to a file is fairly common, and is the reason why most programming projects use revision control systems. Other options like nilfs or ZFS snapshots can also fill this goal. This goal is accomplished more easily if the backups area automatic and the backup device is live on the system.

    Depending on your needs, this goal may be counterbalanced by a need to not retain the history of files for legal or other reasons, and this should inform your choice of backup strategy.

  4. We would like to protect against filesystem corruption, whether by an OS bug, or by accidentally doing cat /dev/random > /dev/hda. This can be done by having an extra drive of some sort that isn't normally hooked up to the computer. Tape drives, CDs, and DVDs have traditionally fulfilled this purpose, and this is where the use of additional hard drives is being suggested. Remote backups, via rsync or git, can also accomplish this. When deciding whether to do this remotely or locally, consider the amount of data you're backing up, the size of backing up incremental changes, the size of the initial upload, and whether you have a one-off way of getting more bandwidth for the initial upload.

  5. We would like to protect against natural disasters. For someone living in New Orleans, it would be nice to have a backup somewhere outside the path of Hurricane Katrina. Remote backups may be pretty much the only way to accomplish this, unless you're a frequent traveler and can hand-deliver backup media to remote locations.

  6. In addition to any of the above, the code you use create said backup may be buggy, or may become buggy or misconfigured or obsolete over time. Checking the integrity and restorability of your backups after creating them, and keeping several (independent) previous versions of a backup, at least for a short time, may help here.

You may not be concerned with the various modes of failure described here occuring simultaneously. For example, it may be unlikely that you need to deal with file system corruption at the same time that you regret one of the edits you made on your file. In that case, your offline backup device doesn't need to hold all of your snapshots.

Also, consider the importance of the data you are backing up, and your ability to regenerate it as needed. For example, I use Debian Linux. Pretty much any software I need to restore is available from Debian's mirrors (for free), so there's no need to backup the software I use or the operating system. I can content myself with backing up /etc, and /home, and knowing that anything else is out there in the cloud because hundreds of other people are using it.

After that, there's stuff that's just not that important, I'm more willing to permentantly lose 2GB of photos, than the few megabytes that is the core of my Ph.D. thesis research.

And there's also a diary and GPG keys that (though important) I'd rather lose permenantly than have anywhere other than my one primary computer.

No backup strategy is perfect. There's a story about how a five-year old password foiled one company's otherwise immaculate backup scheme.

Permalink | linux.
22nd May 2009

Ruby Odeum

Somewhere out there, in a far corner of the Web that I won't find again, there was a Gem of Ruby/Odium 0.4, a ruby binding for the QDBM inverted index APIs. I'm going to mirror it here until I come up with some better idea of what to do with it.

Permalink | linux.
17th May 2009

Using a JMicron JM20337 USB IDE adapter with linux

My old computer died, and I got a new one, and now I'm working on recovering documents from the old hard drive. As the new comptuer has SATA (and the old one had IDE), I needed to borrow a USB-IDE adapter to recover the drive.

Because I can't seem to find any documentation about the adapter on the web, I'm going to post a tip here about making it work.

The adapter is a JMicron JM20337 adapter. It shows in lsusb as

Bus 001 Device 002: ID 152d:2338 JMicron Technology Corp. / JMicron
USA Technology Corp. JM20337 Hi-Speed USB to SATA & PATA Combo Bridge

and it shows in dmesg as:

usb 1-1: new high speed USB device using ehci_hcd and address 2
usb 1-1: New USB device found, idVendor=152d, idProduct=2338
usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=5
usb 1-1: Product: USB to ATA/ATAPI Bridge
usb 1-1: Manufacturer: JMicron
usb 1-1: SerialNumber: 152D203380B6
usb 1-1: configuration #1 chosen from 1 choice
Initializing USB Mass Storage driver...
scsi0 : SCSI emulation for USB Mass Storage devices
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning

It's still waiting for the device to settle, as I write this, becuase I haven't turned on power to the drive yet. When I first connected it last night (with the drive attached) and turned it on, I got the message:

scsi 4:0:0:0: Direct-Access                                    PQ: 0 ANSI: 2 CCS
sd 4:0:0:0: [sdb] Attached SCSI disk

No partition table, and if I tried to dd if=/dev/sdb (for example to try and recover the partition table), I got a zero length file.

After some failed googling and experimentation, I discovered that the drive has to be jumpered as an IDE slave. (Not a master, and cable select won't work.)

Then I get the partitions, and more information about the drive

scsi5 : SCSI emulation for USB Mass Storage devices
scsi 5:0:0:0: Direct-Access     WDC WD12 00JB-00EVA0      5R15 PQ: 0 ANSI: 2 CCS
sd 5:0:0:0: [sdb] 234441648 512-byte hardware sectors: (120 GB/111 GiB)
sd 5:0:0:0: [sdb] Write Protect is off
sd 5:0:0:0: [sdb] 234441648 512-byte hardware sectors: (120 GB/111 GiB)
sd 5:0:0:0: [sdb] Write Protect is off
 sdb: sdb1 sdb2 < sdb5 > sdb3
sd 5:0:0:0: [sdb] Attached SCSI disk

You may also need to plug the USB cable in to the computer after turning on power to the disk for things to work properly. (If you plugged in the USB cable first, you can just unplug it and replug it.)

Permalink | linux.
30th January 2009

`bus' command for the CTA Bus tracker

So the CTA finally brought the bus tracker to busses running through my neighborhood. It gives you a nice Google map showing you where busses currently are on the route, and a separate web interface telling you how long before a bus reaches a given spot. Nice, but I don't want to open a web browser for all of that. (PACE has actually had my busses for a while.)

Enter the bus command:

#!/usr/bin/env ruby
require "yaml"
require "cgi"
require "open-uri"
require "time",:lines),:dir,:stop,:id) do
  def url
    "" %
      [self.line, CGI.escape(self.dir), CGI.escape(self.stop),]
  def times
    mins=html.scan(%r{<b>(\d+).*MIN</b>}){|x| x.to_i}{|x|*60}{|m,t| "#{} (#{m}m)"}.join(", ")
  def description
    "%s %s (%s): %s" % [self.line, self.stop, self.dir, self.times]
end,:dir,:stop,:name) do
  def url
    "" %
      [self.line, self.dir, self.stop]
  def times
    times=html.scan(%r{>(\d+:\d+ [AP]\.M\.)}){|x| Time.parse(x)}{|x| (( }{|m,t| "#{} (#{m}m)"}.join(", ")
  def description
    "#{}: #{self.times}"

class Time
  def hm
    strftime "%H:%M"


locations.each do |location|
  location.lines.each do |line|
    puts "  #{line.description}"

And my ~/.bus.yaml

- !ruby/struct:Location 
  name: Home
  - !ruby/struct:CTA 
    line: 96
    dir: East Bound
    stop: Lunt & California
    id: 12043
  - !ruby/struct:CTA 
    line: 96
    dir: West Bound
    stop: Lunt & California
    id: 12077
  - !ruby/struct:CTA 
    line: 93
    dir: North Bound
    stop: California & Lunt
    id: 11915
  - !ruby/struct:CTA 
    line: 93
    dir: South Bound
    stop: California & Lunt
    id: 11871
  - !ruby/struct:CTA
    line: 49B
    dir: South Bound
    stop: Western & Lunt
    id: 1698
  - !ruby/struct:PACE
    line: 290
    dir: 1
    stop: 22090
    name: 290 Touhy & California (East Bound)

The secret codes (id for the CTA, and stop and dir for PACE) need to be filled in by visitng the bus you're interested in on the bus tracker first. For PACE, name is the name of the line --- since most of the site's inner workings are through secret codes, you need to provide a human-readable name yourself.

Now I can just run the bus command and see all of the nearby busses at a glance:

  96 Lunt & California (East Bound): 14:46 (16m)
  96 Lunt & California (West Bound): 14:36 (6m), 14:54 (24m)
  93 California & Lunt (North Bound): 14:39 (9m), 14:51 (21m)
  93 California & Lunt (South Bound): 14:33 (3m), 14:55 (25m)
  49B Western & Lunt (South Bound): 14:37 (7m), 14:48 (18m), 14:58 (28m)
  290 Touhy & California (East Bound): 14:33 (2m), 14:44 (13m), 14:50 (19m)

Whoops. Here comes by #93 now. Better get going.

Permalink | linux.
15th November 2008

When you can't add local printers in CUPS

I bought a Samsung ML-2510 laser printer the other day, and when installing using CUPS, I discovered that CUPS wasn't letting me add local printers of any kind. After a fashion I discovered that this is because the available printing backends in CUPS are determined by which files are present in /usr/lib/cups/backend. So one can add a missing backend by copying the missing backend from /usr/lib/cups/backend-available to /usr/lib/cups/backend.

In Debian, one can also fix this by running dpkg-reconfigure cups.

Update Nov 18, 2008 As ususal, I spoke too soon. The issue I was having is actually a CUPS permissions bug . I don't know why it worked for me Friday after fiddling with the above (and upgrading CUPS to Debian experimental), but today it was all broken again, so I had to fiddle some more.

Permalink | linux.
28th September 2008


Lucas Nussbaum gave me a tip that one can use the posix_fadvise(2) system call to let the Linux IO scheduler know what data you intend to use, so that it can fetch it before you have to block on the data. So I have written the fadvise RubyGem to make that system call accessible to Ruby.

File#fadvise(offset,len,advice) -> self

Advise the operating system how you intend to use the data in this file, starting from byte offset and counting len bytes, so that the kernel can schedule it to be fetched in the background while you do more processing. This call is intended to avoid blocking later when you actually read the data, but whether this actually happens is up to the the kernel's IO scheduler. Valid values for advice are:

  • :normal
  • :sequential
  • :random
  • :no_reuse
  • :will_need
  • :dont_need

This call does not block.

See the posix_fadvise(2) manpage for more information about these types of advice.

Permalink | linux.
31st January 2008

Battery monitor with sysfs

Debian's packages of Linux kernel 2.6.24 have disabled CONFIG_ACPI_PROCFS_POWER, so the /proc interface for reading your battery power is no longer present. This breaks most of the battery monitors in Debian. Not to worry. I wrote my own.

#!/usr/bin/env ruby
require 'dockapp'


def readint file
  open(file)do |f|
def readstr file
  open(file)do |f|

def update
  charge_full=readint "#{PREFIX}/BAT1/charge_full"
  charge_now=readint "#{PREFIX}/BAT1/charge_now"
  @bat_pct.set_text "#{(100*charge_now/charge_full)}%"
  @bat_status.set_text readstr("#{PREFIX}/BAT1/status")
  @time.set_text "" #TODO: change me

  acad_online=readint "#{PREFIX}/ACAD/online"
  case acad_online
  when 0: @cord.set_text "unplugged"
  when 1: @cord.set_text "plugged"
end "Power Status""",9,1,0)"",9,1,0)"",9,1,0)"",9,1,0)

dockapp.add(2,45,@time) do



Have a look at for the dockapp library.

Permalink | linux.
15th July 2007


I had been looking to configure an email server that

  1. accepted SMTP connections on localhost:25,
  2. accepted mails through /usr/bin/sendmail
  3. would use's TLS interface as a smarthost.
  4. wasn't difficult to configure, since I don't actually care about local delivery.

All of the standard choices for the first 3 requirements including Exim, and Postfix are IMO difficult to configure because you also have lots of options for how to configure local delivery.

So I found masqmail, which handles 1,2 and 4 admirably, and would do just fine with 3, if it didn't have to use TLS (since masqmail doesn't support TLS). After exploring around for a while, I found that Masqmail has two options for delivery to a smarthost:

  1. send directly to an SMTP server
  2. pipe the message to an external program

So I began looking at an external program to use.

My previous configuration mail configuration had been to use ssmtp as my MTA, but that only supported requrements 2,3 and 4. My first expermient was to use ssmtp as the masqmail's external program. To use ssmtp for this purpose, I had to compile it from scratch for Debian since all of the sendmail providers conflict with each other. (No /etc/alternatives support for the MTA)

I discovered in Debian's archives a similar program, esmtp. The maintainers of esmtp were smart enough to split the sendmail functionality from the rest of the package, creating the esmtp package with the mailer, and the esmtp-run package with the sendmail symlink.

So I installed esmtp (but not esmtp-run which provides the sendmail symlink) and set up masqmail as follows: in /etc/masqmail/masqmail.conf (among other options):

online_pipe="/bin/echo gmail" #arguably I should do something smarter here"/etc/masqmail/gmail.route"

in /etc/masqmail/gmail.route:

pipe="/usr/sbin/esmtp ${rcpt}"

in /etc/esmtprc:

Update Dec 30, 2008: Matan Ziv-Av pointed out to me that "when a message has multiple recipients, masqmail tries to deliver to each one, but esmtp -t sends to all recipient, according to the message's headers, so every recipient gets as much messages as there are recipients." I have updated /etc/masqmail/gmail.route to use esmtp ${rcpt}.

Permalink | linux.
31st May 2006


The ACM SIG documentclasses interact badly with the use of the geometry package for getting the paper size right. They also print very badly when in A4 papersize, sent to letter paper. To solve this problem, put this in your LaTeX preamble. (You should probably remove it before submitting the paper electronically to the conference)


%remove these lines before submitting

This handles both LaTeX with the dvips driver, and pdfLaTeX.

Permalink | linux.
My Website Archives