Pete Zaitcev's Journal [entries|friends|calendar]
Pete Zaitcev

[ userinfo | livejournal userinfo ]
[ calendar | livejournal calendar ]

How Mitchell Baker made me to divorce [06 May 2015|02:10pm]

Well, nearly did. Deleting history in Firefox 37 is very slow and the UI locks up while you do that. "Very slow" means an operation that takes 13 minutes (not exaggerating - it's reproducible). The UI lock-up means a non-dismissable context menu floating over everything; Firefox itself being, of course, entirely unresponsive. See the screencap.

The screencap is from Linux where I confirmed the problem, but the story started on Windows, where my wife tried to tidy up a bit. So, when Firefox locked up, she killed it, and repeated the process a few times. And what else would you do? We are not talking about hanging up for seconds - it literally was many minutes. Firefox did not pop a dialog with "Please wait, deleting 108,534 objects with separate SQLite transactions", a progress gauge, and a "Cancel" button. Instead, it pretended to lock up.

Interestingly enough, remember when Firefox had a default to keep the history for a week? This mode is gone now - FF keeps the history potentially forever. Instead, it offers a technical limit: 108,534 entries are saved in the "Places" database at the most, in order to prevent SQLite from eating all your storage. Now I understand why my brown "visited" links never go back to blue anymore.

The problem is, there's no alternative. I tried to use Midori as my main browser for a month or two in early 2014, but it was a horrible crash city. I had no choice but to give up and go back to Firefox and its case of Featuritis Obesum.

[link] post comment

UAT on RTL-SDR update [15 Jan 2015|12:52pm]

About a year ago, when I started playing with ADS-B over 1090ES, I noticed that small airplanes heavily favour UAT in 978 MHz, because it's cheaper. For the purposes of independent on-board traffic, thus, it would be important to tap directly into UAT. If I mingle with airliners and their 1090ES, it's in controlled airspace anyway (yes, I know that most collisions happen in controlled airspace, but I'm grasping at excuses to play with UAT here, okay).

UAT poses a large challenge for RTL-SDR because of its relatively high data rate: 1.041667 mbit/s. RTL 2832U can only sample at 3.2 MS/s, and is only stable at 2.8 MS/s. Theoretically it should be enough, but everything I saw out there only works with 8 samples per bit, for weak signals. So, for initial experiments, I thought to try a trick: self-clocking. I set the sample rate to 2083334, and then do no clock recovery whatsoever. It hits where it hits, and if sample points lay well onto bits, the packet is recovered, otherwise it's lost.

I threw together some code, ran it, and it didn't work. Only received white noise. Oh, well. I moved the repo into a dusty corner of Github and forgot about it.

Fast forward a year, a gentleman by the name Oliver Jowett noticed a big problem: the phase was computed incorrectly. I fixed that up and suddenly saw some bits recovered (as it happened, zeroes and ones were swapped, which Oliver had to correct again).

After that, things started to move forward. Having bits recovered allowed to measure reception, and I found that the antenna that I built for the 978 MHz band was much worse than the stock antenna for TV. Imagine my surprise and disappoinment: all that soldering for nothing. I don't know where I screwed up, but some suggest that the computer and dongle produce RF noise that screws with antenna, and a length of coax helps with that despite the losses incurred by the coax (/u/christ0ph liked that in parti-cular).

This is bad.

This is good. Or better at least.

From now on, it's the error recovery. Unfortunately, I have no clue what this means:

The FEC parity generation shall be based on a systematic RS 256-ary code with 8-bit code word symbols. FEC parity generation for each of the six blocks shall be a RS (92,72) code.

Quoted from Annex 10, Volume III,

P.S. Observing Oliver's key involvement, the cynical may conclude that the whole premise of Open Source is that you write drek that does not work, upload it to Github, and someone fixes it up for you, free of charge. ESR wrote a whole book about it ("with enough eyes all bugs are shallow")!

[link] 1 comment|post comment

Swift and balance [08 Jan 2015|11:17am]

Swift is on the cusp of getting yet another intricate mechanism that regulates how partitions are placed: so-called "overload". But new users of Swift even keep asking what weights are, and now this? I am not entirely sure it's necessary, but here's a simple explanation why we're ending with attempts at complexity (thanks to John Dickinson on IRC).

Suppose you have a system that spreads your replicas (of partitions) across (failure) zones. This works great as long as your zones are about the same size, usually a rack. But then one day you buy a new rack with 8TB drives and suddenly the new zone is several times larger than others. If you do not adjust anything, it ends only filled by quarter at best.

So, fine, we add "weights". Now a zone that has weight 100.0 gets 2 times more replicas than zone with weight 50.0. This allows you to fill zones better, but this must compromize your dispersion and thus durability. Suppose you only have 4 racks: three with 2TB drives and one with 8TB drives. Not an unreasonable size for a small cloud. So, you set weights to 25, 25, 25, 100. With replication factor of 3, there's still a good probability (which I'm unable to calculate, although I feel it ought to be easy for someone better educated) that the bigger node will end with 2 replicas for some partitions. Once that node goes down, you lose redundancy completely for those partitions.

In the small-cloud example above, if you care about your customers' data, you have to eat the imbalance and underutilization until your retire the 2TB drives [1].

<clayg> torgomatic: well if you have 6 failure domains in a tier but their sized 10000 10 10 10 10 10 - you're still sorta screwed

My suggestion would be just ignore all the complexity we thoughtfuly provided for the people with "screwed" clusters. Deploy and maintain your cluster to make it easy for the placement and replication: have a good number of more or less uniform zones that are well aligned to natural failure domains. Everything else is a workaround -- even weights.

P.S. Kinda wondering how Ceph deals with these issues. It is more automagic when it decides what to store where, but surely there ought to be a good and bad way to add OSDs.

[1] Strictly speaking, other options exist. You can delegate to another tier by tying 2 small racks into a zone: yet another layer of Swift's complexity. Or, you could put new 8TB drives on trays and stuff them into existing nodes. But considering that muddies the waters.

UPDATE: See the changelog for better placement in Swift 2.2.2.

[link] post comment

Going beyond ZFS by accident [05 Jan 2015|11:25am]

Yesterday, CKS wrote an article that tries to redress the balance a little in the coverage of ZFS. Apparently, detractors of ZFS were pulling quotes from his operational gripes, so he went forceful with the observation that ZFS remains the only viable advanced filesystem (on Linux or not). CKS has no time for your btrfs bullshit.

The situation where weselovskys of the world hold the only viable advanced filesystem hostage and call everyone else "jackass" is very sad for Linux, but it may not be quite so dire, because it's possible that events are overtaking btrfs and ZFS. I am talking about the march of super-advanced, distributed filesystems downstream.

It all started with the move beyond POSIX, which, admittedly, seemed very silly at the time. The early DHT was laughable and I remember how they struggled for years to even enable writes. However, useful software was developed since then.

The poster child of it is Sage's Ceph, which relies on plain old XFS for back-end storage, composes an object storage out of nodes (called RADOS), and layers a POSIX layer on top for those who want it. It is in field use at Dreamhost. I can easily see someone using it where otherwise a ZFS-backed NFS/CIFS cluster would be deployed.

Another piece of software that I place in the same category is OpenStack Swift. I know, Swift is not competing with ZFS directly. The consistency of its meta layer is not sufficient to emulate POSIX anyway. However, you get all those built-in checksums and all that durability jazz that CKS wants. And Swift aims even further up in scale than Ceph, by being in field use at Rackspace. So, what seems to be happening is that folks who really need to go large are willing at times to forsake even compatibility with POSIX, in part of get the benefits that ZFS provides to CKS. Mercado Libre is one well-hyped case of migration from a pile of NFS filers to a Swift cluster.

Now that these systems are established and have themselves proven, I see constant efforts to take them downscale. Original Swift 1.0 did not even work right if you had less than 3 nodes (strictly speaking, if you had fewer zones than replication factor). This was fixed since by so-called "as good as possible placement" around 1.13 Havana, so you can have 1-node Swift easily. Ceph, similarly, would not consider PGs on the same node healthy and it's a bit of a PITA even in Firefly. So yea, there are issues, but we're working on it. And in the process, we're definitely coming for chunks of ZFS space.

[link] 2 comments|post comment

blitz2 for GNOME catastrophy [05 Jan 2015|09:27am]

After putting blitz2 on my Nexus and poking into GUI buttons, I reckoned that it might be time to stop typing in a terminal like a caveman on Linux too (I'm joking, but only just). And the project rolled smoothly for a while. What it took me 2 months to accomplish in Android, only took 2 days in GNOME. However, 2 lines of code from the end, it all came to an abrubt halt when I found out that it is impossible to access clipboard from a JavaScript application. See GNOME bugs 579312, 712752.

[link] post comment

blitz2 for Android [31 Dec 2014|04:15pm]

An additional upside for blitz2 is that HTTP client is available on about any platform. So if I want to share clipboard with my Nexus tablet, I can, without running an sshd on it.

This is my first Android app, and I didn't touch Java in many years. So, first impressions.

I forgot how insanely wordy Java is. And doing anything takes effort, with all the factories, accessors, and whatnot.

I like checked exceptions. Too bad Python doesn't have them (probably impossible by the very nature of a dynamic language, but I've been bitten by an unexpected exception floating up from the depth of the stack before).

Android docs are excellent and one almost never needs to search for answers. Unfortunately, I managed to step into one such case: the so-called "Up" navigation. My chosen API level is 11. The contemporary docs explain how to emulate "Up" using compatibility libraries for APIs before mine, and explain how to use onNavigateUp(), that comes in API level 16. But there's absolutely nothing, nowhere, that tells how do it in API 11. I was walking in circles for days. The answer is actually a secret ID namespace, particularly I would never figure it out if not for random pieces of code on the Internet. Good grief, Google. So close to perfect marks.

Oh, and one more thing: Googlers score good sanity points for reimplementing a stock Java API for HTTP (HttpURLConnection and friends). They could've easily rolled their own, but they didn't. They wrote their own runtime, but it's fully compatible with Oracle, including dark corners of SSL. It permits to mostly debug difficult parts on a Linux box. Very nice. Just to see what it could be otherwise, look at their gratiously incompatible Base64.

UPDATE: I forgot to mention that I started with Eclipse, but it was entirely unusable due to crashing all the time (about once an hour, for no discernable reason). I was at Fedora 20 at the time. So, I used command-line tools, and that worked like a charm. There's a Makefile in blitz2 repo linked above.

[link] post comment

Cheating around taskotron in Fedora [22 Dec 2014|09:44am]

The yesterday ntp vulnerability uncovered a trick for Fedora maintainers. You know how it's super annoying that you cannot push an update to F20 without F21? You must herd updates and can never do them in parallel, or else taskotron ruins innocent updates. But at the time of this writing the fixes are live in F20, but not in F21. How does Miroslav do it?

The answer is easy: he keeps ntp intentionally a few releases back in older Fedora (4.2.6p5-19 in F20), so he can bump it with impunity without regard to the newer Fedora (4.2.6p5-25 in F21). Of course, if someone were to upgrade to F21 today, he'd go from a fixed ntp to a broken ntp, but hey... at least the automated checks are defeated.

This challenge is similar to writing super ugly OpenStack code that passes PEP8 checks, only outcome is actually dangerous today.

[link] post comment

blitz2 [12 Dec 2014|01:55pm]

You know how some people attach several montiors to one PC? I don't. I just have several PCs. But then I want copy-paste to work transparently (as transparently as possible). For several years I used blitz to copy clipboard. It works well enough, but once you have 3 computers, it gets somewhat cumbersome to type the hostname. Also, it always bothered me how it rides ssh authentication. I wanted something independent from ssh.

Behold blitz2. Instead of passing the clipboard to the host where it's needed directly, the clipboard is uploaded to an HTTP server. Seems more complex at first, but it's actually much better, because previously the PC where you copy had to authenticate to the PC where you paste. Now the authentication is symmetric. So, all clients are configured exactly the same, and all can upload and download the clipboard no matter who trusts what ssh keys.

[link] post comment

Laptop bleg [19 Oct 2014|10:29am]

I'm considering a laptop (actually two). Requirements:

  • 13" to 14" class.
  • Indestructable.
  • Display that is not too wide. Enough with 16:9 already! Aspect of 1.6 would be ideal (Lenovo T400 had that).
  • Light. Indestructable is more important, but it should be light: 2kg or less.
  • No nipple. No Lenovo.

Where it comes from is mostly my wife's Sony Vaio Z. I used to have a Z back in 2001 or so, when they were in 12" format. It was the best laptop ever, but unfortunately it succumbed to a DC-DC converter failure. The modern Z is not like that Z. The most super annoying problem is that the screws holding the battery failed in an interesting way: it is impossible to remove the battery now. Also, the contact between the battery and the moterboard is marginal. I managed to fix the problem by manufacturing a finely shaped wooden wedge that I drove into a gap and thus extended the life of that thing, but man, Sony, this is disappointing.

Unfortunately, I don't remember if it was Kota or Daisuke, but one of Japanese guys at a recent Swift Hackathon in Boston had a Z of the similar vintage, and it looked impeccable. Maybe Sony figured that it's going to be predominant mode of care that their wares receive, and so why not make the modern Z this much cheaper than the old, indestructable Z. But they still charge exorbitant prices.

Lenovo wins a special notice because I had a T400 for 3 years and swore never deal with it ever again. The biggest problem is the keyboard layout, because I use left pinky for control key. I could live with their idiotic placement of Escape, but I refuse to deal with 3 years of physical pain again. Also, their famous qualify seems slipping, as my mouse button broke within 3 years. Battery died, too. However, the T400 had a very good display, and I would like another like that, if possible.

[link] 10 comments|post comment

Next stop, SMT [26 May 2014|06:50pm]

The little hardware project, mentioned previously, continues to chug along. A prototype is now happily blinking LEDs on a veroboard:


Now it's time to use a real technology, so the thing could be used onboard a car or airplane. Since the heart of the design is a chip in LGA-14 package, we are looking at soldering a surface-mounted element with 0.35 mm pitch.

I reached to members of a local hackerspace for advice, and they suggested to forget what people post to the Internet about wave-soldering in a oven. Instead, buy a cheap 10x microscope, fine tip for my iron, and some flux-solder paste, then do it all by hand. As Yoda said, there is only do and do not, there is no try.

[link] 2 comments|post comment

Digital life in 2014 [24 May 2014|09:58pm]

I do not get out often and my clamshell cellphone incurs a bill of $3/month. So, imagine my surprise when at a dinner with colleagues everyone pulled out a charger brick and plugged their smartphone into it. Or, actually, one guy did not have a brick with him, so someone else let him tap into a 2-port brick (pictured above). The same ritual repeated at every dinner!

I can offer a couple of explanations. One is that it is much cheaper to buy a smartphone with a poor battery life and a brick than to buy a cellphone with a decent battery life (decent being >= 1 day, so you could charge it overnight). Or, that market forces are aligned in such a way that there are no smartphones on the market that can last for a whole day (anymore).

UPDATE: My wife is a smartphone user and explained it to me. Apparently, sooner or later everyone hits an app which is an absolute energy hog for no good reason, and is unwilling to discard it (in her case it was Pomodoro, but it could be anything). Once that happens, no technology exists to pack enough battery into a smartphone. Or so the story goes. In other words, they buy a smartphone hoping that they will not need the brick, but inevitably they do.

[link] 4 comments|post comment

OpenStack Atlanta 2014 [23 May 2014|12:38pm]

The best part, I think, came during the Swift Ops session, when our PTL, John Dickinson, asked for a show of hands. For example -- without revealing any specific proprietary information -- how many people run clusters of less than 10 nodes, 10 to 100, 100 to 1000, etc. He also asked how old the production clusters were, and if anyone deployed Swift in 2014. Most clusters were older, and only 1 man raised his hand. John asked him what were the difficulties setting it up, and the man said: "we had some, but then we paid Red Hat to fix it up for us, and they did, so it works okay now." I felt so useful!

The hardware show was pretty bare, compared to Portland, where Dell and OCP brought out cool things.

HGST showed their Ethernet drive, but they used a chassis so dire, I don't even want to post a picture. OCP did the same this year: they brought a German partner who demonstrated a storage box that looked built in a basement in Kherson, Ukraine, while under siege by Kiev forces.

Here's a poor pic of cute Mellanox Ethernet wares: a switch, NICs, and some kind of modern equivalent of GBICs for fiber.

Interestingly enough, although Mellanox displayed Ethernet only, I heard in other sessions that Infiniband was not entirely dead. Basically if you need to step beyond bonded 10GbE, there's nothing else for your overworked Swift proxies: it's Infiniband or nothing. My interlocutor from SwiftStack implied that a router existed into which you could plug your Infiniband pipe, I think.

[link] post comment

Seagate Kinetic and SMR [15 May 2014|08:35am]

In the trivial things I never noticed department: Kinetic might have a technical merit. Due to the long history of vendors selling high-margin snake oil, I was somewhat sceptical of the object-addressed storage. However, they had a session with Joe Arnold of SwiftStack and a corporate person from Seagate (with an excessively complicated title), where they mentioned off-hand that actually all this "intelligent" stuff is supposed to help with SMR. As everyone probably know, shingle drives implement complicated read-modify-write cycles to support traditional sector-addressed model and the performance penalty is worse than 4K drive with 512-byte interface.

I cannot help thinking that it would be even better to find a more natural model to expose the characteristics of the drive to the host. Perhaps some kind of "log-structured-drive". I bet there are going to be all sorts of overheads in the drive's filesystem that negate the increase in the aerial density. After all, shingles only give you about 2x. The long-term performance of any object-addressed drive is also in doubt as the fragmentation mounts.

BTW, a Seagate guy swore to me that Kinetic is not patent-encumbered and that they really want other drive vendors to jump on the bandwagon.

UPDATE: Jeff Darcy brought up HGST on Twitter. The former-Hitachi guys (owned by WD now) do something entirely different: they allow apps, such as Swift object server, to run directly on the drive. It's cute, but does nothing to help the block-addressing API being unsifficient to manage a shingled drive. When software runs on the drive, it still has to talk to the rest of the drive somehow, and HGST did not add a different API to the kernel. All it does is kicking the can down the road and hoping a solution comes along.

UPDATE: Wow even Sage.

[link] 1 comment|post comment

Mental note [08 May 2014|01:25pm]

I went through the architecture manual for Ceph and penciled down a few ideas that could be applied to Swift. The biggest one is that we could benefit from some kind of massive proxy or a PACO setup.

Unfortunately, I see problems with a large PACO. Memcached efficiency will nosedive, for one. But also, how are we going to make sure clients are spread right? There's no cluster map and thus clients can't know which proxy in PACO is closer to the location. In fact we deliberately prevent them from knowing too much. They don't even know cluster's partition size.

The reason this matters is that I feel that EC should increase requirements for CPU on proxies, which are CPU bound in most clusters already. Of course what I feel may not be what actually occurs, so maybe it does not matter.

[link] post comment

Get off my lawn [01 May 2014|12:19pm]

My only contact with Erasure Codes previously was in the field of data transmission (naturally). While a student at Lomonosov MSU, I worked in a company that developed a network for PCs, called "Micross" and led by Andrey Kinash, IIRC. Originally it ran on top of MS-DOS 3.30, and was ported to 4.01 and later versions over time.

Ethernet was absurdly expensive in the country back in the 1985, so the hardware used the built-in serial. A small box with a relay attached the PC to a ring, where a software token was circulated. Primary means of keeping the ring integrity was the relay, controlled by the DTR signal. In case that was not enough, the system implemented a double-ring isolation not unlike FDDI's.

The baud rate was 115.2 kbit/s, and that interfered with the MS-DOS. I should note that notionally the system permitted data transfer while PCs ran usual office applications. But even if PC was otherwise idle, a hit by the system timer would block out interrupts long enough to drop 2 or 3 characters. Solution was to implement a kind of erasure coding, which recovered not only corruption, but also a loss of data. The math and implementation in Modula 2 were done by Mr. Vladimir Roganov.

I remember that at the time, the ability to share directories over LAN without a dedicated server was most impressive to the customers, but I thought that the Roganov's EC was perhaps the most praiseworthy, from nerdiness perspective. Remember that all of that ran on a 12 MHz 80286.

[link] post comment

gEDA [26 Apr 2014|07:41pm]

For the next stage, I switched from Fritzing to gEDA and was much happier for it. Certainly, gEDA is old-fashioned. The UI of gschem is really weird, and even has problems dealing with click-to-focus. But it works.

I put the phase onto a Veroboard, as I'm not ready to deal with ordering PCBs yet.

[link] 2 comments|post comment

VPN versus DNS [28 Mar 2014|12:26pm]

For years, I did my best to ignore the problem, but CKS inspired me to blog the curious networking banality, in case anyone has wisdom to share.

The deal is simple: I have a laptop with a VPN client (I use vpnc). The client creates a tun0 interface and some RFC 1918 routes. My home RFC 1918 routes are more specific, so routing works great. The name service does not.

Obviously, if we trust DHCP-supplied nameserver, it has no work-internal names in it. The stock solution is to let vpnc to install /etc/resolv.conf pointing to work-internal nameservers. Unfortunately this does not work for me, because I have a home DNS zone, zaitcev.lan. Work-internal DNS does not know about that one.

Thus I would like some kind of solution that routes DNS requests somehow according to a configuration. Requests to work-internal namespaces (such as * would go to nameservers delivered by vpnc (I think I can make it write something like /etc/vpnc/resolv.conf that does not conflict). Other requests go to the infrastructure name service, being it a hotel network or home network. Home network is capable of serving its own private authoritative zones and forwarding the rest. That's the ideal, so how to accomplish it?

I attempted apply a local dnsmasq, but could not figure out if it can do what I want and if yes, how.

For now, I have some scripting that caches work-internal hostnames in /etc/hosts. That works, somewhat. Still, I cannot imagine that nobody thought of this problem. Surely, thousands are on VPNs, and some of them have home networks. And... nobody? (I know that a few people just run VPN on the home infrastructure; that does not help my laptop, unfortunately).

UPDATE: Several people commented with interesting solutions. You can count on Mr. robbat2 to be on the bleeding edge and use unbound manually. I went with the NM magic as suggested by Mr. nullr0ute. In F20 it is required to edit /etc/NetworkManager/NetworkManager.conf and add "dns=dnsmasq" there. Then, NM runs dnsmasq with the following magic /var/run/NetworkManager/dnsmasq.conf:


It is exactly the syntax Ewen tried to impart with his comment, but I'm too stupid to add 2 and 2 this way, so I have NM do it.

NM also starts vpnc in such a way that it does not need to damage any of my old hand-made config in /etc/vpnc, which is a nice touch.

See also: bz#842037.

See also: Chris using unbound.

[link] 8 comments|post comment

Okay, it's all broken. Now what? [11 Mar 2014|08:02pm]

A rant in ;login was making rounds recently (h/t @jgarzik), which I thought was not all that relevant... until I remembered that Swift Power Calculator has mysteriously stopped working for me. Its creator is powerless to do anything about it, and so am I.

So, it's relevant all right. We're in a big trouble even if Gmail kind of works most of the time. But the rant makes no recommendations, only observations. So it's quite unsatisfying.

BTW, it reminds me about a famous preso by Jeff Mogul, "What's wrong with HTTP and why it does not matter". Except Mogul's rant was more to the point. They don't make engineers like they used to, apparently. Also notably, I think, Mogul prompted development of RESTful improvements. But there's nothing we can do about excessive thickness of our stacks (that I can see). It's just spiralling out of control.

[link] 1 comment|post comment

AVR, Fritzing, Inkscape [09 Mar 2014|10:23pm]

I suppose everyone has to pass through a hardware phase, and mine is now, for which I implemented a LED blinker with an AVRtiny2313. I don't think it even merits the usual blog laydown. Basically all it took was following tutorials to the letter.

For the initial project, I figured that learning gEDA would take too much, so I unleashed an inner hipster and used Fritzing. Hey, it allows to plan breadboards, so there. And well it was a learning experience and no mistake. Crashes, impossible to undo changes, UI elements outside of the screen, everything. Black magic everywhere: I could never figure out how to merge wires, dedicate a ground wire/plane, or edit labels (so all of them are incorrect in the schematic above). The biggest problem was the lack of library support together with an awful parts editor. Editing schematics in Inkscape was so painful, that I resigned to doing a piss-poor job, evident in all the crooked lines around the AVRtiny2313. I understand that Fritzing's main focus is iPad, but this is just at a level of typical outsourced Windows application.

Inkscape deserves a special mention due to the way Fritzing requires SVG files being in a particular format. If you load and edit some of those, the grouping defeats Inkscape features, so one cannot even select elements at times. And editing the raw XML cause weirdest effects, so it's not like LyX-on-TeX, edit and visualize. At least our flagship vector graphics package didn't crash.

The avr-gcc is awesome though. 100% turnkey: yum install and you're done. Same for avrdude. No huss, no fuss, everything works.

[link] 2 comments|post comment

Suddenly, Python Magic [06 Mar 2014|08:44pm]

Looking at a review by Solly today, I saw something deeply disturbing. A simplified version that I tested follows:

import unittest

class Context(object):
    def __init__(self):
        self.func = None
    def kill(self):

class TextGuruMeditationMock(object):

    # The .run() normally is implemented in the report.Text.
    def run(self):
        return "Guru Meditation Example"

    def setup_autorun(cls, ctx, dump_with=None):
        ctx.func = lambda *args: cls.handle_signal(dump_with,

    def handle_signal(cls, dump_func, *args):
            res = cls().run()
        except Exception:
            dump_func("Unable to run")

class TestSomething(unittest.TestCase):

    def test_dump_with(self):
        ctx = Context()

        class Writr(object):
            def __init__(self):
                self.res = ''

            def go(self, out):
                self.res += out

        target = Writr()
        self.assertIn('Guru Meditation', target.res)

Okay, obviously we're setting a signal handler, which is a little lambda, which invokes the dump_with, which ... is a class method? How does it receive its self?!

I guess that the deep Python magic occurs in how the method target.go is prepared to become an argument. The only explanation I see is that Python creates some kind of activation record for this, which includes the instance (target) and the method, and that record is the object being passed down as dump_with. I knew that Python did it for scoped functions, where we have global dict, local dict, and all that good stuff. But this is different, isn't it? How does it even know that belongs to target? In what part of Python spec is it described?

UPDATE: Commenters provided hints with the key idea being a "bound method" (a kind of user-defined method).

A user-defined method object combines a class, a class instance (or None) and any callable object (normally a user-defined function).

When a user-defined method object is created by retrieving a user-defined function object from a class, its im_self attribute is None and the method object is said to be unbound. When one is created by retrieving a user-defined function object from a class via one of its instances, its im_self attribute is the instance, and the method object is said to be bound.

Thanks, Josh et al.!

UPDATE: See also Chris' explanation and Peter Donis' comment re. unbound methods gone from py3.

[link] 3 comments|post comment

[ viewing | most recent entries ]
[ go | earlier ]