magic & metaprogramming
August 07, 2017
10 min read

this post is about python descriptors

Instead of writing a way more involved post, spending a bunch of time justifying why I would want to talk about this, and providing examples of a more professional implementation of using Python Data Model Descriptors; I decided it would be way more fun to show a silly example of how this feature can be used.

NOTE: this is done in Python 3.6 but the Python 2.7’s descriptors work exactly the same for new style classes.

Enjoy!

a magical world with magical creatures

Imagine a world of magical creatures. In this world, all of our magical creatures have been granted the ability to interact with other creatures:

    class Creature:

        def __init__(self, name):
            self.name = name

        def interact(self, target):
            raise NotImplementedError()

    class Cat(Creature):

        def interact(self, target):
            return f"{self.name} is a cat and scratches {target}!"

    class Dog(Creature):

        def interact(self, target):
            return f"{self.name} is a dog and barks at {target}!"

It also has things like trees and flowers:

    class Tree:
        def __repr__(self):
            return "a tree"

    class Flower:
        def __repr__(self):
            return "a flower"

And of course; Humans and Wizards:

    class Human(Creature):

        def interact(self, target):
            return f"{self.name} is a normal human and waves at {target}!"

    class Wizard(Human):

        def interact(self, target):
            return f"{self.name} is a powerful wizard and scoffs at {target}!"

Creatures can interact with other objects in the world:

    >>> dog = Dog("Fido")
    >>> dog.interact(Tree())
    'Fido is a dog and barks at a tree!'

And wizards, as we know, have some spells. In particular, wizards can transform creatures into other creatures (don’t worry, we’ll talk about this at the end):

    class Wizard(Human):

        ...

        def transform(self, target, into):
            if not isinstance(target, Creature):
                raise ValueError(f"{type(target)} is not a {Creature.__name__}!")

            if not inspect.isclass(into) or Creature not in into.__mro__:
                raise ValueError(f"{type(into)} is not a {Creature.__name__}!")

            attrs = [attr for attr in dir(into) if not attr.startswith('__')]
            for attr in attrs:
                value = getattr(into, attr)
                if not hasattr(value, '__func__'):
                    continue
                func = value.__func__
                descriptor = func.__get__(target, type(target))
                setattr(target, attr, descriptor)

merlin and his apprentice

Enter; Merlin and his apprentice, a young king Arthur:

    >>> merlin = Wizard('Merlin')
    >>> arthur = Human('Arthur')

As a human, Arthur interacts with things by waving at them:

    >>> arthur.interact(merlin)
    'Arthur is a normal human and waves at Merlin!'

Whereas Merlin, a mighty wizard, has a less benevolent reaction:

    >>> merlin.interact(arthur)
    'Merlin is a powerful wizard and scoffs at Arthur!'

teaching arthur about the world

As a young wizard’s apprentice, it is important that Arthur understand the world around him by observing interactions through the form of various other magical creatures.

So to give Arthur this knowledge and experience, Merlin transforms him into a cat:

    >>> merlin.transform(arthur, into=Cat)
    I, Merlin, have transformed Arthur into a Cat!
    >>> assert type(arthur) is Human
    >>> # Still a human!
    >>> arthur.interact(Tree())
    'Arthur is a cat and scratches a tree!'
    >>> arthur.interact(Flower())
    'Arthur is a cat and scratches a flower!'

And then into a Dog:

    >>> merlin.transform(arthur, into=Dog)
    I, Merlin, have transformed Arthur into a Dog!
    >>> assert type(arthur) is Human
    >>> # Still a human!
    >>> arthur.interact(Tree())
    'Arthur is a dog and barks at a tree!'
    >>> arthur.interact(Flower())
    'Arthur is a dog and barks at a flower!'

After a sufficient amount of transforming Arthur from one animal to the next and interacting with the world, Merlin finally transforms Arthur back into a human and he’s back to his old hand-waving ways:

    >>> merlin.transform(arthur, into=Human)
    I, Merlin, have transformed Arthur into a Human!
    >>> assert type(arthur) is Human
    >>> # Still a human!
    >>> arthur.interact(Tree())
    'Arthur is a normal human and waves at a tree!'
    >>> arthur.interact(Flower())
    'Arthur is a normal human and waves at a flower!'

The end!

the science behind the magic

Once you understand Python Descriptors, there isn’t actually too much to explain here. In short; descriptors are a way to take advantage of the machinery behind Python’s “New-Style” (default in 3.X) classes.

In order understand descriptors better, though, we first need to understand a bit about Python’s data model. Descriptors are a very special part of the data model and they are defined by three magic methods: __get__, __set__, and __del__. We’ll focus mostly on __get__ — the other two are left up to the reader to research. :-)

class members

There are two types of “member” objects of a Python class; values and functions. In most object-oriented languages member values are typically referred to as attributes where as member functions are referred to as methods; and that is usually only reserved for use by an instance of a class, not the class itself.

With Python, class rules aren’t as strict and we can do some funny things at class definition time. For example; we can define a method (i.e. using the first implied self parameter) for an instance of an object OR we can define the method using @classmethod in order to provide its usage at the class level as well as the instance level.

The difference between these two things isn’t exactly subtle:

    class Foo:

        @classmethod
        def bar(cls):
            print(f"{cls} called classmethod bar")

        def baz(self):
            print(f"{self} called instance method baz")
    >>> Foo.bar()
    <class '__main__.Foo'> called classmethod bar
    >>> f = Foo()
    >>> f.baz()
    <__main__.Foo object at 0x101d28908> called instance method baz
    >>> f.bar()
    <class '__main__.Foo'> called classmethod bar

The first thing that almost every Python developer will immediately dismiss is that we can call both Foo.bar and Foo().bar. That is; we can call bar from both the class reference (Foo) and an instance of that class (Foo()). That’s little more than muscle memory at this stage, but descriptors are the key to understanding why this is possible.

the ubiquitous __dict__

Most of you have seen it before— you might have even used it directly at some point during debugging object values.

Python’s __dict__ is a special dictionary that rests as another core component to “New-Style” classes and work together with descriptors to provide the user-defined class system we are familiar with today.

In short; when you define a function as part of a class definition (e.g. def foo(self): print(self)) Python exposes that as an attribute of your class using a descriptor for the underlying function, rather than the function itself. The underlying function is stored in a class’s __dict__ which is used to look up members when attributes are accessed.

To better explain, here is some code using our class Foo from above to demonstrate to this. First, let’s illustrate how dot-style access and __dict__ are related1:

    >>> Foo.bar
    <bound method Foo.bar of <class '__main__.Foo'>>
    >>> Foo.bar.__func__
    <function Foo.bar at 0x101baaf28>
    >>> Foo.__dict__['bar']
    <classmethod object at 0x101d28cc0>
    >>> Foo.__dict__['bar'].__func__
    <function Foo.bar at 0x101baaf28>
    >>> assert Foo.bar.__func__ == Foo.__dict__['bar'].__func__
    >>>

That’s interesting. We have three different types of objects associated with the same bar attribute of the Foo class:

  • function: Which is the actual function defintion of bar.
  • classmethod: Which is a classmethod decorator wrapping bar’s function defintion.
  • bound method: A descriptor (hah!) which takes care of automagically passing the Foo class in as the cls parameter to bar upon invocation by dot-style access.

Our function object is different from classmethods is different from bound methods — an example of this behavior:

    >>> Foo.bar()
    <class '__main__.Foo'> called classmethod bar
    >>> Foo.bar.__func__()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: bar() missing 1 required positional argument: 'cls'
    >>> Foo.__dict__['bar']()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'classmethod' object is not callable

So it makes sense classmethod isn’t directly callalble, but the only problem with Foo.bar.__func__ was its argument. Let’s try with a positional argument:

    >>> Foo.bar.__func__(Foo)
    <class '__main__.Foo'> called classmethod bar

A step forward! But it would be nice if we didn’t have to pass in Foo to every call. It turns out that with descriptors, we can generate a bound method just like @classmethod did above:

    >>> Foo.bar.__func__.__get__(Foo, Foo)
    <bound method Foo.bar of <class '__main__.Foo'>>

Hey! That looks familiar! So how about that not having to pass that argument at all?

    >>> Foo.bar.__func__.__get__(Foo, Foo)()
    <class '__main__.Foo'> called classmethod bar

Bingo!

Part of the behavior of a descriptor is to bind itself to its owner and provide its instance as the first argument to the function it references. That is why we were not required to pass Foo into the function descriptor generated by __get__.

Okay — so we have essentially just recreated @classmethod and how the underlying function bar works when accessed via the class Foo. But how does that help with instances of Foo? Read on!

descriptors and instances

In a more technical description; __get__ can be used to “bind” objects (almost abitrarily) to an “owner” (and optional “instance”)2. The “instance” part of a descriptor is what will be provided as an implied first argument to the function used to generate it once it is called.

As a general rule with descriptor creation; when you provide instance=None the descriptor is registered as an instance method and the instance is passed as the magic first argument. If instance=Foo (or any class), the descriptor will pass the class object Foo as the magic first argument.

For example, here’s how we would use descriptors to create a new instance method for new instances of Foo:

    >>> def fizz(self):
    ...     print(f"Called fizz from instance {self}")
    ...
    >>> Foo.fizz = fizz.__get__(None, Foo)
    >>>
    >>> Foo.fizz
    <function fizz at 0x101d30158>
    >>> Foo.fizz()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: fizz() missing 1 required positional argument: 'self'
    >>> f = Foo()
    >>> f
    <__main__.Foo object at 0x101d28e10>
    >>> f.fizz()
    Called fizz from instance <__main__.Foo object at 0x101d28e10>

And a similar strategy for a “class method”:

    >>> def buzz(cls):
    ...     print(f"Called buzz from cls: {cls}")
    ...
    >>> Foo.buzz = buzz.__get__(Foo, Foo)
    >>> Foo.buzz()
    Called buzz from cls: <class '__main__.Foo'>
    >>> Foo().buzz()
    Called buzz from cls: <class '__main__.Foo'>

And that’s pretty much it.

the smoke and mirrors

So in our example at the top; we have a magical wizard who can transform things into other things, but how did it work?

This is the important snippet of code:

    attrs = [attr for attr in dir(into) if not attr.startswith('__')]
    for attr in attrs:
        value = getattr(into, attr)
        if not hasattr(value, '__func__'):
            continue
        func = value.__func__
        descriptor = func.__get__(target, type(target))
        setattr(target, attr, descriptor)

So the first thing we do is grab every attribute (filtering out scary stuff that .startswith("__")) from the class we’re going to turn our instance “into”.

Next, we get the underlying function from the descriptor and generate a new descriptor on the target class using __get__. Assignment using setattr means we don’t have to know the actual written attribute name (i.e. foo.bar =) ahead of time and ensures that the new descriptor is bound to our target and its owner (type(target)) as target’s original class.

NOTE: This does not modify target’s class definition — that is; new instances of type(target) will still be initialized with all of its original class definition.

Lastly, we set an attribute of the same name onto the target. Now we have successfully copied any attributes, methods, etc. from into as instance methods on target.

Voila! No magic; just descriptors3.


  1. I used equality (==) rather than identity (is) checks here to be consistent with the flow of the example. In fact; Foo.baz is Foo.__dict__['baz'] but due to the way instances work, f.baz is not Foo.baz.__get__(f, type(f)).

  2. The signature for the “get” descriptor is __get__(self, instance, owner). “self” is the object __get__ is called from and is passed auto-magically.

  3. For what it’s worth; what I have blogged about here is a very small amount of descriptor usage. There’s a whole internet out there using and abusing it in fun new ways every day.