Saturday, 13 July 2013

Twisted Network Programming Essentials book review


Twisted Network Programming Essentials by Jessica McKellar & Abe Fettig (O'Reilly Media) gives an introduction to Twisted, a Python framework devoted to event-driven programming, and particularly it's application to networking. The book covers both high level general concepts of Twisted, as well as detailed examples covering some of the protocols Twisted supports, including my particular interests of HTTP and SSH.



In common with many technical books, things start slowly with a chapter on installation which (IMO) would have been better relegated to an Appendix. I like a technical book to start with motivating examples followed by an overview of the technology, and that's exactly how the second chapter 'Building Basic Clients and Servers' works - it is a really good introduction, describing Twisted's core architectural separation of Transports, Protocols, and the Reactor, with some solid introductory examples (including the obligatory echo server).

The next chapter, 'Writing Asynchronous Code with Deferreds', is slightly disappointing. According to the preface, this is a new chapter in the second edition (I've not read the first), and it certainly describes 'How' to use Deferreds, but I would have appreciated more on the 'Why' - the examples are contrived demonstrations of how things work, rather than demonstrating any real use. The chapter opens by stating 'Practice will help you develop an intuition for how to structure asynchronous code', which is undoubtedly true. But this chapter won't - and I'm not sure whether it's the book or Twisted that's at fault here. Again, maybe an appendix would have been more appropriate for this content, so the use could be seen in realistic examples first?

The remainder of the book covers a variety of protocols (HTTP, Mail, IRC, SSH) and various deployment and best practices, including authentication, integration with databases, and concurrency. The section on deployment was particularly useful, and I enjoyed learning about (and using) the range of features available 'out of the box' in the twistd program. The use of Twisted's 'manhole' functionality to provide Python shell access to a remote program over SSH was also a highlight.

Although the book hasn't yet motivated me to start using Twisted regularly, I do feel I now understand the basic approach and could apply it to the sort of tasks (primarily HTTP servers and clients) I'm interested in. The only things I think could have been improved would be to move chapters 1 & 3 to appendices, and some consideration about how Twisted fits into both the Python and wider event-driven world - to answer the question of why and when I should use Twisted rather than NodeJS or gevent, or for example how Twisted's deferreds compare to Python's own concurrent.futures, or Javascript's promises.

If you already know you are going to be using Twisted, but feel slightly apprehensive about it, I think this is an excellent place to start.


Thursday, 2 May 2013

The Dynamics of Catching Exceptions in Python

In which I discuss dynamism in catching exceptions - something which took me by surprise and could hide bugs - or enable fun...

The Problem Code

The following code - abstracted just slightly(!) from production code - looks perfectly good. It's calling a function to get some statistics and then process them in some way. Getting the values in the first place involves a socket connection, which could fail with a socket error. Since statistics aren't vital to the running of the system, we simply log the error and move on.

(Note I'm using doctest to check this article - this is representative of a script doing real things!)

>>> def get_stats():
...     pass
...
>>> def do_something_with_stats(stats):
...     pass
...
>>> try:
...     stats = get_stats()
... except socket.error:
...     logging.warning("Can't get statistics")
... else:
...     do_something_with_stats(stats)

The Find

Our tests didn't find anything wrong, but actually paying some attention to our static analysis reports showed a problem:

$ flake8 filename.py
filename.py:351:1: F821 undefined name 'socket'
filename.py:352:1: F821 undefined name 'logging'

The problem with the code was that the socket and logging modules weren't imported in the module - and we clearly weren't testing for that case. What surprised me was that this didn't cause a NameError up front - I had assumed that exception clauses would have some eager name lookup - after all, if it needs to catch these exceptions, it needs to know what they are!

It turns out not so much - except clause lookups are done lazily, only evaluated if an exception is raised. Not only are the name lookups lazy, but the 'argument' of an except statement can be any arbitrary expression.

This can be good, bad, or just downright ugly.

The Good

Exception specifications can be handed around as any other values. This allows dynamic specification of the exceptions to be caught.

>>> def do_something():
...    blob
...
>>> def attempt(action, ignore_spec):
...     try:
...         action()
...     except ignore_spec:
...         pass
...
>>> attempt(do_something, ignore_spec=(NameError, TypeError))
>>> attempt(do_something, ignore_spec=TypeError)
Traceback (most recent call last):
  ...
NameError: global name 'blob' is not defined

The Bad

The downside of this dynamism is that mistakes in exception specifications often won't be noticed until it's too late - when the exception is triggered. When using exceptions to catch rare events (failure to open a file for writing for example), unless there is a test for that specific case, it won't be known until an exception (any exception) is triggered, at which point a check kicks in to see whether an exception matches, and causes an error all of its own - typically a NameError.

>>> def do_something():
...     return 1, 2
...
>>> try:
...     a, b = do_something()
... except ValuError:  # oops - someone can't type
...     print("Oops")
... else:
...     print("OK!")   # we are 'ok' until do_something returns a triple...
OK!

The Ugly

>>> try:
...    TypeError = ZeroDivisionError  # now why would we do this...?!
...    1 / 0
... except TypeError:
...    print("Caught!")
... else:
...    print("ok")
...
Caught!

The exception specification needn't just be a name lookup - arbitrary expressions also work:

>>> try:
...     1 / 0
... except eval(''.join('Zero Division Error'.split())):
...     print("Caught!")
... else:
...     print("ok")
...
Caught!

Not only can the exception spec be decided at run-time, it can even use the active exceptions' information. The following is a convoluted way to always catch the exception which is being raised - but nothing else:

>>> import sys
>>> def current_exc_type():
...     return sys.exc_info()[0]
...
>>> try:
...     blob
... except current_exc_type():
...     print ("Got you!")
...
Got you!

Clearly this is what we are _really_ looking for when we write exception handlers, and this should immediately be suggested as best practice :-p

The (Byte) Code

To confirm how it appears that exception handling works, I ran dis.dis() on an exception example. (Note the disassembly here is under Python2.7 - different byte code is produced under Python 3.3, but it's basically similar):

>>> import dis
>>> def x():
...     try:
...         pass
...     except Blobbity:
...         print("bad")
...     else:
...         print("good")
...
>>> dis.dis(x)  # doctest: +NORMALIZE_WHITESPACE
  2           0 SETUP_EXCEPT             4 (to 7)
<BLANKLINE>
  3           3 POP_BLOCK
              4 JUMP_FORWARD            22 (to 29)
<BLANKLINE>
  4     >>    7 DUP_TOP
              8 LOAD_GLOBAL              0 (Blobbity)
             11 COMPARE_OP              10 (exception match)
             14 POP_JUMP_IF_FALSE       28
             17 POP_TOP
             18 POP_TOP
             19 POP_TOP
<BLANKLINE>
  5          20 LOAD_CONST               1 ('bad')
             23 PRINT_ITEM
             24 PRINT_NEWLINE
             25 JUMP_FORWARD             6 (to 34)
        >>   28 END_FINALLY
<BLANKLINE>
  7     >>   29 LOAD_CONST               2 ('good')
             32 PRINT_ITEM
             33 PRINT_NEWLINE
        >>   34 LOAD_CONST               0 (None)
             37 RETURN_VALUE

This shows the 'issue' with my original expectations. Exception handling is done exactly as it 'looks' in the Python itself. The setup doesn't need to know anything about the subsequent 'catching' clauses, and they will be completely ignored if no exception is raised. SETUP_EXCEPT doesn't care what happens, just that if there is an exception, the first handler should be evaluated, and then the second, and so on.

Each handler has two parts: getting an exception spec, and comparing it to the just-raised exception. Everything is lazy, and everything appears exactly as you might expect from just looking at the code line-by-line, thinking about things from the point of view of a naive interpreter. Nothing clever happens, and that's what suddenly makes it seem very clever.

Summary

The dynamism of exception specs caught me by surprise slightly, but it has some interesting applications. Of course actually implementing many of those would probably be a bad idea ;-)

It isn't always intuitive how much dynamism certain Python features support - for example it isn't obvious that both expressions and statements are happily accepted directly in class scope (rather than function, method, or global scope), but not everything is so flexible. Although (I think) it would be nice, expressions are forbidden when applying decorators - the following is a syntax error in Python:

@(lambda fn: fn)
def x():
   pass

Here's a final example of playing with dynamic exception specifications to only propagate the first exception of a given type, silently swallowing repeated exceptions:

>>> class Pushover(object):
...     exc_spec = set()
...
...     def attempt(self, action):
...         try:
...             return action()
...         except tuple(self.exc_spec):
...             pass
...         except BaseException as e:
...             self.exc_spec.add(e.__class__)
...             raise
...
>>> pushover = Pushover()
>>>
>>> for _ in range(4):
...     try:
...         pushover.attempt(lambda: 1 / 0)
...     except:
...         print ("Boo")
...     else:
...         print ("Yay!")
Boo
Yay!
Yay!
Yay!

Thursday, 3 January 2013

New pylibftdi release - 0.11

'pylibftdi' is a library for talking to FTDI devices via the libftdi library. FTDI make a wide range of chipsets and modules for interfacing to a number of protocols via USB, including 8-bit parallel and RS232 serial modes. They're a great way of interfacing to other electronics from your computer.

I've just released pylibftdi 0.11. I'm at the point where I'm looking at getting to RC and then stable status, which I'll release as 1.0 - at which point the API will be considered stable. While it isn't yet, I've taken the opportunity to tidy a couple of things, as well as add some improvements.

Raspberry Pi support; better documentation

Though it worked previously, I've taken the opportunity to test it a bit on Raspberry Pi, and I've updated the docs describing udev rules which allow access to the devices without needing sudo / root access. I think this is now a good option if you want a bidirectional 8 bit port for your Raspberry Pi, and it's certainly lower risk of damaging your Pi than using the GPIO pins directly.

BitBangDevice changes

The new latch property

BitBangDevices provide a simple abstraction of a parallel IO device; a 'direction' property which controls which line is an input or output, and a 'port' property for the actual reads and writes. This is based on systems going all the way back to the BBC micro user port and earlier. direction maps to the 'Data Direction Register' of the Beeb, the 'TRISx' register of the Microchip PIC, and so on. port maps to the 'data' register of the Beeb, or the PORTx register of the Microchip PIC. Just as the PIC18F series introduced the 'LATx' register, so too does pylibftdi 0.11 introduce the latch. Read the documentation for more information - in most cases you simply don't need to care about this.

Initialisation

If a physical FTDI device is not reset between program runs, then it retains its output register state; a pin set high in one run of the program would still be high when the device was opened in a subsequent program run. Prior to pylibftdi v0.11, this was not taken into account, and the assumed state of all output pins was that they were at the reset state, i.e. all low. This meant that operations such as read-modify-write on port bits would not reflect the current state, as they do not do a read based on the output state of the port, but based on the internal view of what output values are set to.

With the change, the following will work as expected:

$ python
>>> from pylibftdi import BitBangDevice
>>> bb = BitBangDevice()
>>> bb.port |= 1
>>> ^D
$ python
>>> from pylibftdi import BitBangDevice
>>> bb = BitBangDevice()
>>> bb.port |= 2
>>> ^D

Previously, the final state of the device pins would have been '2'; the read-modify-write implied by |= 2 would have used '0' is its 'read' source, and have output '2'. The new code initialises the internal latch state to the value read from the pins (it's possible to read the actual state of output pins as well as input pins). With the latest version, the final state of the pins after the above will be '3' - both D0 and D1 set high.

API changes

I've always said in the README for pylibftdi that the API won't be stable until version 1.0, and I've changed two parameters only introduced in 0.10.x to have clearer names.

The following two parameters to the Device constructor have changed name:

interface -> interface_select
I considered interface too generic and unintuitive here. The values and behaviour for this parameter (which selects which interface to use on a multi-interface device) haven't changed.
buffer_size -> chunk_size
This is the maximum number of bytes which will be written / read at a time in read/write calls to the libftdi library, designed to ensure we are regularly executing at least some Python byte code, which we can then interrupt (timely Ctrl-C interruption is the primary use-case for this parameter). It was never about buffering, so I've changed the name to reflect this.

Other changes

The bit_server example now works properly; this can be run as:

$ python -m pylibftdi.examples.bit_server

and will start a basic CGI-based web server, open a web-browser talking to it (on port 8008 by default), and allow you to control the state of each of the 8 output lines on the connected device (which it sets to async bit-bang mode).

This will be further developed in the future - it looks somewhat rough right now :)

The led_flash example has also gained a feature in taking a command line argument of the rate at which to flash - defaulting to 1 Hz. To cause an LED (or a piezo buzzer works just as well - and more annoyingly!) to flash at 10Hz, run:

$ python -m pylibftdi.examples.led_flash 10

Coming next

I'm still trying to improve test coverage. I spent some time trying to port the tests to the Mock library, though my efforts at effectively patching at the ctypes DLL level weren't very successful.

Documentation continues, and thanks to the wonderful readthedocs.org, the documentation isn't necessarily tied to the more sedate release cycle - it always shows the latest version from Bitbucket. If more API changes happen this could be counter-productive, but I'll try really hard to note if this is the case, and it makes things much nicer when updating things like installation instructions (which I have done, adding tested udev rules instructions etc).

libftdi 1.0 is just going through release candidate stage at the moment, so I'll test against that. I expect only the installation docs will need changes.

I've never tested pylibftdi on Windows, and I'm keen to do this in the near future, though I don't have regular access to a Windows machine, so no guarantees about this. I suspect it all 'just works'...

Thursday, 4 October 2012

Typing to Yourself

I had an awesome time at PyConUK last weekend. I went to my first code dojo where I helped write a text-based adventure game (with a disturbing plot!), played with using Python on a RaspberryPi to access the GPIO, started a new Python module for my own use, and gave my second ever lightning talk, titled ‘Typing to Yourself’. This is 'the blog of the talk'.

What's this about then?

I’d started finding that IM chat logs often gave a lot of information, and often the timestamps were useful. The conversational nature of the chats also often gave subtle and useful clues about things such as confidence levels which a more formal report would lose. I started to think that it would be worth having that even if I wasn’t chatting to someone else. And so the madness started….

Typing to yourself. About stuff. Preferably as it happens, in ‘real time’ (is there another kind?). I suppose some people use Twitter like this, but I (and I'm sure my employer) like it that I keep at least some things to myself.

I've been doing this for a few months now, and got a single file with about 1300 lines of info I've been writing. Originally I cleared it out every few days, but then thought that maybe keeping it all around would be of some benefit.

Why Type to Yourself?


Record snippets of new knowledge

There are hundreds of small things I’ll find out about and then not look at again for 6 months. And chances are, I’ll forget all about them. It’s worth recording that sort of stuff. Things like pv, a new and useful iptables rule, the name of a nice vim colour scheme.

Decouple recording from reporting

Part of a knowledge-based job, where part of the task involves continual learning and researching, is that there is always the risk of going off into some blind alleys, dead-ends, or things more interesting than what you / I should really be working on. Chances are, even if it’s tangential to the work you / I should be doing, it’s still useful in itself, and worth recording. If I’ve just spent half an hour reading about ZeroMQ, I’ll include that. I might not record it in a list of training activities for the week though. It defers disclosure, allowing selection to take place at a later point. And therefore encourages more interesting and accurate reporting. By separating out recording from (for example) time reporting systems, we can post-process and filter that raw data later. Same thing as RAW and JPEG files from a camera; it’s not a bad thing to have the RAW data even if the end result is somewhat different. We are likely to be more honest if we type to ourselves, including feelings, distractions, etc, some of which will be useful at a later point.

Record why decisions were made

We make dozens of design decisions every day, and the vast majority of these seem obvious at the time. But there are some that aren’t ever obvious, and some that won’t be tomorrow even if they are now. Recording why we make the choices we do is important, even if just to force us to make them consciously. And it can be very useful to document dead-end design decisions which we try and ultimately give up on, in the hope of avoiding repeating them in the future.

Overcome creative blocks

Writer’s block affects programmers as well as novelists. Or at least it affects me from time to time. Sometimes I sit there for minutes on end, simply staring at the screen. I’ve found that explaining my dilemma to myself through the medium of typing to myself can often overcome this. Sometimes any activity can be a key to being able to think clearly about a problem again. Not only that, but regularly writing down what you're doing can be a great antidote to distraction and procrastination. This comes back to being able to be honest with ourselves about what we're doing - writing this down makes us think about it, be able to criticise it, and therefore more quickly be able to change direction.

Rubber duck debugging

© Tom Morris / Wikimedia Commons / CC-BY-SA–3.0 / GFDL

This is a technique which uses vocalisation of a problem we’re facing to make us think more clearly about the problem; to take a step back, and explain to a toy rubber duck - ideally one with no previous knowledge of the problem we’re facing - what’s going on and how we’re trying to fix it. Just explaining it often helps us realise the problem. But rubber ducks are tricky to find at the crucial moment, and people think programmers are mad enough already without seeing us all talking to little ducks sitting on our desks. No, typing to ourselves, writing down the problem, is clearly much safer. After all, a programmer writing down a problem to themselves looks highly productive, rather than slightly mad.

Searchable history

We version control our code. Why not version our thoughts and activities? Write stuff down. Be able to go back in time and revisit those thoughts at a later date. Use it to store our short-term thoughts just before a meeting or break, so picking up where we left of is easy. That sort of stuff. Or to record surprising errors which we can’t reproduce and just put down to ‘something must have been set up wrong’. But then we start to find that we’ve already recorded it two months earlier…

How should I go about that?


Timestamped

Typing to yourself is an activity best done in real-time. Doing it later may still have some benefit, but the stream of consciousness brain-dump in the background has a lot of value which is lost if we’re just typing a historical report on what happened earlier. The point of typing to yourself is that having a record is useful; trying to remember stuff to record after the fact is lossy and a waste of time. Having things timestamped is a motivation (‘I’ve not written anything for 2 hours!’) and useful for searching history - finding out just when that bug appeared last.

Centralised

For a given context (e.g. work), there should be a single log on which you type to yourself. Perhaps there shouldn’t even be multiple contexts; everything should go in one big fat log. But it should be a single log, and yet available everywhere. Having to merge logs, or wondering where the latest version is, or knowing but not having access to it - all bad things. Dropbox is good.

Low friction

The whole point of ‘typing to yourself’ is that it shouldn’t be a context switch. I tend to keep a tmux pane open with editfile running (as track -t). Switching into it is just a case of Ctrl-A/cursor key. Then type stuff. Then Ctrl-A/cursor the other way. There’s no alt-tabbing, no windows changing focus or popping in front of each other. And importantly, I can see what’s there at all times, so it’s always in my consciousness - I don’t have to ‘swap it back in’ when I switch to it. Another aspect of low-friction is that the data itself should be widely available to programs to use, whether for searching, editing, or anything else. A text file is ideal.

An Implementation

I’m more keen about the ideas here than the implementation, but without an implementation it couldn’t work. I use my editfile program for almost all longer pieces of writing - blog posts, ideas, plans. And my ‘typing to myself’ log, which is just an editfile ‘instance’ used in ‘time track’ mode, which keeps a single file on Dropbox with all the content in a text file. I wrote about editfile in an earlier blog post. editfile started out as a very simple bash script:

#!/bin/bash
$EDITOR "~/Dropbox/editfile/$(basename $0)"

but is now a more complex bash script, including search, a two level hierarchy (I had that before iCloud decided it was a good idea!), command-line completion, and the time track mode I use for typing to myself.

The time-track mode has a couple of useful features - readline & history integration, and prompting and storing a timestamp. It’s not perfect; one of the key things is that the timestamp prompt doesn’t update in real time (although it does store the current time in the text file rather than the potentially out-of-date displayed time). The implementation of the time-track loop is the following:

now=$(date '+%Y/%m/%d %H:%M')
# read history from previous
history -r $HIST_FILE
while read -ep "$now >> " track_input ; do
    now=$(date '+%Y/%m/%d %H:%M')
    if [[ -z $track_input ]] ; then
        # don't store blank lines
        continue
    fi
    # use -- to indicate end to options e.g. if track_input
    # starts with '->' which previously caused errors
    history -s -- "$track_input"
    echo "$now $track_input" >> ${TARGET_PATH}
done
# append current session to history
history -a $HIST_FILE
# ensure bash prompt starts on a new line
echo

I use this every day at work, and it's got to the stage where I want to use it more. I've got plenty of ideas for things to integrate into my implementation, though the real essence of it doesn't need anything clever really.

Monday, 24 September 2012

Project Naming in a Google World

I’m a great fan of Python; not only do I think the language itself is clean and readable, the community polite and helpful, and the ecosystem diverse and fascinating, but also the Zen of Python resonates with me.

I think there is significant value in that ‘there should be one - and preferably only one - obvious way to do it’, and that ‘namespaces are one honking great idea’. To me, it is sad that this essence of Python philosophy isn’t applied more widely.

Of course there is an element of tension in the Zen - Namespaces are about nesting, but ‘Flat is better than nested’. Nevertheless, flat within namespaces isn’t the same as not having any namespaces at all.

Namespaces don’t exist in a Google world.

I bet that most project name searches on Google are a single word. ‘jquery’ would get me want I want. ‘requests’ gets me what I want. Even one of my own projects - ‘pylibftdi’ gets me where I want to go. Getting to this point is probably part of choosing a good name. But that’s exactly the problem: how do I choose a good name for my new project? It’s one thing already knowing what project I’m interested in and simply using Google to get me there (sometimes a language name qualifier helps, e.g. ‘python flask’), it’s quite another two problems a) searching for a project to meet a given problem, not knowing what might be available b) searching for a project name I can use for my shiny new thing.

Searchable Project Names

One of the technologies I use the most at work is SSH. I tend to use it mostly in a fairly crude way, via it’s normal client and server programs ssh and sshd with many configuration options, but I have used the paramiko library. Which works well, and has a great name - easily remembered, especially after reading about its etymology on the project page. And very easily searchable. Recently, however, it’s development has slowed. I read in some places that it is now ‘deprecated’, but I’m not sure about that - the github project was last updated 11 days ago as of now… Anyhow, recently it has been forked, and its ‘successor’, has the brilliant name of… wait for it… ‘ssh’. Yes, brilliant. No, actually, it isn’t that helpful. Search for ‘ssh’, and it obviously won’t be there, straightaway, on the first page. Search for ‘python ssh’, and it still won’t be there. I guess it might be in a few months or years once it (presumably) takes off as the ‘one way to do it’, but now? It’s not helpful. Maybe it’s only aimed at people who use the PyPI search engine? And even if / when it is ‘obvious’, it’s still going to be a pain to do web searches for problems relating to use of the package. If I want to know which to use, then ‘paramiko vs ssh’ is of no help. Is the new ssh module ‘preferred’ by the community going forward? Or is it just a random fork by the Fabric guys? Other than the download stats on PyPI, it’s difficult to tell, because searching for info about it is... tricky.

As another example, the pbs package has recently changed its name to sh. Now pbs might not be the bestest name, but changing it to sh causes exactly the same kind of problem as ssh. There can be a real feeling of ‘hijacking’ when something so domain specific is used for a general project name. Using such a name is a clear signal: this is the module you should want to use for this task - you’d would be crazy to try anything else! That may or may not be intended or justified, but when it is a trivial thing for anyone to do, we developers have to be very careful and deliberate. Domain-specific project names, with massively overloaded meanings, only make sense in a very defined namespace: in these cases, the set of Python packages on PyPI.

Except, in a Google world, there aren’t namespaces.

Finding a project name (or rather finding the absence of one)

One of the problems with project naming in a flat unified project namespace (because of course there is one namespace) is project name squatting. For a variety of reasons - good and bad - developers decide that ‘release early, release often’ is a good policy. And one of the first things needed for that first visible release - perhaps the only thing needed - is a project name. So names are snapped up in an eager race. Project names have become the new hot-property. So we have lots of great project ideas, which need and find an awesome project name, make that first release, … and then do nothing. Stagnate. Just like the dot-com crazy days, we have project-name squatting, and permanent project-name ‘under construction’ empty shells… And, like defunct satellites cluttering low-earth orbit, the debris of project names now unused is a danger to every other project, trying to find its own space and path through the knowledge-sphere, avoiding the no-man’s land which has been staked out and left barren, taking juicy spectrum in an interference causing blackout. Soon there will be no more names left and [Sorry, I seem to have got carried away. Ahem.]

So…?

The following are some more thoughts and examples. Most of this is subjective. Hurrah for being able to dump half-finished ideas in a well name-spaced environment!

Over-general names:

  • ‘node’ - really unhelpful.
  • ‘windows’ - key element in GUI programming. WIMP.
  • ‘dropbox’ - to a certain extent.
  • ‘color’ - remember them? Good thing they didn’t take this word away…
  • ‘word’ - a tool for writing words?
  • eliminate a name not just from the project namespace, but increasingly from the word namespace.
  • makes web searching harder

Unpleasant / generally bad names:

  • git
  • gimp
  • My[anything] ;-)
  • Any number of ‘offensive’ or ‘wrong connotation’ names, often leading to name changes, which help no one, except in an ‘any publicity is good publicity’ kind of way:

Duplicate projects with the same name:

Create or recognise our own namespaces:

  • blog articles: author + title
  • PyPI / CPAN etc
  • ‘hungarian notation’ e.g. pyxyz, where the ‘py’ prefix includes some indicator of what namespace it lives in.
  • domain name country code extensions - ‘.io’ etc
  • ‘file extension’ as part of project name: ‘node.js’ etc
  • identification by company or organisation: iOS / iPod / i*, gmail, google maps, etc
  • identification by well-known patterns: xUnit, [j/py]Query etc.

Summary

If I were to produce a new vacuum cleaner and call it ‘Vacuum’, then various people might get upset. We (in software development) don’t really want to have to deal with all the legal & trademark clutter - the fact that we can have an idea, create a project and ‘market’ it all in a weekend is awesome, but requires us to act responsibly. Just because we can launch a new project into the orbital (name)space around us, doesn’t mean we must. Though it is awfully tempting… In addition we need to recognise, use, and educate ourselves and others about the namespaces all around us.

So I guess what I’m really saying, is (to quote Tim Peters)...

Namespaces are one honking great idea - let’s do more of those!

Sunday, 10 June 2012

pylibftdi v0.10 released

I’ve recently released pylibftdi v0.10. pylibftdi is a ‘minimal Pythonic interface to libftdi’, and is intended to be (possibly?) the easiest way to get up and running for simple use cases of FTDI’s range of USB to serial and parallel interface chips and modules. v0.10 adds a couple of new features and bug fixes.

For a long time I suffered under the misapprehension that version numbers should follow the rules of decimal numbers, and that by all reasonable accounts v1.0 should have followed 0.9, and since I want(ed) 1.0 to be ‘stable’ (I currently classify it as ‘beta’), I’d reached an impasse. I can’t remember the exact moment, but I had a realisation that I didn’t have to approach 1.0 via smaller and smaller increments from 0.9 (as in Zeno’s race), but that I could go from 0.9 to 0.10. Anyway, I still want to do better documentation (and a few other things) before reaching 1.0.

Changes in v0.10:

  • Running the unit tests is now easier due to some reorganisation - just run python -m unittest discover in the top level directory.

  • Support for the FT232H device - this has a different USB product ID (PID) to the previous devices I’d been testing with and using - mainly FT245BM/RL, FT232R/RL. All those devices have a PID of 0x6001, while the newer FT232H has a PID of 0x6014. I experimented for a while with having (defaulted) extra parameters for specifying the VID and PID of the target device, but this pushed too much complexity up to the user - I really want pylibftdi to be something which can be used with default options and next-to-no set up code for most basic operations. The approach taken is to have two lists (USB_VID_LIST, USB_PID_LIST) and have the device finding code iterate over the cartesian product of these (i.e. a nested loop, but implemented through the wonderful itertools.product). So adding new PIDs in the future is as simple as appending to USB_PID_LIST, and a device can be opened with no parameters to the Device() constructor if it’s the only FTDI device on the USB bus.

  • Resetting the device to serial mode on open. There’s been discussion about implementing this logic in the library on the libftdi mailing list, but doing it in pylibftdi as well doesn’t hurt. This fixes the unexpected condition that if a previous application had used a device in BitBang mode, reopening it just using Device() would leave it in BitBang mode, rather than the expected serial mode (for devices which have support both).

  • Added a ‘buffer_size’ parameter to the Device() constructor (defaulted to zero, which retains previous behaviour) which chunks reads and writes into accesses of that length at most. This avoids the issue that a call of (for example) dev.write(‘hello’ * 100000) over a 9600 serial link would take an incredibly long time, and since it is all running in the library context (via a ctypes call), it wouldn’t be interruptible by Ctrl-C.

  • Removed the deprecated use of Driver() to be a synonym for Device().

  • Update: I've already done two maintenance releases in the hours since originally writing this - v0.10.2 is now current. One of the major changes is that the examples subpackage is now included in the sdist - so python -m pylibftdi.examples.led_flash should work if you have an LED attached to D0 on your device.

The plan for the next release is just more tidy-ups, examples and more documentation, but I might squeeze a few other things in there…

Friday, 8 June 2012

My First 3 hours with a Raspberry Pi

My Raspberry Pi arrived a couple of weeks ago, but I've been too busy to play with it til now. And tomorrow I'll be too busy with different things :-)

I'd intended to put Arch Linux on it, since that's what I use on my Pogoplug, and it works really nicely there. But, starting at the downloads page (http://www.raspberrypi.org/downloads) I decided that since the Debian install seemed to have some 'official'ness about it, I'd start there. It pointed to http://elinux.org/RPi_Easy_SD_Card_Setup, which contains a list of 'lazy/easiest/easy/...' and so on approaches to getting a card set up.

A link on that page led to a Mac-only (for now) script which downloads and writes an image to an SD card all in one operation - nothing too difficult to do manually, but I'm not one to ignore easy options. Mostly. http://exaviorn.com/raspiwrite/

I'd bought a Sandisk Extreme 8GB SDHC card - which is on 'the list' by the way - for use with my RPi. I ran the RasPiWrite script firstly with Debian, then Arch Linux. Neither worked. No 'OK' LED, only the red Power LED. Somewhat disappointed. As a side-note, the RasPiWrite script worked better if I reformatted (standard FAT) the card between attempts; otherwise it initially considered selecting my external hard drive, which wasn't great - but it did warn and I was aware, so no worries (for me) there.

I then wondered what the OK LED was supposed to indicate, and read (here: http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=6735) that it should be expected to at least flicker a little on power up - and it wasn't. I had another SDHC card (an old Transcend 4GB SDHC Class 6) and decided to try that. After re-runninge RasPiWrite on it with Debian again, ... the OK light did stuff! It appears to be an SD access light - this wasn't mentioned anywhere I noticed - but it isn't just constantly on when the system is booted, but flickers on activity. So I was sure it was finally booting. I'd read a bunch of things suggesting that as well as dodgy SD cards, flaky PSUs could be a fault, but the two I used (a Blackberry phone charger and an iPad charger) both worked fine in the end.

Sadly however, there was no output on HDMI, butfrom my router's DHCP client list that it was taking a lease. A quick OUI scan showed the Raspberry Pi foundation as the owner of a new MAC in the table :-) I'd foolishly assumed ssh would work out of the box, but nmap showed just port 111 open. Following the instructions here (http://www.youtube.com/watch?v=SmMMKojOE4U) to enable SSH worked great. (Basically 'ssh-keygen; service ssh start; update-rc.d ssh defaults' as root), and I could ssh in! Hurrah!

After reading a few places that sometimes the RCA could work even if HDMI didn't, I found an RCA-SCART adapter and magic - it was actually doing stuff! Finally, after searching and finding http://elinux.org/R-Pi_ConfigurationFile and http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=6762 I got HDMI working by (while logged in over SSH) by creating a /boot/config.txt file (there wasn't an old one) containing the following:

config_hdmi_boost=4
hdmi_force_hotplug=1

And that was it - task accomplished.

Since it's listed in the 'working SD card' list, presumably more recent kernels might work with the bigger / faster card I bought for the purpose of running my RPi; IMO it should be a priority to keep images on the downloads page current - containing latest kernels supporting as wide-a-range of SD cards as possible. I'm sure many have fewer hassles than me, and equally sure many have had many more problems. But for this whole endeavour to really work, it needs to get easier. Is RPi about teaching programming, fun, and 'computing', or about the harsh realities of embedded Linux? Fortunately there is an awesome community which I'm confident will keep pushing things forward - thanks!