miscoranda: by Sean B. Palmer

Mimulus

After just a couple of weeks spent making it, I've released Mimulus. It's an XHTML editor extension for Firefox, and it lets you edit content right inside the tab. It'll bring up a cursor so that you can just click inside a paragraph or heading and start typing; and it enables key combinations to make new elements, such as Ctrl+U to make a new unordered list, which you can then type into.

Though it's pretty lightweight, focussing on paragraphs and headings and lists and links rather than embedded content and tables and style, it has a source edit mode so that you can make more extensive changes almost as quickly if need be. The main advantage it brings is convenience, for when you need to edit lots of very simple notes very quickly. But though I wrote it mainly for creating notes, I've found that it's actually very suitable for editing random existing pages, and typing up documentation and so on.

It basically takes WYSIWYG to the limit, because the finished product is literally what you see in the browser: there are no toolbars or anything like that since all status information is displayed in the browser's status bar. It's minimally invasive too, so it's turned on per-tab rather than being activated for all of your tabs. And since it's an extension, security is handled transparently, and you can save edited documents to your local hard drive.

And of course it's an open source present, released under a BSD style license. The hard work of doing the conversion to an extension was done by Christopher Schmidt, who also helped out with a bit of the coding and hosts the subversion repository for the code; and the logos were designed by Cody Woodard, who just so happens to be awesome at that.

Please feel free to send me feedback on Mimulus, especially about bugs and so on. Please remember, if mailing feature requests, that the lightweight and minimally invasive characterists of Mimulus are by design and should carry through to any new capabilities.

by Sean B. Palmer, at 2006-08-17 02:14:03. Comment?

Pluvo

For all of you who've been waiting for me to release this, and all of you who haven't, my Pluvo Programming Language project is now go. From the homepage: "Pluvo is a nascent experimental scripting language with an easy to use syntax, built in test facilities, and modern datatypes. High level and data structured, it makes things easier for the programmer by incorporating idioms from a wide range of languages in a consistent manner."

I had been working hard to get it to a level of maturity where it was at least a curiosity and had the major structural functions working so that people could get a feel for them. It's also, hopefully, at a level where people can actually have a go at adding bits themselves, not that I expect that to happen particularly. This means that I haven't really discussed much of the feature set with anyone, which has been very difficult to avoid!

So there you go: feel free to download it, poke at it, get it running and so on, but don't expect too much from it. It's mainly the concept of the thing and the ideas implemented in it that are the fun thing. It might even prove to be something that I continue to the point of actually maintaining various of my scripts in; especially, perhaps, CGIs which I think could turn out to be quite nice written in Pluvo.

by Sean B. Palmer, at 2006-07-01 01:06:15. Comment?

Antikythera Mechanism in Python

I've ported the Antikythera Mechanism to Python: antikythera.py The Antikythera Mechanism is an ancient Greek astronomical calculator made from a gaggle of gears, so it's a kind of digital ratioing machine. I just modelled the gears in Python then bound it all together per some schematics of the mechanism. Here's an example of how to use the script:

$ ./antikythera.py 20
Sun: 20.0°
Moon: 267.36842°
4 Year Dial: -5.0°
Synodic Month: -247.36842°
Lunar Year: -20.61404°

The input argument is the number of degrees clockwise through which to turn the drive wheel. The outputs are the number of degrees through which the respective output gears have been moved. The latter three go anticlockwise because you look at them from the other side of the mechanism, behind the base plate.

by Sean B. Palmer, at 2006-06-09 09:16:34. Comment?

GRDDL for XHTML Schemata Associations

For validation and as an editor hint in emacs's nxml-mode, I use a little RELAX NG Compact schema called xhtml.rnc which allows a subset of XHTML 1.0 Strict. So any documents that I write conforming to it should also hopefully be valid XHTML 1.0 Strict. But how do I make the association between an instance document and this schema formal?

The obvious choice would be to use the schema as a value of the profile attribute, which is designed as either a global unique name for dispatch of arbitrary facilities, or as a namespace for @rel and @rev. My use of it here would be for the former purpose:

<html xmlns="http://www.w3.org/1999/xhtml">
<head profile="http://inamidst.com/proj/quality/xhtml.rnc">
   [...]

Sadly, though, this conflicts with languages which provide facilities under the latter purpose, so in other words if I use the profile attribute for my own devices, I won't be able to use it when some other language comes along that needs it. One such language that may well be very popular in the future, and is already specified, is GRDDL. But GRDDL is special in that it is itself a generalised mechanism for allowing arbitrary extra structure to be added to HTML, in such a way as to be clearly authorised by the creator of the document.

So it would be possible to use GRDDL to provide this schema hint, as long as we used the GRDDL mechanism properly. Here's what I envisage:

<html xmlns="http://www.w3.org/1999/xhtml">
<head profile="http://www.w3.org/2003/g/data-view">
<link rel="transformation" href="http://example.org/xhtml/transform" />
<link rel="schema" href="http://example.org/xhtml/schema" />

This is okay because HTML 4.01 says that "Authors may wish to define additional link types not described in this specification. If they do so, they should use a profile to cite the conventions used to define the link types.", which means that it's up to GRDDL to define what rel="schema" means, but GRDDL doesn't seem to mind as long as you obey its rel="transformation" way of doing things. So the definition for rel="schema" comes from the output of the transformation. The transform URI here could be used as a globally unique value itself, if you know the rel="schema" convention that it formalises.

For more information on this topic, see the #swig chat that I had with Dan Connolly as to whether it was valid to use rel attribute values not defined by GRDDL for your own use even when using the GRDDL profile.

by Sean B. Palmer, at 2006-04-25 16:04:56. Comment?

User-Agent Abuse

According to RFC 2616, the User-Agent header is a statistical datapoint and capability preference, allowing the receiving site to serve pages based on what the client is known to be able to receive: "This [header] is for statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations." So if the limitations of your user agent change, you can modify the User-Agent field that you send appropriately.

With this in mind, I often set my User-Agent header to "Mozilla/5.0 (Something)" when I'm using wget, curl, or urllib in Python, but I'm often told that this is a bad thing, even an abuse of the header. That's absurd; the abuse is usually on the server side, not the client. I fake the User-Agent because many sites don't allow download via curl or wget—two that spring immediately to mind are google.com and f2o.org. These sites have a legitimate practical reason to do so: presumably a high percentage of the hits they receive from these user agents are crawlers and bots. With Google especially, this is going to cost them a lot of money, so blocking is prudent.

But bots should adhere to robots.txt, and I'll bet that a significant portion of the requests that curl and wget banning sites receive from those clients are legitimate. Their filtering is, therefore, a technical solution to a societal problem. It's a bit like banning Firefox on a framed site because Firefox can display the content unframed. So whilst I realise that banning the clients server side is something that pragmatically just has to be done, a hack to save a lot of money and bandwidth, it's an abuse of the User-Agent header, and it's taking place on the server. Getting around that by faking the User-Agent header client side is abuse by neither morals nor specification, as long as the client is being used legitimately.

(Tip of the hat to John Cowan.)

by Sean B. Palmer, at 2006-02-20 20:53:25. Comment?

Looking for previous posts? Try the archives.

Sean B. Palmer