I've remarked privately from time to time how I couldn't possibly achieve the things I've been working on without a million bits of work put forth by others. The list is massive, everything from the OS I live in to the web framework I build on and the language it's written in, to my source-control management program & IDE, my browser, and a million smaller pieces. Not to mention the many web services, both commercial and not-for-profit, and various tools disguised as websites.
Most important of these, lately, seems to have been the git & github team. I'd never done much open-source development in my days, before picking up git. Even if I made changes to an existing project, it was too much a fuss to publish changes, and I was busy with other things. Busy enough that what changes I made, I kept for myself. But now it seems that git(hub) has changed that. Over the past few months I've made a number of contributions, of minor significance, but am graduating into more interesting territory, which is this, ROXML 2.0.
ROXML is short for "Ruby Object to XML Mapping Library." XML goes in, Ruby objects come out. In these days, with web-services flying so freely across the web, and XML being one of the common languages thereof, it seems useful that one might be able to declare a mapping in the form of a collection of objects, then extend those objects with functionality, and interact with them cleanly, as objects, not as structured text.
I had just this need, and came across ROXML and a number of other similar options. I picked ROXML because it was relatively clean, simple and to the point. I've been hacking away at in in the spare parts of my nights and weekends and by now there's very little recognizable from the original library, so I figure it's time to release :-).
For an initial look at ROXML, you can see John Nunnemaker's recent post, or the old site & the docs therein. I'll be using those as a baseline.
A syntax makeover
Now, I can be a stickler for syntax. As I've said before, syntax matters. It changes the way we interact with the system, it changes what is done, how it is done, and what can be done. Take, for example, the old syntax: [sourcecode language='ruby'] class Posts include ROXML xml_attribute :user xml_attribute :tag xml_object :post, Post, ROXML::TAG_ARRAY end [/sourcecode]Read-only-ability
There's something important missing in those definitions above. Quick, is:user
modifiable, or no? What about :tag
and :post
? Surely it's one or the other, but which?
It turns out that the attributes above are writable, which is the default. To override this you'd have to write the following:
[sourcecode language='ruby']
xml_attribute :user, ROXML::TAG_READONLY
[/sourcecode]
As syntaxes go, this is a pretty obtuse barrier to const-correctness, and will likely lead to most developers simply leaving their attributes writable, even when more restrictive setting would be correct. The Ruby community may have cast aside strict typing, but const-correctness is still a very important part of object-oriented programming, what with factoring being all about minimum exposure and minimum coupling, and it ought to be treated as such.
The solution is to treat writability the same way the standard attr
methods do, by making it a key part of the declaration name. The type name is relegated to a parameter, which gives us flexibility we'll exploit later. In short, you end up with this:
[sourcecode language='ruby']
# read-only:
xml_reader :user, :attr
# writable:
xml_accessor :user, :attr
[/sourcecode]
Object-tivity
Now you may notice above that:attr
declares the referenced type as the second argument. This is consistent throughout and there are several more types. They are:
-
:attr
: an xml attribute on the current node, returned as text -
:text
: the contents of a named sub-node, returned as text -
:content
: the contents of the current node, returned as text -
Object
: Any ROXML object can be provided to declare sub-types, including recursive types (provided recursion terminates) -
[Object]
,[:text]
: Put the type in an Array to declare that there are multiple instances of this type which should be provided in a collection -
{}
: A hash type can be populated with sub-nodes and attributes, in various ways
:text
is the default, if no type is declared,
Named args & TAG_what?
The old ROXML uses only positional arguments and theseTAG_
constants to declare aspects of the declaration. But the ROXML::TAG_
stuff is unnecessarily heavyweight, so the new ROXML uses symbols instead, e.g. :cdata
rather than ROXML::TAG_CDATA
.
Likewise, many optional arguments are now named, rather than positional. So rather than have to put in the default values for these parameters, or nil
, you can simply omit them. So these:
[sourcecode language='ruby']
xml_text :name, 'NAME', ROXML::TAG_CDATA & ROXML::TAG_READONLY, 'USER'
xml_text :name, nil, ROXML::TAG_READONLY, 'USER'
[/sourcecode]
Become:
[sourcecode language='ruby']
xml_reader :name, :from => 'NAME', :in => 'USER', :as => :cdata
xml_reader :name, :in => 'USER'
[/sourcecode]
The options map as follows:
-
:in
: Previously 'wrapper' -
:from
: Previously 'name' -
:else
: Used to declare a default value in case the entity is missing; previously unavailable -
:as
: Previously 'options'. Can be passed as a singly symbol, or multiple in an array
Hash attack!
One of the more important additions is theHash
base type mapping. Hash declarations have a syntax of their own which enable you to pull from attributes, contents, names and sub-nodes of a series of entries. This can be super-useful for web-services which provide collections of named attributes, which fit naturally in this type. The ROXML documentation covers these cases well.
Here's a few example of the syntax:
[sourcecode language='ruby']
xml_reader :definitions, {:attrs => ['dt', 'dd']}
xml_reader :definitions, {:key => {:attr => 'word'},
:value => :content}, :in => 'definitions'
[/sourcecode]