Discussion:
Hobix and Ruby1.8.6/REXML
Paul van Tilburg
2007-03-16 11:52:01 UTC
Permalink
Hello all,

Something odd happened with Hobix after upgrading to Ruby 1.8.6.
Something has changed in the REXML API or a bug has been introduced
breaking the Atom feed generator. The traceback:

/usr/lib/ruby/1.8/hobix/out/atom.rb:59:in `load': undefined method
`attributes' for nil:NilClass (NoMethodError)
from /usr/lib/ruby/1.8/hobix/weblog.rb:617:in `retouch'
from /usr/lib/ruby/1.8/hobix/weblog.rb:584:in `each'
...
from /usr/bin/hobix:79

Line 59 of atom.rb is:
59: rssdoc.elements['/feed/link[@rel="alternate"]'].attributes['href']

Somehow this XPath query doesn't seem to find the appropriate XML
element anymore in the following (abbreviated) XML document:

<feed
xmlns="http://www.w3.org/2005/Atom"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xml:lang="en">
<title></title>
<link rel="alternate" type="text/html" href="" />
<link rel="self" type="application/atom+xml" href="" />
...
</feed>

However, when the xmlns attribute of the <feed/> element is removed,
everyting works fine. What would be a possible fix or workaround for
this problem not breaking the spec nor the fact that it should work with
Ruby 1.8.5 and 1.8.6?

Thanks in advance,
Paul

P.S. People on the REXML list, please Cc me, for I am not on the list.
--
Using the Power of Debian GNU/Linux <<< | GnuPG key ID: 0x50064181
MenTaLguY
2007-03-16 17:28:21 UTC
Permalink
Post by Paul van Tilburg
Somehow this XPath query doesn't seem to find the appropriate XML
<feed
xmlns="http://www.w3.org/2005/Atom"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xml:lang="en">
<title></title>
<link rel="alternate" type="text/html" href="" />
<link rel="self" type="application/atom+xml" href="" />
...
</feed>
However, when the xmlns attribute of the <feed/> element is removed,
everyting works fine. What would be a possible fix or workaround for
this problem not breaking the spec nor the fact that it should work with
Ruby 1.8.5 and 1.8.6?
Hmm. I forget how REXML does namespaces, but I would imagine the query probably needs to be something more or less like:

rssdoc.elements['/atom:feed/atom:link[@rel="alternate"]', { 'atom' => "http://www.w3.org/2005/Atom" }].attributes['href']

I'm guessing the old REXML behavior which the existing atom.rb code relies upon was a bug.

(I'm also not subscribed to the rexml list, so please CC me)

-mental
MenTaLguY
2007-03-17 04:43:16 UTC
Permalink
Post by MenTaLguY
=> "http://www.w3.org/2005/Atom" }].attributes['href']
It looks like we must use XPath.first here.

Paul, would you be interested in generating a patch? I've got my hands
pretty full lately.

-mental
Paul van Tilburg
2007-03-18 00:00:21 UTC
Permalink
Post by MenTaLguY
Post by MenTaLguY
=> "http://www.w3.org/2005/Atom" }].attributes['href']
It looks like we must use XPath.first here.
What doe you mean, exactly?
Note that we should check that this approach works for 1.8.6 too.
Post by MenTaLguY
Paul, would you be interested in generating a patch? I've got my hands
pretty full lately.
Yes, I'll try to hack up something tomorrow... I made a quick fix for me
yesterday already, but that one is quite dirty: it does all the operations
as normally, only the initial document has the xmlns statement/attribute removed
and I add this at the very end. Not nice, but it works, sue me ;-)

Regards,
Paul
--
Post by MenTaLguY
Post by MenTaLguY
Using the Power of Debian GNU/Linux <<< | GnuPG key ID: 0x50064181
MenTaLguY
2007-03-18 17:18:33 UTC
Permalink
Post by Paul van Tilburg
Post by MenTaLguY
Post by MenTaLguY
=> "http://www.w3.org/2005/Atom" }].attributes['href']
It looks like we must use XPath.first here.
What doe you mean, exactly?
require 'rexml/xpath'

...

REXML::XPath.first( rssdoc.elements, '/atom:feed/atom:link[@rel="alternate"]', { 'atom' => "http://www.w3.org/2005/Atom" } )

The 1.8.6 behavior is more compliant -- XPath is not supposed to match
namespaced elements without using a prefix bound to the right namespace.
Sadly, it turns out there's no way to specify namespaces when using
REXML::Elements#[], so REXML::XPath is the only way to go if you want
namespace support.

-mental
Paul van Tilburg
2007-03-18 20:35:17 UTC
Permalink
Post by MenTaLguY
Post by Paul van Tilburg
Post by MenTaLguY
Post by MenTaLguY
=> "http://www.w3.org/2005/Atom" }].attributes['href']
It looks like we must use XPath.first here.
What doe you mean, exactly?
require 'rexml/xpath'
...
The 1.8.6 behavior is more compliant -- XPath is not supposed to match
namespaced elements without using a prefix bound to the right namespace.
Sadly, it turns out there's no way to specify namespaces when using
REXML::Elements#[], so REXML::XPath is the only way to go if you want
namespace support.
The question for me is... why do we use all this searching after just
creating a template based on variables. Can't we just put these in the
string directly? I don't see the point IMO. Well, maybe this seems (or
seemed at the time) more elegant, but with the NS code you write above,
it becomes QUITE a hassle to put some link in an attribute. Opinions?

Regards,
Paul
--
Post by MenTaLguY
Post by Paul van Tilburg
Post by MenTaLguY
Using the Power of Debian GNU/Linux <<< | GnuPG key ID: 0x50064181
MenTaLguY
2007-03-18 23:25:30 UTC
Permalink
Post by Paul van Tilburg
The question for me is... why do we use all this searching after just
creating a template based on variables. Can't we just put these in the
string directly?
As long as they're properly escaped, sure. I'm not too keen on the
current approach that the atom code uses really, but I've got no time
right now to rewrite it all.

-mental

Loading...