[redland-dev] Redland RSS changes (was Re: Fwd: [rss-media] Re:
Media extensions)
Dave Beckett
dave.beckett at bristol.ac.uk
Wed Jul 27 17:30:44 BST 2005
On Tue, 2005-07-26 at 17:28 +0200, Suzan Foster wrote to the rss-media
list:
> >
> > I have been tweaking the rss-tag-soup parser (librdf) which I can
> > get to do a reasonable job on multiple media:content elements and
> > loose properties [2], but can't handle the <media:group> element.
> > Nor the <media:credit>, <media:category> and <media:text> elements
> > properly yet.
This is a warning, I've made huge changes to the RSS Tag Soup parser in
CVS :) Copied to redland-dev since this is of probably interest there.
Firstly, I ripped the code into three parts - parser, serializer
(raptor_serializer_rss.c) and common (raptor_rss_common.c). The parser
now does atom 1.0 input and rewrites atom: terms into rss: ones where
they overlap. This is note complete as it ignores the atom:content for
now, as that needs scraping from XML into a large RDF literal. Tedious.
The serializer can now do atom 1.0 output separate from rss1.0. (You
will have to configure with --enable-maintainer-mode to get this
activated)
This means you can do (rss any|atom 0.3|atom 1.0) in and (rss 1.0|atom
1.0) out. With redland/rasqal in between you can query atom1.0 direct
with sparql - see example #10 at http://librdf.org/query/ . Check your
mime types if you don't get what you expect.
The new common code is an internal rss_model class - this is NOT in the
public API. The model so far is:
1. the common items (channel aka atom:feed, image, textinput, ...)
2. the sequence of rss:item (aka atom:entry).
Additions like the media parts, atom:link (another sequence),
atom:content and so on would best be added to the common rss_model code
and then updated by the parser (input) and read by the serializer
(output).
Dave
>
More information about the redland-dev
mailing list