On Fri, Aug 28, 2009 at 5:33 PM, Vincent Fretin<
[hidden email]> wrote:
> On Fri, Aug 28, 2009 at 12:40 AM, victor rajewski <
[hidden email]>
> wrote:
>>
>> On Thu, Aug 27, 2009 at 8:04 AM, Vincent Fretin<
[hidden email]>
>> wrote:
>> > On Thu, Aug 27, 2009 at 7:40 AM, victor rajewski <
[hidden email]>
>> >> I have a .pt file for creating RSS2 feeds. RSS2 is XML, but is not
>> >> HTML. In particular, the <link> tag in HTML is an empty tag (i.e.
>> >> never has any content), while in RSS2 it needs content (the URI of
>> >> some item). The problem is, i18ndude treats .pt (in fact .*pt) files
>> >> as HTML, and puts them through the HTMLTALParser, which fails when it
>> >> gets a non-empty <link> tag. This occurs around line 472 in
>> >> i18ndude/extract.py.
>> >
>> > I'm not sure, but do your link is closed like this:
>> > <link attr1="" attr2="" />
>> > ?
>>
>> The link tag is not closed, because it is not meant to be. In RSS, the
>> link tag looks like:
>> <link>
http://foo.bar/blah.html</link>
>> which is what my template has. However, this is not valid HTML, which
>> expects
>> <link />
>>
>> The issue is with i18ndude expecting all .*pt files to be HTML. It is
>> possible they are XML.
>>
>> Cheers,
>>
>> vik
>
> Oh yes, I understand. Sorry, I read too quickly your previous post.
>
> And if you replace HTMLParser by TALParser like you said, you solved the
> problem, right?
> What's the difference between the two parsers by the way?
TALParser parses XML files, HTMLTALParser parses HTML files. I tried
getting rid of HTMLTALParser and using just TALParser, but since HTML
is not (necessarily) valid XML, this fails on some .pt files. So I
tried using python-magic to determine the filetype magically. This
works fine if the HTML template has <html> tags, but included
templates might not, so I opted for this approach, which for files
.*pt tries to work out if they are XML, otherwise uses TALParser. Not
a great approach, and I don't think 'magic' is available for windows
systems, but it worked for me.
Here is the diff to extract.py:
458,462d457
< #for determining filetype based on more than extension
< import magic
< filetypes = magic.open(magic.MAGIC_NONE)
< filetypes.load()
<
478,481c473
< if filetypes.file(filename) == 'XML':
< p = TALParser()
< else:
< p = HTMLTALParser()
---
> p = HTMLTALParser()
ciao
vik
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.
http://p.sf.net/sfu/bobj-july_______________________________________________
Plone-i18n mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/plone-i18n