I have installed TextIndexNG3 v3.2.8, which is working very well. I however have a query about special characters. When I go to /Plone/portal_catalog/Indexes/SearchableText and look through a letter, say b, I get words with 'special' characters
búsquedas
bürointerne
būtiskajām
Yet when I click on one of these words I get an error which tells me that ascii cannot decode the character.
Exception Type UnicodeDecodeError
Exception Value 'ascii' codec can't decode byte 0xc5 in position 1: ordinal not in range(128)
Has anyone seen this before and have any advice on how to either access these special characters or to fix it so it doesn't use the special characters?
Regards
Kees
kees04
()
Re: TextIndexNG3 - Query on special characters
In addition, here is how my searchable text is configured.
TextIndexNG3 at /Plone/portal_catalog/Indexes/SearchableText
# indexed documents 17794
# indexed words 1735369
Languages en
Fields SearchableText
Default encoding utf-8
Additional characters recognized by the splitter part of a word _-
Splitter txng.splitters.simple
Stemming False
Autoexpand off
Autoexpand limit 4
Parser txng.parsers.en
Casefolding True
Storage txng.storages.term_frequencies
Dedicated storages False
Ranking True
Normalizer False
Stopwords False
Index unknown languages True
ajung
()
Re: TextIndexNG3 - Query on special characters
--On 19. August 2008 01:21:46 -0700 kees04 <[hidden email]> wrote:
>
> Hi,
>
> I have installed TextIndexNG3 v3.2.8, which is working very well. I
> however have a query about special characters. When I go to
> /Plone/portal_catalog/Indexes/SearchableText and look through a letter,
> say b, I get words with 'special' characters
>
> búsquedas
> bürointerne
> būtiskajām
>
> Yet when I click on one of these words I get an error which tells me that
> ascii cannot decode the character.
>
> Exception Type UnicodeDecodeError
> Exception Value 'ascii' codec can't decode byte 0xc5 in position 1:
> ordinal not in range(128)
Provide the full traceback please. Likely only a UI problem. The UI
functionality does not affect the backend functionality.
Provide the full traceback please. Likely only a UI problem. The UI
functionality does not affect the backend functionality.
-aj
Hi AJ,
Here is the traceback log
Traceback (innermost last):
Module ZPublisher.Publish, line 119, in publish
Module ZPublisher.mapply, line 88, in mapply
Module ZPublisher.Publish, line 42, in call_object
Module Products.Five.browser.metaconfigure, line 417, in __call__
Module Shared.DC.Scripts.Bindings, line 313, in __call__
Module Shared.DC.Scripts.Bindings, line 350, in _bindAndExec
Module Products.PageTemplates.PageTemplateFile, line 129, in _exec
Module Products.CacheSetup.patch_cmf, line 120, in PT_pt_render
Module zope.tal.talinterpreter, line 271, in __call__
Module zope.tal.talinterpreter, line 346, in interpret
Module zope.tal.talinterpreter, line 855, in do_condition
Module zope.tal.talinterpreter, line 346, in interpret
Module zope.tal.talinterpreter, line 536, in do_optTag_tal
Module zope.tal.talinterpreter, line 521, in do_optTag
Module zope.tal.talinterpreter, line 516, in no_tag
Module zope.tal.talinterpreter, line 346, in interpret
Module zope.tal.talinterpreter, line 586, in do_setLocal_tal
Module zope.tales.tales, line 696, in evaluate
- URL: index
- Line 8, Column 2
- Expression: <PathExpr standard:'context/@@documents_for_word'>
- Names:
{'container': <TextIndexNG3 at /Plone/portal_catalog//SearchableText>,
'context': <TextIndexNG3 at /Plone/portal_catalog//SearchableText>,
'default':
ajung
()
Re: TextIndexNG3 - Query on special characters
--On 19. August 2008 01:48:02 -0700 kees04 <[hidden email]> wrote:
>
>
> Andreas Jung-5 wrote:
>>
>> Provide the full traceback please. Likely only a UI problem. The UI
>> functionality does not affect the backend functionality.
>>
>
Hm..I can not reproduce this error nor can I figure out a flaw in the code.
How can this be reproduced with a bare Plone 3 instance?
Hm..I can not reproduce this error nor can I figure out a flaw in the code.
How can this be reproduced with a bare Plone 3 instance?
Andreas
I'm not sure how you'd reproduce this problem on your site.
I have added in an external filesystem via reflector, which hosts all of our documentation, do you think the problem may lie here?
My plone site is hosted on a Linux server and the external documentation is hosted on a windows server which is mounted on Linux via NFS.
Thanks
ajung
()
Re: TextIndexNG3 - Query on special characters
--On 19. August 2008 02:38:05 -0700 kees04 <[hidden email]> wrote:
>
>
>
> Andreas Jung-5 wrote:
>>
>>
>> Hm..I can not reproduce this error nor can I figure out a flaw in the
>> code.
>> How can this be reproduced with a bare Plone 3 instance?
>>
>> Andreas
>>
>>
>
> I'm not sure how you'd reproduce this problem on your site.
> I have added in an external filesystem via reflector, which hosts all of
> our documentation, do you think the problem may lie here?
hm..sorry, no idea...I need something in my hands in order to perform
further investigations.
kees04 wrote at 2008-8-19 01:48 -0700:
> ...
> Module Products.TextIndexNG3.browser, line 82, in documents_for_word
> Module textindexng.lexicon, line 106, in getWordId
>UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 1:
>ordinal not in range(128)
I read this as: the lexicon is mixing unicode and "str" together.
I would try to reproduce this problem in an interactive Python interpreter
("bin/zopectl debug" under *nix), then use "pdb.pm()" to analyse:
the parameter passed to "getWordId" (it is likely an "str")
and the lexicon content (likely to be "unicode").
> kees04 wrote at 2008-8-19 01:48 -0700:
>> ...
>> Module Products.TextIndexNG3.browser, line 82, in documents_for_word
>> Module textindexng.lexicon, line 106, in getWordId
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 1:
>> ordinal not in range(128)
>
> I read this as: the lexicon is mixing unicode and "str" together.
Never!
The lexicon of TXNG has a dedicated check for unicode strings.
--On 26. August 2008 00:54:54 -0700 kees04 <[hidden email]> wrote:
>
> On a slight side note, I have installed Flash Player 2.1 and uploaded a
> few FLV video files.
>
> Is it possible to have these indexed and searchable? Or is it not possible
> due to them being video files?
Video files and TextIndexNG? Makes no sense to me. For File content, TXNG
will only index the textual metadata.
Video files and TextIndexNG? Makes no sense to me. For File content, TXNG
will only index the textual metadata.
Andreas
I know you cannot index the video file itself, as it's a text index. However I would like to be able to index the text name and description I have given to the file, is that possible?
To elaborate further, I want to upload 'How To' videos for our users and have them searchable in our plone site. I have tried adding them into a category but this has not worked. Any advice on this?
ajung
()
Re: TextIndexNG3 - Query on special characters
--On 26. August 2008 01:06:31 -0700 kees04 <[hidden email]> wrote:
>
>
>
> Andreas Jung-5 wrote:
>>
>>
>> Video files and TextIndexNG? Makes no sense to me. For File content,
>> TXNG will only index the textual metadata.
>>
>> Andreas
>>
>>
>
> I know you cannot index the video file itself, as it's a text index.
> However I would like to be able to index the text name and description I
> have given to the file, is that possible?
>
--On 26. August 2008 01:23:04 -0700 kees04 <[hidden email]> wrote:
>
>
>
> Andreas Jung-5 wrote:
>>
>> Please read my reply once again :-)
>>
>> -aj
>>
>>
>>
>
> Ok thanks for your help, is there a package out there that you know would
> help me with my requirements?
You just have to read the TXNG Readme. It explains you how to integrate TXNG
with other content-types.
You just have to read the TXNG Readme. It explains you how to integrate TXNG
with other content-types.
-aj
I have just read the Readme.txt and found this section
How to make your custom content-types searchable
================================================
Most current Zope index implementations are built on the fact that an
index with id XX tries to lookup the indexable content either from an objects
XX attribute or by calling the method XX() of the object. Although TextIndexNG
V3 still supports this behaviour, the recommended way to make custom types
indexable through TXNG3 is through providing dedicated methods that return
indexable content. The API of these methods is defined in
src/textindexng/interfaces/indexable.py. Custom types must either implement the
IIndexableContent API directly or provide the interface through an adapter
registered through ZCML. The IndexContentCollector class should be used to
return indexable content either as unicode string or as binary stream (to be
transformed through external converters). Some example how to use the
indexing API can be found in src/textindexng/tests/mock.py (see classes
Mock, MockPDF and StupidMockAdapter)
Is this the section you are refering to? If so I don't quite understand what I need to do? Would you be able to help.
ajung
()
Re: TextIndexNG3 - Query on special characters
--On 26. August 2008 01:48:18 -0700 kees04 <[hidden email]> wrote:
>
>
> Andreas Jung-5 wrote:
>>
>>
>> You just have to read the TXNG Readme. It explains you how to integrate
>> TXNG
>> with other content-types.
>>
>> -aj
>>
>
> I have just read the Readme.txt and found this section
>
>
>
>> How to make your custom content-types searchable
>> ================================================
>>
>> Most current Zope index implementations are built on the fact that an
>> index with id XX tries to lookup the indexable content either from an
>> objects
>> XX attribute or by calling the method XX() of the object. Although
>> TextIndexNG
>> V3 still supports this behaviour, the recommended way to make custom
>> types indexable through TXNG3 is through providing dedicated methods
>> that return indexable content. The API of these methods is defined in
>> src/textindexng/interfaces/indexable.py. Custom types must either
>> implement the
>> IIndexableContent API directly or provide the interface through an
>> adapter registered through ZCML. The IndexContentCollector class should
>> be used to return indexable content either as unicode string or as
>> binary stream (to be
>> transformed through external converters). Some example how to use the
>> indexing API can be found in src/textindexng/tests/mock.py (see classes
>> Mock, MockPDF and StupidMockAdapter)
>>
>
> Is this the section you are refering to? If so I don't quite understand
> what I need to do? Would you be able to help.
This section refer to basic Zope 3 technology like adapter & components.
Sorry but I won't explain Zope 3 technology here. You have to refer to the
related documentation like Philipp von Weiterhausen's Zope 3 book.
Or you have to check the related unittests of the TXNG 3 source code.
You need to know the basic Zope 3 concepts in order to proceed..sorry, you
have learn.
>
>
>--On 23. August 2008 13:27:06 +0200 Dieter Maurer <[hidden email]>
>wrote:
>
>> kees04 wrote at 2008-8-19 01:48 -0700:
>>> ...
>>> Module Products.TextIndexNG3.browser, line 82, in documents_for_word
>>> Module textindexng.lexicon, line 106, in getWordId
>>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 1:
>>> ordinal not in range(128)
>>
>> I read this as: the lexicon is mixing unicode and "str" together.
>
>Never!
>
>The lexicon of TXNG has a dedicated check for unicode strings.
The traceback tells us without any doubt:
In line 106, "getWordId" fails to decode an "str" to "unicode"
using the "ascii" encoding.
This means:
* there is some "str" and some "unicode" mixed together in
"lexicon.Lexicon.getWordId".
* the "str" can come from the caller ("documents_for_word")
or from the lexicon itself.
I have installed TextIndexNG3 v3.2.8, which is working very well. I however have a query about special characters. When I go to /Plone/portal_catalog/Indexes/SearchableText and look through a letter, say b, I get words with 'special' characters
búsquedas
bürointerne
būtiskajām
Yet when I click on one of these words I get an error which tells me that ascii cannot decode the character.
Exception Type UnicodeDecodeError
Exception Value 'ascii' codec can't decode byte 0xc5 in position 1: ordinal not in range(128)
Has anyone seen this before and have any advice on how to either access these special characters or to fix it so it doesn't use the special characters?