[Discussion] Annotations improvements (wider annotations target) and roadmap update

14 messages Options
Embed this post
Permalink
Anca Luca

[Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
Hi devs,

following a discussion with Fabio about the second desired feature for the
annotations, namely the ability to add annotations on any document, no matter
how its content is generated, we came up with the solution described at
http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments 
, the main idea being that annotations would be defined by their selected text
and a context (as opposed to offsets) and would be identified to be rendered in
a document on a serialization of the transformed XDOM of the document, this way
taking into account any macro rendering, document inclusion, etc.

WDYT about this solution?

Also, because the implementation of this, though relatively localized, comes
together with refactor and cleanup of the annotations module (update everything
so that annotations don't store and use offsets anymore, remove classes &
functions which are not needed in this simplified process), I propose to include
this improvement in version 1.0 of the annotations module (so that we don't
cleanup and release what we know for sure we'll delete) and push the 1.0 version
further to mid to end December.

here's my +1 for this,
WDYT?


Happy coding,
Anca

_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
eduard.moraru

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
Hi Anca,

Here's my understanding of what you suggest:

Document text:
"word1 word2 word3 word4 word5 word6."

Annotation on "word4" results in:
- Selection: "word4" (unique within the context)
- Context: "word3 word4 word5" (unique within this document)

Then all you need to do to mark this annotation, is to locate the
document-unique context and then to mark the selection within the
document-area defined by the context.

I see 2 major issues here:

1. The selection must be itself unique within the context, otherwise you
just have the same big problem, at a smaller scale. This directly
restricts the context's size.
2. How can the document-uniqueness of a context be ensured?
  - Fixed size context is not that practical.
  - Computing the context size at creation time with the js client
becomes a must, but if it fails to identify such an unique context, you
risc having all the document as context. Example document text: "word4
word4 word4 word4 word4 word4"
  - The matching is done *after* the dynamic part of the document
finishes to execute. That dynamic part could potentially generate a copy
of the context and confuse the matching algorithm.

Maybe I misunderstood the proposal or missed some key detail, otherwise,
please let me know.

P.S.: I like examples :)

Thanks,
Eduard

On 10/30/2009 05:21 PM, Anca Luca wrote:

> Hi devs,
>
> following a discussion with Fabio about the second desired feature for the
> annotations, namely the ability to add annotations on any document, no matter
> how its content is generated, we came up with the solution described at
> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
> , the main idea being that annotations would be defined by their selected text
> and a context (as opposed to offsets) and would be identified to be rendered in
> a document on a serialization of the transformed XDOM of the document, this way
> taking into account any macro rendering, document inclusion, etc.
>
> WDYT about this solution?
>
> Also, because the implementation of this, though relatively localized, comes
> together with refactor and cleanup of the annotations module (update everything
> so that annotations don't store and use offsets anymore, remove classes&
> functions which are not needed in this simplified process), I propose to include
> this improvement in version 1.0 of the annotations module (so that we don't
> cleanup and release what we know for sure we'll delete) and push the 1.0 version
> further to mid to end December.
>
> here's my +1 for this,
> WDYT?
>
>
> Happy coding,
> Anca
>
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
>    
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Anca Luca

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
Hi Edi,

On 10/30/2009 06:16 PM, Eduard Moraru wrote:

> Hi Anca,
>
> Here's my understanding of what you suggest:
>
> Document text:
> "word1 word2 word3 word4 word5 word6."
>
> Annotation on "word4" results in:
> - Selection: "word4" (unique within the context)
> - Context: "word3 word4 word5" (unique within this document)
>
> Then all you need to do to mark this annotation, is to locate the
> document-unique context and then to mark the selection within the
> document-area defined by the context.
>
> I see 2 major issues here:
>
> 1. The selection must be itself unique within the context, otherwise you
> just have the same big problem, at a smaller scale. This directly
> restricts the context's size.

either use offset for position of selection in context, or use contextLeft &
contextRight (then the whole context becomes contextLeft + selection +
contextRight) and this doesn't affect context size.

> 2. How can the document-uniqueness of a context be ensured?
>    - Fixed size context is not that practical.

Why?
It should work well in most real-life cases. My take is that a frame of 3-5
hundred characters doesn't repeat in a regular document created in a wiki (of
course you can build test cases when it fails).

>    - Computing the context size at creation time with the js client
> becomes a must, but if it fails to identify such an unique context, you
> risc having all the document as context. Example document text: "word4
> word4 word4 word4 word4 word4"

If the whole document is not megabytes (in which case you should be able to find
a shorter context), I don't see why that is a problem.

>    - The matching is done *after* the dynamic part of the document
> finishes to execute. That dynamic part could potentially generate a copy
> of the context and confuse the matching algorithm.

Well, the whole point is to get the dynamic part to execute and store
annotations as text, so that annotations are defined by what the user sees, as a
general idea. This also ensures that an annotation could be displayed even if
its position moved from one execution to another (for example, in a scripted
document, you'd have a part which would only be displayed to admins. If an
annotation falls in that part, then for admins would match, for regular users
no, which is ok: since the text doesn't appear there's nothing the user expects
to be annotated. If an annotation falls in the part outside the content
displayed only for admins, then matching by text (and not position) would allow
us to find it and display it even if its position moves because of the text
rendered only for admins. If it's half-half then for admins it would be there
because the content is, for regular users not because the content isn't).
For the particular case of a dynamic part generating copies of context, I'm
reiterating the idea that a couple of hundred characters should work in normal
cases. If there is a dynamic part which duplicates the whole document, the only
problem would be that annotation would be matched and displayed on the first
encounter of its context, and not the second (is that a pb for what user sees
and perceives? or we could also display it 2 times) Also, this is a particular
case ("normal" documents shouldn't do that).

Of course there would be annotations which will fail to be matched and
represented but, if few enough and in particular enough cases, I think it's a
good tradeoff.

Note that I don't know yet how this performs in practice, but I think it's a
good direction to try as a balance between practical performance and speed of
implementation, and it's the only one so far (well, there's also the current
implementation but that only covers a subset of what this implementation would
cover).


>
> Maybe I misunderstood the proposal or missed some key detail, otherwise,
> please let me know.

The idea is that this algorithm is changeable in its key points, like detecting
and matching context. This first take is supposed to perform well *in practice*,
even if theoretically there are cases where it fails. If these failures make it
unusable or we decide we want atomic precision, then we can improve the key
points (like trying to match context better and others).


Thanks,
Anca

>
> P.S.: I like examples :)
>
> Thanks,
> Eduard
>
> On 10/30/2009 05:21 PM, Anca Luca wrote:
>> Hi devs,
>>
>> following a discussion with Fabio about the second desired feature for the
>> annotations, namely the ability to add annotations on any document, no matter
>> how its content is generated, we came up with the solution described at
>> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
>> , the main idea being that annotations would be defined by their selected text
>> and a context (as opposed to offsets) and would be identified to be rendered in
>> a document on a serialization of the transformed XDOM of the document, this way
>> taking into account any macro rendering, document inclusion, etc.
>>
>> WDYT about this solution?
>>
>> Also, because the implementation of this, though relatively localized, comes
>> together with refactor and cleanup of the annotations module (update everything
>> so that annotations don't store and use offsets anymore, remove classes&
>> functions which are not needed in this simplified process), I propose to include
>> this improvement in version 1.0 of the annotations module (so that we don't
>> cleanup and release what we know for sure we'll delete) and push the 1.0 version
>> further to mid to end December.
>>
>> here's my +1 for this,
>> WDYT?
>>
>>
>> Happy coding,
>> Anca
>>
>> _______________________________________________
>> devs mailing list
>> [hidden email]
>> http://lists.xwiki.org/mailman/listinfo/devs
>>
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Enygma-3

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
Context left and context right could be an idea.

However, what do you do about the static size of the context when, for
example, you have a 500 character document and you make only 2 annotations?
That results in storing 2x300 = 600 characters in just 2 annotations. That
is already duplicating the document's content in size. If you make
additional annotations, you duplicate the document several times.

The part where annotations appear depending on user rights, sounds cool, but
how can you detect when the dynamic content changes and fix your
annotations? (like you do for static content)

While I'm not convinced about this approach, you may be right and, comparing
with the existing one (which I did not take the time to understand in detail
as you had), and other issues which you underlined in your reply, it sounds
like a start.

Thanks,
Eduard

On Fri, Oct 30, 2009 at 7:25 PM, Anca Luca <[hidden email]> wrote:

> Hi Edi,
>
> On 10/30/2009 06:16 PM, Eduard Moraru wrote:
> > Hi Anca,
> >
> > Here's my understanding of what you suggest:
> >
> > Document text:
> > "word1 word2 word3 word4 word5 word6."
> >
> > Annotation on "word4" results in:
> > - Selection: "word4" (unique within the context)
> > - Context: "word3 word4 word5" (unique within this document)
> >
> > Then all you need to do to mark this annotation, is to locate the
> > document-unique context and then to mark the selection within the
> > document-area defined by the context.
> >
> > I see 2 major issues here:
> >
> > 1. The selection must be itself unique within the context, otherwise you
> > just have the same big problem, at a smaller scale. This directly
> > restricts the context's size.
>
> either use offset for position of selection in context, or use contextLeft
> &
> contextRight (then the whole context becomes contextLeft + selection +
> contextRight) and this doesn't affect context size.
>
> > 2. How can the document-uniqueness of a context be ensured?
> >    - Fixed size context is not that practical.
>
> Why?
> It should work well in most real-life cases. My take is that a frame of 3-5
> hundred characters doesn't repeat in a regular document created in a wiki
> (of
> course you can build test cases when it fails).
>
> >    - Computing the context size at creation time with the js client
> > becomes a must, but if it fails to identify such an unique context, you
> > risc having all the document as context. Example document text: "word4
> > word4 word4 word4 word4 word4"
>
> If the whole document is not megabytes (in which case you should be able to
> find
> a shorter context), I don't see why that is a problem.
>
> >    - The matching is done *after* the dynamic part of the document
> > finishes to execute. That dynamic part could potentially generate a copy
> > of the context and confuse the matching algorithm.
>
> Well, the whole point is to get the dynamic part to execute and store
> annotations as text, so that annotations are defined by what the user sees,
> as a
> general idea. This also ensures that an annotation could be displayed even
> if
> its position moved from one execution to another (for example, in a
> scripted
> document, you'd have a part which would only be displayed to admins. If an
> annotation falls in that part, then for admins would match, for regular
> users
> no, which is ok: since the text doesn't appear there's nothing the user
> expects
> to be annotated. If an annotation falls in the part outside the content
> displayed only for admins, then matching by text (and not position) would
> allow
> us to find it and display it even if its position moves because of the text
> rendered only for admins. If it's half-half then for admins it would be
> there
> because the content is, for regular users not because the content isn't).
> For the particular case of a dynamic part generating copies of context, I'm
> reiterating the idea that a couple of hundred characters should work in
> normal
> cases. If there is a dynamic part which duplicates the whole document, the
> only
> problem would be that annotation would be matched and displayed on the
> first
> encounter of its context, and not the second (is that a pb for what user
> sees
> and perceives? or we could also display it 2 times) Also, this is a
> particular
> case ("normal" documents shouldn't do that).
>
> Of course there would be annotations which will fail to be matched and
> represented but, if few enough and in particular enough cases, I think it's
> a
> good tradeoff.
>
> Note that I don't know yet how this performs in practice, but I think it's
> a
> good direction to try as a balance between practical performance and speed
> of
> implementation, and it's the only one so far (well, there's also the
> current
> implementation but that only covers a subset of what this implementation
> would
> cover).
>
>
> >
> > Maybe I misunderstood the proposal or missed some key detail, otherwise,
> > please let me know.
>
> The idea is that this algorithm is changeable in its key points, like
> detecting
> and matching context. This first take is supposed to perform well *in
> practice*,
> even if theoretically there are cases where it fails. If these failures
> make it
> unusable or we decide we want atomic precision, then we can improve the key
> points (like trying to match context better and others).
>
>
> Thanks,
> Anca
>
> >
> > P.S.: I like examples :)
> >
> > Thanks,
> > Eduard
> >
> > On 10/30/2009 05:21 PM, Anca Luca wrote:
> >> Hi devs,
> >>
> >> following a discussion with Fabio about the second desired feature for
> the
> >> annotations, namely the ability to add annotations on any document, no
> matter
> >> how its content is generated, we came up with the solution described at
> >>
> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
> >> , the main idea being that annotations would be defined by their
> selected text
> >> and a context (as opposed to offsets) and would be identified to be
> rendered in
> >> a document on a serialization of the transformed XDOM of the document,
> this way
> >> taking into account any macro rendering, document inclusion, etc.
> >>
> >> WDYT about this solution?
> >>
> >> Also, because the implementation of this, though relatively localized,
> comes
> >> together with refactor and cleanup of the annotations module (update
> everything
> >> so that annotations don't store and use offsets anymore, remove classes&
> >> functions which are not needed in this simplified process), I propose to
> include
> >> this improvement in version 1.0 of the annotations module (so that we
> don't
> >> cleanup and release what we know for sure we'll delete) and push the 1.0
> version
> >> further to mid to end December.
> >>
> >> here's my +1 for this,
> >> WDYT?
> >>
> >>
> >> Happy coding,
> >> Anca
> >>
> >> _______________________________________________
> >> devs mailing list
> >> [hidden email]
> >> http://lists.xwiki.org/mailman/listinfo/devs
> >>
> > _______________________________________________
> > devs mailing list
> > [hidden email]
> > http://lists.xwiki.org/mailman/listinfo/devs
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
>
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Fabio Mancinelli-4

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink

On Oct 31, 2009, at 12:26 PM, Eduard Moraru wrote:

> Context left and context right could be an idea.
>
> However, what do you do about the static size of the context when, for
> example, you have a 500 character document and you make only 2  
> annotations?
> That results in storing 2x300 = 600 characters in just 2  
> annotations. That
> is already duplicating the document's content in size. If you make
> additional annotations, you duplicate the document several times.
>
That's a tradeoff of course...

the underlying problem is that many XWiki documents have the content  
users finally see, generated in some way (think of a blog post, or the  
Watch news coming from objects)

We need to find a way to map what the user see to where it comes from.

Now, since XWiki allows you to display everything using several Turing-
Complete mechanisms (groovy scripts, velocity, etc.), making this  
mapping implies being able to understand what's coded in a page (not  
possible), or force the author to "mark" somehow the source of the  
content in their script (impractical), or give the user a constrained  
scripting language where this information is made explicit (limiting)

The solution is to apply heuristics in order to retrieve annotations  
in the text the users really sees : we called this "Canonical  
Representation", which basically corresponds to the XDOM after the  
transformations and before the rendering. In this way we don't really  
care where the annotated content comes from. As long as it's there and  
we are able to recognize and locate it, we can display it as annotated  
content. If we are unable to do so then we simply don't display the  
annotation.

Now the problem is : what are reasonable heuristics that work in the  
most common cases? (80/20 rule) We proposed one.

> The part where annotations appear depending on user rights, sounds  
> cool, but
> how can you detect when the dynamic content changes and fix your
> annotations? (like you do for static content)
>
Again, heuristics.

In the case when you have no generated content (what the user sees is  
all contained in a single page) you can rely on a diff from the  
previous version of the page and be able to understand what happened  
(adjusting annotation accordingly). This, imho, should work perfectly.

In the case of generated content you could not be able to do a diff  
(because you don't know where the content came from, and consequently  
what changed) but you can still be able to do some smart things in  
order to "guess" what happened to your annotation. And if you are  
unable to do this guess then you display in a box that there are  
"stale" annotations that were there before and that cannot be placed  
anymore.

> While I'm not convinced about this approach, you may be right and,  
> comparing
> with the existing one (which I did not take the time to understand  
> in detail
> as you had), and other issues which you underlined in your reply, it  
> sounds
> like a start.
>
It's surely a start. But what was clear during our discussions with  
Anca was that offsets are brittle and cumbersome when content comes  
from different sources : if you want to use offsets and annotate a  
blog post, for example, you should be able to say that the annotation  
starts at offset X of the field Y of the object Z on the page P (or  
any variant of this for any possible content source). Who gives you  
this information if all that you can see in the requested page is a  
#include('something') ? How could you encode this information in a  
standard way ? Too much complicated.

Since we have a lot of use cases of this type (blog post, watch feeds,  
and in general data coming from objects and displayed using general  
purposes languages) we should think about another simpler solution.

The proposed one is not perfect (this is the price to pay for having  
such a powerful wiki platform that allows you to do whatever it's  
calculable) but it should work nicely in most cases. As I said before  
it should cover correctly all the cases where documents are self-
contained (i.e., all the use cases where the current annotation system  
works)

Returning to your remark at the beginning about the storage... That's  
a tradeoff. It's sure that the more data we store, the higher is the  
degree of correctness in the dynamic cases.

Hope that this clarified a little bit what we are trying to achieve.

Anyway, if you have more ideas/comments don't hesitate.

-Fabio

P.S.: Offsets could be useful in the heuristics too and we could  
continue to store them as well. In fact they could help to locate,  
more or less, where the annotation was done. But they should only give  
a hint, not a precise information.


_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Sergiu Dumitriu-2

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
In reply to this post by Anca Luca
On 10/30/2009 04:21 PM, Anca Luca wrote:

> Hi devs,
>
> following a discussion with Fabio about the second desired feature for the
> annotations, namely the ability to add annotations on any document, no matter
> how its content is generated, we came up with the solution described at
> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
> , the main idea being that annotations would be defined by their selected text
> and a context (as opposed to offsets) and would be identified to be rendered in
> a document on a serialization of the transformed XDOM of the document, this way
> taking into account any macro rendering, document inclusion, etc.
>
> WDYT about this solution?
>
> Also, because the implementation of this, though relatively localized, comes
> together with refactor and cleanup of the annotations module (update everything
> so that annotations don't store and use offsets anymore, remove classes&
> functions which are not needed in this simplified process), I propose to include
> this improvement in version 1.0 of the annotations module (so that we don't
> cleanup and release what we know for sure we'll delete) and push the 1.0 version
> further to mid to end December.
>
> here's my +1 for this,
> WDYT?

The problem with a context is that you must make it long enough to
properly determine the right position, and short enough not to require
too much storage space.

How about XPath expressions for the start and end positions on the XDOM?
This could fail for dynamic queries, where the number and position of
entries would change. But do we want to annotate this kind of elements
anyway? Like, why would somebody make annotations on the results of a
search?
--
Sergiu Dumitriu
http://purl.org/net/sergiu/
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Fabio Mancinelli-4

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink

On Oct 31, 2009, at 9:21 PM, Sergiu Dumitriu wrote:

>
> The problem with a context is that you must make it long enough to
> properly determine the right position, and short enough not to require
> too much storage space.
>
> How about XPath expressions for the start and end positions on the  
> XDOM?
> This could fail for dynamic queries, where the number and position of
> entries would change. But do we want to annotate this kind of elements
> anyway? Like, why would somebody make annotations on the results of a
> search?

I don't think the XPath is going to work.
Imagine that your XPath targets a node that is a "script". Then you  
need an offset for identifying the annotated content span. But the  
content of this node could be anything once expanded : it depends, in  
fact, on the outcome of the script (I am not talking about result  
queries). So you end up with the original "offset" problem you have  
with dynamic content. Before you had this at the page level, now you  
have it at the node level.

Unless I misunderstood your solution.

Anyway, is this context storage issue a real problem? I mean, even  
putting 1K text into each annotation is it really something that we  
should avoid at all costs? (I agree, though, that it could be  
redundant and inefficient)

-Fabio
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Anca Luca

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
In reply to this post by Sergiu Dumitriu-2
Hi Sergiu, see below

On 10/31/2009 10:21 PM, Sergiu Dumitriu wrote:

> On 10/30/2009 04:21 PM, Anca Luca wrote:
>> Hi devs,
>>
>> following a discussion with Fabio about the second desired feature for the
>> annotations, namely the ability to add annotations on any document, no matter
>> how its content is generated, we came up with the solution described at
>> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
>> , the main idea being that annotations would be defined by their selected text
>> and a context (as opposed to offsets) and would be identified to be rendered in
>> a document on a serialization of the transformed XDOM of the document, this way
>> taking into account any macro rendering, document inclusion, etc.
>>
>> WDYT about this solution?
>>
>> Also, because the implementation of this, though relatively localized, comes
>> together with refactor and cleanup of the annotations module (update everything
>> so that annotations don't store and use offsets anymore, remove classes&
>> functions which are not needed in this simplified process), I propose to include
>> this improvement in version 1.0 of the annotations module (so that we don't
>> cleanup and release what we know for sure we'll delete) and push the 1.0 version
>> further to mid to end December.
>>
>> here's my +1 for this,
>> WDYT?
>
> The problem with a context is that you must make it long enough to
> properly determine the right position, and short enough not to require
> too much storage space.

I agree, and that is why this algorithm should be easily improved and adapted
upon need.

>
> How about XPath expressions for the start and end positions on the XDOM?

I think XPath expressions are almost as rigid as offsets. Imagine you have a
text displayed for a user but not for others, or a container of some sort.
Although the annotated text doesn't fall in this content, its presence in the
XDOM can influence the node resolution for the stored path and cause an
annotation to fail to display although the annotated text (the selection) is
well there.

Also, right now, I don't tend to like the idea of coupling the annotation model
with the XDOM one, I think an annotation should make sense on its own
(regardless of the other mechanisms behind a document rendering, an annotation
is just some text selected with some meta infos).

Which makes me think, this proposal about rendering annotations on the
transformed XDOM would only work for syntax 2.0. Is that an issue at this point?
Do we want to give up this solution because of this or find a workaround when
we'd need to for the 1.0 syntax documents?

Thanks,
Anca

> This could fail for dynamic queries, where the number and position of
> entries would change. But do we want to annotate this kind of elements
> anyway? Like, why would somebody make annotations on the results of a
> search?
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Sergiu Dumitriu-2

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
On 11/02/2009 12:13 AM, Anca Luca wrote:

> Hi Sergiu, see below
>
> On 10/31/2009 10:21 PM, Sergiu Dumitriu wrote:
>> On 10/30/2009 04:21 PM, Anca Luca wrote:
>>> Hi devs,
>>>
>>> following a discussion with Fabio about the second desired feature for the
>>> annotations, namely the ability to add annotations on any document, no matter
>>> how its content is generated, we came up with the solution described at
>>> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
>>> , the main idea being that annotations would be defined by their selected text
>>> and a context (as opposed to offsets) and would be identified to be rendered in
>>> a document on a serialization of the transformed XDOM of the document, this way
>>> taking into account any macro rendering, document inclusion, etc.
>>>
>>> WDYT about this solution?
>>>
>>> Also, because the implementation of this, though relatively localized, comes
>>> together with refactor and cleanup of the annotations module (update everything
>>> so that annotations don't store and use offsets anymore, remove classes&
>>> functions which are not needed in this simplified process), I propose to include
>>> this improvement in version 1.0 of the annotations module (so that we don't
>>> cleanup and release what we know for sure we'll delete) and push the 1.0 version
>>> further to mid to end December.
>>>
>>> here's my +1 for this,
>>> WDYT?
>>
>> The problem with a context is that you must make it long enough to
>> properly determine the right position, and short enough not to require
>> too much storage space.
>
> I agree, and that is why this algorithm should be easily improved and adapted
> upon need.
>
>>
>> How about XPath expressions for the start and end positions on the XDOM?
>
> I think XPath expressions are almost as rigid as offsets. Imagine you have a
> text displayed for a user but not for others, or a container of some sort.
> Although the annotated text doesn't fall in this content, its presence in the
> XDOM can influence the node resolution for the stored path and cause an
> annotation to fail to display although the annotated text (the selection) is
> well there.
>
> Also, right now, I don't tend to like the idea of coupling the annotation model
> with the XDOM one, I think an annotation should make sense on its own
> (regardless of the other mechanisms behind a document rendering, an annotation
> is just some text selected with some meta infos).

Completely agree.

> Which makes me think, this proposal about rendering annotations on the
> transformed XDOM would only work for syntax 2.0. Is that an issue at this point?
> Do we want to give up this solution because of this or find a workaround when
> we'd need to for the 1.0 syntax documents?
>
> Thanks,
> Anca
>
>> This could fail for dynamic queries, where the number and position of
>> entries would change. But do we want to annotate this kind of elements
>> anyway? Like, why would somebody make annotations on the results of a
>> search?


--
Sergiu Dumitriu
http://purl.org/net/sergiu/
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Anca Luca

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
In reply to this post by Enygma-3
Hi Edi,

see below (although argumentation might be potentially duplicated by Fabio in
the other mail)

On 10/31/2009 01:26 PM, Eduard Moraru wrote:

> Context left and context right could be an idea.
>
> However, what do you do about the static size of the context when, for
> example, you have a 500 character document and you make only 2 annotations?
> That results in storing 2x300 = 600 characters in just 2 annotations. That
> is already duplicating the document's content in size. If you make
> additional annotations, you duplicate the document several times.
>
> The part where annotations appear depending on user rights, sounds cool, but
> how can you detect when the dynamic content changes and fix your
> annotations? (like you do for static content)

If I understand correctly what this is about, this is one of the main advantages
of storing an annotation with selection and context: the fact that you don't
need to update the offsets anytime a document changes, if the selected text is
still in the doc, it will be found & shown as annotated. The only issue is about
contexts, which, if edited, can prevent identification of an annotation even if
annotated text is still there. Static case is simple, diff can be used to do
this update. For the dynamic case, various strategies could be applied to detect
if the change of context is small enough to consider annotation as still valid
(like editing distance, or, equality of one side of the context and not the
other, etc).

I am trying to imagine how a user perceives an annotation. If one annotated a
word in a phrase (or area of content in a page) and then the whole phrase (or
content area) changes completely but still happens to contain that word, would
the user really expect the word to be displayed as annotated? I would say it's
ok not to (which is again one of the advantages of storing an annotation as
"what the user sees").

Thanks,
Anca

>
> While I'm not convinced about this approach, you may be right and, comparing
> with the existing one (which I did not take the time to understand in detail
> as you had), and other issues which you underlined in your reply, it sounds
> like a start.
>
> Thanks,
> Eduard
>
> On Fri, Oct 30, 2009 at 7:25 PM, Anca Luca<[hidden email]>  wrote:
>
>> Hi Edi,
>>
>> On 10/30/2009 06:16 PM, Eduard Moraru wrote:
>>> Hi Anca,
>>>
>>> Here's my understanding of what you suggest:
>>>
>>> Document text:
>>> "word1 word2 word3 word4 word5 word6."
>>>
>>> Annotation on "word4" results in:
>>> - Selection: "word4" (unique within the context)
>>> - Context: "word3 word4 word5" (unique within this document)
>>>
>>> Then all you need to do to mark this annotation, is to locate the
>>> document-unique context and then to mark the selection within the
>>> document-area defined by the context.
>>>
>>> I see 2 major issues here:
>>>
>>> 1. The selection must be itself unique within the context, otherwise you
>>> just have the same big problem, at a smaller scale. This directly
>>> restricts the context's size.
>>
>> either use offset for position of selection in context, or use contextLeft
>> &
>> contextRight (then the whole context becomes contextLeft + selection +
>> contextRight) and this doesn't affect context size.
>>
>>> 2. How can the document-uniqueness of a context be ensured?
>>>     - Fixed size context is not that practical.
>>
>> Why?
>> It should work well in most real-life cases. My take is that a frame of 3-5
>> hundred characters doesn't repeat in a regular document created in a wiki
>> (of
>> course you can build test cases when it fails).
>>
>>>     - Computing the context size at creation time with the js client
>>> becomes a must, but if it fails to identify such an unique context, you
>>> risc having all the document as context. Example document text: "word4
>>> word4 word4 word4 word4 word4"
>>
>> If the whole document is not megabytes (in which case you should be able to
>> find
>> a shorter context), I don't see why that is a problem.
>>
>>>     - The matching is done *after* the dynamic part of the document
>>> finishes to execute. That dynamic part could potentially generate a copy
>>> of the context and confuse the matching algorithm.
>>
>> Well, the whole point is to get the dynamic part to execute and store
>> annotations as text, so that annotations are defined by what the user sees,
>> as a
>> general idea. This also ensures that an annotation could be displayed even
>> if
>> its position moved from one execution to another (for example, in a
>> scripted
>> document, you'd have a part which would only be displayed to admins. If an
>> annotation falls in that part, then for admins would match, for regular
>> users
>> no, which is ok: since the text doesn't appear there's nothing the user
>> expects
>> to be annotated. If an annotation falls in the part outside the content
>> displayed only for admins, then matching by text (and not position) would
>> allow
>> us to find it and display it even if its position moves because of the text
>> rendered only for admins. If it's half-half then for admins it would be
>> there
>> because the content is, for regular users not because the content isn't).
>> For the particular case of a dynamic part generating copies of context, I'm
>> reiterating the idea that a couple of hundred characters should work in
>> normal
>> cases. If there is a dynamic part which duplicates the whole document, the
>> only
>> problem would be that annotation would be matched and displayed on the
>> first
>> encounter of its context, and not the second (is that a pb for what user
>> sees
>> and perceives? or we could also display it 2 times) Also, this is a
>> particular
>> case ("normal" documents shouldn't do that).
>>
>> Of course there would be annotations which will fail to be matched and
>> represented but, if few enough and in particular enough cases, I think it's
>> a
>> good tradeoff.
>>
>> Note that I don't know yet how this performs in practice, but I think it's
>> a
>> good direction to try as a balance between practical performance and speed
>> of
>> implementation, and it's the only one so far (well, there's also the
>> current
>> implementation but that only covers a subset of what this implementation
>> would
>> cover).
>>
>>
>>>
>>> Maybe I misunderstood the proposal or missed some key detail, otherwise,
>>> please let me know.
>>
>> The idea is that this algorithm is changeable in its key points, like
>> detecting
>> and matching context. This first take is supposed to perform well *in
>> practice*,
>> even if theoretically there are cases where it fails. If these failures
>> make it
>> unusable or we decide we want atomic precision, then we can improve the key
>> points (like trying to match context better and others).
>>
>>
>> Thanks,
>> Anca
>>
>>>
>>> P.S.: I like examples :)
>>>
>>> Thanks,
>>> Eduard
>>>
>>> On 10/30/2009 05:21 PM, Anca Luca wrote:
>>>> Hi devs,
>>>>
>>>> following a discussion with Fabio about the second desired feature for
>> the
>>>> annotations, namely the ability to add annotations on any document, no
>> matter
>>>> how its content is generated, we came up with the solution described at
>>>>
>> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
>>>> , the main idea being that annotations would be defined by their
>> selected text
>>>> and a context (as opposed to offsets) and would be identified to be
>> rendered in
>>>> a document on a serialization of the transformed XDOM of the document,
>> this way
>>>> taking into account any macro rendering, document inclusion, etc.
>>>>
>>>> WDYT about this solution?
>>>>
>>>> Also, because the implementation of this, though relatively localized,
>> comes
>>>> together with refactor and cleanup of the annotations module (update
>> everything
>>>> so that annotations don't store and use offsets anymore, remove classes&
>>>> functions which are not needed in this simplified process), I propose to
>> include
>>>> this improvement in version 1.0 of the annotations module (so that we
>> don't
>>>> cleanup and release what we know for sure we'll delete) and push the 1.0
>> version
>>>> further to mid to end December.
>>>>
>>>> here's my +1 for this,
>>>> WDYT?
>>>>
>>>>
>>>> Happy coding,
>>>> Anca
>>>>
>>>> _______________________________________________
>>>> devs mailing list
>>>> [hidden email]
>>>> http://lists.xwiki.org/mailman/listinfo/devs
>>>>
>>> _______________________________________________
>>> devs mailing list
>>> [hidden email]
>>> http://lists.xwiki.org/mailman/listinfo/devs
>> _______________________________________________
>> devs mailing list
>> [hidden email]
>> http://lists.xwiki.org/mailman/listinfo/devs
>>
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Anca Luca

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
In reply to this post by Anca Luca
Hi devs,

I'm reviving this thread since I didn't really get any votes, and I'm about to
pick some development direction.

On 10/30/2009 05:21 PM, Anca Luca wrote:

> Hi devs,
>
> following a discussion with Fabio about the second desired feature for the
> annotations, namely the ability to add annotations on any document, no matter
> how its content is generated, we came up with the solution described at
> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
> , the main idea being that annotations would be defined by their selected text
> and a context (as opposed to offsets) and would be identified to be rendered in
> a document on a serialization of the transformed XDOM of the document, this way
> taking into account any macro rendering, document inclusion, etc.
>
> WDYT about this solution?

Is there anything you see not doable in this? or any of its "cons" is a showstopper?

>
> Also, because the implementation of this, though relatively localized, comes
> together with refactor and cleanup of the annotations module (update everything
> so that annotations don't store and use offsets anymore, remove classes&
> functions which are not needed in this simplified process), I propose to include
> this improvement in version 1.0 of the annotations module (so that we don't
> cleanup and release what we know for sure we'll delete) and push the 1.0 version
> further to mid to end December.
>
> here's my +1 for this,
> WDYT?

Following a discussion with Fabio today, the plan at this moment is to:
1/ finish improving the tests (so that the power and limitations of the current
solution are checked by a set of tests)
2/ eliminate all the code related to implementations for specific document types
(watch's feed entry documents, for example, everything should be handled as an
xwiki doc), so that the remaining code is cleaner and easier to maintain
3/ implement a prototype of the solution presented above, hopefully passing at
least the tests that were passing at 1/
4/ minimal improvements on the js UI, making it usable in real life
5/ release 1.0

Is there a different strategy you would prefer?

Thanks,
Anca

>
>
> Happy coding,
> Anca
>
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Fabio Mancinelli-4

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink

On Nov 10, 2009, at 6:20 PM, Anca Luca wrote:

>>
>> Also, because the implementation of this, though relatively localized, comes
>> together with refactor and cleanup of the annotations module (update everything
>> so that annotations don't store and use offsets anymore, remove classes&
>> functions which are not needed in this simplified process), I propose to include
>> this improvement in version 1.0 of the annotations module (so that we don't
>> cleanup and release what we know for sure we'll delete) and push the 1.0 version
>> further to mid to end December.
>>
>> here's my +1 for this,
>> WDYT?
>
> Following a discussion with Fabio today, the plan at this moment is to:
> 1/ finish improving the tests (so that the power and limitations of the current
> solution are checked by a set of tests)
> 2/ eliminate all the code related to implementations for specific document types
> (watch's feed entry documents, for example, everything should be handled as an
> xwiki doc), so that the remaining code is cleaner and easier to maintain
> 3/ implement a prototype of the solution presented above, hopefully passing at
> least the tests that were passing at 1/
> 4/ minimal improvements on the js UI, making it usable in real life
> 5/ release 1.0
>
> Is there a different strategy you would prefer?
>
As we discussed earlier I am +1 for this.

-Fabio
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Guillaume Lerouge

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
In reply to this post by Anca Luca
Hi Anca,

On Tue, Nov 10, 2009 at 6:20 PM, Anca Luca <[hidden email]> wrote:

> Hi devs,
>
> I'm reviving this thread since I didn't really get any votes, and I'm about
> to
> pick some development direction.
>
> On 10/30/2009 05:21 PM, Anca Luca wrote:
> > Hi devs,
> >
> > following a discussion with Fabio about the second desired feature for
> the
> > annotations, namely the ability to add annotations on any document, no
> matter
> > how its content is generated, we came up with the solution described at
> >
> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
> > , the main idea being that annotations would be defined by their selected
> text
> > and a context (as opposed to offsets) and would be identified to be
> rendered in
> > a document on a serialization of the transformed XDOM of the document,
> this way
> > taking into account any macro rendering, document inclusion, etc.
> >
> > WDYT about this solution?
>


> Is there anything you see not doable in this? or any of its "cons" is a
> showstopper?
>
> >
> > Also, because the implementation of this, though relatively localized,
> comes
> > together with refactor and cleanup of the annotations module (update
> everything
> > so that annotations don't store and use offsets anymore, remove classes&
> > functions which are not needed in this simplified process), I propose to
> include
> > this improvement in version 1.0 of the annotations module (so that we
> don't
> > cleanup and release what we know for sure we'll delete) and push the 1.0
> version
> > further to mid to end December.
> >
> > here's my +1 for this,
> > WDYT?
>
> Following a discussion with Fabio today, the plan at this moment is to:
> 1/ finish improving the tests (so that the power and limitations of the
> current
> solution are checked by a set of tests)
> 2/ eliminate all the code related to implementations for specific document
> types
> (watch's feed entry documents, for example, everything should be handled as
> an
> xwiki doc), so that the remaining code is cleaner and easier to maintain
> 3/ implement a prototype of the solution presented above, hopefully passing
> at
> least the tests that were passing at 1/
> 4/ minimal improvements on the js UI, making it usable in real life
> 5/ release 1.0
>

Sounds good to me. Here's my +1

Guillaume


>
> Is there a different strategy you would prefer?
>
> Thanks,
> Anca
>
> >
> >
> > Happy coding,
> > Anca
> >
> > _______________________________________________
> > devs mailing list
> > [hidden email]
> > http://lists.xwiki.org/mailman/listinfo/devs
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
>



--
Guillaume Lerouge
Product Manager - XWiki SAS
Skype: wikibc
Twitter: glerouge
http://guillaumelerouge.com/
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
tmortagne

Re: [Discussion] Annotations improvements (wider annotations target) and roadmap update

Reply Threaded More More options
Print post
Permalink
In reply to this post by Anca Luca
On Tue, Nov 10, 2009 at 18:20, Anca Luca <[hidden email]> wrote:

> Hi devs,
>
> I'm reviving this thread since I didn't really get any votes, and I'm about to
> pick some development direction.
>
> On 10/30/2009 05:21 PM, Anca Luca wrote:
>> Hi devs,
>>
>> following a discussion with Fabio about the second desired feature for the
>> annotations, namely the ability to add annotations on any document, no matter
>> how its content is generated, we came up with the solution described at
>> http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1storeannotationsasselectionandcontextovertransformeddocuments
>> , the main idea being that annotations would be defined by their selected text
>> and a context (as opposed to offsets) and would be identified to be rendered in
>> a document on a serialization of the transformed XDOM of the document, this way
>> taking into account any macro rendering, document inclusion, etc.
>>
>> WDYT about this solution?
>
> Is there anything you see not doable in this? or any of its "cons" is a showstopper?

Sounds good, i don't see any important issue for now.

>
>>
>> Also, because the implementation of this, though relatively localized, comes
>> together with refactor and cleanup of the annotations module (update everything
>> so that annotations don't store and use offsets anymore, remove classes&
>> functions which are not needed in this simplified process), I propose to include
>> this improvement in version 1.0 of the annotations module (so that we don't
>> cleanup and release what we know for sure we'll delete) and push the 1.0 version
>> further to mid to end December.
>>
>> here's my +1 for this,
>> WDYT?
>
> Following a discussion with Fabio today, the plan at this moment is to:
> 1/ finish improving the tests (so that the power and limitations of the current
> solution are checked by a set of tests)
> 2/ eliminate all the code related to implementations for specific document types
> (watch's feed entry documents, for example, everything should be handled as an
> xwiki doc), so that the remaining code is cleaner and easier to maintain
> 3/ implement a prototype of the solution presented above, hopefully passing at
> least the tests that were passing at 1/
> 4/ minimal improvements on the js UI, making it usable in real life
> 5/ release 1.0
>
> Is there a different strategy you would prefer?
>
> Thanks,
> Anca
>
>>
>>
>> Happy coding,
>> Anca
>>
>> _______________________________________________
>> devs mailing list
>> [hidden email]
>> http://lists.xwiki.org/mailman/listinfo/devs
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
>



--
Thomas Mortagne
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs