Textfilecontent54 Matches

4 Messages Forum Options Options
Permalink
Martin Thomas
Textfilecontent54 Matches
Reply Threaded More
Print post
Permalink
The schema documentation for subexpression in textfilecontent54
indicates that a regular expression can have many terms enclosed in
parentheses (or captured) - should every capture of every match be
returned? Or just the first capture?  I am also wondering how instance
numbering would work.

Using the example regex from the schema, abc(.*)mno(.*)xyp searching the
following lines:
abcdefmnokljxyp
abcaaamnoxyp

would match the first line and return two subexpressions (def and klj).
The second line would match and only return one non-null subexpression
(aaa). Would they be numbered as instances 1,2,3? That is, are instance
numbers 1 based or 0 based?

Given that multiline matching is a key part of the new
textfilecontent_test, we should also get a match that starts on the
first line and finishes on the second and the first capture would be
'defmnokljxyp\nabcaaa'.

Does all this make sense or did I miss something?

Thanks // Martin

To unsubscribe, send an email message to LISTSERV@... with
SIGNOFF OVAL-DEVELOPER-LIST
in the BODY of the message.  If you have difficulties, write to OVAL-DEVELOPER-LIST-request@....
Andrew Buttner
Re: Textfilecontent54 Matches
Reply Threaded More
Print post
Permalink
>The schema documentation for subexpression in textfilecontent54
>indicates that a regular expression can have many terms enclosed in
>parentheses (or captured) - should every capture of every match be
>returned? Or just the first capture?  I am also wondering how
>instance numbering would work.

The regex given in the 'pattern' entity is run against the text file.
Each match (of the entire regex) that is found would be an <item> in
the OVAL SC file, and each would have a different 'instance' value.
This all works regardless of how many subexpressions or captures were
defined in the regex.  The instance refers to a match of the entire
regex against the text file.





>Using the example regex from the schema, abc(.*)mno(.*)xyp searching
the
>following lines:
>abcdefmnokljxyp
>abcaaamnoxyp
>
>would match the first line and return two subexpressions (def and
klj).
>The second line would match and only return one non-null subexpression
>(aaa). Would they be numbered as instances 1,2,3? That is, are
instance
>numbers 1 based or 0 based?
>
>Given that multiline matching is a key part of the new
>textfilecontent_test, we should also get a match that starts on the
>first line and finishes on the second and the first capture would be
>'defmnokljxyp\nabcaaa'.

Note that the schema documentation for 'pattern' says that it matches
the longest possible block.  So in your example above, the pattern
supplied would result in one OVAL SC <item> focused around the entire
set of text.  (since the 'pattern' matches the entire text)  The
'instance' would be 1.  (I will add to the schema doc the fact that
this is 1 based)

If a 'subexpression' was used in a state, then the value of that
'subexpression' would be compared separately with the following two
subexpressions defined by the regex:  "defmnokljxyp\nabcaaa" and ""

Did this help?

Thanks
Drew

To unsubscribe, send an email message to LISTSERV@... with
SIGNOFF OVAL-DEVELOPER-LIST
in the BODY of the message.  If you have difficulties, write to OVAL-DEVELOPER-LIST-request@....
Martin Thomas
Re: Textfilecontent54 Matches
Reply Threaded More
Print post
Permalink
And numbering starts from 1, right?

>-----Original Message-----
>From: Buttner, Drew [mailto:abuttner@...]
>Sent: Thursday, July 03, 2008 11:47 AM
>To: OVAL-DEVELOPER-LIST@...
>Subject: Re: [OVAL-DEVELOPER-LIST] Textfilecontent54 Matches
>
>>The schema documentation for subexpression in textfilecontent54
>>indicates that a regular expression can have many terms enclosed in
>>parentheses (or captured) - should every capture of every match be
>>returned? Or just the first capture?  I am also wondering how
>>instance numbering would work.
>
>The regex given in the 'pattern' entity is run against the text file.
>Each match (of the entire regex) that is found would be an <item> in
>the OVAL SC file, and each would have a different 'instance' value.
>This all works regardless of how many subexpressions or captures were
>defined in the regex.  The instance refers to a match of the entire
>regex against the text file.
>
>
>
>
>
>>Using the example regex from the schema, abc(.*)mno(.*)xyp searching
>the
>>following lines:
>>abcdefmnokljxyp
>>abcaaamnoxyp
>>
>>would match the first line and return two subexpressions (def and
>klj).
>>The second line would match and only return one non-null subexpression
>>(aaa). Would they be numbered as instances 1,2,3? That is, are
>instance
>>numbers 1 based or 0 based?
>>
>>Given that multiline matching is a key part of the new
>>textfilecontent_test, we should also get a match that starts on the
>>first line and finishes on the second and the first capture would be
>>'defmnokljxyp\nabcaaa'.
>
>Note that the schema documentation for 'pattern' says that it matches
>the longest possible block.  So in your example above, the pattern
>supplied would result in one OVAL SC <item> focused around the entire
>set of text.  (since the 'pattern' matches the entire text)  The
>'instance' would be 1.  (I will add to the schema doc the fact that
>this is 1 based)
>
>If a 'subexpression' was used in a state, then the value of that
>'subexpression' would be compared separately with the following two
>subexpressions defined by the regex:  "defmnokljxyp\nabcaaa" and ""
>
>Did this help?
>
>Thanks
>Drew
>
>To unsubscribe, send an email message to LISTSERV@... with
>SIGNOFF OVAL-DEVELOPER-LIST
>in the BODY of the message.  If you have difficulties, write to OVAL-
>DEVELOPER-LIST-request@....

To unsubscribe, send an email message to LISTSERV@... with
SIGNOFF OVAL-DEVELOPER-LIST
in the BODY of the message.  If you have difficulties, write to OVAL-DEVELOPER-LIST-request@....
Andrew Buttner
Re: Textfilecontent54 Matches
Reply Threaded More
Print post
Permalink
Yes, I will make sure to add this to the schema documentation for the
next release.

Thanks
Drew


>-----Original Message-----
>From: Martin_Thomas@... [mailto:Martin_Thomas@...]
>Sent: Thursday, July 03, 2008 3:31 PM
>To: oval-developer-list OVAL Developer List/Closed Public Discussion
>Subject: Re: [OVAL-DEVELOPER-LIST] Textfilecontent54 Matches
>
>And numbering starts from 1, right?
>
>>-----Original Message-----
>>From: Buttner, Drew [mailto:abuttner@...]
>>Sent: Thursday, July 03, 2008 11:47 AM
>>To: OVAL-DEVELOPER-LIST@...
>>Subject: Re: [OVAL-DEVELOPER-LIST] Textfilecontent54 Matches
>>
>>>The schema documentation for subexpression in textfilecontent54
>>>indicates that a regular expression can have many terms enclosed in
>>>parentheses (or captured) - should every capture of every match be
>>>returned? Or just the first capture?  I am also wondering how
>>>instance numbering would work.
>>
>>The regex given in the 'pattern' entity is run against the text file.
>>Each match (of the entire regex) that is found would be an <item> in
>>the OVAL SC file, and each would have a different 'instance' value.
>>This all works regardless of how many subexpressions or captures were
>>defined in the regex.  The instance refers to a match of the entire
>>regex against the text file.
>>
>>
>>
>>
>>
>>>Using the example regex from the schema, abc(.*)mno(.*)xyp searching
>>the
>>>following lines:
>>>abcdefmnokljxyp
>>>abcaaamnoxyp
>>>
>>>would match the first line and return two subexpressions (def and
>>klj).
>>>The second line would match and only return one non-null
subexpression

>>>(aaa). Would they be numbered as instances 1,2,3? That is, are
>>instance
>>>numbers 1 based or 0 based?
>>>
>>>Given that multiline matching is a key part of the new
>>>textfilecontent_test, we should also get a match that starts on the
>>>first line and finishes on the second and the first capture would be
>>>'defmnokljxyp\nabcaaa'.
>>
>>Note that the schema documentation for 'pattern' says that it matches
>>the longest possible block.  So in your example above, the pattern
>>supplied would result in one OVAL SC <item> focused around the entire
>>set of text.  (since the 'pattern' matches the entire text)  The
>>'instance' would be 1.  (I will add to the schema doc the fact that
>>this is 1 based)
>>
>>If a 'subexpression' was used in a state, then the value of that
>>'subexpression' would be compared separately with the following two
>>subexpressions defined by the regex:  "defmnokljxyp\nabcaaa" and ""
>>
>>Did this help?
>>
>>Thanks
>>Drew
>>
>>To unsubscribe, send an email message to LISTSERV@...
with
>>SIGNOFF OVAL-DEVELOPER-LIST
>>in the BODY of the message.  If you have difficulties, write to OVAL-
>>DEVELOPER-LIST-request@....
>
>To unsubscribe, send an email message to LISTSERV@... with
>SIGNOFF OVAL-DEVELOPER-LIST
>in the BODY of the message.  If you have difficulties, write to OVAL-
>DEVELOPER-LIST-request@....

To unsubscribe, send an email message to LISTSERV@... with
SIGNOFF OVAL-DEVELOPER-LIST
in the BODY of the message.  If you have difficulties, write to OVAL-DEVELOPER-LIST-request@....