Standard (and simple) format for conversion tests.

16 messages Options
Embed this post
Permalink
Landon Blake

Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
Some javascript/style in this post has been disabled (why?)

I will be helping Martin Davis on some testing and improvements to Proj4J. One of my tasks will be to test some of the improvements we are making to the coordinate conversion calculations. I think this testing is currently being done with Java unit tests. A while back on this list I remember we discussed a simple format for test data that could be provided to software tests. I think the goal would be to assemble a standard library of test data files that could be used by different coordinate conversion projects.

 

Is there still an interest in this?

 

I’d like to suggest the following format for a simple text file that could store the data:

 

Document Version Number

Source Coordinate Reference System Identifier (EPSG Code for the time being.)

Destination Coordinate Reference System Identifier (EPSG Code for the time being.)

Source Coordinates (Space Delimited and in X Y Z Order)

Destination Coordinates (Space Delimited and in X Y Z Order)

 

This is just my idea. We could tweak the format as needed to suit the needs of others.

 

I would be willing to maintain the library of test data files for MetaCRS if we decide to cooperate on this issue.

 

Landon

 



Warning:
Information provided via electronic media is not guaranteed against defects including translation and transmission errors.
If the reader is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this information in error, please notify the sender immediately.


_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Frank Warmerdam

Re: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
Landon Blake wrote:

> I will be helping Martin Davis on some testing and improvements to
> Proj4J. One of my tasks will be to test some of the improvements we are
> making to the coordinate conversion calculations. I think this testing
> is currently being done with Java unit tests. A while back on this list
> I remember we discussed a simple format for test data that could be
> provided to software tests. I think the goal would be to assemble a
> standard library of test data files that could be used by different
> coordinate conversion projects.
>
>  
>
> Is there still an interest in this?

Landon,

I am interested in such a thing existing.  In my Python script for
testing PROJ.4 (through OGRCoordinateTransformation) I have:

###############################################################################
# Table of transformations, inputs and expected results (with a threshold)
#
# Each entry in the list should have a tuple with:
#
# - src_srs: any form that SetFromUserInput() will take.
# - (src_x, src_y, src_z): location in src_srs.
# - src_error: threshold for error when srs_x/y is transformed into dst_srs and
#              then back into srs_src.
# - dst_srs: destination srs.
# - (dst_x,dst_y,dst_z): point that src_x/y should transform to.
# - dst_error: acceptable error threshold for comparing to dst_x/y.
# - unit_name: the display name for this unit test.
# - options: eventually we will allow a list of special options here (like one
#   way transformation).  For now just put None.
# - min_proj_version: string with minimum proj version required or null if unknown

transform_list = [ \

     # Simple straight forward reprojection.
     ('+proj=utm +zone=11 +datum=WGS84', (398285.45, 2654587.59, 0.0), 0.02,
      'WGS84', (-118.0, 24.0, 0.0), 0.00001,
      'UTM_WGS84', None, None ),

     # Ensure that prime meridian changes are applied.
     ('EPSG:27391', (20000, 40000, 0.0), 0.02,
      'EPSG:4273', (6.397933,58.358709,0.000000), 0.00001,
      'NGO_Oslo_zone1_NGO', None, None ),

     # Verify that 26592 "pcs.override" is working well.
     ('EPSG:26591', (1550000, 10000, 0.0), 0.02,
      'EPSG:4265', (9.449316,0.090469,0.00), 0.00001,
      'MMRome1_MMGreenwich', None, None ),
...

I think one important thing is to provide an acceptable error threshold with
each test in addition to the expected output value.  I also think each test
should support a chunk of arbitrary test which could be used to explain
the purpose of the test (special issues being examined) and pointing off
to a ticket or other relavent document.

Actually one more thing is a name for the test, hopefully slightly
self-documenting.  I suppose if each test is a distinct file, we
could use meaningful filenames.

The other dilemma is how to define the coordinate systems.  I feel that
limiting things to EPSG defined coordinate systems is a problem though of
course otherwise we have serious problems with defining in the coordinate
system in an interoperable fashion.   So, perhaps starting with EPSG codes
is reasonable with an understanding that eventually some tests might need
to be done another way - perhaps OGC WKT.

If you wanted to roll out something preliminary I would be interested
writing a Python script that would run the test against OGR/PROJ.4.

Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, [hidden email]
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Martin Davis

Re: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
Sorry, this should have gone to the list.

Agreed, it would be nice to keep the number of CRS languages to a minimum.

Perhaps the mapping could be done separately to the actual unit test, to
keep the unit test file format reasonably simple.  This is essentially
what CSMap does - it has a separate CVS file mapping between a whole set
of CRS systems.  I imagine every lib will end up with its own custom set
of mappings.

On further reflection, perhaps the idea of having an authority prefix
string could also be accomplished by simply having separate test files
for each authority.  (Similar to how PROJ4 represents its CS registry,
with separate files for epsg, esri, etc).  That might make it a bit
easier for different projects to use only the unit tests they are
capable of ingesting.  Also keeps the file sizes more manageable, and
allows a (minor) simplification to the file format by eliminating the
prefix string.

Martin


Frank Warmerdam wrote:

> Martin Davis wrote:
>> That looks pretty good to me, Frank.
>> It would be nice to have the format of this data in an easy-to-parse
>> format.  CSMap uses CSV, which certainly meets this criteria (but
>> maybe fails the readability test).  The format you give is a bit more
>> work, but maybe not impossible to deal with.  And of course there's
>> always XML....  Keeping to a single-table structure would be highly
>> preferable, I think.
>
> Martin,
>
> I wasn't proposing my format which is really just a Python structure.
> I think XML might be nice in that it could be formatted to be more
> readable
> than a csv with lots of free text.  Ultimately the entries will likely be
> hand created.
>
>> As for CS definition, I'm beginning to appreciate what a can of worms
>> that is!  My temptation would be to be highly pragmatic about this.  
>> I think a minimum is to have the notion of an authority and a code
>> (eg "epsg:12345" or "esri:3333").  This could be extended to allow
>> descriptions as well as code (eg "proj4:+proj=utm +zone=11" or
>> "Oracle-WKT:-----").  I think this encapsulates enough information
>> about the "language" being used to describe the CS to allow most libs
>> to grok the CSes they know about.
>
> This is a reasonable approach, though ideally there would not be many
> CRS languages used as it limits the universality of the test.
>
> I could imagine a way of providing a CRS *override* for a particular
> package.  So for instance if EPSG:3785 is not properly supported by
> PROJ.4 it would be nice to be able to specify the PROJ.4 string that
> it ought to use as an override.  To me this level of flexibility suggests
> XML.
>
> Hmm, I see this wasn't on the list.  Any particular reason?
> Feel free to forward my response on list.
>
> Best regards,

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Landon Blake

RE: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
Martin wrote: "On further reflection, perhaps the idea of having an
authority prefix string could also be accomplished by simply having
separate test files for each authority."

This sounds reasonable. It is really just a matter of duplicating some
simple text files, after all.

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658
 
 

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Martin Davis
Sent: Wednesday, November 04, 2009 12:42 PM
To: Frank Warmerdam
Cc: [hidden email]
Subject: Re: [MetaCRS] Standard (and simple) format for conversion
tests.

Sorry, this should have gone to the list.

Agreed, it would be nice to keep the number of CRS languages to a
minimum.

Perhaps the mapping could be done separately to the actual unit test, to

keep the unit test file format reasonably simple.  This is essentially
what CSMap does - it has a separate CVS file mapping between a whole set

of CRS systems.  I imagine every lib will end up with its own custom set

of mappings.

On further reflection, perhaps the idea of having an authority prefix
string could also be accomplished by simply having separate test files
for each authority.  (Similar to how PROJ4 represents its CS registry,
with separate files for epsg, esri, etc).  That might make it a bit
easier for different projects to use only the unit tests they are
capable of ingesting.  Also keeps the file sizes more manageable, and
allows a (minor) simplification to the file format by eliminating the
prefix string.

Martin


Frank Warmerdam wrote:
> Martin Davis wrote:
>> That looks pretty good to me, Frank.
>> It would be nice to have the format of this data in an easy-to-parse
>> format.  CSMap uses CSV, which certainly meets this criteria (but
>> maybe fails the readability test).  The format you give is a bit more

>> work, but maybe not impossible to deal with.  And of course there's
>> always XML....  Keeping to a single-table structure would be highly
>> preferable, I think.
>
> Martin,
>
> I wasn't proposing my format which is really just a Python structure.
> I think XML might be nice in that it could be formatted to be more
> readable
> than a csv with lots of free text.  Ultimately the entries will likely
be
> hand created.
>
>> As for CS definition, I'm beginning to appreciate what a can of worms

>> that is!  My temptation would be to be highly pragmatic about this.  
>> I think a minimum is to have the notion of an authority and a code
>> (eg "epsg:12345" or "esri:3333").  This could be extended to allow
>> descriptions as well as code (eg "proj4:+proj=utm +zone=11" or
>> "Oracle-WKT:-----").  I think this encapsulates enough information
>> about the "language" being used to describe the CS to allow most libs

>> to grok the CSes they know about.
>
> This is a reasonable approach, though ideally there would not be many
> CRS languages used as it limits the universality of the test.
>
> I could imagine a way of providing a CRS *override* for a particular
> package.  So for instance if EPSG:3785 is not properly supported by
> PROJ.4 it would be nice to be able to specify the PROJ.4 string that
> it ought to use as an override.  To me this level of flexibility
suggests
> XML.
>
> Hmm, I see this wasn't on the list.  Any particular reason?
> Feel free to forward my response on list.
>
> Best regards,

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs


Warning:
Information provided via electronic media is not guaranteed against defects including translation and transmission errors. If the reader is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this information in error, please notify the sender immediately.
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Norm Olsen

RE: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Frank Warmerdam
Hello All . . .

I too am interested in a general test format and universal test case database.  I believe a set of standard test cases would be a great thing for the MetaCRS project.  For legal and other reasons, I believe we should simply qualify the "target" values of all of our test cases as suggested results with some generous tolerance values; and issue a disclaimer as to the accuracy of the published results.

Having wrestled with this problem for many years, I have some comments:

1> I prefer a simple .CSV type of test file format.  The test file would then be a totally portable, non-binary, text file (limited to 8 bit characters for more portability?), be easily parsed in any language or application, and it can be easily maintained using something like Excel, MySql, etc (anything that can export a simple .CSV file).

2> I would like to see a "test type" field in the record format which will support testing things addition to the basic "convert this coordinate test".  Thus, datum shift tests, geoid height tests, grid scale, convergence, vertical datum tests, etc. could all be included in a single database.

3> We should strive for a "Source of Test Data" field requirement in the database which indicates the source of the test.  That is, where did the test case data come from.  The source should always (?) be something outside of the MetaCRS project base.

4> Test cases derived from the various projects of MetaCRS could/should be included and classified as being regression tests only.

5> Some sort of environment field would be nice.  That is, a bit map sort of thing that would enable a program to skip certain tests based on the environment (i.e. presence of the Canadian NTv2 data file for example).

6> Separate tolerances on the source and target is a nice idea enabling an automatic inverse test for each test case.  A simpler database would result if we require separate entries in the database to test both the forward and inverse cases.  I prefer the latter, as inverse testing is not always appropriate and it supports item 9 below.

7> Test data values should be entered in the form as the source material (to the degree possible), implying (for example) that geographic coordinates may be entered as degrees, minutes, and seconds or decimal degrees.

8> Tolerances in the test database should be based on the quality or nature of the "Source of Test Data".  It could be a serious legal issue if we publish something suggesting that this is the correct result.

9> None of our projects will produce the exact same result, nor will any other library match any of ours precisely.  At this level I do not think it appropriate for MetaCRS to make the call as to which is the correct one.  Therefore I suggest that the format be designed such that any library (MetaCRS or otherwise) be able to simply publish a file with the result produced by the library as opposed to a Boolean condition indicating whether or not they meet the MetaCrs standard. standards.  It is then up to the consumer of that information to decide which one is correct.  This may be an important legal issue as well.  (Notice that EPSG has never included test cases in their database.)

10> Coordinate system references should be by EPSG number where ever possible.  I suggest a format of the "EPSG:3745" type.  In cases where this won't work, the test database should include a namespace qualifier and then the definition:

        CSMAP:LL84
        PROJ4:'+proj=utm +zone=11 +datum=WGS84'
        ORACLE:80114
        .
        .
        .
Test applications would, of course, skip any test which it is incapable of deciphering the CRS's referenced.

The CS-MAP distribution includes a test data file named TEST.DAT which includes a couple thousand test cases.  The comments in this file usually indicate the "Source of Test Data" to some degree.  Many need to be commented out due to environmental reasons, thus item 5 above.

Norm

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Frank Warmerdam
Sent: Wednesday, November 04, 2009 11:50 AM
To: Landon Blake
Cc: [hidden email]
Subject: Re: [MetaCRS] Standard (and simple) format for conversion tests.

Landon Blake wrote:

> I will be helping Martin Davis on some testing and improvements to
> Proj4J. One of my tasks will be to test some of the improvements we are
> making to the coordinate conversion calculations. I think this testing
> is currently being done with Java unit tests. A while back on this list
> I remember we discussed a simple format for test data that could be
> provided to software tests. I think the goal would be to assemble a
> standard library of test data files that could be used by different
> coordinate conversion projects.
>
>  
>
> Is there still an interest in this?

Landon,

I am interested in such a thing existing.  In my Python script for
testing PROJ.4 (through OGRCoordinateTransformation) I have:

###############################################################################
# Table of transformations, inputs and expected results (with a threshold)
#
# Each entry in the list should have a tuple with:
#
# - src_srs: any form that SetFromUserInput() will take.
# - (src_x, src_y, src_z): location in src_srs.
# - src_error: threshold for error when srs_x/y is transformed into dst_srs and
#              then back into srs_src.
# - dst_srs: destination srs.
# - (dst_x,dst_y,dst_z): point that src_x/y should transform to.
# - dst_error: acceptable error threshold for comparing to dst_x/y.
# - unit_name: the display name for this unit test.
# - options: eventually we will allow a list of special options here (like one
#   way transformation).  For now just put None.
# - min_proj_version: string with minimum proj version required or null if unknown

transform_list = [ \

     # Simple straight forward reprojection.
     ('+proj=utm +zone=11 +datum=WGS84', (398285.45, 2654587.59, 0.0), 0.02,
      'WGS84', (-118.0, 24.0, 0.0), 0.00001,
      'UTM_WGS84', None, None ),

     # Ensure that prime meridian changes are applied.
     ('EPSG:27391', (20000, 40000, 0.0), 0.02,
      'EPSG:4273', (6.397933,58.358709,0.000000), 0.00001,
      'NGO_Oslo_zone1_NGO', None, None ),

     # Verify that 26592 "pcs.override" is working well.
     ('EPSG:26591', (1550000, 10000, 0.0), 0.02,
      'EPSG:4265', (9.449316,0.090469,0.00), 0.00001,
      'MMRome1_MMGreenwich', None, None ),
...

I think one important thing is to provide an acceptable error threshold with
each test in addition to the expected output value.  I also think each test
should support a chunk of arbitrary test which could be used to explain
the purpose of the test (special issues being examined) and pointing off
to a ticket or other relavent document.

Actually one more thing is a name for the test, hopefully slightly
self-documenting.  I suppose if each test is a distinct file, we
could use meaningful filenames.

The other dilemma is how to define the coordinate systems.  I feel that
limiting things to EPSG defined coordinate systems is a problem though of
course otherwise we have serious problems with defining in the coordinate
system in an interoperable fashion.   So, perhaps starting with EPSG codes
is reasonable with an understanding that eventually some tests might need
to be done another way - perhaps OGC WKT.

If you wanted to roll out something preliminary I would be interested
writing a Python script that would run the test against OGR/PROJ.4.

Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, [hidden email]
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Norm Olsen

RE: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Landon Blake
If the definition has an EPSG equivalent, than that ID should be used.  Thus, the test file would also verify that whatever library is being tested can properly map the EPSG code to a definition correctly in addition to the actual coordinate conversion.  This will be of great value in and of itself.  (Name/ID/Definition mapping is a project unto itself now: SpatialReference.org; is it not?)

Otherwise, maintenance of the test file will be a nightmare, and the mapping of names/ID/definitions will be such a big problem the primary purpose of the project will suffer badly.

I suspect that all libraries will support things for which there is no EPSG number.  If a test application cannot map the "official" CRS reference (the CRS reference chosen by the committer who added the test to the database), it either skips the test or the tech lead for that project adds another copy of the test (perhaps with the same test number/name) with a CRS reference it can interpret.

If we have more than one file, maintenance will be a big problem.  As we are all volunteers, that implies the maintenance _may_ not get done at all.

NOTE: A single .CSV file maintained in Subversion and therefore under version control is my preferred manner for this database.  Once I get the .CSV out of Subversion, I can use Excel, Access, MySql, Python, Pearl, vi, Notepad, whatever I want to make my changes.  I simply convert back to .CSV to before committing the changes.  To me, this means one less contentious issue for us to agree upon.

Norm

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Landon Blake
Sent: Wednesday, November 04, 2009 1:52 PM
To: Martin Davis; Frank Warmerdam (External)
Cc: [hidden email]
Subject: RE: [MetaCRS] Standard (and simple) format for conversion tests.

Martin wrote: "On further reflection, perhaps the idea of having an
authority prefix string could also be accomplished by simply having
separate test files for each authority."

This sounds reasonable. It is really just a matter of duplicating some
simple text files, after all.

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658
 
 

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Martin Davis
Sent: Wednesday, November 04, 2009 12:42 PM
To: Frank Warmerdam
Cc: [hidden email]
Subject: Re: [MetaCRS] Standard (and simple) format for conversion
tests.

Sorry, this should have gone to the list.

Agreed, it would be nice to keep the number of CRS languages to a
minimum.

Perhaps the mapping could be done separately to the actual unit test, to

keep the unit test file format reasonably simple.  This is essentially
what CSMap does - it has a separate CVS file mapping between a whole set

of CRS systems.  I imagine every lib will end up with its own custom set

of mappings.

On further reflection, perhaps the idea of having an authority prefix
string could also be accomplished by simply having separate test files
for each authority.  (Similar to how PROJ4 represents its CS registry,
with separate files for epsg, esri, etc).  That might make it a bit
easier for different projects to use only the unit tests they are
capable of ingesting.  Also keeps the file sizes more manageable, and
allows a (minor) simplification to the file format by eliminating the
prefix string.

Martin


Frank Warmerdam wrote:
> Martin Davis wrote:
>> That looks pretty good to me, Frank.
>> It would be nice to have the format of this data in an easy-to-parse
>> format.  CSMap uses CSV, which certainly meets this criteria (but
>> maybe fails the readability test).  The format you give is a bit more

>> work, but maybe not impossible to deal with.  And of course there's
>> always XML....  Keeping to a single-table structure would be highly
>> preferable, I think.
>
> Martin,
>
> I wasn't proposing my format which is really just a Python structure.
> I think XML might be nice in that it could be formatted to be more
> readable
> than a csv with lots of free text.  Ultimately the entries will likely
be
> hand created.
>
>> As for CS definition, I'm beginning to appreciate what a can of worms

>> that is!  My temptation would be to be highly pragmatic about this.  
>> I think a minimum is to have the notion of an authority and a code
>> (eg "epsg:12345" or "esri:3333").  This could be extended to allow
>> descriptions as well as code (eg "proj4:+proj=utm +zone=11" or
>> "Oracle-WKT:-----").  I think this encapsulates enough information
>> about the "language" being used to describe the CS to allow most libs

>> to grok the CSes they know about.
>
> This is a reasonable approach, though ideally there would not be many
> CRS languages used as it limits the universality of the test.
>
> I could imagine a way of providing a CRS *override* for a particular
> package.  So for instance if EPSG:3785 is not properly supported by
> PROJ.4 it would be nice to be able to specify the PROJ.4 string that
> it ought to use as an override.  To me this level of flexibility
suggests
> XML.
>
> Hmm, I see this wasn't on the list.  Any particular reason?
> Feel free to forward my response on list.
>
> Best regards,

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs


Warning:
Information provided via electronic media is not guaranteed against defects including translation and transmission errors. If the reader is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this information in error, please notify the sender immediately.
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Landon Blake

RE: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Norm Olsen
I have some responses to Norm's excellent comments below.

Norm wrote: "1> I prefer a simple .CSV type of test file format.  The
test file would then be a totally portable, non-binary, text file
(limited to 8 bit characters for more portability?), be easily parsed in
any language or application, and it can be easily maintained using
something like Excel, MySql, etc (anything that can export a simple .CSV
file)."

I think CSV is a little less readable, but I don't have a problem using
it.

Norm wrote: " I would like to see a "test type" field in the record
format which will support testing things addition to the basic "convert
this coordinate test".  Thus, datum shift tests, geoid height tests,
grid scale, convergence, vertical datum tests, etc. could all be
included in a single database."

Good idea Norm. Can you send me a quick list of the types you think
should be included. I think I may see an informal file format
description in the works, and it would be good to list these types
there.

Norm wrote: "We should strive for a "Source of Test Data" field
requirement in the database which indicates the source of the test.
That is, where did the test case data come from.  The source should
always (?) be something outside of the MetaCRS project base."

Do you mean the expected coordinate values that verify a passing test?
This presents us with an interesting chicken or egg first problem. The
only way to verify some conversions may be with our own
code/calculations. I agree we need to make sure tests aren't incorrect,
but to some degree this may have to come from good management of the
test database. What if we have a system where we keep two sets of test
data files. The first set will be "official" and the second set will be
"experimental". We won't move a test data from the experimental set to
the official set until there has been a peer reivew of some type.

Norm wrote: " Some sort of environment field would be nice.  That is, a
bit map sort of thing that would enable a program to skip certain tests
based on the environment (i.e. presence of the Canadian NTv2 data file
for example)."

It sounds like we might need some sore of "test execution requirements"
line, that outlines the dependencies for proper test execution. I don't
know if we can do this with a bit map or not. I think I would favor
plain english names for requirements. We could agree to have a consensus
on the names for requirements and keep a list of them in the file format
spec.

I think as we package more of this information, CSV will become less
handy. How would we separate a list of requirements in a CSV file? By
using another delimiter? This can quickly get messy. Just a thought...

Norm wrote: " A simpler database would result if we require separate
entries in the database to test both the forward and inverse cases.  I
prefer the latter, as inverse testing is not always appropriate and it
supports item 9 below."

I agree. I think we should keep "forward" and "backward" test data in
separate files.

Norm wrote: "Test data values should be entered in the form as the
source material (to the degree possible), implying (for example) that
geographic coordinates may be entered as degrees, minutes, and seconds
or decimal degrees."

I didn't think about units. Do we need to specify units as part of the
test data file, or will the units be defined by the CRS? Are there any
situations where a CRS doesn't specify a unit? If you are saying we
should require source and expected coordinate values to be in the same
units as defined by the CRS, then I think this is a good idea.

Norm wrote: "Tolerances in the test database should be based on the
quality or nature of the "Source of Test Data".  It could be a serious
legal issue if we publish something suggesting that this is the correct
result.

None of our projects will produce the exact same result, nor will any
other library match any of ours precisely.  At this level I do not think
it appropriate for MetaCRS to make the call as to which is the correct
one.  Therefore I suggest that the format be designed such that any
library (MetaCRS or otherwise) be able to simply publish a file with the
result produced by the library as opposed to a Boolean condition
indicating whether or not they meet the MetaCrs standard. standards.  It
is then up to the consumer of that information to decide which one is
correct.  This may be an important legal issue as well.  (Notice that
EPSG has never included test cases in their database.)"

To be completely honest, I'm not worried about this liability very much.
We use all sorts of open source software with the understanding that we
use it at our own risk.

I think we simply document how we determine if a test passes or fails as
you suggested, and then release the test data files under the public
domain or creative commons. I know this might be an issue for some
larger corporations, but I'm not worried about getting sued because
someone who built a bridge in the wrong place blames it on one of my
test data files.

I will, however, follow the wishes of the majority in this regard. I
just don't want to take something simple and make it overbearing because
of liability concerns.

Norm wrote: " Coordinate system references should be by EPSG number
where ever possible.  I suggest a format of the "EPSG:3745" type.  In
cases where this won't work, the test database should include a
namespace qualifier and then the definition:

        CSMAP:LL84
        PROJ4:'+proj=utm +zone=11 +datum=WGS84'
        ORACLE:80114
        .
        .
        .
Test applications would, of course, skip any test which it is incapable
of deciphering the CRS's referenced.

The CS-MAP distribution includes a test data file named TEST.DAT which
includes a couple thousand test cases.  The comments in this file
usually indicate the "Source of Test Data" to some degree.  Many need to
be commented out due to environmental reasons, thus item 5 above."

I'd like to keep the file format for test data as simple as we can. I
don't really want to stick long winded CRS definitions in them. I think
we should (1) keep separate files for separate definition systems as
Martin suggested or (2) use a mapping as Martin suggested.

Only my 2 cents. I'm glad we got a conversation started. It is also good
to see an Autodesk guy on this list, as an Autodesk user. :]

Landon
 

-----Original Message-----
From: Norm Olsen [mailto:[hidden email]]
Sent: Wednesday, November 04, 2009 12:53 PM
To: Frank Warmerdam (External); Landon Blake
Cc: [hidden email]
Subject: RE: [MetaCRS] Standard (and simple) format for conversion
tests.

Hello All . . .

I too am interested in a general test format and universal test case
database.  I believe a set of standard test cases would be a great thing
for the MetaCRS project.  For legal and other reasons, I believe we
should simply qualify the "target" values of all of our test cases as
suggested results with some generous tolerance values; and issue a
disclaimer as to the accuracy of the published results.

Having wrestled with this problem for many years, I have some comments:

1> I prefer a simple .CSV type of test file format.  The test file would
then be a totally portable, non-binary, text file (limited to 8 bit
characters for more portability?), be easily parsed in any language or
application, and it can be easily maintained using something like Excel,
MySql, etc (anything that can export a simple .CSV file).

2> I would like to see a "test type" field in the record format which
will support testing things addition to the basic "convert this
coordinate test".  Thus, datum shift tests, geoid height tests, grid
scale, convergence, vertical datum tests, etc. could all be included in
a single database.

3> We should strive for a "Source of Test Data" field requirement in the
database which indicates the source of the test.  That is, where did the
test case data come from.  The source should always (?) be something
outside of the MetaCRS project base.

4> Test cases derived from the various projects of MetaCRS could/should
be included and classified as being regression tests only.

5> Some sort of environment field would be nice.  That is, a bit map
sort of thing that would enable a program to skip certain tests based on
the environment (i.e. presence of the Canadian NTv2 data file for
example).

6> Separate tolerances on the source and target is a nice idea enabling
an automatic inverse test for each test case.  A simpler database would
result if we require separate entries in the database to test both the
forward and inverse cases.  I prefer the latter, as inverse testing is
not always appropriate and it supports item 9 below.

7> Test data values should be entered in the form as the source material
(to the degree possible), implying (for example) that geographic
coordinates may be entered as degrees, minutes, and seconds or decimal
degrees.

8> Tolerances in the test database should be based on the quality or
nature of the "Source of Test Data".  It could be a serious legal issue
if we publish something suggesting that this is the correct result.

9> None of our projects will produce the exact same result, nor will any
other library match any of ours precisely.  At this level I do not think
it appropriate for MetaCRS to make the call as to which is the correct
one.  Therefore I suggest that the format be designed such that any
library (MetaCRS or otherwise) be able to simply publish a file with the
result produced by the library as opposed to a Boolean condition
indicating whether or not they meet the MetaCrs standard. standards.  It
is then up to the consumer of that information to decide which one is
correct.  This may be an important legal issue as well.  (Notice that
EPSG has never included test cases in their database.)

10> Coordinate system references should be by EPSG number where ever
possible.  I suggest a format of the "EPSG:3745" type.  In cases where
this won't work, the test database should include a namespace qualifier
and then the definition:

        CSMAP:LL84
        PROJ4:'+proj=utm +zone=11 +datum=WGS84'
        ORACLE:80114
        .
        .
        .
Test applications would, of course, skip any test which it is incapable
of deciphering the CRS's referenced.

The CS-MAP distribution includes a test data file named TEST.DAT which
includes a couple thousand test cases.  The comments in this file
usually indicate the "Source of Test Data" to some degree.  Many need to
be commented out due to environmental reasons, thus item 5 above.

Norm

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Frank Warmerdam
Sent: Wednesday, November 04, 2009 11:50 AM
To: Landon Blake
Cc: [hidden email]
Subject: Re: [MetaCRS] Standard (and simple) format for conversion
tests.

Landon Blake wrote:
> I will be helping Martin Davis on some testing and improvements to
> Proj4J. One of my tasks will be to test some of the improvements we
are
> making to the coordinate conversion calculations. I think this testing

> is currently being done with Java unit tests. A while back on this
list
> I remember we discussed a simple format for test data that could be
> provided to software tests. I think the goal would be to assemble a
> standard library of test data files that could be used by different
> coordinate conversion projects.
>
>  
>
> Is there still an interest in this?

Landon,

I am interested in such a thing existing.  In my Python script for
testing PROJ.4 (through OGRCoordinateTransformation) I have:

########################################################################
#######
# Table of transformations, inputs and expected results (with a
threshold)
#
# Each entry in the list should have a tuple with:
#
# - src_srs: any form that SetFromUserInput() will take.
# - (src_x, src_y, src_z): location in src_srs.
# - src_error: threshold for error when srs_x/y is transformed into
dst_srs and
#              then back into srs_src.
# - dst_srs: destination srs.
# - (dst_x,dst_y,dst_z): point that src_x/y should transform to.
# - dst_error: acceptable error threshold for comparing to dst_x/y.
# - unit_name: the display name for this unit test.
# - options: eventually we will allow a list of special options here
(like one
#   way transformation).  For now just put None.
# - min_proj_version: string with minimum proj version required or null
if unknown

transform_list = [ \

     # Simple straight forward reprojection.
     ('+proj=utm +zone=11 +datum=WGS84', (398285.45, 2654587.59, 0.0),
0.02,
      'WGS84', (-118.0, 24.0, 0.0), 0.00001,
      'UTM_WGS84', None, None ),

     # Ensure that prime meridian changes are applied.
     ('EPSG:27391', (20000, 40000, 0.0), 0.02,
      'EPSG:4273', (6.397933,58.358709,0.000000), 0.00001,
      'NGO_Oslo_zone1_NGO', None, None ),

     # Verify that 26592 "pcs.override" is working well.
     ('EPSG:26591', (1550000, 10000, 0.0), 0.02,
      'EPSG:4265', (9.449316,0.090469,0.00), 0.00001,
      'MMRome1_MMGreenwich', None, None ),
..

I think one important thing is to provide an acceptable error threshold
with
each test in addition to the expected output value.  I also think each
test
should support a chunk of arbitrary test which could be used to explain
the purpose of the test (special issues being examined) and pointing off
to a ticket or other relavent document.

Actually one more thing is a name for the test, hopefully slightly
self-documenting.  I suppose if each test is a distinct file, we
could use meaningful filenames.

The other dilemma is how to define the coordinate systems.  I feel that
limiting things to EPSG defined coordinate systems is a problem though
of
course otherwise we have serious problems with defining in the
coordinate
system in an interoperable fashion.   So, perhaps starting with EPSG
codes
is reasonable with an understanding that eventually some tests might
need
to be done another way - perhaps OGC WKT.

If you wanted to roll out something preliminary I would be interested
writing a Python script that would run the test against OGR/PROJ.4.

Best regards,
--
---------------------------------------+--------------------------------
------
I set the clouds in motion - turn up   | Frank Warmerdam,
[hidden email]
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs


Warning:
Information provided via electronic media is not guaranteed against defects including translation and transmission errors. If the reader is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this information in error, please notify the sender immediately.
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Landon Blake

RE: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Norm Olsen
Norm wrote: " If the definition has an EPSG equivalent, than that ID
should be used.  Thus, the test file would also verify that whatever
library is being tested can properly map the EPSG code to a definition
correctly in addition to the actual coordinate conversion.  This will be
of great value in and of itself.  (Name/ID/Definition mapping is a
project unto itself now: SpatialReference.org; is it not?)

Otherwise, maintenance of the test file will be a nightmare, and the
mapping of names/ID/definitions will be such a big problem the primary
purpose of the project will suffer badly.

I suspect that all libraries will support things for which there is no
EPSG number.  If a test application cannot map the "official" CRS
reference (the CRS reference chosen by the committer who added the test
to the database), it either skips the test or the tech lead for that
project adds another copy of the test (perhaps with the same test
number/name) with a CRS reference it can interpret.

If we have more than one file, maintenance will be a big problem.  As we
are all volunteers, that implies the maintenance _may_ not get done at
all."

I think just starting out with EPSG code as the primary identifier would
be better than our current situation, which is everyone cooking their
own bowl of stew.

Perhaps we do that, and then come up with a mapping system for those
libraries that want to support CRS without an EPSG code.

Norm wrote: " NOTE: A single .CSV file maintained in Subversion and
therefore under version control is my preferred manner for this
database.  Once I get the .CSV out of Subversion, I can use Excel,
Access, MySql, Python, Pearl, vi, Notepad, whatever I want to make my
changes.  I simply convert back to .CSV to before committing the
changes.  To me, this means one less contentious issue for us to agree
upon."

I think we might run into trouble trying to cram everything into a
single CSV file for all tests. Using single files makes things a little
more modular, and they can still be managed in a version control
repository.

I might even be able to whip up a little Java library and GUI to edit
the test data files.

However, if the guys want a monolithic CSV, I'll work on a monolithic
CSV.

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658
 
 

-----Original Message-----
From: Norm Olsen [mailto:[hidden email]]
Sent: Wednesday, November 04, 2009 1:13 PM
To: Landon Blake; Martin Davis; Frank Warmerdam (External)
Cc: [hidden email]
Subject: RE: [MetaCRS] Standard (and simple) format for conversion
tests.

If the definition has an EPSG equivalent, than that ID should be used.
Thus, the test file would also verify that whatever library is being
tested can properly map the EPSG code to a definition correctly in
addition to the actual coordinate conversion.  This will be of great
value in and of itself.  (Name/ID/Definition mapping is a project unto
itself now: SpatialReference.org; is it not?)

Otherwise, maintenance of the test file will be a nightmare, and the
mapping of names/ID/definitions will be such a big problem the primary
purpose of the project will suffer badly.

I suspect that all libraries will support things for which there is no
EPSG number.  If a test application cannot map the "official" CRS
reference (the CRS reference chosen by the committer who added the test
to the database), it either skips the test or the tech lead for that
project adds another copy of the test (perhaps with the same test
number/name) with a CRS reference it can interpret.

If we have more than one file, maintenance will be a big problem.  As we
are all volunteers, that implies the maintenance _may_ not get done at
all.

NOTE: A single .CSV file maintained in Subversion and therefore under
version control is my preferred manner for this database.  Once I get
the .CSV out of Subversion, I can use Excel, Access, MySql, Python,
Pearl, vi, Notepad, whatever I want to make my changes.  I simply
convert back to .CSV to before committing the changes.  To me, this
means one less contentious issue for us to agree upon.

Norm

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Landon Blake
Sent: Wednesday, November 04, 2009 1:52 PM
To: Martin Davis; Frank Warmerdam (External)
Cc: [hidden email]
Subject: RE: [MetaCRS] Standard (and simple) format for conversion
tests.

Martin wrote: "On further reflection, perhaps the idea of having an
authority prefix string could also be accomplished by simply having
separate test files for each authority."

This sounds reasonable. It is really just a matter of duplicating some
simple text files, after all.

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658
 
 

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Martin Davis
Sent: Wednesday, November 04, 2009 12:42 PM
To: Frank Warmerdam
Cc: [hidden email]
Subject: Re: [MetaCRS] Standard (and simple) format for conversion
tests.

Sorry, this should have gone to the list.

Agreed, it would be nice to keep the number of CRS languages to a
minimum.

Perhaps the mapping could be done separately to the actual unit test, to

keep the unit test file format reasonably simple.  This is essentially
what CSMap does - it has a separate CVS file mapping between a whole set

of CRS systems.  I imagine every lib will end up with its own custom set

of mappings.

On further reflection, perhaps the idea of having an authority prefix
string could also be accomplished by simply having separate test files
for each authority.  (Similar to how PROJ4 represents its CS registry,
with separate files for epsg, esri, etc).  That might make it a bit
easier for different projects to use only the unit tests they are
capable of ingesting.  Also keeps the file sizes more manageable, and
allows a (minor) simplification to the file format by eliminating the
prefix string.

Martin


Frank Warmerdam wrote:
> Martin Davis wrote:
>> That looks pretty good to me, Frank.
>> It would be nice to have the format of this data in an easy-to-parse
>> format.  CSMap uses CSV, which certainly meets this criteria (but
>> maybe fails the readability test).  The format you give is a bit more

>> work, but maybe not impossible to deal with.  And of course there's
>> always XML....  Keeping to a single-table structure would be highly
>> preferable, I think.
>
> Martin,
>
> I wasn't proposing my format which is really just a Python structure.
> I think XML might be nice in that it could be formatted to be more
> readable
> than a csv with lots of free text.  Ultimately the entries will likely
be
> hand created.
>
>> As for CS definition, I'm beginning to appreciate what a can of worms

>> that is!  My temptation would be to be highly pragmatic about this.  
>> I think a minimum is to have the notion of an authority and a code
>> (eg "epsg:12345" or "esri:3333").  This could be extended to allow
>> descriptions as well as code (eg "proj4:+proj=utm +zone=11" or
>> "Oracle-WKT:-----").  I think this encapsulates enough information
>> about the "language" being used to describe the CS to allow most libs

>> to grok the CSes they know about.
>
> This is a reasonable approach, though ideally there would not be many
> CRS languages used as it limits the universality of the test.
>
> I could imagine a way of providing a CRS *override* for a particular
> package.  So for instance if EPSG:3785 is not properly supported by
> PROJ.4 it would be nice to be able to specify the PROJ.4 string that
> it ought to use as an override.  To me this level of flexibility
suggests
> XML.
>
> Hmm, I see this wasn't on the list.  Any particular reason?
> Feel free to forward my response on list.
>
> Best regards,

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs


Warning:
Information provided via electronic media is not guaranteed against
defects including translation and transmission errors. If the reader is
not the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication is strictly
prohibited. If you have received this information in error, please
notify the sender immediately.
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Martin Desruisseaux

Re: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Norm Olsen
Hello all

I would like to support the project of a common test database as well. I could
contribute by porting the Geotk scripts if wanted. There is a few minor comments:

1) Maybe the tolerance should be specified for each axis, instead than one
    value applying to all axis. This is more useful when the target CRS is a
    geographic one, since the tolerance on longitude can become greater as we
    approach the pole. At the extreme case (point a North pole or South pole),
    the longitude is meanless and should have a tolerance of +/- 180°

2) Maybe the axis order should be "as the authority said" (useful for testing
    libraries to be used with recent WMS/WCS versions), instead than forced to
    "X Y Z" order. As a help for libraries that do not handle axis ordering,
    we could add a field telling what the order is for the current record.

3) EPSG does not provide explicit test suite to my knowledge, but they provide
    some examples of coordinate conversions that can be used.

        Martin
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Martin Davis

Re: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Norm Olsen
Whew! Quite a shopping list, Norm.  But it all makes good sense.

I guess I agree about the CSV format.  It certainly works well for the
CSMap tests - I found it fairly easy to parse and use the test data
there.  One thing about CSV is that it doesn't make provision for
comments.  The '#' commenting convention in the TEST.DAT file looked
fairly useful. But it would be easy to adopt this convention and strip
out comments if required.  (I'll insert a plug here for JEQL - it is
handy for reading/transforming/subsetting CSV files)

The "test type" indicator is a very good point.  This is another thing
that it might be nice to physically segment the unit test files on, to
make it easy to only run test types of interest/capability for a given
lib.  But it's still good to have this as an explicit field in the test
description, since this allows the logical model of the tests to be
independent of their physical file organization  (This also applies to
the authoritynamespace identifier - and yes, I am reversing my position
of my previous email!)

For the environment indicator, how about a string with a delimited set
of keywords/tags?  This is likely to be somewhat freeform and
lib-specific, isn't it?  Or is there a clear idea of what would go here?

I assume the disclaimer would take the form of a README file associated
with the test archive?

Now, how to move forward on this?  Perhaps:
- define a prototype format (on a wiki page)
- create a SVN location for this archive
- create some sample tests in the test format (perhaps by re-modelling
the CSMap tests?  Or using extracts from the test suites from other libs)
- and then all the lib teams can get to work and start creating testrunners!

Martin

Norm Olsen wrote:

> Hello All . . .
>
> I too am interested in a general test format and universal test case database.  I believe a set of standard test cases would be a great thing for the MetaCRS project.  For legal and other reasons, I believe we should simply qualify the "target" values of all of our test cases as suggested results with some generous tolerance values; and issue a disclaimer as to the accuracy of the published results.
>
> Having wrestled with this problem for many years, I have some comments:
>
> 1> I prefer a simple .CSV type of test file format.  The test file would then be a totally portable, non-binary, text file (limited to 8 bit characters for more portability?), be easily parsed in any language or application, and it can be easily maintained using something like Excel, MySql, etc (anything that can export a simple .CSV file).
>
> 2> I would like to see a "test type" field in the record format which will support testing things addition to the basic "convert this coordinate test".  Thus, datum shift tests, geoid height tests, grid scale, convergence, vertical datum tests, etc. could all be included in a single database.
>
> 3> We should strive for a "Source of Test Data" field requirement in the database which indicates the source of the test.  That is, where did the test case data come from.  The source should always (?) be something outside of the MetaCRS project base.
>
> 4> Test cases derived from the various projects of MetaCRS could/should be included and classified as being regression tests only.
>
> 5> Some sort of environment field would be nice.  That is, a bit map sort of thing that would enable a program to skip certain tests based on the environment (i.e. presence of the Canadian NTv2 data file for example).
>
> 6> Separate tolerances on the source and target is a nice idea enabling an automatic inverse test for each test case.  A simpler database would result if we require separate entries in the database to test both the forward and inverse cases.  I prefer the latter, as inverse testing is not always appropriate and it supports item 9 below.
>
> 7> Test data values should be entered in the form as the source material (to the degree possible), implying (for example) that geographic coordinates may be entered as degrees, minutes, and seconds or decimal degrees.
>
> 8> Tolerances in the test database should be based on the quality or nature of the "Source of Test Data".  It could be a serious legal issue if we publish something suggesting that this is the correct result.
>
> 9> None of our projects will produce the exact same result, nor will any other library match any of ours precisely.  At this level I do not think it appropriate for MetaCRS to make the call as to which is the correct one.  Therefore I suggest that the format be designed such that any library (MetaCRS or otherwise) be able to simply publish a file with the result produced by the library as opposed to a Boolean condition indicating whether or not they meet the MetaCrs standard. standards.  It is then up to the consumer of that information to decide which one is correct.  This may be an important legal issue as well.  (Notice that EPSG has never included test cases in their database.)
>
> 10> Coordinate system references should be by EPSG number where ever possible.  I suggest a format of the "EPSG:3745" type.  In cases where this won't work, the test database should include a namespace qualifier and then the definition:
>
> CSMAP:LL84
> PROJ4:'+proj=utm +zone=11 +datum=WGS84'
> ORACLE:80114
> .
> .
> .
> Test applications would, of course, skip any test which it is incapable of deciphering the CRS's referenced.
>
> The CS-MAP distribution includes a test data file named TEST.DAT which includes a couple thousand test cases.  The comments in this file usually indicate the "Source of Test Data" to some degree.  Many need to be commented out due to environmental reasons, thus item 5 above.
>
> Norm
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Frank Warmerdam
> Sent: Wednesday, November 04, 2009 11:50 AM
> To: Landon Blake
> Cc: [hidden email]
> Subject: Re: [MetaCRS] Standard (and simple) format for conversion tests.
>
> Landon Blake wrote:
>  
>> I will be helping Martin Davis on some testing and improvements to
>> Proj4J. One of my tasks will be to test some of the improvements we are
>> making to the coordinate conversion calculations. I think this testing
>> is currently being done with Java unit tests. A while back on this list
>> I remember we discussed a simple format for test data that could be
>> provided to software tests. I think the goal would be to assemble a
>> standard library of test data files that could be used by different
>> coordinate conversion projects.
>>
>>  
>>
>> Is there still an interest in this?
>>    
>
> Landon,
>
> I am interested in such a thing existing.  In my Python script for
> testing PROJ.4 (through OGRCoordinateTransformation) I have:
>
> ###############################################################################
> # Table of transformations, inputs and expected results (with a threshold)
> #
> # Each entry in the list should have a tuple with:
> #
> # - src_srs: any form that SetFromUserInput() will take.
> # - (src_x, src_y, src_z): location in src_srs.
> # - src_error: threshold for error when srs_x/y is transformed into dst_srs and
> #              then back into srs_src.
> # - dst_srs: destination srs.
> # - (dst_x,dst_y,dst_z): point that src_x/y should transform to.
> # - dst_error: acceptable error threshold for comparing to dst_x/y.
> # - unit_name: the display name for this unit test.
> # - options: eventually we will allow a list of special options here (like one
> #   way transformation).  For now just put None.
> # - min_proj_version: string with minimum proj version required or null if unknown
>
> transform_list = [ \
>
>      # Simple straight forward reprojection.
>      ('+proj=utm +zone=11 +datum=WGS84', (398285.45, 2654587.59, 0.0), 0.02,
>       'WGS84', (-118.0, 24.0, 0.0), 0.00001,
>       'UTM_WGS84', None, None ),
>
>      # Ensure that prime meridian changes are applied.
>      ('EPSG:27391', (20000, 40000, 0.0), 0.02,
>       'EPSG:4273', (6.397933,58.358709,0.000000), 0.00001,
>       'NGO_Oslo_zone1_NGO', None, None ),
>
>      # Verify that 26592 "pcs.override" is working well.
>      ('EPSG:26591', (1550000, 10000, 0.0), 0.02,
>       'EPSG:4265', (9.449316,0.090469,0.00), 0.00001,
>       'MMRome1_MMGreenwich', None, None ),
> ...
>
> I think one important thing is to provide an acceptable error threshold with
> each test in addition to the expected output value.  I also think each test
> should support a chunk of arbitrary test which could be used to explain
> the purpose of the test (special issues being examined) and pointing off
> to a ticket or other relavent document.
>
> Actually one more thing is a name for the test, hopefully slightly
> self-documenting.  I suppose if each test is a distinct file, we
> could use meaningful filenames.
>
> The other dilemma is how to define the coordinate systems.  I feel that
> limiting things to EPSG defined coordinate systems is a problem though of
> course otherwise we have serious problems with defining in the coordinate
> system in an interoperable fashion.   So, perhaps starting with EPSG codes
> is reasonable with an understanding that eventually some tests might need
> to be done another way - perhaps OGC WKT.
>
> If you wanted to roll out something preliminary I would be interested
> writing a Python script that would run the test against OGR/PROJ.4.
>
> Best regards,
>  

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Martin Desruisseaux

Re: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Martin Desruisseaux
Landon Blake a écrit :
> Martin D wrote: " Maybe the tolerance should be specified for each axis, instead than one value applying to all axis. This is more useful when the target CRS is a geographic one, since the tolerance on longitude can become greater as we approach the pole. At the extreme case (point a North pole or South pole), the longitude is meanless and should have a tolerance of +/- 180"
>
> Now that I think about it, why have the tolerances specified in the test data files at all? Let the programmer writing the tests that will use the files set his own tolerance values and determine which tolerance is most important to him/her.

Suggesting a tolerance can make the job easier for the tester (one less thing to
think about), especially since the tolerance depends on the unit of measurement,
the proximity of pole in case of geographic CRS, whatever the operation involve
a datum shift or not (to phrase that in ISO 19111 terms: whatever the operation
is a "conversion" or a "transformation"), etc. However I'm fine with either
approach (including or excluding them from the test file).


> Martin D wrote:" Maybe the axis order should be "as the authority said" (useful for testing libraries to be used with recent WMS/WCS versions), instead than forced to "X Y Z" order. As a help for libraries that do not handle axis ordering, we could add a field telling what the order is for the current record."
>
> Good point. So maybe we do something like this:
>
> X:60321125.25 Y:2335688.21 Z:12.20

If we define "X" and "Y" as "first and second axis in a right-handed system",
I'm fine. We can not said "X=Easthing and Y=Northing" (except informaly) because
it doesn't work at poles (e.g. Polar Stereographic projections), while a
right-handed system is well defined everywhere.

        Martin

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Landon Blake

RE: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
Martin D wrote: " Suggesting a tolerance can make the job easier for the tester (one less thing to  think about), especially since the tolerance depends on the unit of measurement, the proximity of pole in case of geographic CRS, whatever the operation involve  a datum shift or not (to phrase that in ISO 19111 terms: whatever the operation  is a "conversion" or a "transformation"), etc. However I'm fine with either approach (including or excluding them from the test file)."

OK. A suggested tolerance is probably a good idea. We then need to pick which one of the following to use, or list them all:

X Tolerance
Y Tolerance
Z Tolerance
2D Distance Tolerance
3D Distance Tolerance

Martin D wrote: "If we define "X" and "Y" as "first and second axis in a right-handed system", I'm fine. We can not said "X=Easthing and Y=Northing" (except informaly) because it doesn't work at poles (e.g. Polar Stereographic projections), while a right-handed system is well defined everywhere"

OK. We can specify the meaning of X, Y, and Z in the file format spec. I should have clarified my use of these terms for our discussion. In my use "x" meant easting in a grid system or longitude in a geographic system. "Y" meant northing in a grid system or latitude in a geographic system. Z meant height or elevation. Of course, I am a surveyor, so I like northing, easting, elevation. :]

If X, Y and Z is to "grid specific" we could just require that the test name each ordinate value. So instead of something like:

X:6522188.12 Y:2558223.12 Z:23.255

Latitude:37-15-02.356 Longitude:-121-52-45.233 Ellipsoid_Heiht:120.23

Or:

Northing: 2558233.12 Eastng:6522188.12 Elevation:23.255

This frees the data file from having to worry about enforcing a certain ordinate order. We can just label the ordinate values. That would allow us to expand the number of ordinates used to specify coordinate values in the future. Down the road we could then do something like this:

Northing: 2558233.12 Eastng:6522188.12 Elevation:23.255 Epcoh_Date:1991.35

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658
 
 

-----Original Message-----
From: Martin Desruisseaux [mailto:[hidden email]]
Sent: Wednesday, November 04, 2009 1:44 PM
To: Landon Blake
Cc: [hidden email]
Subject: Re: [MetaCRS] Standard (and simple) format for conversion tests.

Landon Blake a écrit :
> Martin D wrote: " Maybe the tolerance should be specified for each axis, instead than one value applying to all axis. This is more useful when the target CRS is a geographic one, since the tolerance on longitude can become greater as we approach the pole. At the extreme case (point a North pole or South pole), the longitude is meanless and should have a tolerance of +/- 180"
>
> Now that I think about it, why have the tolerances specified in the test data files at all? Let the programmer writing the tests that will use the files set his own tolerance values and determine which tolerance is most important to him/her.

Suggesting a tolerance can make the job easier for the tester (one less thing to
think about), especially since the tolerance depends on the unit of measurement,
the proximity of pole in case of geographic CRS, whatever the operation involve
a datum shift or not (to phrase that in ISO 19111 terms: whatever the operation
is a "conversion" or a "transformation"), etc. However I'm fine with either
approach (including or excluding them from the test file).


> Martin D wrote:" Maybe the axis order should be "as the authority said" (useful for testing libraries to be used with recent WMS/WCS versions), instead than forced to "X Y Z" order. As a help for libraries that do not handle axis ordering, we could add a field telling what the order is for the current record."
>
> Good point. So maybe we do something like this:
>
> X:60321125.25 Y:2335688.21 Z:12.20

If we define "X" and "Y" as "first and second axis in a right-handed system",
I'm fine. We can not said "X=Easthing and Y=Northing" (except informaly) because
it doesn't work at poles (e.g. Polar Stereographic projections), while a
right-handed system is well defined everywhere.

        Martin



Warning:
Information provided via electronic media is not guaranteed against defects including translation and transmission errors. If the reader is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this information in error, please notify the sender immediately.
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Frank Warmerdam

Re: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Martin Davis
Martin Davis wrote:
>
> Now, how to move forward on this?  Perhaps:
> - define a prototype format (on a wiki page)

Martin,

I'd be pleased to see a wiki page setup for this.  Currently we are
using the general OSGeo mediawiki for general MetaCRS work, so I suppose
that is where it would live linked off:

    http://wiki.osgeo.org/wiki/MetaCRS

At some point we might make this a "proper" subproject of MetaCRS,
assign a lead and setup a Trac instance so that we can easily file
tickets against the test suite.

> - create a SVN location for this archive

I have created http://svn.osgeo.org/metacrs/testsuite under which we
can do a variety of experiments till things settle down.

> - create some sample tests in the test format (perhaps by re-modelling
> the CSMap tests?  Or using extracts from the test suites from other libs)

Reasonable

> - and then all the lib teams can get to work and start creating
> testrunners!

Yes, sounds good.

Regarding other points raised:
  * I'm ok with .csv format.
  * I'd like to use recognisable textual names for classes of test,
    and requirements (like ntv2 files) rather that bitfields!
  * I agree with MartinDe that test values likely ought to be in the
    axis orientation defined for the coordinate system though I can see
    this being a headache.

Adding a few fields in the future should not be terribly hard so we don't
necessarily need to come out of the gate with a perfect solution.  A bit
of iteration is fine with me as long as we don't lose the work done.

Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, [hidden email]
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Martin Desruisseaux

Re: Standard (and simple) format for conversion tests.

Reply Threaded More More options
Print post
Permalink
In reply to this post by Landon Blake
Landon Blake a écrit :
> If X, Y and Z is to "grid specific" we could just require that the test name each ordinate value. So instead of something like:
>
> X:6522188.12 Y:2558223.12 Z:23.255
>
> Latitude:37-15-02.356 Longitude:-121-52-45.233 Ellipsoid_Heiht:120.23

I'm neutral on that. I'm completly fine with "X" and "Y" if their meaning are
defined. "Easthing" / "Northing" is intuitive, but we need a bit more for some
CRS, thereby my "right-handed system" proposal.

        Martin

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Martin Davis

Tolerance values in standard test format

Reply Threaded More More options
Print post
Permalink
In reply to this post by Martin Desruisseaux
Is the notion of "tolerance" actually conflating two different requiremts?
 
1. Capture the accuracy of the test data as supplied from some external
source.
2. Indicate the expected accuracy of a given implementation of a
projection operation.

#1 needs to be specified externally, obviously.  Possibly it could be
indicated by the supplied precision of the expected value?

#2 would seem to be library dependent.

The definition of tolerance seems to be getting quite complex.  As Frank
says, it wouild be nice to move ahead with creating test artifacts even
if the tolerance issue is not completely settled.

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Landon Blake

RE: Tolerance values in standard test format

Reply Threaded More More options
Print post
Permalink
I think number 1 will be subjective and very difficult to quantify.
Different "authorities" may have slightly different formulas used in
conversion, and different software packages will implement these
differently.

How do I know how good Trimble Business Center conversions are compared
to AutoCAD Map or OGR2OGR?

I think we stick to clearly identifying the source of the expected
coordinate values in the test data file and let the user make a judgment
call. As an alternative, we could test various software packages and
provide a comparison of coordinate conversions, but I think this should
be done as supplemental information, and not as part of the test file.

I think you are correct about number 2. This should be library
dependent. However, if Martin Dre thinks a tolerance is important, I say
we include a suggested total 3D distance (between actual and expected
coordinates) tolerance.

Landon
Office Phone Number: (209) 946-0268
Cell Phone Number: (209) 992-0658
 
 

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Martin Davis
Sent: Wednesday, November 04, 2009 2:35 PM
To: [hidden email]
Subject: [MetaCRS] Tolerance values in standard test format

Is the notion of "tolerance" actually conflating two different
requiremts?
 
1. Capture the accuracy of the test data as supplied from some external
source.
2. Indicate the expected accuracy of a given implementation of a
projection operation.

#1 needs to be specified externally, obviously.  Possibly it could be
indicated by the supplied precision of the expected value?

#2 would seem to be library dependent.

The definition of tolerance seems to be getting quite complex.  As Frank

says, it wouild be nice to move ahead with creating test artifacts even
if the tolerance issue is not completely settled.

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs


Warning:
Information provided via electronic media is not guaranteed against defects including translation and transmission errors. If the reader is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this information in error, please notify the sender immediately.
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs