|
|
|
Nikos Alexandris
|
Dear grass-users and developers,
I was hesitant to send this on-line since I am a lousy scripter and by no means a programmer. I posted this off-list to some grass-devs. I apologise to them for this action. I need some advice with respect to python scripting (within grass) and if the code is working, naturally, I would like to share the script (...maybe my first useful script that I can share) on the grass-wiki. Ultimately, I wish to see in the future a (native if possible) grass program (written by a real programmer) that will implement the "Pareto Boundary", useful for the accuracy assessment of low resolution thematic maps. I have filed a trac ticket: http://trac.osgeo.org/grass/ticket/804 About: I wrote some code (for me it was/is a lot of code) in order to get omission and commission errors out of a series of low resolution dichotomic (classification derived from MODIS imagery) maps based on a high(-er) resolution reference map (Landsat7) and plot then (using R with a custom function) the "famous" Pareto Boundary [1] to compare and judge which classification map is best (and successively which classification method from the ones used performs better). Question: How is merging/ calling the several scripts in/ from within just one script done best? What strategy is best? I really have trouble to get this done. What is problematic: (a) Well, due to my ignorance and inexperience I have had some hard time to decide which of the stuff can be hardcoded and which should not. I have several hardcoded work-arounds since I don't know (yet) how to deal with them properly. (b) The part where vector points are counted takes TOO much. I stole the idea described in grass-wiki that uses v.what.vect and, if I am not wrong, the bottleneck is v.distance (as already discussed in the list some times in the past). (c) I am sure there are more problems... but I can't see them :-) Working code: I have 8 python scripts + 2 R which I have tested with small samples from the spearfish60 dataset [*]. I have no idea how I should proceed: - make it one single file (it's a bit hard since I still learn python stuff); - keep the separate and join them as modules in one module-script; - source variables, (grass) options and flags, etc. from a "pareto_options_and_flags.py" module in which variables and the rest can be set and sourced later to the module that will do the job Attachement in grass-trac [2] contains: ### I took quite some time to comment (almost) everything so the code is self-explained. Hopefully it really is like that - not sure since I am no programmer. ### - the 8 scripts [in 2] (step - by - step to extract the omission/commission errors and export them as csv); - R code to plot the boundary and the Oe,Ce of classification(s) [also in 2] (but I doubt my code is efficient/ smart). Dr. Nikos Koutsias helped me understand the Pareto Boundary. Whether I really did or not is, of course, my responsibility but I wouldn't have done anything without his help. Best regards, Nikos ------------------------------------------------------------- [*] Test using spearfish60 data: ------------------------------------------------------------- # testing with spearfish60 data grass64 /geo/grassdb/spearfish60/user1/ # set region to g.region s=4925000 e=593500 n=4927000 w=590000 res=30 -p # use sqlite db-backend db.connect driver=sqlite database=/geo/grassdb/spearfish60/PERMANENT/sqlite.db db.connect -p ### prepare input files # high resolution reference raster map will be g.copy rast=landcover.30m,landcover_ref # high redolution reference Class of Interest (to be assessed) will be r.mapcalc "landcover_refcoi = if(landcover.30m == 51 || landcover.30m == 71 || landcover.30m == 81 || landcover.30m == 92, 2, null())" # low resolution classification map 1 (to be assessed for accuracy) will be g.region res=100 -pa r.mapcalc "pareto_classification_rangeland_1 = if (vegcover == 2, 2, null())" # prepare a 2nd classification map g.copy rast=pareto_classification_rangeland_1,pareto_classification_rangeland_2 # edit/ change 2nd map d.rast.edit pareto_classification_rangeland_2 out=pareto_classification_rangeland_2_edited g.remove pareto_classification_rangeland_2 g.rename rast=pareto_classification_rangeland_2_edited,pareto_classification_rangeland_2 ### extract omission and commission errors for pareto-optimal maps and classifications #### --- optionally SKIP steps and use the "ONE-LINER" below --- #### # # step 1 #python pareto_1_vectorise_rasters.py reference_raster=landcover_ref reference_coi_rasters=landcover_refcoi classification_rasters=pareto_classification_rangeland_1,pareto_classification_rangeland_2 --o # step 2 #python pareto_2_create_lowres_vector_grid.py highres=30 lowres=100 --o && python # step 3 #pareto_3_count_pixels_within_gridcells.py --v && python # step 4 #pareto_4_calculate_coi_percentages.py --v && python # step 5 #pareto_5_populate_thresholds_and_classifications.py lowres=100 --v # step 6 #python pareto_6_populate_pareto_errors.py --v # step 7 #python pareto_7_populate_classification_errors.py --v # step 8 #python pareto_8_export_csv___spearfish60.py --o # #### --- optionally SKIP steps and use the "ONE-LINER" below --- #### # ONE-LINER (copy-paste carefully) python pareto_1_vectorise_rasters.py reference_raster=landcover_ref reference_coi_rasters=landcover_refcoi classification_rasters=pareto_classification_rangeland_1,pareto_classification_rangeland_2 --o --q && python pareto_2_create_lowres_vector_grid.py highres=30 lowres=100 --o --q && python pareto_3_count_pixels_within_gridcells.py --q && python pareto_4_calculate_coi_percentages.py --q && python pareto_5_populate_thresholds_and_classifications.py lowres=100 --q && python pareto_6_populate_pareto_errors.py --q && python pareto_7_populate_classification_errors.py --q && python pareto_8_export_csv___spearfish60.py --q # to remove pareto related vector files later execute "g.mremove vect=pareto* -f" ### use epoxrted csv files to plot the Pareto Boundary (using the R scripts within R) ------------------------------------------------------------- P.S. In my case I used the scripts for burned area maps. But this "boundary" could be useful for all kinds of dichotomic maps and theoretically even maps with multiple classes. Probably there are better/ smarter ways to implement all this. This was the first contact with python :-). All of it is related directly with my questions about python scripting posted in grass-user last month. --- [1] Analysis of the conflict between omission and commission in low spatial resolution dichotomic thematic products: The Pareto Boundary Luigi Boschetti, Stephane P. Flasse, Pietro A. Brivio; published in Remote Sensing of Environment 91 (2004) 280 – 292 My notes from the paper: • A set of Pareto-optimal classified images, with the properties defined in the previous paragraph, could be obtained starting from the high-resolution reference map (our ‘true’ data). • Each optimal classification will have a pair of error rates (Oe,Ce), that uniquely identify a point in the omission/commission Oe/Ce={(Oe,Ce)aR Â ROea[0,1], Cea[0,1]} bi-dimensional space (henceforth noted Oe/Ce). • The line joining all these points is the Pareto Boundary related to the specific high-resolution reference map and to a specific low-resolution pixel size. • As all the points belonging to the Pareto Boundary represent the error level of efficient classification, by the definition of efficiency given in the previous paragraph, it follows that the Pareto Boundary divides the Oe/Ce in two regions. • Any possible low-resolution thematic products can be associated only to the points in the region above or, at the most, on the Pareto Boundary. • The point (0,0), with neither omission nor commission error, will unfortunately lie in the unreachable region! • The terms ‘efficient’ or ‘optimal’ will be used as synonyms and will always mean efficient (optimal) under Pareto’s definition. [2] Python scripts and R code: http://trac.osgeo.org/grass/attachment/ticket/804/pareto_boundary_grass_python_R_scripts.tar.gz python scritps: - pareto_1_vectorise_rasters.py - pareto_2_create_lowres_vector_grid.py - pareto_3_count_pixels_within_gridcells.py - pareto_4_calculate_coi_percentages.py - pareto_5_populate_thresholds_and_classifications.py - pareto_6_populate_pareto_errors.py - pareto_7_populate_classification_errors.py - pareto_8_export_csv__spearfish60.py The steps implemented in the scripts are (with respect to my burned area maps): 1 Prepare low resolution grids 2 Count burned (or Class of Interest) pixels in each low resolution grid-cell 3 Convert cells to points ( gridcell -> centroids -> point) 4 Calculate percentages of burned (or Class of Interest) pixels for each low resolution grid-cell 5 Add and populate threshold columns (txx) based on burned pixels percentage 6 Add classification results (Update from classification maps) 7 Add and update columns with Oe, Ce (Omission error, Commission error) 8 Export Oe,Ce pairs as CSV R code [The 1st code is supposed to be ready to become a function. Currently all to be used "manually"!] - pareto_boundary.R - plot_pareto_boundary_plus_domination_regions.R _______________________________________________ grass-user mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-user |
||||||||||||||||
|
Glynn Clements
|
Νίκος Αλεξανδρής wrote:
> How is merging/ calling the several scripts in/ from within just one > script done best? What strategy is best? I really have trouble to get > this done. It's hard to say without understanding likely workflows. However, if you want to make most of the code into Python modules, one suggestion is to have the body of the script (the code in the "if __name__ == '__main__':" block) read the various fields from the "options" and "flags" variables and pass them to the functions as parameters, rather than having the functions read global variables (Python doesn't actually have global variables; "global" variables are local to a module). BTW, this is wrong: flags = "-o",\ It happens to work at present, but the correct method to pass the "--o" flag is to use "overwrite = True" (analogous to "quiet = True" for the "--q" flag and "verbose = True" for the "--v" flag). If you want to enable these flags throughout the script, you can use: os.environ['GRASS_OVERWRITE'] = '1' # --o os.environ['GRASS_VERBOSE'] = '0' # --q os.environ['GRASS_VERBOSE'] = '3' # --v In general, it's best to leave these settings to the user. If the user runs the script with --o, --q, and/or --v, the corresponding environment variables will be set by grass.parser(). If the user has those variables set in their environment, the values will be inherited by the script. If you want to examine the settings use the overwrite() and/or verbosity() functions in grass.script. Also, for writing log messages, string.Template is preferable when there are multiple substitutions. E.g. rather than: print """\n + Vector grid of size %d rows x %d ( %s cells) columns created"""\ % (rows, cols, cells) using: print string.Template( """\n + Vector grid of size ${rows} rows x ${cols} (${cells} cells) columns created""" ).substitute(rows = rows, cols = cols, cells = cells) makes it easier to correlate the fields with the values. It also makes it easier to localise the message, as the fields can be re-ordered if needed. If you do a lot of this, consider a utility function, e.g.: def log(msg, **kwargs): grass.message(string.Template(msg).substitute(**kwargs)) I'm wondering whether grass.message() etc should be extended in this manner. -- Glynn Clements <[hidden email]> _______________________________________________ grass-user mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-user |
||||||||||||||||
|
Nikos Alexandris
|
Νίκος Αλεξανδρής wrote:
> > How is merging/ calling the several scripts in/ from within just one > > script done best? What strategy is best? I really have trouble to get > > this done. On Sun, 2009-11-08 at 08:19 +0000, Glynn Clements wrote: > It's hard to say without understanding likely workflows. > > However, if you want to make most of the code into Python modules, one > suggestion is to have the body of the script (the code in the > "if __name__ == '__main__':" block) read the various fields from the > "options" and "flags" variables and pass them to the functions as > parameters, rather than having the functions read global variables > (Python doesn't actually have global variables; "global" variables are > local to a module). OK. > BTW, this is wrong: > > flags = "-o",\ > > It happens to work at present, but the correct method to pass the > "--o" flag is to use "overwrite = True" (analogous to "quiet = True" > for the "--q" flag and "verbose = True" for the "--v" flag). OK, thanks! Maybe this can/ should be added in the wiki(?). > If you want to enable these flags throughout the script, you can use: > > os.environ['GRASS_OVERWRITE'] = '1' # --o > os.environ['GRASS_VERBOSE'] = '0' # --q > os.environ['GRASS_VERBOSE'] = '3' # --v > > In general, it's best to leave these settings to the user. Sometimes it is required to use them and prevent the user from _not_ using e.g. the "--o" flag (I think... ?). > If the user > runs the script with --o, --q, and/or --v, the corresponding > environment variables will be set by grass.parser(). If the user has > those variables set in their environment, the values will be inherited > by the script. > > If you want to examine the settings use the overwrite() and/or > verbosity() functions in grass.script. # note-to-self: # check functions overwrite(), verbosity() in grass.script > Also, for writing log messages, string.Template is preferable when > there are multiple substitutions. E.g. rather than: > > print """\n + Vector grid of size %d rows x %d ( %s cells) columns created"""\ > % (rows, cols, cells) > > using: > print string.Template( > """\n + Vector grid of size ${rows} rows x ${cols} (${cells} cells) columns created""" > ).substitute(rows = rows, cols = cols, cells = cells) > > makes it easier to correlate the fields with the values. It also makes > it easier to localise the message, as the fields can be re-ordered if > needed. That is (very) true. Cool! > If you do a lot of this, consider a utility function, e.g.: > > def log(msg, **kwargs): > grass.message(string.Template(msg).substitute(**kwargs)) This is supposed to reduce typing effort, right? > I'm wondering whether grass.message() etc should be extended in this > manner. Thank you for your time Glynn, Nikos _______________________________________________ grass-user mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-user |
||||||||||||||||
|
Glynn Clements
|
Νίκος Αλεξανδρής wrote: > > If you want to enable these flags throughout the script, you can use: > > > > os.environ['GRASS_OVERWRITE'] = '1' # --o > > os.environ['GRASS_VERBOSE'] = '0' # --q > > os.environ['GRASS_VERBOSE'] = '3' # --v > > > > In general, it's best to leave these settings to the user. > > Sometimes it is required to use them and prevent the user from _not_ > using e.g. the "--o" flag (I think... ?). The presence or absence of --o should definitely be honoured. Ideally, a script should try to use unique names for any temporary maps it creates, so that it doesn't inadvertently overwrite the user's maps (using "tmp" and the PID in the mapname is sufficient, e.g. mapname.tmp.<pid>; if the user gives their own maps such names, tough). Using overwrite=True to overwrite temporary maps created within the script is okay, but the final output maps should honour the --o and $GRASS_OVERWRITE settings, i.e. don't complain about overwriting maps if --o is used, and don't overwrite maps if it wasn't. The script shouldn't need to handle this explicitly. If the --o flag was given, GRASS_OVERWRITE will have been set to "1" by grass.parser(), and this will automatically be inherited by any child processes. The only thing that the script needs to do is to *not* explicitly use overwrite=True on any commands which create the final output maps, but allow their overwrite behaviour be controlled by $GRASS_OVERWRITE. > > If the user > > runs the script with --o, --q, and/or --v, the corresponding > > environment variables will be set by grass.parser(). If the user has > > those variables set in their environment, the values will be inherited > > by the script. > > > > If you want to examine the settings use the overwrite() and/or > > verbosity() functions in grass.script. > > # note-to-self: > # check functions overwrite(), verbosity() in grass.script It's seldom necessary to check these explicitly. The settings will be inherited by child processes, and if you use grass.message(), grass.verbose(), etc for messages, whether or not the messages are displayed depends upon the $GRASS_VERSBOSE setting. > > If you do a lot of this, consider a utility function, e.g.: > > > > def log(msg, **kwargs): > > grass.message(string.Template(msg).substitute(**kwargs)) > > This is supposed to reduce typing effort, right? Yes, although it may also make the code easier to read and modify. -- Glynn Clements <[hidden email]> _______________________________________________ grass-user mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-user |
||||||||||||||||
| Free Embeddable Forum Powered by Nabble | Help |