|
|
|
Soeren Gebbert-2
|
Was: Re: The tomcat shut down when encounter some error grass commond
2009/9/24 Glynn Clements <[hidden email]>: > > Soeren Gebbert wrote: > >> > It isn't just that it's a lot of work to change it, but that it >> > potentially causes problems for code which uses the libraries (e.g. >> > simply making the variables non-"static" pollutes the global >> > namespace). >> >> Sorry, i did not quite understand what "pollute the global namespace" >> exactly means, >> well i am not a native speaker and not a C expert, so can you please >> provide an example? :) > > If two libraries both export a symbol named e.g. "name", you'll get a > conflict if you use both libraries from the same program. Depending > upon the platform, you may get an error, or you may end up with the > variables being merged into a single variable. > > This isn't a problem if the variable isn't exported (i.e. declared > with the "static" qualifier). But you have can't use "static" if the > variable needs to be accessible in more than one file. Thanks for the explanation, now i understand. > >> > That could be gotten around by moving all of the static data for each >> > library into a single "state" variable. That in itself isn't an issue, >> > but you have to remain vigilant for developers adding new static >> > variables ad-hoc. >> >> An other approach might be the implementation of reset functions in >> every source file which uses >> static variables. The reset function can be called for a fine grain >> reset approach or all together >> in a single reset function which collects all distributed reset functions. >> >> i.e: >> New reset functions like: >> >> G_reset_geodesic_equation in lib/gis/geodesic.c >> G_reset_window in lib/gis/get_window.c >> G_reset_cell_area_calculations in lib/gis/area.c >> >> And so on ... > > That will get around the namespace issue. > > However, it doesn't really help with multithreading. For that, you Indeed. > would want to store all of the static data in a library in a state > structure, and give each thread a pointer to a separate state > structure (e.g. pthread_setspecific). IMHO this approach will result in a re-design and re-write of most of the grass libraries and an update of more than 300 modules ... . Looks like a 2 1/2 years full time job for one developer ... any sponsors available? :) I think implementing the reset functions is the first step to use grass in long running applications. Within those application it must be assured that all the grass library functions are called from within the same thread, using a producer-consumer pattern. Reading and writing of raster and vector maps is performed in serial by the producer thread using a queue for all requests from consumer threads. The producer thread will call the reset functions after each processed request. A consumer thread will always request full raster and vector maps. :/ So the processing of the data read into the memory will be independent from the producer thread. Read and write calls are queued and the consumer threads need to wait for the producer thread. Well, that sounds quite complex. Glynn, if i remember correctly you described a similar approach for a multi-threaded raster library some time ago. Are there any plans to implement such an design in the grass7 raster lib? I have no experience with such an approach, but i would like to realize it in a Java application ... someday. :) > >> It would also be great to open raster maps without setting the global >> window variable >> but providing a pointer of a region to the opening function? >> Glynn do you think this is possible without an overhead of work on >> other functions? > > The main problem here is that the mapping between the region grid and > the raster grid is recalculated whenever you change the region, so > that all maps are read and written at a single resolution. That isn't > a particularly common operation, but it is used (e.g. r.resamp.*) and > needs to be supported in some way. I see. > > This is one area which could do with an overhaul, but it's a > significant architectural change rather than modifying a few > functions. Thanks for your estimation. I will have a look at the raster io code to evaluate the effort and which function must be changed or new implemented without breaking the compatibility. Best regards Soeren > > -- > Glynn Clements <[hidden email]> > _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Glynn Clements
|
Soeren Gebbert wrote: > >> An other approach might be the implementation of reset functions in > >> every source file which uses > >> static variables. The reset function can be called for a fine grain > >> reset approach or all together > >> in a single reset function which collects all distributed reset functions. > >> > >> i.e: > >> New reset functions like: > >> > >> G_reset_geodesic_equation in lib/gis/geodesic.c > >> G_reset_window in lib/gis/get_window.c > >> G_reset_cell_area_calculations in lib/gis/area.c > >> > >> And so on ... > > > > That will get around the namespace issue. > > > > However, it doesn't really help with multithreading. For that, you > > Indeed. > > > would want to store all of the static data in a library in a state > > structure, and give each thread a pointer to a separate state > > structure (e.g. pthread_setspecific). > > IMHO this approach will result in a re-design and re-write of most of > the grass libraries > and an update of more than 300 modules ... . > Looks like a 2 1/2 years full time job for one developer ... any > sponsors available? :) It will require refactoring the libraries, but it doesn't require any interface changes. I'm not suggesting making all of the functions take a pointer to the state as a parameter, just making it thread-local. > I think implementing the reset functions is the first step to use > grass in long running applications. > Within those application it must be assured that all the grass library > functions are called from > within the same thread, using a producer-consumer pattern. It shouldn't be necessary to be quite that restrictive. 7.0 already includes changes which make it practical to use multiple threads. You can't e.g. read or write a single map from multiple threads, but reading and writing different maps in different threads should work, as should functions which only query the state (first-use initialisations are protected by a mutex, so so you don't need to worry about concurrent read operations both trying to initialise the state). However, the error handling is probably a bigger issue. Pushing error handling onto the modules isn't an acceptable solution. Simply allowing the fatal error handler to longjmp() out then resume using the GRASS libraries would be non-trivial, as you would have to repair any inconsistencies in the library state. Allowing G_fatal_error() to return is enough work that it can probably be ruled out. Apart from changing every single call (I count 520 references in lib/*), almost every public function would need two versions: one which returns an error code and one which treats errors as fatal (i.e. only returns upon success). > Glynn, if i remember correctly you described a similar approach for a > multi-threaded raster library > some time ago. Are there any plans to implement such an design in the > grass7 raster lib? > I have no experience with such an approach, but i would like to > realize it in a Java application ... someday. :) Built-in multi-threading is currently "blue sky", i.e. I know what the main issues are but I don't have any plan to solve them. The main issue for concurrent reading is that the raster library caches the current row, so that upscaling doesn't read and decode each row multiple times. That's problematic if you want multiple threads reading the same map. -- Glynn Clements <[hidden email]> _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
hamish-2
|
Glynn wrote:
> The main issue for concurrent reading is that the raster > library caches the current row, so that upscaling doesn't > read and decode each row multiple times. That's problematic > if you want multiple threads reading the same map. to me multi-threaded multi-map read does not seem so exciting. typically both maps are in the same GISDBASE, so on the part of the same disk drive, which can only physically read one thing at a time, and asking the drive to jump around non-sequentially to read two maps at once would just slow the overall read down. I've got nothing against making the libs thread-safe, but personally I'm much more intersted in preping the processor-bound modules for multi-threading as I see more gains there vs a slight speedup in I/O-bound libs. but then I'm not trying to keep the libs running long term :) and don't know how the vector cleaning tasks fit into this. Hamish _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Glynn Clements
|
Hamish wrote: > > The main issue for concurrent reading is that the raster > > library caches the current row, so that upscaling doesn't > > read and decode each row multiple times. That's problematic > > if you want multiple threads reading the same map. > > to me multi-threaded multi-map read does not seem so exciting. > typically both maps are in the same GISDBASE, so on the part of > the same disk drive, which can only physically read one thing > at a time, and asking the drive to jump around non-sequentially > to read two maps at once would just slow the overall read down. That assumes that the data isn't buffered in RAM. Often, the file will have been created by the previous command, an will stil be in RAM, so read() is just copying RAM. Also, the decoding and rescaling overhead may mean that the process isn't actually I/O bound. And if you have more than one input map, or you're working on a multi-user system, the I/O will typically be non-sequential anyhow. -- Glynn Clements <[hidden email]> _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Soeren Gebbert-2
|
In reply to this post
by Glynn Clements
Hello Glynn,
2009/9/26 Glynn Clements <[hidden email]>: > > Soeren Gebbert wrote: > >> >> An other approach might be the implementation of reset functions in >> >> every source file which uses >> >> static variables. The reset function can be called for a fine grain >> >> reset approach or all together >> >> in a single reset function which collects all distributed reset functions. >> >> >> >> i.e: >> >> New reset functions like: >> >> >> >> G_reset_geodesic_equation in lib/gis/geodesic.c >> >> G_reset_window in lib/gis/get_window.c >> >> G_reset_cell_area_calculations in lib/gis/area.c >> >> >> >> And so on ... >> > >> > That will get around the namespace issue. >> > >> > However, it doesn't really help with multithreading. For that, you >> >> Indeed. >> >> > would want to store all of the static data in a library in a state >> > structure, and give each thread a pointer to a separate state >> > structure (e.g. pthread_setspecific). >> >> IMHO this approach will result in a re-design and re-write of most of >> the grass libraries >> and an update of more than 300 modules ... . >> Looks like a 2 1/2 years full time job for one developer ... any >> sponsors available? :) > > It will require refactoring the libraries, but it doesn't require any > interface changes. > > I'm not suggesting making all of the functions take a pointer to the > state as a parameter, just making it thread-local. Ok. To my shame i have to admit that i never heard of the thread-local mechanism before. After a quick look at wikipedia i understand the principal and it sounds great! This will speed up things a lot. I guess we need to use the pthread version of thread-local to support other compiler than gcc and windows too? > >> I think implementing the reset functions is the first step to use >> grass in long running applications. >> Within those application it must be assured that all the grass library >> functions are called from >> within the same thread, using a producer-consumer pattern. > > It shouldn't be necessary to be quite that restrictive. 7.0 already > includes changes which make it practical to use multiple threads. You > can't e.g. read or write a single map from multiple threads, but > reading and writing different maps in different threads should work, > as should functions which only query the state (first-use > initialisations are protected by a mutex, so so you don't need to > worry about concurrent read operations both trying to initialise the > state). Great news, i will test this with some java code soon. I have therefor ported the vtkGRASSBridge to grass7. > > However, the error handling is probably a bigger issue. Pushing error > handling onto the modules isn't an acceptable solution. Indeed. This was the next issue i would like to talk about. > Simply allowing the fatal error handler to longjmp() out then resume > using the GRASS libraries would be non-trivial, as you would have to > repair any inconsistencies in the library state. Is there an alternative to longjmp() and setjmp()? It seems to be quite complex, the man page warns about the usage. And i never used it before. > Allowing G_fatal_error() to return is enough work that it can probably > be ruled out. Apart from changing every single call (I count 520 > references in lib/*), almost every public function would need two > versions: one which returns an error code and one which treats errors > as fatal (i.e. only returns upon success). 520 calls are indeed a lot. The raster and gis libraries all together have 70 calls and the vector and db libraries have 190 calls. Glynn, if you can point me to a concrete implementation concept, i would like to start to patch the gis, raster, vector and db libraries in grass7. Maybe we can use signals to set an error variable in the resume error function? I focus on those libraries, because i have currently applications in mind which implements raster and vector algorithms in highly multi threaded environments (servlet or EJB container) and high performance visualization applications (VTK/Paraview). I believe the future of grass can be the backbone of many WPS server, if we can get the core libs thread safe and ready for long running processes. >> Glynn, if i remember correctly you described a similar approach for a >> multi-threaded raster library >> some time ago. Are there any plans to implement such an design in the >> grass7 raster lib? >> I have no experience with such an approach, but i would like to >> realize it in a Java application ... someday. :) > > Built-in multi-threading is currently "blue sky", i.e. I know what the > main issues are but I don't have any plan to solve them. > > The main issue for concurrent reading is that the raster library > caches the current row, so that upscaling doesn't read and decode each > row multiple times. That's problematic if you want multiple threads > reading the same map. Reading single raster maps in different threads is just great. Everything else is like icing on the cake. Best regards Soeren > > -- > Glynn Clements <[hidden email]> > _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Soeren Gebbert-2
|
In reply to this post
by hamish-2
Hello Hamish,
2009/9/26 Hamish <[hidden email]>: > Glynn wrote: >> The main issue for concurrent reading is that the raster >> library caches the current row, so that upscaling doesn't >> read and decode each row multiple times. That's problematic >> if you want multiple threads reading the same map. > > to me multi-threaded multi-map read does not seem so exciting. > typically both maps are in the same GISDBASE, so on the part of > the same disk drive, which can only physically read one thing > at a time, and asking the drive to jump around non-sequentially > to read two maps at once would just slow the overall read down. > I've got nothing against making the libs thread-safe, but > personally I'm much more intersted in preping the processor-bound > modules for multi-threading as I see more gains there vs a > slight speedup in I/O-bound libs. I think that some algorithms are only parallelizable if the I/O runs in parallel too. Especially in case that the algorithms are based on row processing. Otherwise you need to read huge parts of the raster maps in RAM to benefit from multithreading. The gpde and gmath libraries are already partly multi threaded with OpenMP. Especially the blas level 2 and 3 functions and the krylov-subspace linear equation solver (cg, pcg and bicgstab). But you will only have a benefit from multi threading in case the data you like to process is large enough and the compiler provides excellent OpenMP support (intel!). If the data is not large enough the time the threads needs to spawn may take longer then processing the data (in case you are not using a thread pool). Most of the time you need to rewrite the algorithm to enable good speedup on multiple threads, thats the experience i have. My hope is that grass will benefit from the parallel VTK code using the vtkGRASSBridge. The VTK library ships many multithreaded and MPI parallelized image and vector processing algorithms. AFAIK they support the processing of tiles on a cluster. Best regards Soeren > > but then I'm not trying to keep the libs running long term :) > and don't know how the vector cleaning tasks fit into this. > > > Hamish > > > > > grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Glynn Clements
|
In reply to this post
by Soeren Gebbert-2
Soeren Gebbert wrote: > > I'm not suggesting making all of the functions take a pointer to the > > state as a parameter, just making it thread-local. > > Ok. > To my shame i have to admit that i never heard of the thread-local > mechanism before. > After a quick look at wikipedia i understand the principal and it sounds great! > This will speed up things a lot. > I guess we need to use the pthread version of thread-local to support > other compiler than gcc and windows too? You would need to conditionalise it. The usual mechanism is like that used for errno. In a single-threaded implementation, it's just a variable. In a multi-threaded implementation, it's a macro which expands to (*errno_location()), where errno_location() retrieves the address using pthread_getspecific(). > > However, the error handling is probably a bigger issue. Pushing error > > handling onto the modules isn't an acceptable solution. > > Indeed. This was the next issue i would like to talk about. > > > Simply allowing the fatal error handler to longjmp() out then resume > > using the GRASS libraries would be non-trivial, as you would have to > > repair any inconsistencies in the library state. > > Is there an alternative to longjmp() and setjmp()? Not really. > It seems to be quite complex, the man page warns about the usage. > And i never used it before. longjmp() is conceptually similar to raising an exception in C++, while setjmp() is equivalent to establishing a try/catch block. The details are quite simple if you understand how C is implemented in terms of machine code. setjmp() essentially saves the most important CPU registers (including the program counter, stack pointer, and frame pointer), while longjmp() restores them. So setjmp() records the current execution state while longjmp() restores it (similar to save/load in a game). Most of the complexities and warnings relate to potential interactions with optimisation. Primarily, local variables in the function which calls setjmp() aren't guaranteed to be restored to the correct value by longjmp(). gcc warns you if this might occur. Using the "volatile" qualifier can help here. The other caveat is that you can't "wrap" setjmp(). The saved state ceases to become valid once you leave the function which called setjmp(), so you can only call longjmp() from within a "descendent" of the function which calls setjmp(). In terms of using it to recover from a fatal error, the usage would be something along the lines of: jmp_buf save; int my_handler(const char *msg, int fatal) { print_error(msg, fatal); longjmp(save, 1); return 0; /* can't happen; longjmp() doesn't return */ } void main_loop(void) { volatile int done; G_set_error_routine(my_handler); for (done = 0; !done; ) { if (setjmp(save) != 0) continue; /* fatal error happened */ done = do_next_action(); } G_unset_error_routine(); } A common idiom is to call setjmp() in the top-level loop, at the beginning of each "action", and have the fatal error handler call longjmp(). If an error occurs during the execution of an action, the longjmp() will jump back out to the main loop which can then process the next action. An example can be found in lib/driver/main.c (in 6.x), where setjmp() and longjmp() are used to to recover from SIGPIPE, so that the monitor doesn't terminate if the client terminates prematurely. > > Allowing G_fatal_error() to return is enough work that it can probably > > be ruled out. Apart from changing every single call (I count 520 > > references in lib/*), almost every public function would need two > > versions: one which returns an error code and one which treats errors > > as fatal (i.e. only returns upon success). > > 520 calls are indeed a lot. The raster and gis libraries all together > have 70 calls and > the vector and db libraries have 190 calls. > Glynn, if you can point me to a concrete implementation concept, i > would like to start to patch the gis, raster, vector and db libraries > in grass7. > Maybe we can use signals to set an error variable in the resume error function? The least invasive approach is to perform clean-up before calling G_fatal_error(), so that subsequent operations don't crash GRASS, and rely upon the application registering an error handler which longjmp()s out. G_fatal_error() can't be made to return, as that would break all of the modules which use it. And library functions which don't return on error can't be changed to return. You *could* replace existing functions with a wrapper around a version which returns on error. The original function would be modified to return a status indication upon error, and the wrapper would just call the modified version and call G_fatal_error() in the event of an error. Functions which want to handle the error themselves would call the lower-level function. The main problems here are: 1. The sheer number of such functions. 2. The functions may rely upon other functions which currently call G_fatal_error(). So you would have to make similar changes to the underlying functions, then modify the calling function to allow for the fact that these functions can fail. 3. Reporting errors where the error message includes data from local variables. One option here would be to give the underlying function a "fatal" parameter, and add a G_error() function which takes an extra parameter indicating whether to terminate. All things considered, making it safe to longjmp() out of the fatal error handler is would be a lot less work. > > The main issue for concurrent reading is that the raster library > > caches the current row, so that upscaling doesn't read and decode each > > row multiple times. That's problematic if you want multiple threads > > reading the same map. > > Reading single raster maps in different threads is just great. Everything else > is like icing on the cake. BTW, you can read the same map from multiple threads provided that you open it once for each thread. r.mapcalc only opens each map once, but it uses a mutex to prevent concurrent access. -- Glynn Clements <[hidden email]> _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Soeren Gebbert-2
|
Hi Glynn,
thanks a lot for your response. After reading some documentation and asking "silly" questions to my poor informatics colleagues, i understand the concept of thread local and the setjmp()/longjmp() approach a bit better. I would suggest to add longjmp() to G_fatal_error(). It should be set at runtime by an application if longjmp() should be chosen or not. So G_fatal_error() will either call longjmp() or exit(). The setjmp() code goes into the application which calls the grass library functions, except if nested setjmp()/longjmp() calls are needed in grass to clean up data, or to close open file descriptors. The linux threaded errno definition scared me, so i have chosen a different approach. We define thread local support and two extern variables in gis.h to choose at runtime if G_fatal_error() will call exit() or longjmp() and to add thread local support. Example which works for me in my test code: /*Thread local and setjmp() exception support*/ #include <setjmp.h> #ifdef WIN32 #define Thread __declspec( thread ) #else #define Thread __thread #endif extern Thread jmp_buf G_stack_buffer; /*to save the most important CPU register for each thread*/ extern int G_long_run; /*Set to 1 to choose the setjmp() version of G_fatal_error()*/ The G_long_run variable will be initialized in gisinit.c and so the G_stack_buff: int G_long_run; Thread jmp_buf G_stack_buffer; ... void G__gisinit(const char *version, const char *pgm) { const char *mapset; if (initialized) return; G_long_run = 0; ... The application has to set G_long_run right after calling G_gisinit() from a single thread (i.e: a singleton). Now we need to patch error.c to use longjmp() or exit(): void G_fatal_error(const char *msg, ...) { va_list ap; va_start(ap, msg); vfprint_error(ERR, msg, ap); va_end(ap); if(G_long_run == 1) longjmp(G_stack_buffer, 1); else exit(EXIT_FAILURE); } The C++ application code may look like this: extern "C" { #include <grass/gis.h> } ... int G_long_run; Thread jmp_buf G_stack_buffer; vtkGRASSInit::vtkGRASSInit() { G_gisinit("vtkGRASSBridge"); // Set the long run variable to provide long run support in grass libraries G_long_run = 1; } ... /*Open a vector map*/ ... if(!setjmp(G_stack_buffer)) { if (1 > Vect_open_new(&this->map, name, with_z)) { fprintf(stderr, "class: %s line: %i Unable to open vector map <%s>.", this->GetClassName(), __LINE__, name); return false; } } else { fprintf(stderr, "class: %s line: %i Unable to open vector map <%s>.", this->GetClassName(), __LINE__, name);; return false; } ... That's all. Is this approach ok or to simple or just naive? :) If this is ok, i would like to test this approach to identify possible nested setjmp()/longjmp() calls in libgis, libraster and libvector. Additionally i will try to make most of the static variable thread local. Best regards Soeren 2009/9/27 Glynn Clements <[hidden email]>: > > Soeren Gebbert wrote: > >> > I'm not suggesting making all of the functions take a pointer to the >> > state as a parameter, just making it thread-local. >> >> Ok. >> To my shame i have to admit that i never heard of the thread-local >> mechanism before. >> After a quick look at wikipedia i understand the principal and it sounds great! >> This will speed up things a lot. >> I guess we need to use the pthread version of thread-local to support >> other compiler than gcc and windows too? > > You would need to conditionalise it. The usual mechanism is like that > used for errno. In a single-threaded implementation, it's just a > variable. In a multi-threaded implementation, it's a macro which > expands to (*errno_location()), where errno_location() retrieves the > address using pthread_getspecific(). > >> > However, the error handling is probably a bigger issue. Pushing error >> > handling onto the modules isn't an acceptable solution. >> >> Indeed. This was the next issue i would like to talk about. >> >> > Simply allowing the fatal error handler to longjmp() out then resume >> > using the GRASS libraries would be non-trivial, as you would have to >> > repair any inconsistencies in the library state. >> >> Is there an alternative to longjmp() and setjmp()? > > Not really. > >> It seems to be quite complex, the man page warns about the usage. >> And i never used it before. > > longjmp() is conceptually similar to raising an exception in C++, > while setjmp() is equivalent to establishing a try/catch block. > > The details are quite simple if you understand how C is implemented in > terms of machine code. setjmp() essentially saves the most important > CPU registers (including the program counter, stack pointer, and frame > pointer), while longjmp() restores them. So setjmp() records the > current execution state while longjmp() restores it (similar to > save/load in a game). > > Most of the complexities and warnings relate to potential interactions > with optimisation. Primarily, local variables in the function which > calls setjmp() aren't guaranteed to be restored to the correct value > by longjmp(). gcc warns you if this might occur. Using the "volatile" > qualifier can help here. > > The other caveat is that you can't "wrap" setjmp(). The saved state > ceases to become valid once you leave the function which called > setjmp(), so you can only call longjmp() from within a "descendent" of > the function which calls setjmp(). > > In terms of using it to recover from a fatal error, the usage would be > something along the lines of: > > jmp_buf save; > > int my_handler(const char *msg, int fatal) { > print_error(msg, fatal); > longjmp(save, 1); > return 0; /* can't happen; longjmp() doesn't return */ > } > > void main_loop(void) { > volatile int done; > G_set_error_routine(my_handler); > for (done = 0; !done; ) { > if (setjmp(save) != 0) > continue; /* fatal error happened */ > done = do_next_action(); > } > G_unset_error_routine(); > } > > A common idiom is to call setjmp() in the top-level loop, at the > beginning of each "action", and have the fatal error handler call > longjmp(). If an error occurs during the execution of an action, the > longjmp() will jump back out to the main loop which can then process > the next action. > > An example can be found in lib/driver/main.c (in 6.x), where setjmp() > and longjmp() are used to to recover from SIGPIPE, so that the monitor > doesn't terminate if the client terminates prematurely. > >> > Allowing G_fatal_error() to return is enough work that it can probably >> > be ruled out. Apart from changing every single call (I count 520 >> > references in lib/*), almost every public function would need two >> > versions: one which returns an error code and one which treats errors >> > as fatal (i.e. only returns upon success). >> >> 520 calls are indeed a lot. The raster and gis libraries all together >> have 70 calls and >> the vector and db libraries have 190 calls. >> Glynn, if you can point me to a concrete implementation concept, i >> would like to start to patch the gis, raster, vector and db libraries >> in grass7. >> Maybe we can use signals to set an error variable in the resume error function? > > The least invasive approach is to perform clean-up before calling > G_fatal_error(), so that subsequent operations don't crash GRASS, and > rely upon the application registering an error handler which > longjmp()s out. > > G_fatal_error() can't be made to return, as that would break all of > the modules which use it. And library functions which don't return on > error can't be changed to return. > > You *could* replace existing functions with a wrapper around a version > which returns on error. The original function would be modified to > return a status indication upon error, and the wrapper would just call > the modified version and call G_fatal_error() in the event of an > error. Functions which want to handle the error themselves would call > the lower-level function. > > The main problems here are: > > 1. The sheer number of such functions. > > 2. The functions may rely upon other functions which currently call > G_fatal_error(). So you would have to make similar changes to the > underlying functions, then modify the calling function to allow for > the fact that these functions can fail. > > 3. Reporting errors where the error message includes data from local > variables. One option here would be to give the underlying function a > "fatal" parameter, and add a G_error() function which takes an extra > parameter indicating whether to terminate. > > All things considered, making it safe to longjmp() out of the fatal > error handler is would be a lot less work. > >> > The main issue for concurrent reading is that the raster library >> > caches the current row, so that upscaling doesn't read and decode each >> > row multiple times. That's problematic if you want multiple threads >> > reading the same map. >> >> Reading single raster maps in different threads is just great. Everything else >> is like icing on the cake. > > BTW, you can read the same map from multiple threads provided that you > open it once for each thread. > > r.mapcalc only opens each map once, but it uses a mutex to prevent > concurrent access. > > -- > Glynn Clements <[hidden email]> > grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Glynn Clements
|
Soeren Gebbert wrote: > After reading some documentation and asking "silly" questions to > my poor informatics colleagues, i understand the concept of thread local and > the setjmp()/longjmp() approach a bit better. > > I would suggest to add longjmp() to G_fatal_error(). > It should be set at runtime by an application if longjmp() should be > chosen or not. > So G_fatal_error() will either call longjmp() or exit(). There's no need to modify G_fatal_error(). The application can install an error handler with G_set_error_routine(), and the error handler can use longjmp() to avoid returning. > The linux threaded errno definition scared me, so i have chosen a > different approach. > We define thread local support and two extern variables in gis.h to > choose at runtime if > G_fatal_error() will call exit() or longjmp() and to add thread local support. > > Example which works for me in my test code: > > /*Thread local and setjmp() exception support*/ > #include <setjmp.h> > #ifdef WIN32 > #define Thread __declspec( thread ) > #else > #define Thread __thread > #endif The __thread qualifier is a gcc extension. > extern Thread jmp_buf G_stack_buffer; /*to save the most important > CPU register for each thread*/ In order to use a GRASS library from multiple threads, the entire library state needs to be thread-specific. That would include the current error handler. Also: do you actually need to resume a thread in which an error occurs? If not, you can just make the error handler terminate the current thread. That will prevent the error handler from returning and thus prevent exit() being called. > Is this approach ok or to simple or just naive? :) It's too invasive. The longjmp() should go into an application-defined error handler, rather than the GRASS libraries. The only changes to the GRASS libraries regarding error handling should be to ensure that it is safe to continue using them after the application recovers from a fatal error. Supporting thread-local state requires some changes to the GRASS libraries, i.e. moving most[1] static variables for a library into a state structure, and referencing all such variables through a pointer. [1] Some variables will contain immutable data, e.g. tables which are read from files. These don't need to be initialised for each thread, although concurrent initialisation has to be protected against. > If this is ok, i would like to test this approach to identify possible > nested setjmp()/longjmp() > calls in libgis, libraster and libvector. I'd prefer to avoid putting any setjmp()s into the GRASS libraries. I'd rather see low-level functions split into an "internal" version which returns an error status, and a "public" version which calls G_fatal_error(), and have intermediate functions use the internal version if they need to catch errors. Ultimately, the GRASS libraries exist for the benefit of GRASS modules. Minor changes to facilitate other uses (e.g. persistent applications) are reasonable, but I'm opposed to substantial architectural changes, particularly if it makes the coding process more complex. -- Glynn Clements <[hidden email]> _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Soeren Gebbert-2
|
Hello,
>> Example which works for me in my test code: >> >> /*Thread local and setjmp() exception support*/ >> #include <setjmp.h> >> #ifdef WIN32 >> #define Thread __declspec( thread ) >> #else >> #define Thread __thread >> #endif > > The __thread qualifier is a gcc extension. Yes, wikipedia says it works for: Sun Studio C/C++, IBM XL C/C++, GNU C and Intel C/C++ (Linux systems) Using the pthreads implementation will be a better solution? > >> extern Thread jmp_buf G_stack_buffer; /*to save the most important >> CPU register for each thread*/ > > In order to use a GRASS library from multiple threads, the entire > library state needs to be thread-specific. That would include the > current error handler. > > Also: do you actually need to resume a thread in which an error > occurs? If not, you can just make the error handler terminate the > current thread. That will prevent the error handler from returning and > thus prevent exit() being called. I am not sure about this. If the grass libs are called from within an EJB container, then there will be no way to terminate the current thread. But it should work from within a servlet container or a RCP container. > >> Is this approach ok or to simple or just naive? :) > > It's too invasive. The longjmp() should go into an application-defined > error handler, rather than the GRASS libraries. Ok. The error handler looks like that for now: int vgb_error_handler(const char *msg, int fatal) { if (fatal == 0) { fprintf(stderr, "%s\n", msg); return 1; } else if (fatal == 1) { fprintf(stderr, "WARNING: %s\n", msg); longjmp(vgb_stack_buffer, 1); } else if (fatal == 2) { fprintf(stderr, "ERROR: %s\n", msg); longjmp(vgb_stack_buffer, 2); } return 1; } vtkGRASSInit::vtkGRASSInit() { G_gisinit("vtkGRASSInit"); // Set the error routine G_set_error_routine(vgb_error_handler); } > > The only changes to the GRASS libraries regarding error handling > should be to ensure that it is safe to continue using them after the > application recovers from a fatal error. > > Supporting thread-local state requires some changes to the GRASS > libraries, i.e. moving most[1] static variables for a library into a > state structure, and referencing all such variables through a pointer. > I'd prefer to avoid putting any setjmp()s into the GRASS libraries. > I'd rather see low-level functions split into an "internal" version > which returns an error status, and a "public" version which calls > G_fatal_error(), and have intermediate functions use the internal > version if they need to catch errors. Ok, i agree. I will implement the setjmp()/longjmp() only within my application. > > Ultimately, the GRASS libraries exist for the benefit of GRASS > modules. Minor changes to facilitate other uses (e.g. persistent > applications) are reasonable, but I'm opposed to substantial > architectural changes, particularly if it makes the coding process > more complex. Very much agreed. Thanks Soeren > > -- > Glynn Clements <[hidden email]> > _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
|
Glynn Clements
|
Soeren Gebbert wrote: > >> Example which works for me in my test code: > >> > >> /*Thread local and setjmp() exception support*/ > >> #include <setjmp.h> > >> #ifdef WIN32 > >> #define Thread __declspec( thread ) > >> #else > >> #define Thread __thread > >> #endif > > > > The __thread qualifier is a gcc extension. > > Yes, wikipedia says it works for: > Sun Studio C/C++, IBM XL C/C++, GNU C and Intel C/C++ (Linux systems) > > Using the pthreads implementation will be a better solution? Any thread-local state needs to be conditionalised upon the actual threading mechanism used, with a fall-back to a single instance (i.e. a simple variable). > >> Is this approach ok or to simple or just naive? :) > > > > It's too invasive. The longjmp() should go into an application-defined > > error handler, rather than the GRASS libraries. > > Ok. The error handler looks like that for now: [snip] Yep; that's the general idea. -- Glynn Clements <[hidden email]> _______________________________________________ grass-dev mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/grass-dev |
||||||||||||||||
| Free Embeddable Forum Powered by Nabble | Help |