PATCH: Raster stability fixes, ticket #462

42 messages Options
Embed this post
Permalink
1 2 3
Jason Birch

PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Walt Welton-Lair

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Traian Stanev

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink

It does look like a sledgehammer approach -- what if some day there is a raster provider that works ok and doesn't need the locks? If anything, the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Walt Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Bruce Dechant

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
I totally agree - the locking needs to be in the provider.

Thanks,
Bruce


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Traian Stanev
Sent: Friday, March 27, 2009 9:40 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462


It does look like a sledgehammer approach -- what if some day there is a raster provider that works ok and doesn't need the locks? If anything, the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Walt Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Steve Dang

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
In reply to this post by Walt Welton-Lair
For the mapguide_raster_unalloc.5.patch, "pPalBuf" should also be de-allocated when an exception gets thrown (e.g. where "pRasterData" and "reader" get de-allocated).
For the mapguide_raster_stability.patch, I have the same concern as Walt does. This could affect performance significantly.

Steve.

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Walt Welton-Lair
Sent: Friday, March 27, 2009 9:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Jason Birch

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
In reply to this post by Bruce Dechant
I think that this patch should be implemented for two reasons.

First, I believe that this is no more of a sledgehammer approach than the existing guard.

Second, getting coordination between a skilled MapGuide developers that can work though the threading issues and the GDAL Provider developer has been attempted in the past and failed because nobody stepped up to the plate from the MapGuide community.

Checking this in as a temporary stability measure allows us to have a reasonable (if slow) raster story for the 2.1 release.

Of course, if there are developers from both the FDO and MapGuide communities that are able to work together to properly resolve this issue before the FDO 3.4 release (it's at RC3) and the MapGuide 2.1 release, then that would be ideal.

Jason

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Bruce Dechant
Sent: March-27-09 8:55 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I totally agree - the locking needs to be in the provider.

Thanks,
Bruce


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Traian Stanev
Sent: Friday, March 27, 2009 9:40 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462


It does look like a sledgehammer approach -- what if some day there is a raster provider that works ok and doesn't need the locks? If anything, the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Walt Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Trevor Wekel-3

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
Just to add a little more detail, the guard in question will also impact WMS consumption in MapGuide.  Querying and rendering of WMS layers from external data sources will become completely single threaded.

Moving the guard forces most of the stylization/rendering work to be single threaded.  I suspect these operations are fairly lengthy.  From what I recall, the Fdo ReadNext() call is also done within StylizeGridLayer.  If ReadNext() is "slow" for WMS and GDAL then there will be an even greater performance impact to multi-user performance when the guard is moved.

The current guard only protects setup and execution of the MgFeatureService::SelectFeatures call.  From an Fdo perspective, I believe SelectFeatures is fairly fast so guarding it does not significantly affect multi-user performance.

In my opinion, moving the guard in MapGuide will impose a heavy price performance wise.  The best option is to add guards to the GDAL Raster provider.  However, Jason is bang on with his analysis of the likelihood of any collaboration happening short term.  One possible solution would be to put a guard around all of the public Fdo APIs in the GDAL Provider.  That would not require much collaboration and would be "drop in" testable.

Thanks,
Trevor


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 11:07 AM
To: 'MapGuide Internals Mail List'; 'FDO Internals Mail List'
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I think that this patch should be implemented for two reasons.

First, I believe that this is no more of a sledgehammer approach than the existing guard.

Second, getting coordination between a skilled MapGuide developers that can work though the threading issues and the GDAL Provider developer has been attempted in the past and failed because nobody stepped up to the plate from the MapGuide community.

Checking this in as a temporary stability measure allows us to have a reasonable (if slow) raster story for the 2.1 release.

Of course, if there are developers from both the FDO and MapGuide communities that are able to work together to properly resolve this issue before the FDO 3.4 release (it's at RC3) and the MapGuide 2.1 release, then that would be ideal.

Jason

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Bruce Dechant
Sent: March-27-09 8:55 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I totally agree - the locking needs to be in the provider.

Thanks,
Bruce


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Traian Stanev
Sent: Friday, March 27, 2009 9:40 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462


It does look like a sledgehammer approach -- what if some day there is a raster provider that works ok and doesn't need the locks? If anything, the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Walt Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Trevor Wekel-3

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jason Birch
Resending.  My original post bounced on FDOInternals.

-----Original Message-----
From: Trevor Wekel
Sent: Friday, March 27, 2009 11:45 AM
To: 'MapGuide Internals Mail List'; 'FDO Internals Mail List'
Subject: RE: PATCH: Raster stability fixes, ticket #462

Just to add a little more detail, the guard in question will also impact WMS consumption in MapGuide.  Querying and rendering of WMS layers from external data sources will become completely single threaded.

Moving the guard forces most of the stylization/rendering work to be single threaded.  I suspect these operations are fairly lengthy.  From what I recall, the Fdo ReadNext() call is also done within StylizeGridLayer.  If ReadNext() is "slow" for WMS and GDAL then there will be an even greater performance impact to multi-user performance when the guard is moved.

The current guard only protects setup and execution of the MgFeatureService::SelectFeatures call.  From an Fdo perspective, I believe SelectFeatures is fairly fast so guarding it does not significantly affect multi-user performance.

In my opinion, moving the guard in MapGuide will impose a heavy price performance wise.  The best option is to add guards to the GDAL Raster provider.  However, Jason is bang on with his analysis of the likelihood of any collaboration happening short term.  One possible solution would be to put a guard around all of the public Fdo APIs in the GDAL Provider.  That would not require much collaboration and would be "drop in" testable.

Thanks,
Trevor


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 11:07 AM
To: 'MapGuide Internals Mail List'; 'FDO Internals Mail List'
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I think that this patch should be implemented for two reasons.

First, I believe that this is no more of a sledgehammer approach than the existing guard.

Second, getting coordination between a skilled MapGuide developers that can work though the threading issues and the GDAL Provider developer has been attempted in the past and failed because nobody stepped up to the plate from the MapGuide community.

Checking this in as a temporary stability measure allows us to have a reasonable (if slow) raster story for the 2.1 release.

Of course, if there are developers from both the FDO and MapGuide communities that are able to work together to properly resolve this issue before the FDO 3.4 release (it's at RC3) and the MapGuide 2.1 release, then that would be ideal.

Jason

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Bruce Dechant
Sent: March-27-09 8:55 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I totally agree - the locking needs to be in the provider.

Thanks,
Bruce


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Traian Stanev
Sent: Friday, March 27, 2009 9:40 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462


It does look like a sledgehammer approach -- what if some day there is a raster provider that works ok and doesn't need the locks? If anything, the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Walt Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Jason Birch

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
Is there any way at that point in the code to identify the provider being used for the Raster operation, and only apply the guard if it's the GDAL provider?

I believe that there was a fair amount of effort put into guarding all of the FDO APIs in the GDAL provider already, but perhaps some got missed.  I'll be interested to hear back from Haris on his investigations into the connection manager to see if this is a combination of factors, not just the GDAL provider.

BTW, yes, having the guard in that location is somewhat slow with multiple layers/users. But it also means that my MapGuide server doesn't crash after a minute of operation once I get a few concurrent sessions going.  These patches mean that it takes at least ten minutes (I'm seeing the problem of exhausting the client connections too )...

In an ideal world, I'd be happier if there was a bare bones, multithreaded, non-configurable, seat of your pants GDAL provider that supported VRTs.  However, just being able to publicly launch the City's new web maps after a year and a half, countless hours of my time, and over $80k in external development costs would be nice :)  I've definitely had cause to regret jumping on this bandwagon early.

Jason

-----Original Message-----
From: Trevor Wekel
Sent: Friday, March 27, 2009 11:45 AM
To: 'MapGuide Internals Mail List'; 'FDO Internals Mail List'
Subject: RE: PATCH: Raster stability fixes, ticket #462

Just to add a little more detail, the guard in question will also impact WMS consumption in MapGuide.  Querying and rendering of WMS layers from external data sources will become completely single threaded.

Moving the guard forces most of the stylization/rendering work to be single threaded.  I suspect these operations are fairly lengthy.  From what I recall, the Fdo ReadNext() call is also done within StylizeGridLayer.  If ReadNext() is "slow" for WMS and GDAL then there will be an even greater performance impact to multi-user performance when the guard is moved.

The current guard only protects setup and execution of the MgFeatureService::SelectFeatures call.  From an Fdo perspective, I believe SelectFeatures is fairly fast so guarding it does not significantly affect multi-user performance.

In my opinion, moving the guard in MapGuide will impose a heavy price performance wise.  The best option is to add guards to the GDAL Raster provider.  However, Jason is bang on with his analysis of the likelihood of any collaboration happening short term.  One possible solution would be to put a guard around all of the public Fdo APIs in the GDAL Provider.  That would not require much collaboration and would be "drop in" testable.

Thanks,
Trevor


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 11:07 AM
To: 'MapGuide Internals Mail List'; 'FDO Internals Mail List'
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I think that this patch should be implemented for two reasons.

First, I believe that this is no more of a sledgehammer approach than the existing guard.

Second, getting coordination between a skilled MapGuide developers that can work though the threading issues and the GDAL Provider developer has been attempted in the past and failed because nobody stepped up to the plate from the MapGuide community.

Checking this in as a temporary stability measure allows us to have a reasonable (if slow) raster story for the 2.1 release.

Of course, if there are developers from both the FDO and MapGuide communities that are able to work together to properly resolve this issue before the FDO 3.4 release (it's at RC3) and the MapGuide 2.1 release, then that would be ideal.

Jason

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Bruce Dechant
Sent: March-27-09 8:55 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I totally agree - the locking needs to be in the provider.

Thanks,
Bruce


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Traian Stanev
Sent: Friday, March 27, 2009 9:40 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462


It does look like a sledgehammer approach -- what if some day there is a raster provider that works ok and doesn't need the locks? If anything, the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Walt Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Trevor Wekel-3

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
Hmm...  It's a bit hacky but you could call GetResourceContent on the feature source and look for OSGeo.GDAL in the xml document that comes back.

Don't jump ship!  Who is going to keep us in line if you leave?? ;)

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 12:35 PM
To: 'FDO Internals Mail List'; MapGuide Internals Mail List
Subject: [fdo-internals] RE: PATCH: Raster stability fixes, ticket #462

Is there any way at that point in the code to identify the provider being used for the Raster operation, and only apply the guard if it's the GDAL provider?

I believe that there was a fair amount of effort put into guarding all of the FDO APIs in the GDAL provider already, but perhaps some got missed.  I'll be interested to hear back from Haris on his investigations into the connection manager to see if this is a combination of factors, not just the GDAL provider.

BTW, yes, having the guard in that location is somewhat slow with multiple layers/users. But it also means that my MapGuide server doesn't crash after a minute of operation once I get a few concurrent sessions going.  These patches mean that it takes at least ten minutes (I'm seeing the problem of exhausting the client connections too )...

In an ideal world, I'd be happier if there was a bare bones, multithreaded, non-configurable, seat of your pants GDAL provider that supported VRTs.  However, just being able to publicly launch the City's new web maps after a year and a half, countless hours of my time, and over $80k in external development costs would be nice :)  I've definitely had cause to regret jumping on this bandwagon early.

Jason

-----Original Message-----
From: Trevor Wekel
Sent: Friday, March 27, 2009 11:45 AM
To: 'MapGuide Internals Mail List'; 'FDO Internals Mail List'
Subject: RE: PATCH: Raster stability fixes, ticket #462

Just to add a little more detail, the guard in question will also impact WMS consumption in MapGuide.  Querying and rendering of WMS layers from external data sources will become completely single threaded.

Moving the guard forces most of the stylization/rendering work to be single threaded.  I suspect these operations are fairly lengthy.  From what I recall, the Fdo ReadNext() call is also done within StylizeGridLayer.  If ReadNext() is "slow" for WMS and GDAL then there will be an even greater performance impact to multi-user performance when the guard is moved.

The current guard only protects setup and execution of the MgFeatureService::SelectFeatures call.  From an Fdo perspective, I believe SelectFeatures is fairly fast so guarding it does not significantly affect multi-user performance.

In my opinion, moving the guard in MapGuide will impose a heavy price performance wise.  The best option is to add guards to the GDAL Raster provider.  However, Jason is bang on with his analysis of the likelihood of any collaboration happening short term.  One possible solution would be to put a guard around all of the public Fdo APIs in the GDAL Provider.  That would not require much collaboration and would be "drop in" testable.

Thanks,
Trevor


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 11:07 AM
To: 'MapGuide Internals Mail List'; 'FDO Internals Mail List'
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I think that this patch should be implemented for two reasons.

First, I believe that this is no more of a sledgehammer approach than the existing guard.

Second, getting coordination between a skilled MapGuide developers that can work though the threading issues and the GDAL Provider developer has been attempted in the past and failed because nobody stepped up to the plate from the MapGuide community.

Checking this in as a temporary stability measure allows us to have a reasonable (if slow) raster story for the 2.1 release.

Of course, if there are developers from both the FDO and MapGuide communities that are able to work together to properly resolve this issue before the FDO 3.4 release (it's at RC3) and the MapGuide 2.1 release, then that would be ideal.

Jason

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Bruce Dechant
Sent: March-27-09 8:55 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

I totally agree - the locking needs to be in the provider.

Thanks,
Bruce


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Traian Stanev
Sent: Friday, March 27, 2009 9:40 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462


It does look like a sledgehammer approach -- what if some day there is a raster provider that works ok and doesn't need the locks? If anything, the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Walt Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable, but I'd like another 1 or 2 people to also look at it.  It's significant since it limits MapGuide to processing one raster stylization request at a time, but I guess that's better than having a dead lock.  Anyone know where the thread-unsafe code actually is?  Is it in the provider itself?  Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide 2.0.2.  I was having the same problems, and was able to duplicate on 2.1.  I have worked with Haris to create patches against trunk that address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some additional work, since limiting the connection pool to 1 for single-threaded providers doesn't appear to have the desired effect.  However, getting both of these patches into 2.1 would at least improve stability for Raster users.

Jason
_______________________________________________
fdo-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/fdo-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Jason Birch

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
Heh.

At this point I'm probably too invested to jump ship, but I definitely get frustrated at times, usually right around project status meetings with my supervisor or when dealing with members of the public I promised a cross-browser map to months ago.

I think somewhere else in the code there's an equally hacky method used to compare test the feature class name (Image vs Raster) to differentiate WMS from Raster providers.  That wouldn't help if the ADSK raster provider is multithreaded though, or if there are other raster providers we don't "know" about.

Jason

-----Original Message-----
From: Trevor Wekel
Sent: March-27-09 11:49 AM
To: FDO Internals Mail List; MapGuide Internals Mail List
Subject: [fdo-internals] RE: PATCH: Raster stability fixes, ticket #462

Hmm...  It's a bit hacky but you could call GetResourceContent on the feature source and look for OSGeo.GDAL in the xml document that comes back.

Don't jump ship!  Who is going to keep us in line if you leave?? ;)
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Jason Birch

RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
In reply to this post by Steve Dang
Hi Steve,

Would you be willing to make the change required to the first patch to ensure that the pPalBuf is de-allocated correctly?  I could make a guess, but I'm not competent enough with c++ to be sure of getting it right.

Jason

-----Original Message-----
From: Steve Dang
Sent: March-27-09 9:01 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

For the mapguide_raster_unalloc.5.patch, "pPalBuf" should also be de-allocated when an exception gets thrown (e.g. where "pRasterData" and "reader" get de-allocated).
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Walt Welton-Lair

RE: RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
I'll take care of this (with Steve's help).

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Jason Birch
Sent: Friday, March 27, 2009 4:03 PM
To: 'MapGuide Internals Mail List'
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

Hi Steve,

Would you be willing to make the change required to the first patch to ensure that the pPalBuf is de-allocated correctly?  I could make a guess, but I'm not competent enough with c++ to be sure of getting it right.

Jason

-----Original Message-----
From: Steve Dang
Sent: March-27-09 9:01 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket #462

For the mapguide_raster_unalloc.5.patch, "pPalBuf" should also be de-allocated when an exception gets thrown (e.g. where "pRasterData" and "reader" get de-allocated).
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Haris Kurtagic

RE: RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
In reply to this post by Bruce Dechant
Regarding MG raster problems ( or any other provider ) the main reason
is connection manager.
If you would set connection pool size 1 , for sdf provider MG will not
work properly.

The patch Jason submitted, is workaround for issue. It will not cause
GDAL to work slower, because provider itself is basically
single-threaded now.
That was done trough work we were doing years or so ago. I was looking
at provider back then and realized that making gdal calls guarded will
make it more stable. Frank then submitted patch with guard on every
single gdal library call.

I will try to explain what is happing now:
1. Map with more than one gdal layer
2. There are two calls in server code execute query and stylize
3. One thread will open connection in execute query and went to stylize
procedure
4. Another thread will come into execute query and because it could be
another layer it will try to open new connection
5. Connection manager will check pool size ( 1 in gdal case ) and
because provider didn't inc. reference count to connection (not  way to
solve, but anyhow provider needs to increment ) it will delete previous
connection and create new one
6. Access violation in first thread while stylization because of deleted
connection
   

In shorter version things to change:
1. FDO GDAL provider is not increasing reference count on FDO connection
when used in other classes like georasterband , which is later used in
query (wrong)
2. connection manager is managing connection pool by using fdo
connection reference count ( wrong )
3. it looks like working for other providers, because connection pool
size is 200 and other providers are incrementing connection pointer in
select command for example, but if there would be provider with select
command with no reference count to connection, it will crash too
4. Server code is making assumption that pool size for single threaded
is 1 (not correct), we need to make code change to allow for making
calls to provider single threaded

Haris

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Bruce
Dechant
Sent: Friday, March 27, 2009 4:55 PM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
#462

I totally agree - the locking needs to be in the provider.

Thanks,
Bruce


-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Traian
Stanev
Sent: Friday, March 27, 2009 9:40 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
#462


It does look like a sledgehammer approach -- what if some day there is a
raster provider that works ok and doesn't need the locks? If anything,
the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Walt
Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
#462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll
submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable,
but I'd like another 1 or 2 people to also look at it.  It's significant
since it limits MapGuide to processing one raster stylization request at
a time, but I guess that's better than having a dead lock.  Anyone know
where the thread-unsafe code actually is?  Is it in the provider itself?
Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Jason
Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide
2.0.2.  I was having the same problems, and was able to duplicate on
2.1.  I have worked with Haris to create patches against trunk that
address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some
additional work, since limiting the connection pool to 1 for
single-threaded providers doesn't appear to have the desired effect.
However, getting both of these patches into 2.1 would at least improve
stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Traian Stanev

RE: RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink

Hi Haris,

So if I understand this right, if we disable connection pooling for the GDAL provider, so that every request opens a new FDO connection, which lives for the lifetime of the request, there would be no problem?

I'm not very familiar with the connection pooling stuff, so it's a layman's question.

Traian

________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Haris Kurtagic [[hidden email]]
Sent: Saturday, March 28, 2009 4:42 AM
To: MapGuide Internals Mail List
Subject: RE: [mapguide-internals] RE: PATCH:  Raster stability fixes,   ticket #462

Regarding MG raster problems ( or any other provider ) the main reason
is connection manager.
If you would set connection pool size 1 , for sdf provider MG will not
work properly.

The patch Jason submitted, is workaround for issue. It will not cause
GDAL to work slower, because provider itself is basically
single-threaded now.
That was done trough work we were doing years or so ago. I was looking
at provider back then and realized that making gdal calls guarded will
make it more stable. Frank then submitted patch with guard on every
single gdal library call.

I will try to explain what is happing now:
1. Map with more than one gdal layer
2. There are two calls in server code execute query and stylize
3. One thread will open connection in execute query and went to stylize
procedure
4. Another thread will come into execute query and because it could be
another layer it will try to open new connection
5. Connection manager will check pool size ( 1 in gdal case ) and
because provider didn't inc. reference count to connection (not  way to
solve, but anyhow provider needs to increment ) it will delete previous
connection and create new one
6. Access violation in first thread while stylization because of deleted
connection


In shorter version things to change:
1. FDO GDAL provider is not increasing reference count on FDO connection
when used in other classes like georasterband , which is later used in
query (wrong)
2. connection manager is managing connection pool by using fdo
connection reference count ( wrong )
3. it looks like working for other providers, because connection pool
size is 200 and other providers are incrementing connection pointer in
select command for example, but if there would be provider with select
command with no reference count to connection, it will crash too
4. Server code is making assumption that pool size for single threaded
is 1 (not correct), we need to make code change to allow for making
calls to provider single threaded

Haris

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Bruce
Dechant
Sent: Friday, March 27, 2009 4:55 PM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
#462

I totally agree - the locking needs to be in the provider.

Thanks,
Bruce


-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Traian
Stanev
Sent: Friday, March 27, 2009 9:40 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
#462


It does look like a sledgehammer approach -- what if some day there is a
raster provider that works ok and doesn't need the locks? If anything,
the lock should be in the FDO provider.

Traian


-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Walt
Welton-Lair
Sent: Friday, March 27, 2009 11:11 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
#462

The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll
submit that.

The stability patch (mapguide_raster_stability.patch) looks reasonable,
but I'd like another 1 or 2 people to also look at it.  It's significant
since it limits MapGuide to processing one raster stylization request at
a time, but I guess that's better than having a dead lock.  Anyone know
where the thread-unsafe code actually is?  Is it in the provider itself?
Can the mutex be moved closer to the actual thread-unsafe code?

Walt

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Jason
Birch
Sent: Friday, March 27, 2009 3:10 AM
To: [hidden email]
Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket #462

Hi,

Haris has been working on the raster stability problem against MapGuide
2.0.2.  I was having the same problems, and was able to duplicate on
2.1.  I have worked with Haris to create patches against trunk that
address a couple of the problems and increase stability immensely.

Could someone please review the patches attached to:

http://trac.osgeo.org/mapguide/ticket/462

and commit them to trunk if appropriate?

My feeling is that the connection management stuff may need some
additional work, since limiting the connection pool to 1 for
single-threaded providers doesn't appear to have the desired effect.
However, getting both of these patches into 2.1 would at least improve
stability for Raster users.

Jason_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Traian Stanev

RE: RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
In reply to this post by Haris Kurtagic

For what it's worth, patching the provider to addref the parent connection from the feature reader fixes the problem. This is on top of Friday's trunk MapGuide and FDO code, without any other extra fixes. I do think it is a bug not to addref the connection from the reader, when it in fact depends on it to be alive, in order to get stuff from it at a later time.

I am pretty sure the addref fixes the problem, because the test I am using is fairly brutal and produces the crash/deadlock in less than 5 seconds every time I run it with unpatched GDAL provider. With the change, I ran it multiple times to completion.

I am unhappy with the fix though, because it leaves CPU usage by mgserver at ~5%, even though I am doing something like 16 simultaneous requests... So then I decided to remove the global mutex from inside the provider -- that didn't make a difference in CPU usage, and the test still passed without problems. So that mutex can go. Then I messed with the connection pooling settings in serverconfig.ini -- those didn't make any difference. I tried them all for the GDAL provider and the test still passed with 5% CPU usage. I don't think the process is blocked for I/O because the TIFF I am using is not huge and I also generated an overlay pyramid (which I highly recommend to anyone by the way, it makes a huge performance difference for TIFFs).




Traian


> -----Original Message-----
> From: [hidden email] [mailto:mapguide-
> [hidden email]] On Behalf Of Haris Kurtagic
> Sent: Saturday, March 28, 2009 4:43 AM
> To: MapGuide Internals Mail List
> Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket #462
>
> Regarding MG raster problems ( or any other provider ) the main reason
> is connection manager.
> If you would set connection pool size 1 , for sdf provider MG will not
> work properly.
>
> The patch Jason submitted, is workaround for issue. It will not cause
> GDAL to work slower, because provider itself is basically
> single-threaded now.
> That was done trough work we were doing years or so ago. I was looking
> at provider back then and realized that making gdal calls guarded will
> make it more stable. Frank then submitted patch with guard on every
> single gdal library call.
>
> I will try to explain what is happing now:
> 1. Map with more than one gdal layer
> 2. There are two calls in server code execute query and stylize
> 3. One thread will open connection in execute query and went to stylize
> procedure
> 4. Another thread will come into execute query and because it could be
> another layer it will try to open new connection
> 5. Connection manager will check pool size ( 1 in gdal case ) and
> because provider didn't inc. reference count to connection (not  way to
> solve, but anyhow provider needs to increment ) it will delete previous
> connection and create new one
> 6. Access violation in first thread while stylization because of
> deleted
> connection
>
>
> In shorter version things to change:
> 1. FDO GDAL provider is not increasing reference count on FDO
> connection
> when used in other classes like georasterband , which is later used in
> query (wrong)
> 2. connection manager is managing connection pool by using fdo
> connection reference count ( wrong )
> 3. it looks like working for other providers, because connection pool
> size is 200 and other providers are incrementing connection pointer in
> select command for example, but if there would be provider with select
> command with no reference count to connection, it will crash too
> 4. Server code is making assumption that pool size for single threaded
> is 1 (not correct), we need to make code change to allow for making
> calls to provider single threaded
>
> Haris
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Bruce
> Dechant
> Sent: Friday, March 27, 2009 4:55 PM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
> #462
>
> I totally agree - the locking needs to be in the provider.
>
> Thanks,
> Bruce
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Traian
> Stanev
> Sent: Friday, March 27, 2009 9:40 AM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
> #462
>
>
> It does look like a sledgehammer approach -- what if some day there is
> a
> raster provider that works ok and doesn't need the locks? If anything,
> the lock should be in the FDO provider.
>
> Traian
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Walt
> Welton-Lair
> Sent: Friday, March 27, 2009 11:11 AM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes, ticket
> #462
>
> The memory patch (mapguide_raster_unalloc.5.patch) looks good, and I'll
> submit that.
>
> The stability patch (mapguide_raster_stability.patch) looks reasonable,
> but I'd like another 1 or 2 people to also look at it.  It's
> significant
> since it limits MapGuide to processing one raster stylization request
> at
> a time, but I guess that's better than having a dead lock.  Anyone know
> where the thread-unsafe code actually is?  Is it in the provider itself?
> Can the mutex be moved closer to the actual thread-unsafe code?
>
> Walt
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Jason
> Birch
> Sent: Friday, March 27, 2009 3:10 AM
> To: [hidden email]
> Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket
> #462
>
> Hi,
>
> Haris has been working on the raster stability problem against MapGuide
> 2.0.2.  I was having the same problems, and was able to duplicate on
> 2.1.  I have worked with Haris to create patches against trunk that
> address a couple of the problems and increase stability immensely.
>
> Could someone please review the patches attached to:
>
> http://trac.osgeo.org/mapguide/ticket/462
>
> and commit them to trunk if appropriate?
>
> My feeling is that the connection management stuff may need some
> additional work, since limiting the connection pool to 1 for
> single-threaded providers doesn't appear to have the desired effect.
> However, getting both of these patches into 2.1 would at least improve
> stability for Raster users.
>
> Jason_______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals

_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Traian Stanev

RE: RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
I attached the patch for the GDAL provider to ticket 462.

Traian

> -----Original Message-----
> From: [hidden email] [mailto:mapguide-
> [hidden email]] On Behalf Of Traian Stanev
> Sent: Sunday, March 29, 2009 1:22 AM
> To: MapGuide Internals Mail List
> Cc: FDO Internals Mail List
> Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket #462
>
>
> For what it's worth, patching the provider to addref the parent
> connection from the feature reader fixes the problem. This is on top of
> Friday's trunk MapGuide and FDO code, without any other extra fixes. I
> do think it is a bug not to addref the connection from the reader, when
> it in fact depends on it to be alive, in order to get stuff from it at
> a later time.
>
> I am pretty sure the addref fixes the problem, because the test I am
> using is fairly brutal and produces the crash/deadlock in less than 5
> seconds every time I run it with unpatched GDAL provider. With the
> change, I ran it multiple times to completion.
>
> I am unhappy with the fix though, because it leaves CPU usage by
> mgserver at ~5%, even though I am doing something like 16 simultaneous
> requests... So then I decided to remove the global mutex from inside
> the provider -- that didn't make a difference in CPU usage, and the
> test still passed without problems. So that mutex can go. Then I messed
> with the connection pooling settings in serverconfig.ini -- those
> didn't make any difference. I tried them all for the GDAL provider and
> the test still passed with 5% CPU usage. I don't think the process is
> blocked for I/O because the TIFF I am using is not huge and I also
> generated an overlay pyramid (which I highly recommend to anyone by the
> way, it makes a huge performance difference for TIFFs).
>
>
>
>
> Traian
>
>
> > -----Original Message-----
> > From: [hidden email] [mailto:mapguide-
> > [hidden email]] On Behalf Of Haris Kurtagic
> > Sent: Saturday, March 28, 2009 4:43 AM
> > To: MapGuide Internals Mail List
> > Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
> > ticket #462
> >
> > Regarding MG raster problems ( or any other provider ) the main
> reason
> > is connection manager.
> > If you would set connection pool size 1 , for sdf provider MG will
> not
> > work properly.
> >
> > The patch Jason submitted, is workaround for issue. It will not cause
> > GDAL to work slower, because provider itself is basically
> > single-threaded now.
> > That was done trough work we were doing years or so ago. I was
> looking
> > at provider back then and realized that making gdal calls guarded
> will
> > make it more stable. Frank then submitted patch with guard on every
> > single gdal library call.
> >
> > I will try to explain what is happing now:
> > 1. Map with more than one gdal layer
> > 2. There are two calls in server code execute query and stylize 3.
> One
> > thread will open connection in execute query and went to stylize
> > procedure 4. Another thread will come into execute query and because
> > it could be another layer it will try to open new connection 5.
> > Connection manager will check pool size ( 1 in gdal case ) and
> because
> > provider didn't inc. reference count to connection (not  way to solve,
> > but anyhow provider needs to increment ) it will delete previous
> > connection and create new one 6. Access violation in first thread
> > while stylization because of deleted connection
> >
> >
> > In shorter version things to change:
> > 1. FDO GDAL provider is not increasing reference count on FDO
> > connection when used in other classes like georasterband , which is
> > later used in query (wrong) 2. connection manager is managing
> > connection pool by using fdo connection reference count ( wrong ) 3.
> > it looks like working for other providers, because connection pool
> > size is 200 and other providers are incrementing connection pointer
> in
> > select command for example, but if there would be provider with
> select
> > command with no reference count to connection, it will crash too 4.
> > Server code is making assumption that pool size for single threaded
> is
> > 1 (not correct), we need to make code change to allow for making
> calls
> > to provider single threaded
> >
> > Haris
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of
> Bruce
> > Dechant
> > Sent: Friday, March 27, 2009 4:55 PM
> > To: MapGuide Internals Mail List
> > Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> > ticket
> > #462
> >
> > I totally agree - the locking needs to be in the provider.
> >
> > Thanks,
> > Bruce
> >
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of
> > Traian Stanev
> > Sent: Friday, March 27, 2009 9:40 AM
> > To: MapGuide Internals Mail List
> > Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> > ticket
> > #462
> >
> >
> > It does look like a sledgehammer approach -- what if some day there
> is
> > a raster provider that works ok and doesn't need the locks? If
> > anything, the lock should be in the FDO provider.
> >
> > Traian
> >
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of Walt
> > Welton-Lair
> > Sent: Friday, March 27, 2009 11:11 AM
> > To: MapGuide Internals Mail List
> > Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> > ticket
> > #462
> >
> > The memory patch (mapguide_raster_unalloc.5.patch) looks good, and
> > I'll submit that.
> >
> > The stability patch (mapguide_raster_stability.patch) looks
> > reasonable, but I'd like another 1 or 2 people to also look at it.
> > It's significant since it limits MapGuide to processing one raster
> > stylization request at a time, but I guess that's better than having
> a
> > dead lock.  Anyone know where the thread-unsafe code actually is?  Is
> > it in the provider itself?
> > Can the mutex be moved closer to the actual thread-unsafe code?
> >
> > Walt
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of
> Jason
> > Birch
> > Sent: Friday, March 27, 2009 3:10 AM
> > To: [hidden email]
> > Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket
> > #462
> >
> > Hi,
> >
> > Haris has been working on the raster stability problem against
> > MapGuide 2.0.2.  I was having the same problems, and was able to
> > duplicate on 2.1.  I have worked with Haris to create patches against
> > trunk that address a couple of the problems and increase stability
> immensely.
> >
> > Could someone please review the patches attached to:
> >
> > http://trac.osgeo.org/mapguide/ticket/462
> >
> > and commit them to trunk if appropriate?
> >
> > My feeling is that the connection management stuff may need some
> > additional work, since limiting the connection pool to 1 for
> > single-threaded providers doesn't appear to have the desired effect.
> > However, getting both of these patches into 2.1 would at least
> improve
> > stability for Raster users.
> >
> > Jason_______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > _______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > _______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > _______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > _______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Traian Stanev

RE: RE: PATCH: Raster stability fixes, ticket #462

Reply Threaded More More options
Print post
Permalink
In reply to this post by Traian Stanev

OK, if I now enable connection pooling for GDAL and *completely remove* the global raster lock from MappingUtil.cpp, I get CPU at 100% and the test still passes! And it's fassssst... YMMV.

Traian


> -----Original Message-----
> From: [hidden email] [mailto:mapguide-
> [hidden email]] On Behalf Of Traian Stanev
> Sent: Sunday, March 29, 2009 1:22 AM
> To: MapGuide Internals Mail List
> Cc: FDO Internals Mail List
> Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket #462
>
>
> For what it's worth, patching the provider to addref the parent
> connection from the feature reader fixes the problem. This is on top of
> Friday's trunk MapGuide and FDO code, without any other extra fixes. I
> do think it is a bug not to addref the connection from the reader, when
> it in fact depends on it to be alive, in order to get stuff from it at
> a later time.
>
> I am pretty sure the addref fixes the problem, because the test I am
> using is fairly brutal and produces the crash/deadlock in less than 5
> seconds every time I run it with unpatched GDAL provider. With the
> change, I ran it multiple times to completion.
>
> I am unhappy with the fix though, because it leaves CPU usage by
> mgserver at ~5%, even though I am doing something like 16 simultaneous
> requests... So then I decided to remove the global mutex from inside
> the provider -- that didn't make a difference in CPU usage, and the
> test still passed without problems. So that mutex can go. Then I messed
> with the connection pooling settings in serverconfig.ini -- those
> didn't make any difference. I tried them all for the GDAL provider and
> the test still passed with 5% CPU usage. I don't think the process is
> blocked for I/O because the TIFF I am using is not huge and I also
> generated an overlay pyramid (which I highly recommend to anyone by the
> way, it makes a huge performance difference for TIFFs).
>
>
>
>
> Traian
>
>
> > -----Original Message-----
> > From: [hidden email] [mailto:mapguide-
> > [hidden email]] On Behalf Of Haris Kurtagic
> > Sent: Saturday, March 28, 2009 4:43 AM
> > To: MapGuide Internals Mail List
> > Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
> > ticket #462
> >
> > Regarding MG raster problems ( or any other provider ) the main
> reason
> > is connection manager.
> > If you would set connection pool size 1 , for sdf provider MG will
> not
> > work properly.
> >
> > The patch Jason submitted, is workaround for issue. It will not cause
> > GDAL to work slower, because provider itself is basically
> > single-threaded now.
> > That was done trough work we were doing years or so ago. I was
> looking
> > at provider back then and realized that making gdal calls guarded
> will
> > make it more stable. Frank then submitted patch with guard on every
> > single gdal library call.
> >
> > I will try to explain what is happing now:
> > 1. Map with more than one gdal layer
> > 2. There are two calls in server code execute query and stylize 3.
> One
> > thread will open connection in execute query and went to stylize
> > procedure 4. Another thread will come into execute query and because
> > it could be another layer it will try to open new connection 5.
> > Connection manager will check pool size ( 1 in gdal case ) and
> because
> > provider didn't inc. reference count to connection (not  way to solve,
> > but anyhow provider needs to increment ) it will delete previous
> > connection and create new one 6. Access violation in first thread
> > while stylization because of deleted connection
> >
> >
> > In shorter version things to change:
> > 1. FDO GDAL provider is not increasing reference count on FDO
> > connection when used in other classes like georasterband , which is
> > later used in query (wrong) 2. connection manager is managing
> > connection pool by using fdo connection reference count ( wrong ) 3.
> > it looks like working for other providers, because connection pool
> > size is 200 and other providers are incrementing connection pointer
> in
> > select command for example, but if there would be provider with
> select
> > command with no reference count to connection, it will crash too 4.
> > Server code is making assumption that pool size for single threaded
> is
> > 1 (not correct), we need to make code change to allow for making
> calls
> > to provider single threaded
> >
> > Haris
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of
> Bruce
> > Dechant
> > Sent: Friday, March 27, 2009 4:55 PM
> > To: MapGuide Internals Mail List
> > Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> > ticket
> > #462
> >
> > I totally agree - the locking needs to be in the provider.
> >
> > Thanks,
> > Bruce
> >
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of
> > Traian Stanev
> > Sent: Friday, March 27, 2009 9:40 AM
> > To: MapGuide Internals Mail List
> > Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> > ticket
> > #462
> >
> >
> > It does look like a sledgehammer approach -- what if some day there
> is
> > a raster provider that works ok and doesn't need the locks? If
> > anything, the lock should be in the FDO provider.
> >
> > Traian
> >
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of Walt
> > Welton-Lair
> > Sent: Friday, March 27, 2009 11:11 AM
> > To: MapGuide Internals Mail List
> > Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> > ticket
> > #462
> >
> > The memory patch (mapguide_raster_unalloc.5.patch) looks good, and
> > I'll submit that.
> >
> > The stability patch (mapguide_raster_stability.patch) looks
> > reasonable, but I'd like another 1 or 2 people to also look at it.
> > It's significant since it limits MapGuide to processing one raster
> > stylization request at a time, but I guess that's better than having
> a
> > dead lock.  Anyone know where the thread-unsafe code actually is?  Is
> > it in the provider itself?
> > Can the mutex be moved closer to the actual thread-unsafe code?
> >
> > Walt
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of
> Jason
> > Birch
> > Sent: Friday, March 27, 2009 3:10 AM
> > To: [hidden email]
> > Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket
> > #462
> >
> > Hi,
> >
> > Haris has been working on the raster stability problem against
> > MapGuide 2.0.2.  I was having the same problems, and was able to
> > duplicate on 2.1.  I have worked with Haris to create patches against
> > trunk that address a couple of the problems and increase stability
> immensely.
> >
> > Could someone please review the patches attached to:
> >
> > http://trac.osgeo.org/mapguide/ticket/462
> >
> > and commit them to trunk if appropriate?
> >
> > My feeling is that the connection management stuff may need some
> > additional work, since limiting the connection pool to 1 for
> > single-threaded providers doesn't appear to have the desired effect.
> > However, getting both of these patches into 2.1 would at least
> improve
> > stability for Raster users.
> >
> > Jason_______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > _______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > _______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > _______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > _______________________________________________
> > mapguide-internals mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Haris Kurtagic

RE: RE: PATCH: Raster stability fixes, ticket#462

Reply Threaded More More options
Print post
Permalink
In reply to this post by Traian Stanev
Hi Traian,
by just adding ref count to FDO GDA provider you did went into lock,
with single connection to provider.
That is where connection manager is wrong, it was constantly waiting for
connection (it is not problem of guard in stylization).
By enabling connection pooling it is still single thread because
provider itself is single threaded.
With only ref count in FDO GDAL , MG will not work properly.

I think we need to slow down a bit and find a solution for all issues we
have there. It is a connected problem and we need to find solution for
few related problems.

- Connection manager not using connection ref count for deciding
used/unused connections
- Really allow executing of single threaded providers
- I have feeling that because some problems previously were tried to
solve too quickly we ended up with gdal blocked all the way, perhaps
that is not needed.

Haris

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Traian
Stanev
Sent: Sunday, March 29, 2009 7:22 AM
To: MapGuide Internals Mail List
Cc: FDO Internals Mail List
Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
ticket#462


For what it's worth, patching the provider to addref the parent
connection from the feature reader fixes the problem. This is on top of
Friday's trunk MapGuide and FDO code, without any other extra fixes. I
do think it is a bug not to addref the connection from the reader, when
it in fact depends on it to be alive, in order to get stuff from it at a
later time.

I am pretty sure the addref fixes the problem, because the test I am
using is fairly brutal and produces the crash/deadlock in less than 5
seconds every time I run it with unpatched GDAL provider. With the
change, I ran it multiple times to completion.

I am unhappy with the fix though, because it leaves CPU usage by
mgserver at ~5%, even though I am doing something like 16 simultaneous
requests... So then I decided to remove the global mutex from inside the
provider -- that didn't make a difference in CPU usage, and the test
still passed without problems. So that mutex can go. Then I messed with
the connection pooling settings in serverconfig.ini -- those didn't make
any difference. I tried them all for the GDAL provider and the test
still passed with 5% CPU usage. I don't think the process is blocked for
I/O because the TIFF I am using is not huge and I also generated an
overlay pyramid (which I highly recommend to anyone by the way, it makes
a huge performance difference for TIFFs).




Traian


> -----Original Message-----
> From: [hidden email] [mailto:mapguide-
> [hidden email]] On Behalf Of Haris Kurtagic
> Sent: Saturday, March 28, 2009 4:43 AM
> To: MapGuide Internals Mail List
> Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket #462
>
> Regarding MG raster problems ( or any other provider ) the main reason

> is connection manager.
> If you would set connection pool size 1 , for sdf provider MG will not

> work properly.
>
> The patch Jason submitted, is workaround for issue. It will not cause
> GDAL to work slower, because provider itself is basically
> single-threaded now.
> That was done trough work we were doing years or so ago. I was looking

> at provider back then and realized that making gdal calls guarded will

> make it more stable. Frank then submitted patch with guard on every
> single gdal library call.
>
> I will try to explain what is happing now:
> 1. Map with more than one gdal layer
> 2. There are two calls in server code execute query and stylize 3. One

> thread will open connection in execute query and went to stylize
> procedure 4. Another thread will come into execute query and because
> it could be another layer it will try to open new connection 5.
> Connection manager will check pool size ( 1 in gdal case ) and because

> provider didn't inc. reference count to connection (not  way to solve,

> but anyhow provider needs to increment ) it will delete previous
> connection and create new one 6. Access violation in first thread
> while stylization because of deleted connection
>
>
> In shorter version things to change:
> 1. FDO GDAL provider is not increasing reference count on FDO
> connection when used in other classes like georasterband , which is
> later used in query (wrong) 2. connection manager is managing
> connection pool by using fdo connection reference count ( wrong ) 3.
> it looks like working for other providers, because connection pool
> size is 200 and other providers are incrementing connection pointer in

> select command for example, but if there would be provider with select

> command with no reference count to connection, it will crash too 4.
> Server code is making assumption that pool size for single threaded is

> 1 (not correct), we need to make code change to allow for making calls

> to provider single threaded
>
> Haris
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Bruce

> Dechant
> Sent: Friday, March 27, 2009 4:55 PM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket
> #462
>
> I totally agree - the locking needs to be in the provider.
>
> Thanks,
> Bruce
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Traian Stanev
> Sent: Friday, March 27, 2009 9:40 AM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket
> #462
>
>
> It does look like a sledgehammer approach -- what if some day there is

> a raster provider that works ok and doesn't need the locks? If
> anything, the lock should be in the FDO provider.
>
> Traian
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Walt
> Welton-Lair
> Sent: Friday, March 27, 2009 11:11 AM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket
> #462
>
> The memory patch (mapguide_raster_unalloc.5.patch) looks good, and
> I'll submit that.
>
> The stability patch (mapguide_raster_stability.patch) looks
> reasonable, but I'd like another 1 or 2 people to also look at it.  
> It's significant since it limits MapGuide to processing one raster
> stylization request at a time, but I guess that's better than having a

> dead lock.  Anyone know where the thread-unsafe code actually is?  Is
> it in the provider itself?
> Can the mutex be moved closer to the actual thread-unsafe code?
>
> Walt
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Jason

> Birch
> Sent: Friday, March 27, 2009 3:10 AM
> To: [hidden email]
> Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket
> #462
>
> Hi,
>
> Haris has been working on the raster stability problem against
> MapGuide 2.0.2.  I was having the same problems, and was able to
> duplicate on 2.1.  I have worked with Haris to create patches against
> trunk that address a couple of the problems and increase stability
immensely.

>
> Could someone please review the patches attached to:
>
> http://trac.osgeo.org/mapguide/ticket/462
>
> and commit them to trunk if appropriate?
>
> My feeling is that the connection management stuff may need some
> additional work, since limiting the connection pool to 1 for
> single-threaded providers doesn't appear to have the desired effect.
> However, getting both of these patches into 2.1 would at least improve

> stability for Raster users.
>
> Jason_______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
Traian Stanev

RE: RE: PATCH: Raster stability fixes, ticket#462

Reply Threaded More More options
Print post
Permalink

Hi Haris,

I agree that the connection manager is wrong. However, I used the same settings for GDAL as we use for other providers, and which work ok with multiple connections. So fixing the connection manager will (probably) have no effect on what I tried.

Why would the provider still be single threaded if I allow multiple GDAL FDO connections, and remove the global lock from the provider and the global raster lock from the MG code? I checked the documentation for GDAL and in theory it does allow multiple threads to read from the same file at once, so GDAL itself is not single-threaded either.

It seems to me that fixing what is really a refcounting problem in the provider by adding more locks is not the best fix.

Traian

________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Haris Kurtagic [[hidden email]]
Sent: Sunday, March 29, 2009 4:54 AM
To: MapGuide Internals Mail List
Subject: RE: [mapguide-internals] RE: PATCH:  Raster stability fixes,   ticket#462

Hi Traian,
by just adding ref count to FDO GDA provider you did went into lock,
with single connection to provider.
That is where connection manager is wrong, it was constantly waiting for
connection (it is not problem of guard in stylization).
By enabling connection pooling it is still single thread because
provider itself is single threaded.
With only ref count in FDO GDAL , MG will not work properly.

I think we need to slow down a bit and find a solution for all issues we
have there. It is a connected problem and we need to find solution for
few related problems.

- Connection manager not using connection ref count for deciding
used/unused connections
- Really allow executing of single threaded providers
- I have feeling that because some problems previously were tried to
solve too quickly we ended up with gdal blocked all the way, perhaps
that is not needed.

Haris

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Traian
Stanev
Sent: Sunday, March 29, 2009 7:22 AM
To: MapGuide Internals Mail List
Cc: FDO Internals Mail List
Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
ticket#462


For what it's worth, patching the provider to addref the parent
connection from the feature reader fixes the problem. This is on top of
Friday's trunk MapGuide and FDO code, without any other extra fixes. I
do think it is a bug not to addref the connection from the reader, when
it in fact depends on it to be alive, in order to get stuff from it at a
later time.

I am pretty sure the addref fixes the problem, because the test I am
using is fairly brutal and produces the crash/deadlock in less than 5
seconds every time I run it with unpatched GDAL provider. With the
change, I ran it multiple times to completion.

I am unhappy with the fix though, because it leaves CPU usage by
mgserver at ~5%, even though I am doing something like 16 simultaneous
requests... So then I decided to remove the global mutex from inside the
provider -- that didn't make a difference in CPU usage, and the test
still passed without problems. So that mutex can go. Then I messed with
the connection pooling settings in serverconfig.ini -- those didn't make
any difference. I tried them all for the GDAL provider and the test
still passed with 5% CPU usage. I don't think the process is blocked for
I/O because the TIFF I am using is not huge and I also generated an
overlay pyramid (which I highly recommend to anyone by the way, it makes
a huge performance difference for TIFFs).




Traian


> -----Original Message-----
> From: [hidden email] [mailto:mapguide-
> [hidden email]] On Behalf Of Haris Kurtagic
> Sent: Saturday, March 28, 2009 4:43 AM
> To: MapGuide Internals Mail List
> Subject: RE: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket #462
>
> Regarding MG raster problems ( or any other provider ) the main reason

> is connection manager.
> If you would set connection pool size 1 , for sdf provider MG will not

> work properly.
>
> The patch Jason submitted, is workaround for issue. It will not cause
> GDAL to work slower, because provider itself is basically
> single-threaded now.
> That was done trough work we were doing years or so ago. I was looking

> at provider back then and realized that making gdal calls guarded will

> make it more stable. Frank then submitted patch with guard on every
> single gdal library call.
>
> I will try to explain what is happing now:
> 1. Map with more than one gdal layer
> 2. There are two calls in server code execute query and stylize 3. One

> thread will open connection in execute query and went to stylize
> procedure 4. Another thread will come into execute query and because
> it could be another layer it will try to open new connection 5.
> Connection manager will check pool size ( 1 in gdal case ) and because

> provider didn't inc. reference count to connection (not  way to solve,

> but anyhow provider needs to increment ) it will delete previous
> connection and create new one 6. Access violation in first thread
> while stylization because of deleted connection
>
>
> In shorter version things to change:
> 1. FDO GDAL provider is not increasing reference count on FDO
> connection when used in other classes like georasterband , which is
> later used in query (wrong) 2. connection manager is managing
> connection pool by using fdo connection reference count ( wrong ) 3.
> it looks like working for other providers, because connection pool
> size is 200 and other providers are incrementing connection pointer in

> select command for example, but if there would be provider with select

> command with no reference count to connection, it will crash too 4.
> Server code is making assumption that pool size for single threaded is

> 1 (not correct), we need to make code change to allow for making calls

> to provider single threaded
>
> Haris
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Bruce

> Dechant
> Sent: Friday, March 27, 2009 4:55 PM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket
> #462
>
> I totally agree - the locking needs to be in the provider.
>
> Thanks,
> Bruce
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Traian Stanev
> Sent: Friday, March 27, 2009 9:40 AM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket
> #462
>
>
> It does look like a sledgehammer approach -- what if some day there is

> a raster provider that works ok and doesn't need the locks? If
> anything, the lock should be in the FDO provider.
>
> Traian
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Walt
> Welton-Lair
> Sent: Friday, March 27, 2009 11:11 AM
> To: MapGuide Internals Mail List
> Subject: [mapguide-internals] RE: PATCH: Raster stability fixes,
> ticket
> #462
>
> The memory patch (mapguide_raster_unalloc.5.patch) looks good, and
> I'll submit that.
>
> The stability patch (mapguide_raster_stability.patch) looks
> reasonable, but I'd like another 1 or 2 people to also look at it.
> It's significant since it limits MapGuide to processing one raster
> stylization request at a time, but I guess that's better than having a

> dead lock.  Anyone know where the thread-unsafe code actually is?  Is
> it in the provider itself?
> Can the mutex be moved closer to the actual thread-unsafe code?
>
> Walt
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Jason

> Birch
> Sent: Friday, March 27, 2009 3:10 AM
> To: [hidden email]
> Subject: [mapguide-internals] PATCH: Raster stability fixes, ticket
> #462
>
> Hi,
>
> Haris has been working on the raster stability problem against
> MapGuide 2.0.2.  I was having the same problems, and was able to
> duplicate on 2.1.  I have worked with Haris to create patches against
> trunk that address a couple of the problems and increase stability
immensely.

>
> Could someone please review the patches attached to:
>
> http://trac.osgeo.org/mapguide/ticket/462
>
> and commit them to trunk if appropriate?
>
> My feeling is that the connection management stuff may need some
> additional work, since limiting the connection pool to 1 for
> single-threaded providers doesn't appear to have the desired effect.
> However, getting both of these patches into 2.1 would at least improve

> stability for Raster users.
>
> Jason_______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
_______________________________________________
mapguide-internals mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
1 2 3