Possible BUG in ReadyToRunQueue.getNextNonSuspendedBPIT ? [was:: OnAlarm on Pick activity stops working]

3 messages Options
Embed this post
Permalink
Fred van Engen

Possible BUG in ReadyToRunQueue.getNextNonSuspendedBPIT ? [was:: OnAlarm on Pick activity stops working]

Reply Threaded More More options
Print post
Permalink
Hi again,

[Resending this message to give it its own mail thread]

I noticed that the OnAlarm handler was executed if any unrelated message
was processed. I also had suspended instances in BPELSE at the time.

So I looked at the BPELSE sources to see what could cause this problem.

Could this be a bug in ReadyToRunQueue.getNextNonSuspendedBPIT ?

ReadyToRunQueue.getNextScheduledTime is used to determine how long to
wait before calling BPELProcessManagerImpl.process() again if no
messages are received from the NMR.

http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpelcore/src/com/sun/jbi/engine/bpel/core/bpel/engine/impl/ReadyToRunQueue.java?r=1.33#l273 


If some instance with a timeout is suspended, it will likely end up
being the one that has expired earliest (mMostRecentlyExpiringBPIT) and
getNextScheduledTime will use getNextNonSuspendedBPIT to get a
non-suspended alternative. But getNextNonSuspendedBPIT  will only return
a BPIT that has expired already.

http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpelcore/src/com/sun/jbi/engine/bpel/core/bpel/engine/impl/ReadyToRunQueue.java?r=1.33#l280 


If no running instance is expired, getNextScheduledTime will return zero
and BPELSEInOutThread.run() will call DeliveryChannel.accept() without
timeout. So it will wait indefinitely until a message arrives.

http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpeljbiadapter/src/com/sun/jbi/engine/bpel/BPELSEInOutThread.java?r=1.41#l152 


If getNextNonSuspendedBPIT would return the earliest BPIT to expire (as
its name suggests), BPELSE would end up calling DeliveryChannel.accept
with a timeout and OnAlarm would be executed as expected.

Correct?


Regards,

Fred.


Fred van Engen wrote:

> Hi again,
>
> Fred van Engen wrote:
>> Hi,
>>
>> Maybe this is related to the problem posted by [hidden email]
>> who had a problem with OnAlarm on a scope.
>>
>> We have a problem with OnAlarm in a Pick activity. The code looks
>> like this:
>>
> ...
>> Basically an external partner decides what to poll for at another
>> partner. The polling interval is variable, based on what is being
>> polled. Because changes by the client process must be possible at all
>> times, we use a Pick activity with some OnMessage handlers and an
>> OnAlarm handler.
>>
>> The problem is that sometimes the OnAlarm handler won't be executed
>> anymore. After restarting BPELSE the OnAlarm handler will be called
>> again. Persistence is enabled.
>>
>> I surrounded the Pick activity with a Scope because it seemed to
>> help. Unfortunately the problem reappeared during a test.
>>
> Monitoring is also enabled. It shows that the PollTime variable is set
> correctly to the 1 or 15 minute interval and also shows that the pick
> activity last longer than that without executing the OnAlarm activity.
>
> When building I get a warning that the duration string has an
> incorrect type in the assign to PollTime, but the assignment works
> (though the OnAlarm stops working sometimes) and I can't think of a
> way to cast it to an xsd:duration.
>> Any hints or work-arounds?
>>
> Thanks for any help.
>
> Fred.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Murali Pottlapelli

Re: Possible BUG in ReadyToRunQueue.getNextNonSuspendedBPIT ? [was:: OnAlarm on Pick activity stops working]

Reply Threaded More More options
Print post
Permalink
Some javascript/style in this post has been disabled (why?)
Hi Fred,
On issue, are you saying?
  • OnAlearm is not working.
  • OnAleam is not working on a suspended instance after it is resumed.
  • both
On code, are you pointing to regression ? or bug in the getNextNonSuspendedBPIT()?

 
Regards
Murali



Fred van Engen wrote:
Hi again,

[Resending this message to give it its own mail thread]

I noticed that the OnAlarm handler was executed if any unrelated message was processed. I also had suspended instances in BPELSE at the time.

So I looked at the BPELSE sources to see what could cause this problem.

Could this be a bug in ReadyToRunQueue.getNextNonSuspendedBPIT ?

ReadyToRunQueue.getNextScheduledTime is used to determine how long to wait before calling BPELProcessManagerImpl.process() again if no messages are received from the NMR.

http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpelcore/src/com/sun/jbi/engine/bpel/core/bpel/engine/impl/ReadyToRunQueue.java?r=1.33#l273

If some instance with a timeout is suspended, it will likely end up being the one that has expired earliest (mMostRecentlyExpiringBPIT) and getNextScheduledTime will use getNextNonSuspendedBPIT to get a non-suspended alternative. But getNextNonSuspendedBPIT  will only return a BPIT that has expired already.

http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpelcore/src/com/sun/jbi/engine/bpel/core/bpel/engine/impl/ReadyToRunQueue.java?r=1.33#l280

If no running instance is expired, getNextScheduledTime will return zero and BPELSEInOutThread.run() will call DeliveryChannel.accept() without timeout. So it will wait indefinitely until a message arrives.

http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpeljbiadapter/src/com/sun/jbi/engine/bpel/BPELSEInOutThread.java?r=1.41#l152

If getNextNonSuspendedBPIT would return the earliest BPIT to expire (as its name suggests), BPELSE would end up calling DeliveryChannel.accept with a timeout and OnAlarm would be executed as expected.

Correct?


Regards,

Fred.


Fred van Engen wrote:
Hi again,

Fred van Engen wrote:
Hi,

Maybe this is related to the problem posted by [hidden email] who had a problem with OnAlarm on a scope.

We have a problem with OnAlarm in a Pick activity. The code looks like this:

...
Basically an external partner decides what to poll for at another partner. The polling interval is variable, based on what is being polled. Because changes by the client process must be possible at all times, we use a Pick activity with some OnMessage handlers and an OnAlarm handler.

The problem is that sometimes the OnAlarm handler won't be executed anymore. After restarting BPELSE the OnAlarm handler will be called again. Persistence is enabled.

I surrounded the Pick activity with a Scope because it seemed to help. Unfortunately the problem reappeared during a test.

Monitoring is also enabled. It shows that the PollTime variable is set correctly to the 1 or 15 minute interval and also shows that the pick activity last longer than that without executing the OnAlarm activity.

When building I get a warning that the duration string has an incorrect type in the assign to PollTime, but the assignment works (though the OnAlarm stops working sometimes) and I can't think of a way to cast it to an xsd:duration.
Any hints or work-arounds?

Thanks for any help.

Fred.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Fred van Engen

Re: Possible BUG in ReadyToRunQueue.getNextNonSuspendedBPIT ? [was:: OnAlarm on Pick activity stops working]

Reply Threaded More More options
Print post
Permalink
Hello Murali,

Thanks for replying.

OnAlarms on all other instances will stop firing after a while when
there is a suspended instance. The problem is not in a suspended
instance after it is resumed.

Because there was initially no response to this problem, I took a look
at the sources and came to suspect a problem in
ReadyToRunQueue.getNextNonSuspendedBPIT. The suspicion hinted me to look
for suspended instances, which I hadn't considered before.

Since then we use the monitor API to periodically check for suspended
BPEL instances and resume them. I haven't seen the problem since. Before
this workaround it was very common.

In GF v2 UR2 we also had issues with all activity stopping until a new
message was sent into BPELSE. I don't know whether we had any suspended
instances at that time so those issues may have been unrelated. It's
probably not a regression but a bug that's been there for some time.


Regards,

Fred.



Murali Pottlapelli wrote:

> Hi Fred,
> On issue, are you saying?
>
>     * OnAlearm is not working.
>     * OnAleam is not working on a suspended instance after it is resumed.
>     * both
>
> On code, are you pointing to regression ? or bug in the
> getNextNonSuspendedBPIT()?
>
>  
> Regards
> Murali
>
>
>
> Fred van Engen wrote:
>> Hi again,
>>
>> [Resending this message to give it its own mail thread]
>>
>> I noticed that the OnAlarm handler was executed if any unrelated
>> message was processed. I also had suspended instances in BPELSE at
>> the time.
>>
>> So I looked at the BPELSE sources to see what could cause this problem.
>>
>> Could this be a bug in ReadyToRunQueue.getNextNonSuspendedBPIT ?
>>
>> ReadyToRunQueue.getNextScheduledTime is used to determine how long to
>> wait before calling BPELProcessManagerImpl.process() again if no
>> messages are received from the NMR.
>>
>> http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpelcore/src/com/sun/jbi/engine/bpel/core/bpel/engine/impl/ReadyToRunQueue.java?r=1.33#l273 
>>
>>
>> If some instance with a timeout is suspended, it will likely end up
>> being the one that has expired earliest (mMostRecentlyExpiringBPIT)
>> and getNextScheduledTime will use getNextNonSuspendedBPIT to get a
>> non-suspended alternative. But getNextNonSuspendedBPIT  will only
>> return a BPIT that has expired already.
>>
>> http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpelcore/src/com/sun/jbi/engine/bpel/core/bpel/engine/impl/ReadyToRunQueue.java?r=1.33#l280 
>>
>>
>> If no running instance is expired, getNextScheduledTime will return
>> zero and BPELSEInOutThread.run() will call DeliveryChannel.accept()
>> without timeout. So it will wait indefinitely until a message arrives.
>>
>> http://fisheye5.atlassian.com/browse/open-jbi-components/ojc-core/bpelse/bpeljbiadapter/src/com/sun/jbi/engine/bpel/BPELSEInOutThread.java?r=1.41#l152 
>>
>>
>> If getNextNonSuspendedBPIT would return the earliest BPIT to expire
>> (as its name suggests), BPELSE would end up calling
>> DeliveryChannel.accept with a timeout and OnAlarm would be executed
>> as expected.
>>
>> Correct?
>>
>>
>> Regards,
>>
>> Fred.
>>
>>
>> Fred van Engen wrote:
>>> Hi again,
>>>
>>> Fred van Engen wrote:
>>>> Hi,
>>>>
>>>> Maybe this is related to the problem posted by
>>>> [hidden email] who had a problem with OnAlarm on a scope.
>>>>
>>>> We have a problem with OnAlarm in a Pick activity. The code looks
>>>> like this:
>>>>
>>> ...
>>>> Basically an external partner decides what to poll for at another
>>>> partner. The polling interval is variable, based on what is being
>>>> polled. Because changes by the client process must be possible at
>>>> all times, we use a Pick activity with some OnMessage handlers and
>>>> an OnAlarm handler.
>>>>
>>>> The problem is that sometimes the OnAlarm handler won't be executed
>>>> anymore. After restarting BPELSE the OnAlarm handler will be called
>>>> again. Persistence is enabled.
>>>>
>>>> I surrounded the Pick activity with a Scope because it seemed to
>>>> help. Unfortunately the problem reappeared during a test.
>>>>
>>> Monitoring is also enabled. It shows that the PollTime variable is
>>> set correctly to the 1 or 15 minute interval and also shows that the
>>> pick activity last longer than that without executing the OnAlarm
>>> activity.
>>>
>>> When building I get a warning that the duration string has an
>>> incorrect type in the assign to PollTime, but the assignment works
>>> (though the OnAlarm stops working sometimes) and I can't think of a
>>> way to cast it to an xsd:duration.
>>>> Any hints or work-arounds?
>>>>
>>> Thanks for any help.
>>>
>>> Fred.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]