0
Fixed

Operation List for Outgoing Pending Check Operation repeats indefinitely when export run profile returns a "completed-export-errors" status

Bob Bradley 8 years ago • updated by anonymous 3 years ago 16

When an export run profile executed for the FIM MA resulted in a "completed-export-errors" return status, the Outgoing Pending Check Operation continued to report on the change pending, resulting in the operation list firing indifinitely. The "success statuses" for the configured FIM Agent correctly includes this status value explicitly, since the failure is for individual export failure(s) and not the batch as a whole.

When an export failure occurs, an indicator is set for the connector space object in the FIMSynchronizationService database, and this should be used as a "circuit breaker" to prevent infinite looping (until the problem is resolved, which may take days/weeks/months ...). However, care must be taken to ensure that once the indicator is cleared, the export is allowed to fire once more ... see https://unifysolutions.jira.com/browse/EB-203 for a detailed explanation of this problem witnessed with EvB 2.0.3 (the fix for which may have inadvertently led to this problem).

Hi Matt, can you please evaluate this issue and determine what our possible solutions are.

Assigned to Bob for further comment

Bob,

I was under the impression from EB-203 that this was the desired result of changing the outgoing pending check - that exports should fire until complete. I am unsure as to what conditions tell FIM that a previously failing export is now "good to go" short of repeating an attempt at the failed export, as the reasons for this could be very broad (provisioning logic created an invalid anchor, a minor attribute is of the wrong format, the target system has become temporarily unavailable for a few of the exports etc.). Otherwise, it would have to be triggered manually or have a full sync execute, or something to that effect.

Matt - I was worried all along about you interpreting the requirement for EB203 that way, which is why I tried to go to great length to explain the scenario. Have another look at EB203 ... in particular the post "added a comment - 08/Mar/11 10:46 AM", where I say "What gets left in the database is a NULL in the mms_connectorspace.is_export_error attribute (i.e. this BIT type attribute is nullable, and not always 0 or 1).". I had presumed by your subsequent remarks that you had catered for this scenario ... i.e. you can't ignore the is_export_error all together either. I believe the right SQL clause should be "AND ISNULL(CS.IS_EXPORT_ERROR,0) = 0 ". Can you please confirm this Matt?

Matt to check on current SQL logic - please paste query in full here.

Since the above scenario was the one we designed to, the query was reduced to:

SELECT TOP 1 CS.ma_id
FROM mms_connectorspace CS WITH (NOLOCK)
INNER JOIN mms_partition SB WITH (NOLOCK) ON CS.partition_id = SB.partition_id
WHERE CS.ma_id=@managementAgentId
AND CS.current_export_batch_number > SB.last_successful_export_batch_number

Spoke with Bob over the phone, will observe behaviour when the above check is added

Matt - I thought I'd set up the following script to demonstrate what I was trying to say on the phone. When you run it yourself you will see there is only 1 value returned for the '<>' style statement, and that the correct result we want is returned by EITHER the 'isnull' or 'or' tests. The 'isnull' option was the one I proposed because it is more efficient ... not sure what you were doing before, but this is what I would like to see used, and should give the desired result (i.e. solve both EB-203 and EB-294).

declare @tbl table (
test varchar(10) NOT NULL,
value bit NULL
)

insert into @tbl
Select '0', 0
Union
Select '1', 1
Union
Select 'Null', null

select 'control' as group, * from @tbl
union
select 'isnull', * from @tbl
where ISNULL(value,0) = 0
union
select 'or', * from @tbl
where value is null
or value = 0
union
select '<>', * from @tbl
where value <> 1

Bob,

I really want to make sure we get this right. I've added your suggested check, and I've observed the following behaviour:

  • If the entire batch fails due to system unavailability or something to that effect, Event Broker will continue to retry
  • If there is one item in the batch and it fails due to service unavailability, it will not be retried until FIM flags the item as exported-change-not-reimported on a successful confirming import sync
  • Obviously, once the export has been successful, the check does not continue to fire

What I really need is a clear use case that would prove that this check is now working for the behaviour you expect in EB-203. If you could let me know a scenario that would easily and accurately replicate the issue as you expect, that would be a big help. Specifically, what is an operation that would "clear" the export error field and leave it as a null value?

Matthew - it just so happens that I got JUST THIS SITUATION just now @ DEEWR with EvB 2.3, and I traced the SQL and it is showing as this:

SELECT MA.ma_name
FROM mms_management_agent MA WITH (NOLOCK)
where MA.MA_ID in (select CS.ma_id
from mms_connectorspace CS WITH (NOLOCK, INDEX(IX_mms_connectorspacepartitexportcurrenunappldepthis_expobjectpobjec))
INNER JOIN mms_partition SB WITH (NOLOCK) ON CS.partition_id = SB.partition_id
where CS.CURRENT_EXPORT_BATCH_NUMBER > SB.last_successful_export_batch_number
and CS.IS_EXPORT_ERROR <> 1)

Now this is definitely wrong because after the error condition is resolved, IS_EXPOR_ERROR is set to NULL, which means from THAT POINT ONWARDS this CS object never gets detected as a pending export if it's on its own!!! The correction should definitely be "and ISNULL(CS.IS_EXPORT_ERROR,0) = 0".

As for your use case definition, that sounds about right except for #2. The only problem with this scenario seems to be (as in my case just now) that even a full import/full sync after the failed export, the is_export_error flag (value 1) does not get cleared ... this seems to happen only on a successful export. The issue with the present EvB 2.3 code is that after a successful export the value is set to NULL (not 0, which appears to be the default for a new CS object), and therefore from that point onwards never gets reset to 0 and hence never gets exported except if there's another CS object waiting to go where is_export_error does = 0.

Make sense?

So it sounds like the use case is if you have a new connector which fails on its first export, that it is never exported again - is that correct?

And yes, the script I was checking before now includes your mentioned check. Together the script is now:

SELECT TOP 1 CS.ma_id
FROM mms_connectorspace CS WITH (NOLOCK)
INNER JOIN mms_partition SB WITH (NOLOCK) ON CS.partition_id = SB.partition_id
WHERE CS.ma_id=@managementAgentId
AND CS.current_export_batch_number > SB.last_successful_export_batch_number
AND ISNULL(CS.IS_EXPORT_ERROR,0) = 0

No Matt - the use case is slightly different ... if a connector EVER fails on export, it is NOT to be exported again until either (a) another connector for the same MA is also pending export, or (b) a successful export occurs (either manually or via EvB), in which case FIM will reset IS_EXPORT_ERROR to NULL. Therefore the desired behaviour is achieved by ensuring that the pending changes detection process finds at LEAST one connector pending export where IS_EXPORT_ERROR is either null or 0.

Oh, and I wish I know how you knew what wiki markup gives that fancy formatting above ...

Ok Bob, I have tested this issue according to what you've said, and found the script does as you say. Please confirm this is correct:

  • Identity Broker (target system) turned off
  • A pending export is created for Identity Broker
  • Event Broker detects the pending export and attempts to fire a change
  • An export attempt is made, fails, and does not repeat
  • A new pending export is introduced
  • Event Broker detects the new pending export and attempts another export, and the export for both objects fails
  • Identity Broker is turned back on
  • Another item for export is created
  • Event Broker detects the new item and successfully exports the items

So a sync and a new connector will cause exports to fire in your use case.

Great, thanks Bob. Script has been updated. Please confirm resolution in the next release.

Have not seen this problem since the fix was available.