0
Answered

Improved FIM Agent handling of no-start-ma-already-running run profile return status

Bob Bradley 13 years ago updated by anonymous 8 years ago 6

A problem with Event Broker 3.0.* which needed to be resolved for the DEEWR deployment was how to handle the (special) run profile return status of no-start-ma-already-running to ensure that FIM MA delta import/delta sync run profiles. The solution implemented was along the lines of the advice offered by Matt (refer https://unifysolutions.jira.com/browse/EB-381?focusedCommentId=17596&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17596) involving the use of a secondary FIM agent for use exclusively by the FIM MA Operations invoked externally by FIM Portal workflow. The new agent, with the no-start-ma-already-running run profile return status excluded from its success status list, was otherwise identical to the existing FIM agent in order that any externally invoked operation list would raise an exception in the Event Broker logs, thereby enabling these operations to be marked as "Queue Missed" so that Event Broker would retry once the MA was free to run.

The implementation of the above solution has several unwanted side effects (such as the editing/creating of run profile Operations for ANY MA forcing you to re-specify which of the 2 agents to use), as well as complicating the deployment of Event Broker to meet what will be a standard requirement of EVERY Event Broker 3.* deployment involving the FIM MA (i.e. Portal).

A more standard/transparent approach is requested which could avoid the need to define a secondary FIM agent, and thereby streamline the configuration in a repeatable "standard FIM MA configuration" for other FIM sites. My thoughts around how this might be achieved involve the following:
1. Extension of the FIM Agent schema to include an additional "Retry Statuses" property for which the default value of no-start-ma-already-running could also be retained in the existing "Success Statuses" but allow special handling (see below);
2. Operation handling to be enhanced to use the new "Retry Statuses" property to enable the event to be logged as a warning as opposed to an exception (in one day's processing @ DEEWR there were 114 exceptions logged, and of these 99 were cases of no-start-ma-already-running), but still observe the "Queue Missed" property if set for that MA.


Event Broker.Agents.html
Event Broker.Groups.html
Event Broker.Logging.html
Event Broker.Operations.html
Event Broker.Policy.html
Event Broker.Roles.html

I meant to include the excessive logging of no-start-ma-already-running (which are warnings not errors) as an exception in the Event Broker logs as one of the "unwanted side effects". Another way of looking at this issue is to consider the implications of NOT providing an alternative to the above configuration for DEEWR ... i.e. propagating the above configuration throughout ALL FIM Portal Clients would not be desirable due to the added complications it introduces (i.e. I believe we would all want something much "cleaner"). If time prevents such a feature being implemented in the next release, we are going to have to put together some sort of "How To" configuration article explaining how the above needs to be implemented ...

I agree that since this may be of note to all our FIM Portal sites, this is certainly something we should give thought to in a future version.

Complete Event Broker config documentation HTML generated (from my xslts) for DEEWR ... this shows the use of multiple FIM agents. Note I appear to have excluded the "in-progress" status from the secondary FIM agent as well as "no-start-ma-already-running", although I don't know why as I've not seen the "in-progress" status myself from memory.

See EB300:Usage Considerations for current behaviour. This will need to be updated following any changes to this interface.

Bob, just marking this issue as resolved as we're doing some minor cleanup. This behaviour has been addressed by the correction of the Queue Missed setting in v3.0.3, and the target environment has been addressed (see EB-487 and DEEWR-57). I have also just updated EB300:Usage Considerations with an advisory to use an exclusion group to prevent import and export operations on the same management agent (but in separate operation lists) from blocking each other.

Can you please close this issue if the above is appropriate?

Closing despite not having a chance to test with v3.1. The use of exclusion groups needs to become "best practice" and I don't think we've heard the last of this quite yet. Still getting occasional locking errors @CSODBB on export to AD - although I'm using 3.0 there.