Start service from specified step

Ross Currie 13 years ago updated by anonymous 8 years ago 8

When starting the Unify Event Broker service, it sometimes fails at a certain step in the startup schedule.

When you're dealing with 45 minute imports, it can be frustrating to have to wait while the ones that have already completed are re-run.

To handle this situation the following will need to be achieved:

  • Allow for operation lists to be run by operations. (EB-516)
  • Allow for any operation in an operation list to be run manually.
  • Define a global start-up operation list, which has operations that reference the operation lists to run on startup.


1) You get to define the order of start-up operation lists.
2) You get to skip the completed startup operation lists by starting from the one that failed.

This one is interesting... will definitely require some thinking. Personally, I think that startup operations should have their "On Failure" attribute set to "Continue" so that a baseline will still continue even if a failure occurs. The failed operation could then be run manually. This should be made easier in EB3 with the ability to see failed operation lists/operations.

I don't really like the idea of starting from a particular operation list step as it introduces another potential place for the solution to be in an inconsistent state. This happens enough with connector spaces, and having potential for operation lists to get stuck in a state like this is something I'm not too comfortable with. Open to comment.

Actually, there are some very good reasons why you would not want to set your startup list to "Continue" -> for example, if you're building a metaverse and ILM/FIM was only looking at OU's in AD where it was supremely authoritative and there was a flow rule that says "if the HR value is not active or present, disable the user", well it's pretty important that the HR import occurs first, otherwise all of your users will be disabled...

And there's some very good reasons why you'd want a "continue from step"

Let's say, for example, that you've just configured a new solution and you're deploying it to dev. You've importd all the MA's into ILM/FIM and you're on step 8 of 12 of your startup list when it fails due to a connectivity issue on the MA. Let's say for example, that your Oracle TNS names are configured to use 3 servers and the local ops team have failed to open the firewall to 1 of the 3. So when you import the MA's, the round-robin selection of server works fine, but when the import executes it fails because it's trying to use the 3rd server.

Now it's 1am and this has to be deployed by 9am... but the 8 full import/syncs already took 3 hours and you don't want to re-start them again.because it'd be nice to get SOME sleep. If you can't start from a certain step, you either have to 1) Restart from scratch and hope it works in time; 2) Run each import manually; 3) Write a VB script to run the remaining ones in sequence - of course these last two kind of defeat the purpose of using Event Broker in the first place.

Thank you for expanding on your initial request Ross, I'll add this to the list of things we need to discuss.

Given those use cases, it sounds like something worthwhile to include. Something we do need to be careful of is to ensure operation lists cannot get in an inconsistent state (eg. perhaps only startup operation lists should have this functionality). There may also be cases where you do want to run the entire startup operation list again, so this should also include the ability to ignore the "startup from step" option.

Yeah, the feature was for a separate option (default is do all), and only for the startup schedule.

Requires operations that can run operation lists.

Having a look at this issue from a few years ago...

Most users are now pretty well-versed in using a management agent-oriented solution model. Doing this means that very few times are operation lists used that of substantial enough size for these scenarios, and very few people even use the startup operation list idea any more. It is also extremely easy to add new operation lists on the fly to pick up where something failed, to set up adequate retry settings to prevent this from happening, or to take alternate courses of action in the event of a failure.

Closed as Won't Fix.