0
Fixed

Identity Broker Service fails to start (timeout) when removing connectors/adapters

Ross Currie 13 years ago updated by anonymous 8 years ago 10

Thought I'd added this one already, but can't see it....

Essentially what happens here is that when you start the Identity Broker service, it has a database timeout. I don't have the exact log message, but I can reproduce and will do so once I finish working in the environment I'm in (next week). Removing all connectors/adapters from the config has no impact. Clearing the database allows the service to restart.

My understanding is that this error is caused by Identity Broker's clean-up process that occurs during startup... for example, if a connector has been removed, Identity Broker deletes all entities in that connector... etc.

The issue comes in where you have large volumes of data. The time taken to run the entity deletion command exceeds Identity Broker's database timeout.

This can cause problems in development environments where you are chopping and changing connectors/adapters quite regularly. In production it's not as big an issue, though it can have some impact.

Two work-arounds:
1) Leave all connectors/adapters in when testing new ones.. just extend timings
2) Manually delete entities for those connectors/adapters you're removing from the Entity table before

The description for Event ID 0 from source The following error in Identity Broker occurred during start of the server: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Error occurred in module: Identity Broker

The following error occurred:
System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
at System.Data.SqlClient.TdsParserStateObject.ReadSni(DbAsyncResult asyncResult, TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParserStateObject.ReadNetworkPacket()
at System.Data.SqlClient.TdsParserStateObject.ReadBuffer()
at System.Data.SqlClient.TdsParserStateObject.ReadByte()
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult result)
at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(DbAsyncResult result, String methodName, Boolean sendToPipe)
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at System.Data.Linq.SqlClient.SqlProvider.Execute(Expression query, QueryInfo queryInfo, IObjectReaderFactory factory, Object[] parentArgs, Object[] userArgs, ICompiledSubQuery[] subQueries, Object lastResult)
at System.Data.Linq.SqlClient.SqlProvider.ExecuteAll(Expression query, QueryInfo[] queryInfos, IObjectReaderFactory factory, Object[] userArguments, ICompiledSubQuery[] subQueries)
at System.Data.Linq.SqlClient.SqlProvider.System.Data.Linq.Provider.IProvider.Execute(Expression query)
at System.Data.Linq.ChangeDirector.StandardChangeDirector.DynamicDelete(TrackedObject item)
at System.Data.Linq.ChangeDirector.StandardChangeDirector.Delete(TrackedObject item)
at System.Data.Linq.ChangeProcessor.SubmitChanges(ConflictMode failureMode)
at System.Data.Linq.DataContext.SubmitChanges(ConflictMode failureMode)
at System.Data.Linq.DataContext.SubmitChanges()
at Unify.Data.LinqContextConversionBase`4.DeleteItems(HashSet`1 deletedItems, TContext sourceContext, SqlConnection connection)
at Unify.Data.LinqContextConversionBase`4.SubmitChanges()
at Unify.Framework.IdentityBrokerEngine.SetUpEntityRepositoryPartitions(IEnumerable`1 connectorConfigurationCollection, IEnumerable`1 adapters)
at Unify.Framework.IdentityBrokerEngine.Start()
at Unify.Framework.UnifyEngine.Start()


withAllMinusIncludesOnIndex.sqlplan
withAllSelect.sqlplan
withoutEntityValueIncludesIndex.sqlplan

Assigned to Tony.

See how this currently works. It may be worth just executing a parameterised sql command that deletes where partition = id

Could the index being maintained be taking up time? Look at dropping the clustered index before the query and recreate after the delete.

Alternatively, could the delete be using a an index when the index may not be the best idea? See http://msdn.microsoft.com/en-us/library/ms187373.aspx for instructions on setting INDEX(0?

To determine the cause of lengthy deletions, cascade deletes were turned on/off

{manually deleting from bottom-up}

and the clustered/non-clustered indexes were removed.

  REMOVE CLUSTERED REMOVE NON-CLUSTERED
using CASCADE 15minutes 3 minutes
using NO-CASCADE 16minutes 30 seconds 6 minutes

From these results CASCADE should be left in-tact, and the root cause of the problem appears to be one-to-many of the non-clustered indexes.

The following combinations of indices were tested to find the root cause of the problem:

The test involved determining the total execution time of removing a particular set of indices, and then deleting a particular partition from the IDB database (with 300,000 entities) and then re-adding the indices at the end of the query.
INDEXES REMOVED EXECUTION TIME
IX_Entity_ObjectClass_PartitionId
IX_Entity_PartitionId
IX_EntityValue_PartitionId
28minutes+
IX_Entity_ObjectClass_PartitionId
IX_Entity_PartitionId
IX_EntityValue_PartitionId
dta_index_EntityValue_7_181575685_K13_K12_K2_1_3_4_5_6_8_9_11
10minutes 34seconds
dta_index_EntityValue_7_181575685_K13_K12_K2_1_3_4_5_6_8_9_11 8minutes 39seconds
dta_index_EntityValue_7_181575685_K13_K12_K2_1_3_4_5_6_8_9_11
IX_EntityValue_IntValue
IX_EntityValue_StringValue
6minutes 24seconds

The offending indexes appear to be those on the EntityValue value type columns, and furthermore simply removing indexes en masse does not appear to improve performance, as can be seen by the second test.

This behaviour would appear to reflect that described here; specifically:

"Performance gains are achieved in select operations because the query optimizer can locate all the required column data within the index; the table or clustered index is not accessed. However, having too many included columns may increase the time that is required to perform insert, update, or delete operations to the underlying table or indexed view because of increased index maintenance."

If this is the case, then these indices may be dramatically affecting the performance of Insert, Update and Delete operation in IdentityBroker. I am currently testing the implications of removing/re-instantiating these indices in IDB-150.

Finally, the performance of the final test (~6.5 minutes) does not reflect the best times achieved in the tests mentioned previously -> There may still be a non-clustered index that is affecting performance.

The following is a subset of the overall tests to be performed in conjunction with IDB-150.

Test 0: (Base)

Description: 300,000 Deletes only, with no more Partitions remaining.

Index State Execution time
Indices maintained ~28 minutes 'Cancelled at this time to continue testing'
Indices removed (IX_EntityValue_IntValue, IX_EntityValue_StringValue, and _dta...) 6minutes 24 seconds
_dta... Index removed 9 minutes 17 seconds

Test 1:

Description: 300,000 Deletes, with 300,000 entities in existence against a separate partition.

Index State Execution time
Indices maintained 25 minutes 6 seconds
Indices removed (IX_EntityValue_IntValue, IX_EntityValue_StringValue, and _dta...) 11 minutes 25 seconds
_dta... Index removed 16 minutes 53 seconds

Test 2:

Description: 300,000 Deletes, with 600,000 entities in existence against a separate partition.

Index State Execution time
Indices maintained 31 minutes 51 seconds
Indices removed (IX_EntityValue_IntValue, IX_EntityValue_StringValue, and _dta...) 51 minutes 14 seconds 22minutes 47 seconds
_dta... Index removed 25 minutes 28 seconds

Note: The incorrect indices were removed in Test 2.

Expected outcome: Currently unknown
Preferable outcome: Linear decrease in performance, minor and predictable decline.
Potential edge-case requirements:

  • If an exponential decrease in performance is encountered, then the respective test will need to be re-run with the indices left in-tact to compare the rates of decline.

From the data that's been collected so far, we can make the following declarations:

  1. That cascading deletes are not to blame
  2. That some of the non-clustered indexes are, but not all.
  3. That the dta_index_EntityValue_7_181575685_K13_K12_K2_1_3_4_5_6_8_9_11, IX_EntityValue_IntValue and IX_EntityValue_StringValue are some of the non-clustered indexes that directly affect the performance of deletions (and in potentia Updates and Inserts).
  4. That from the execution times of a default IdentityBroker database and one with the previously mentioned offending indices removed, it would appear that the optimal configuration with regards to Deletions (and again in potentia Updates and Inserts) is the latter.

The following tests were run to determine the effects of removing the dta_index_EntityValue_7_181575685_K13_K12_K2_1_3_4_5_6_8_9_11 on the performance of SELECT statements. Early tests were run against simple SELECT statements, which showed that removing this index was actually beneficial to the overall performance of the SELECT statement, however these statements did not reflect those being executed by IdentityBroker and as such were not particular applicable.

A slightly updated copy of a query run by IdentityBroker - as seen below - exhibited the following times:

SELECT [t0].[EntityId], [t0].[PartitionId], [t0].[ObjectClassId], [t0].[DN], [t6].[EntityValueId], [t6].[EntityId] AS [EntityId2], [t6].[ValueType], [t6].[BigIntValue], [t6].[BitValue], [t6].[FloatValue], [t6].[ImageValue], [t6].[IntValue], [t6].[StringValue], [t6].[TextValue], [t6].[UniqueIdentifierValue], [t6].[PartitionId] AS [PartitionId2], [t6].[CollectionKeyId], [t6].[ValueOrder], [t6].[DecimalValue], [t7].[Caption], [t7].[CollectionKeyId] AS [CollectionKeyId2], (
    SELECT COUNT(*)
    FROM [dbo].[EntityValue] AS [t8]
    INNER JOIN [dbo].[CollectionKey] AS [t9] ON [t9].[CollectionKeyId] = [t8].[CollectionKeyId]
    WHERE ([t8].[EntityId] = [t0].[EntityId]) AND ([t8].[PartitionId] = [t0].[PartitionId])
    ) AS [value], [t2].[test], [t2].[ObjectClassId] AS [ObjectClassId2], [t2].[ClassName], [t2].[PartitionId] AS [PartitionId3]
FROM [dbo].[Entity] AS [t0]
LEFT OUTER JOIN (
    SELECT 1 AS [test], [t1].[ObjectClassId], [t1].[ClassName], [t1].[PartitionId]
    FROM [dbo].[ObjectClass] AS [t1]
    ) AS [t2] ON [t2].[ObjectClassId] = [t0].[ObjectClassId]
CROSS JOIN ([dbo].[EntityValue] AS [t3]
    INNER JOIN [dbo].[CollectionKey] AS [t4] ON [t4].[CollectionKeyId] = [t3].[CollectionKeyId])
LEFT OUTER JOIN ([dbo].[EntityValue] AS [t6]
    INNER JOIN [dbo].[CollectionKey] AS [t7] ON [t7].[CollectionKeyId] = [t6].[CollectionKeyId]) ON ([t6].[EntityId] = [t0].[EntityId]) AND ([t6].[PartitionId] = [t0].[PartitionId])
WHERE ([t4].[Caption] = 'GUID') AND ([t0].[PartitionId] = 'C99043B6-D673-4CCB-ADCB-267F7C5674EE') AND ([t3].[EntityId] = 
[t0].[EntityId]) AND ([t3].[PartitionId] = [t0].[PartitionId]) 
ORDER BY [t0].[EntityId], [t0].[PartitionId], [t2].[ObjectClassId], [t3].[EntityValueId], [t4].[CollectionKeyId], [t6].[EntityValueId], [t7].[CollectionKeyId]
Index State Execution Time
Indices maintained 5 minutes 29 seconds
Index removed (_dta...) 5 minutes 52 seconds
Indices maintained, with the _dta... INCLUDES portion removed 7 minutes 43 seconds

Attached sql execution plans of index select tests.

Removal of the listed indexes and addition of the indices introduced by the execution of the listed queries significantly improves the performance of both deletes and selects simultaneously:

Indices to keep:

  • Entity.IX_EntityKey
  • Entity.PK_Entity
  • EntityValue.IX_EntityValueKey
  • EntityValue.PK_EntityValue

To Add:

CREATE NONCLUSTERED INDEX [_dta_index_Entity_7_149575571__K3_K2_4_5] ON [dbo].[Entity] 
(
	[PartitionId] ASC,
	[EntityId] ASC
)
INCLUDE ( [ObjectClassId],
[DN]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [_dta_index_EntityValue_7_181575685__K13_K3_14] ON [dbo].[EntityValue] 
(
	[PartitionId] ASC,
	[EntityId] ASC
)
INCLUDE ( [CollectionKeyId]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [_dta_index_EntityValue_7_181575685__K14_K13_K3_K2_4_5_6_7_9_10_12_15_16] ON [dbo].[EntityValue] 
(
	[CollectionKeyId] ASC,
	[PartitionId] ASC,
	[EntityId] ASC,
	[EntityValueId] ASC
)
INCLUDE ( [ValueType],
[BigIntValue],
[BitValue],
[FloatValue],
[IntValue],
[StringValue],
[UniqueIdentifierValue],
[ValueOrder],
[DecimalValue]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

These changes resulted in the following:

Description: 300,000 Delete, with 300,000 entities in existence against a separate partition.

Index State Execution Time
Updated indexes 12minutes 46seconds
  • ~1 minute 30 seconds longer than removing the indices altogether
  • A gain of approximately 12minutes 15seconds - as opposed to the indices being maintained in the current state of the IdentityBroker database.

Description: Select statement executed during IdentityBroker full-import

Index State Execution Time
Updated indexes 3minutes 6seconds
  • As opposed to 5minutes 29seconds encountered with the previous index configuration - A gain of approximately 2minutes and 30seconds

If these benefits persist through further testing, it would appear that this is the ideal index configuration for the IdentityBroker database.

Reassigned for confirmation of completion