0
Fixed

Memory leak during connector operation

Matthew Clark 7 years ago • updated by anonymous 3 years ago 9

The Identity Broker service requires profiling. After leaving three connectors running (one of which is producing 3000 schema validation errors each run every 30 seconds), the console is up to 635MB in a matter of minutes, and still climbing.

No adapters are running. The configuration is connecting to a Chris21 DET form (3000 entities, all failing validation), Chris21 DDS form (588 entities, all succeeding), and a CSV connector (failing due to missing file).

Affected Versions:
Fixed by Version:

This is also happening on Test 1, with 2 connector level errors occurring (rather than entity level), meaning there are significantly less errors occurring.

Initial observations show Gen 2 is not too large, but there is a lot of unused space allocated to the program - we could be having issues with fragmentation due to larger binary objects being stored in memory. Will leave running a bit longer on Test 1 and see what other information can be uncovered.

Raising priority to blocker as memory increase is too rapid, with 3 connectors running sufficient to increase memory use to 1GB in a matter of minutes.

Looking at this with Tony, we've found and fixed several issues around the logging, including:

  • Large object heap fragmentation caused by excessive numbers of streams and readers/writers
  • The log holder being recreated each logging call
  • Some streams not being disposed of correctly
  • Asycnhronous log writers not writing at all, causing a large number of logs to be placed on a queue with no further use, hence being promoted to Gen 2

Early testing shows large log files still require a significant amount of memory to open, but the amount is somewhat proportional to the log file size and does not continue to increase. Tony is adding a job to populate the log holder on the first request, and to clean it up after a certain period of time (~5 minutes). This will ensure subsequent log operations over a period of time will still be quite fast, and that memory is not being unnecessarily held by logs if they are not being looked at.

Assigning to Tony for the addition of the log holder job. Confirming current work locally and on Test 1.

I have checked in some unit tests.

Can you please just monitor this to ensure that it hasn't regressed?

Thanks.

Left 3 instances open overnight. Memory usage stable at 60MB, 100MB, and 170MB (busier environment). Issue closed.

Reopening to attempt to fix JIRA bug