Platform healing describes the processes your Platform performs to reconcile inconsistencies between Platform metadata and physical artifacts on Platform servers. As a large distributed system, the Platform is constantly handling the demands of multiple users, scaling application workloads, dynamically routing requests, or managing shared resources during Platform workflows. As processes interact during these workflows, inconsistencies can occur that will impede normal Platform behavior.
Platform healing, enabled by default at Platform installation, performs constant sweeps to monitor Platform metadata and servers to detect any inconsistency such as invalid workloads, outdated routing requests, or abandoned transactions on the Platform. Sweeps are performed by each server at the interval determined by the Hosting.PlatformHealing.SweepTimeMinutes Platform Registry setting. When a sweep finds an inconsistency, the Platform will take action to remove or update incorrect metadata or application artifacts so the Platform can function correctly.
All application components, except application databases, are checked for inconsistencies by a Platform healing sweep.
The Platform will carry out monitoring and healing activities on its own, but will generate logs so that Platform Operators can track healing actions the Platform is taking and view potential problems. You may need to adjust the Global Log Level or create Log Overrides in order to see the logs generated by Platform healing.
When an inconsistency is found, the Platform will generate a log at the Warning level describing the problem that has occurred so that Platform Operators will know about the issue. If the Platform successfully heals the inconsistencies, logs will be created at the Info level. In cases where the Platform has a problem healing an inconsistency, a log will be made at the Warning level and further actions from the Platform Operators may be required.
In addition to the logs noted above, a log will be made at the Fatal level if a potentially abandoned transaction occurs on the Platform. As larger transactions, such as application promotions or demotions, are managed by the Platform there may be times where processes of the transaction are left unfinished or are not marked as finished on the Platform. Any transaction that has taken longer than the set time in the Hosting.PlatformHealing.LongestExpectedTransaction Platform Registry setting will be logged as a potentially abandoned transaction. Platform Operators may need to take further action to resolve the issue or complete the operation on the Platform.
As of Platform version 6.5.3, Platform Operators are able to set timeout limits for healing sweeps and the corrective actions the Platform takes to resolve inconsistencies. The Hosting.PlatformHealing.DetectionTimeoutMinutes Platform Registry setting sets a timeout limit for healing sweeps performed on your Platform. Set to 5 minutes at installation, this setting defines the time a healing sweep will attempt to detect missing or corrupted workloads before timing out.
Similarly, the Hosting.PlatformHealing.HealingTimeoutMinutes Platform Registry setting determines how long the process that heals inconsistencies will run before timing out. This is set to 5 minutes at installation.
As needed, Platform Operators may disable Platform healing by setting the value of the Hosting.PlatformHealing.Enabled registry setting to “False”.
You have the ability to disable Platform healing on a per-server basis for Linux servers. If you have Platform healing enabled on your Platform but need to have healing disabled on select Linux servers, contact your support representative for help configuring this capability.