Soft Delete Cleanup Job

Why is there a concept of soft delete and a cleanup job that triggers the logging in the DB?

The soft delete mechanism allows to share information on deleted records in our DB with external systems using e.g. the deleted data providers in the Export. However, if you would keep all soft deleted objects in the DB forever it would lead to a DB space explosion and severe performance stretch and therefore a job can be configured to clean up soft deleted records from the DB after X days. A good thing to have for an enterprise application if you have a use case for this. If not, you may also want to switch the soft delete of completely and save some space on your DB.

Change the configuration in “plugin_customization.ini” and restart service

# Defines the default deletion mode in case entities support both, soft and hard delete
# Supported values are SOFT and HARD
com.heiler.ppm.std.server/default.deletion-mode = HARD

Note: If the recovery logging level of your DB is set to FULL any change is stored in the transaction log for a roll back in case of errors. Otherwise you could only roll back using DB backups.

General information

The soft delete cleanup job is a maintenance job which performs a hard delete (physical delete) of entities which have been already soft deleted. The hard delete cleanup is only triggered for expired entities and only if the lifetime is not unlimited. With version higher 8.0.01, the lifetime is set to one year per default for all entities. The expiration date is calculated by means of the current date as basis. So if today is February the 5th and you have a lifetime of 1 month, then the calculated expire date will be January the 5th, which means that all entities deleted prior to this date will be removed permanently. The default life time property defines life time for all entity types. This can be overwritten per root and sub entity type.

Configuration

You can define the cleanup life time in the plugin_customization.ini. Search for cleanup.lifeTime.default.

Example default life times for all entities:

cleanup.lifeTime.default=1d

cleanup.lifeTime.default=1y

If you want to define a different life time for different root and sub entities then you need to uncomment the property cleanup.lifeTime.custom.file and refer to a custom file defining the soft delete properties. The custom file has to be located in the folder ~/server/configuration/HPM. The cleanup.lifeTime.custom.file property referring to the custom file could look like this, for example:

cleanup.lifeTime.custom.file = softDeleteLifeTime.properties

In your customized soft delete life time properties file you can define the different life times for the entities. The most concrete definition will be used. So if you define a life time for a root entity, then you can overwrite it for the sub entity. But this is only valid if the life time for the sub entity is shorter than the one of the parent.

A valid configuration:

cleanup.lifeTime.default = unlimited

ArticleType.lifeTime = 1y

ArticleLangType.lifeTime = 1m

StructureGroupLangType = 6d

Schedule

To schedule the job repetitively, uncomment the property cleanup.job.repeatPattern and define the pattern. From the version 8.0.01.00 this job is scheduled to run every day at 24:00. To disable the scheduling, you have to uncomment the property and leaf it empty, then the job won't be executed automatically but may be started manually in the process overview (Select "System processes > Remove deleted objects", select the process in the "Current and past processes" view and call "Execute immediatelly" from the context menu).

Example for running the job daily at 24.00:

cleanup.job.repeatPattern = 0 00 00 * * ? *

To disable scheduling:

cleanup.job.repeatPattern =

For more information refer to the Quartz Enterprise Job Scheduler documentation for Cron Expressions

Be cautious with mass deletes in such a setup

The recovery level FULL would increase your transaction log if your DBA is not securing the log away regularly. Potentially endless so to speak, or just until the disk space runs full. A one-time mass deletion of records in Product 360 under recovery level FULL may cause such problem once the cleanup job kicks in.

Recovery level SIMPLE for example would delete the log after every commit to free up log space again.

How do other customers solve such a constellation?

So far, we have not seen many customers deleting that many records in such a short period of time. However, we have seen 4 different variations on best practices in this scenario:

  1. They give the DB as much space as it would need. Even temporary…

  2. They have the cleanup job executed frequently in general (typically delete everything older than 4 weeks) to keep the overall amount of soft deleted records as low as possible over time

  3. In case of a one-time mass deletion process customers create a DB backup and then set the logging level to SIMPLE for the time needed by the transaction

  4. They back up the transaction log prior this action to start with an empty log and keep disk space consumption as limited as possible

These are scenarios where it is already identified that this mass deletion could otherwise create a problem. However, what happens if you didn’t know about this and now are stuck with a transaction log running full all the time the cleanup job gets executed?

Possible Workarounds

Set the X days a record should be kept in the DB relatively high to reduce the amount of overall records in the queue to be cleaned up as small as possible per job execution. This needs some further monitoring over time but is the best option in a safe setup to reduce the amount of soft deleted records controlled and as granular as possible.

The key question is whether you have a lot of soft deleted records in the queue (e.g. because a clean-up job has only recently been configured for the first time in your setup) or if that space problem is caused only having deleted records in the queue that share the same day on their deletion time stamp. If the latter is the case you could still try to work with the bulk size property to get things solved by defining the number of records to be taken by the job per transaction:

Add this line to the plugin_customization.ini of the server to limit the number of items per transaction to 1000 (or whatever suits your idea, default is 100,000). Better use control center to adjust the file and deploy it on the servers if you have multiple servers.

com.heiler.ppm.std.server/entity-manager.delete.bulk-size = 1000