File listeners in file ingestion and replication tasks
You can use a file listener as a source and to schedule file monitors in file ingestion and replication tasks.
In file ingestion and replication tasks with the following source types, you can schedule the task to run when it receives notifications from a file listener:
•Local folder
•Advanced FTP V2
•Advanced SFTP V2
•Advanced FTPS V2
•Amazon S3 V2
•Microsoft Azure Data Lake Store Gen2
•Microsoft Azure Data Lake Store V3
•Microsoft Azure Blob Storage V3
•Microsoft Fabric OneLake
•Google Cloud Storage V2
•Hadoop Distributed File Storage (HDFS) V2
Note: For more information on configuring a file ingestion task, see the Data Ingestion and Replication help.
File event reliability
When you use a file listener as a source in a file ingestion and replication task, it creates a file event based on the file listener configuration when new files arrive, when the existing files are updated, or when the files are deleted. The file events are passed to the file ingestion and replication task. This section explains the reliability aspects of handling these file events between a file listener and a file ingestion and replication task.
The file listener handles the events based on the following conditions:
•If the Secure Agent isn't running or there is a temporary network disruption, and file events don't reach the file ingestion and replication task, the file listener queues the events for each file and includes it in the notification of the next file ingestion and replication job. A file ingestion and replication task thus receives a notification about each file at least once. This ensures at-least-once reliability between the file listener and the file ingestion and replication task.
Note: File events that aren't processed remain in the queue for seven days.
•If multiple events occur, the file listener notifies the file ingestion and replication task with only the last event for each file.
•File events that are in the file listener queue reach the file ingestion and replication task by one of the following methods:
- When a file ingestion and replication job completes, the Data Ingestion and Replication service makes a pull request to the file listener to check for any queued events. If it finds any events, the service triggers a new ingestion job to process them. The pull request doesn't trigger the processing of files that are already assigned to another concurrent job that runs by the same ingestion task, so only one ingestion job processes a file at any time.
- If any events aren't picked up by the proactive pull request, for example, if the Secure Agent isn't running when the Data Ingestion and Replication service makes the request, the file listener queues the last event for each file and includes it in the notification of the next file ingestion and replication job.
- You can also run the file ingestion and replication task manually to pull the failed events.
•When a file event processing fails, the file ingestion and replication task retries to process the failed events. Retry of failed events occurs once automatically and during subsequent file listener notifications.
•The file ingestion and replication task doesn't automatically reprocess file events that are in success or duplicate status.
You need to manually identify files that aren't successfully transferred to the target due to an error, for example, by using the transfer logs. To resolve the problem, either move the files or modify them manually, so that the file listener picks them up. For example, if the last modified time of a file changes, the file listener identifies the file as updated even if the contents haven't changed.
Example 1. Example
A file listener is a source in a file ingestion and replication task with 15 file events to transfer to a target. The batch size is five. When the file ingestion and replication task is triggered and complete, the file events are in the following status:
•Five events in the first batch (file 1 to 5): success
•Five events in the second batch (file 6 to 10): failed
•Five events in the third batch (file 11 to 15): unprocessed
The file ingestion and replication task automatically retries to process the five failed and unprocessed events once. When the file ingestion and replication task is complete, the file events are in the following status:
•Five events in the first batch (file 6 to 10): success
•Five events in the second batch (file 11 to 15): failed
The file ingestion and replication task automatically retries to process the five failed events once. When the file ingestion and replication task is complete, the five events in the second batch (file 11 to 15) fails.
You can manually run the file ingestion and replication task to pull the pending five events. If you don't run the file ingestion and replication task manually, the file listener would include the failed events in the notification of the next file ingestion and replication job.
File listener job resiliency
When you run a file listener, a corresponding job is created.
The file listener job handles the Secure Agent availability based on the following conditions:
•If a new version of Application Ingestion and Replication is released, the file listener stops and restarts running on the new version.
•When a Secure Agent is unavailable and is back up within 20 minutes, the file listener resumes running on the same Secure Agent after it restarts.
•If a Secure Agent is unavailable for 20 minutes, the file listener restarts on any running Secure Agents available in the Secure Agent group. This behavior applies if the file listener hasn't reached its end time.
•When a Secure Agent is unavailable for more than 20 minutes in a Secure Agent group, the file listener remains in an unresponsive state. After 200 minutes, its status changes to Stopped.