Skip to content

Conversation

@markap14
Copy link
Contributor

@markap14 markap14 commented Feb 9, 2026

…source Claim can be truncated if it is large. Whenever FlowFile Repository is checkpointed, truncate any large Resource Claims when possible and necessary to avoid having a situtation where a small FlowFile in a given Resource Claim prevents a large Content Claim from being cleaned up.

Summary

NIFI-00000

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000
  • Pull request contains commits signed with a registered key indicating Verified status

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using ./mvnw clean install -P contrib-check
    • JDK 21
    • JDK 25

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

…source Claim can be truncated if it is large. Whenever FlowFile Repository is checkpointed, truncate any large Resource Claims when possible and necessary to avoid having a situtation where a small FlowFile in a given Resource Claim prevents a large Content Claim from being cleaned up.
return false;
}

private void truncate(final ContentClaim claim) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truncate method doesn't verify that the claimant count is still 0 before truncating. If a clone operation increments the claimant count while the truncation task is mid-flight, we could truncate content that is still referenced. Isn't it a concern?

Wondering if we could have a race condition:

  1. TruncateClaims.truncateClaims() checks claim.isTruncationCandidate() and sees true
  2. A clone operation calls incrementClaimaintCount(), which sets truncationCandidate = false and increments the claimant count
  3. TruncateClaims.truncate() proceeds to truncate the file anyway, corrupting the data for the newly cloned FlowFile

Or maybe this scenario is not an option for some reasons that I missed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing @pvillard31!
In short, no, that should not be possible. The only way we will ever queue up the ContentClaim for truncation is if the FlowFile Repository is synched to disk (typically on checkpoint but also possible on every commit if fsync property in nifi.properties is set to true) and the Content Claim has truncationCandidate = true. So at this point, the FlowFile Repository is the owner of the Content Claim and no Processor has access to it, and the Repository determines that there are no longer any references to it. As a result, we'll only queue up the Content Claim for truncation if there's only 1 referencing FlowFile and that one referencing FlowFile is now being removed. So no concerns about the claimant count going back up.

@pvillard31
Copy link
Contributor

As a side note, the integration test failure was caused by another commit and is now fixed if you rebase on main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants