fix: Make reproducable zips in exports#3491
Merged
Merged
Conversation
…es we write to be of the record itself.
Contributor
|
Just to confirm - does this mean that the modified date on the OSV final record is the last-modified of the Datastore entry (when we update it) or does it mean the modified date of the original record (when the data source updates it), as with this issue: #3451? |
Contributor
Author
|
Modified is the last-modified datastore entry. |
hogo6002
approved these changes
May 26, 2025
4 tasks
SanskaarUndale21
added a commit
to SanskaarUndale21/osv.dev
that referenced
this pull request
May 24, 2026
Before this change, the exporter uploaded every output file on every run regardless of whether the content had changed. Since all.zip and other outputs are now reproducible (google#3491), unchanged files would accumulate redundant object generations in the bucket, making it harder for downstream consumers to detect real updates. The writer now calls ReadObjectAttrs before each GCS write and computes the CRC32C of the outgoing data using the Castagnoli polynomial (the same algorithm GCS uses for its stored checksums). If the checksums match, the upload is skipped and an info log is emitted. New objects (ErrNotFound) and any transient attr-read errors fall through to the normal upload path so the exporter remains correct under all conditions. Tests verify the three cases: same content is skipped, changed content is uploaded, and brand-new objects are always created. Fixes google#3513
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Set the modification date on the OSV records written to disk to be of the Modification time of the record.
Partially resolves: #3365