Skip to content

Archive

The archive feature bundles all collected data into a single compressed archive file. This is useful for creating portable snapshots of your infrastructure data.

When enabled, all step outputs are written to the archive, and the archive is then written to the configured sink. This means you can combine archiving with any sink (filesystem, S3, etc.).

ArchiveSpec configures bundling output into an archive.

  • format - Required (string)

    Format is the archive format. Currently only "tar" is supported.

    Must be one of: tar .
  • compression - Optional (string)

    Compression algorithm

    Must be one of: gzip , zstd , none .
  • name - Optional Template (string)

    Name is the archive base name, defaults to the job name. The appropriate file extension (e.g., ".tar.gz") is automatically appended.

CompressionExtensionDescription
gzip.tar.gzGood compression ratio, widely supported (default)
zstd.tar.zstBetter compression ratio and speed
none.tarNo compression, fastest

The name field supports these variables:

VariableDescriptionExample
$JOB_NAMEThe job’s metadata.namemy-job
$JOB_DATE_ISO8601UTC time in ISO8601 basic format20240115T143052Z
$JOB_DATE_RFC3339UTC time in RFC3339 format2024-01-15T14:30:52Z

The appropriate file extension is automatically appended based on the format and compression.

output:
archive:
format: tar
compression: gzip
sink:
filesystem:
path: ./output

Creates ./output/my-job.tar.gz containing all step outputs.

output:
archive:
format: tar
compression: zstd
name: $JOB_NAME-$JOB_DATE_ISO8601
sink:
filesystem:
path: ./backups

Creates ./backups/my-job-20240115T143052Z.tar.zst.

output:
archive:
format: tar
compression: gzip
name: $JOB_NAME-$JOB_DATE_ISO8601
sink:
s3:
bucket: my-backups
region: us-east-1
prefix: infracollect/

Uploads infracollect/my-job-20240115T143052Z.tar.gz to S3.

kind: CollectJob
apiVersion: v1
metadata:
name: infrastructure-snapshot
collectors:
- id: aws
terraform:
provider: hashicorp/aws
version: "5.0.0"
args:
region: us-east-1
steps:
- id: vpcs
collector: aws
terraform:
data_source: aws_vpcs
- id: instances
collector: aws
terraform:
data_source: aws_instances
output:
encoding:
json:
indent: " "
archive:
format: tar
compression: gzip
name: $JOB_NAME-$JOB_DATE_ISO8601
sink:
s3:
bucket: infrastructure-snapshots
region: us-east-1
prefix: daily/