Part 7/11:
Dealing with Small Files and Version Management
To avoid storage bloat caused by many tiny files, Redbus implements a compaction process that merges small files into larger ones, increasing retrieval efficiency. Schema versions are tracked and managed across buckets, preventing data discrepancies or mismatches.
Data Accessibility: Efficient Extraction and Visualization
The company developed an innovative, serverless Lenses-powered explorer for data querying:
Users specify their query parameters (e.g., location, event time).
The system spins up a Fargate job to filter and extract relevant data.
Extracted data is stored as an SQLite file in S3, accessible via a direct link for download and inspection.
This tool allows teams to: