On this submit, we offer a serverless resolution to cost-optimize the storage of contact-center name recordings. The answer automates the scheduling, storage-tiering, and resampling of call-recording recordsdata, leading to quick value financial savings. The answer is an asynchronous structure constructed utilizing AWS Step Features, Amazon Easy Queue Service (Amazon SQS), and AWS Lambda.
Amazon Join supplies an omnichannel cloud contact middle with the power to keep up name recordings for compliance and gaining actionable insights utilizing Contact Lens for Amazon Join and AWS Contact Heart Intelligence Companions. The storage required for name recordings can shortly enhance as clients meet compliance retention necessities, usually spanning six or extra years. This will result in tons of of terabytes in long-term storage.
Answer overview
When an agent completes a buyer name, Amazon Join sends the decision recording to an Amazon Easy Storage Answer (Amazon S3) bucket with: a date and get in touch with ID prefix, the file saved within the .WAV format and encoded utilizing bitrate 256 kb/s, pcm_s16le, 8000 Hz, two channels, and 256 kb/s. The decision-recording recordsdata are roughly 2 Mb/minute optimized for high-quality processing, similar to machine studying evaluation (see Determine 1).

Determine 1. Asynchronous structure for batch resampling for call-recording recordsdata on Amazon S3
When a name recording is distributed to Amazon S3, downstream post-processing is commonly carried out to generate analytics studies for brokers and high quality auditors. The downstream processing can embrace providers that present transcriptions, quality-of-service metrics, and sentiment evaluation to create studies and set off actionable occasions.
Whereas this processing is commonly accomplished inside minutes, the downstream functions might require processing retries. As audio resampling reduces the standard of the audio recordsdata, it’s important to delay resampling till after processing is accomplished. As processed name recordings are sometimes accessed days after a name is accomplished, with solely a small proportion accessed by brokers and name high quality auditors, name recordings can profit from resampling and transitioning to long-term Amazon S3 storage tiers.
In Determine 2, a number of AWS providers work collectively to offer an end-to-end cost-optimization resolution to your contact middle name recordings.

Determine 2. AWS Step Perform orchestrates the batch resampling of name recordings
An Amazon EventBridge schedule rule triggers the step perform to carry out the batch resampling course of for all name recordings from the earlier 7 days.
In step one perform activity, the Lambda perform activity iterates the S3 bucket utilizing the ListObjectsV2 API, acquiring the decision recordings (1000 objects per iteration) with the date prefix from 7 days in the past.
The following activity invokes a Lambda perform inserting the decision recording objects into the Amazon SQS queue. The audio-conversion Lambda perform receives the Amazon SQS queue occasions by way of the occasion supply mapping Lambda integration. Every concurrent Lambda invocation downloads a saved name recording from Amazon S3, resampling the .WAV with ffmpeg and tagging the S3 object with a “transformed=True” tag.
Lastly, the conversion perform uploads the resampled file to Amazon S3, overwriting the unique name recording with the resampled recording utilizing a cost-optimized storage class, similar to S3 Glacier Immediate Retrieval. S3 Glacier Immediate Retrieval supplies the bottom value for long-lived knowledge that’s not often accessed and requires milliseconds retrieval, similar to for contact-center call-recording playback. By default, Amazon Join shops name recordings with S3 Versioning enabled, sustaining the unique file as a model. You should use lifecycle insurance policies to delete object variations from a version-enabled bucket to completely take away the unique model, as this may decrease the storage of the unique name recording.
This resolution captures failures throughout the step perform workflow with logging and a dead-letter queue, similar to when an error happens with resampling a recording file. A Step Perform activity screens the Amazon SQS queue utilizing the AWS Step Perform integration with AWS SDK with SQS and ending the workflow when the queue is emptied. Desk 1 demonstrates the default and resampled codecs.

Determine 3. Detailed AWS Step Features state machine diagram
Resampling
Desk 1. Default and resampled name recording audio codecs
Audio sampling codecs | File dimension/minute | Notes |
---|---|---|
Bitrate 256 kb/s, pcm_s16le, 8000 Hz, 2 channels, 256 kb/s | 2 MB | The default for Amazon Join name recordings. Sampled for audio high quality and name analytics processing. |
Bitrate 64 kb/s, pcm_alaw, 8000 Hz, 1 channel, 64 kb/s | 0.5 MB | Resampled to mono channel 8 bit. This resampling shouldn’t be reversible and may solely be carried out in any case name analytics processing has been accomplished. |
Price evaluation
For pricing data for the first providers used within the resolution, go to:
The prices incurred by the answer are primarily based on utilization and are AWS Free Tier eligible. After the AWS Free Tier allowance is consumed, utilization prices are roughly $0.11 per 1000 minutes of name recordings. S3 Customary begins at $0.023 per GB/month; and S3 Glacier Immediate Retrieval is $0.004 per GB/month, with $0.003 per GB of knowledge retrieval. Throughout a 6-year compliance retention time period, the schedule-based resampling and storage tiering leads to important value financial savings.
Within the 6-year instance detailed in Desk 2, the S3 Customary storage prices could be roughly $356,664 for 3 million call-recording minutes/month. The audio resampling and S3 Glacier Immediate Retrieval tiering reduces the 6-year value to roughly $41,838.
Desk 2. Multi-year prices financial savings state of affairs (3 million minutes/month) in USD
12 months | Whole minutes (3 million/month) | Whole storage (TB) | Price of storage, S3 Customary (USD) | Price of working the resampling (USD) | Price of resampling resolution with S3 Glacier Immediate Retrieval (USD) |
---|---|---|---|---|---|
1 | 36,000,000 | 72 | 10,764 | 3,960 | 4,813 |
2 | 72,000,000 | 108 | 30,636 | 3,960 | 5,677 |
3 | 108,000,000 | 144 | 50,508 | 3,960 | 6,541 |
4 | 144,000,000 | 180 | 70,380 | 3,960 | 7,405 |
5 | 180,000,000 | 216 | 90,252 | 3,960 | 8,269 |
6 | 216,000,000 | 252 | 110,124 | 3,960 | 9,133 |
Whole | 1,008,000,000 | 972 | 356,664 | 23,760 | 41,838 |
To discover PCA prices for your self, use AWS Price Explorer or select Invoice Particulars on the AWS Billing Dashboard to see your month-to-date spend by service.
Deploying the answer
The code and documentation for this resolution can be found by cloning the git repository and may be deployed with AWS Cloud Growth Package (AWS CDK).
Bash
# clone repository
git clone https://github.com/aws-samples/amazon-connect-call-recording-cost-optimizer.git
# navigate the challenge listing
cd amazon-connect-call-recording-cost-optimizer
Modify the cdk.context.json
together with your atmosphere’s configuration setting, such because the bucket_name
. Subsequent, set up the AWS CDK dependencies and deploy the answer:
:# guarantee you might be within the root listing of the repository
./cdk-deploy.sh
As soon as deployed, you may take a look at the resampling resolution by ready for the EventBridge schedule rule to execute primarily based on the num_days_age
setting that’s utilized. You too can manually run the AWS Step Perform with a specified date, for instance {"specific_date":"01/01/2022"}
.
The AWS CDK deployment creates the next sources:
- AWS Step Perform
- AWS Lambda perform
- Amazon SQS queues
- Amazon EventBridge rule
The answer handles the automation of transitioning a storage tier, similar to S3 Glacier Immediate Retrieval. As well as, Amazon S3 Lifecycles may be set manually to transition the decision recordings after resampling to different Amazon S3 Storage Lessons.
Cleanup
If you end up completed experimenting with this resolution, cleanup your sources by working the command:
cdk destroy
This command deletes the AWS CDK-deployed sources. Nevertheless, the S3 bucket containing your name recordings and CloudWatch log teams are retained.
Conclusion
This name recording resampling resolution provides an automatic, cost-optimized, and scalable structure to scale back long-term compliance name recording archival prices.