At high volumes, Kinesis is one of the more expensive AWS services. The example we'll look at in this post is a 1GB/s stream with 1 million 1kb records per second and 3-day retention, whose sticker price is over $800k per year (provisioned mode) or $4.2 million per year (on-demand mode). (This is in us-west-2, though you can expect similar prices elsewhere. Also see the AWS official pricing and note that your organization may have different discounts in place.) There is also the "On-demand Advantage" mode, around 60% cheaper than regular on-demand but still considerably more expensive than provisioned mode.

Breaking down these numbers a bit, we see that for on-demand mode, bandwidth costs dominate ($0.08/GB for data ingested and $0.04/GB to read), amounting to over $3.6 million annually. You're also hit with an extra charge when storing data for more than 24 hours. Beyond 24 hours but less than 7 days, it's ~ 4x more expensive than S3. After 7 days, the storage cost is equal to S3 standard's worst price: $0.023/GB per month.

If your org is looking to save money, you probably shouldn't be using either of the on-demand modes for Kinesis at scale. You're better off using provisioned mode and adjusting the number of shards as needed. These changes take effect in minutes. While this does mean some operational complexity to monitor your application to know when to scale up or down, it's not rocket science and even at this (relatively low) volume, it's possible to save millions of dollars per year.

For provisioned mode, the annual cost exceeding $800k is due to the per-shard charge, the higher storage cost beyond 24 hours, and "PUT payload unit" usage, explained below. Here's the breakdown:

  • Each shard can handle 1000 records/second for writes, 2000 records/second for reads, and at most 1MiB/s. To get a million records per second written, we need 1000 shards, and it's advisable to leave some buffer (say 20%), so let's call it 1200 shards, costing over $157k per year ($10.95/mo per shard x 1200 x 12).
  • Each record written to the stream is rounded up to the nearest 25kb increment, called a "PUT payload unit", and you're charged $0.014 per million units. This is not much per unit, but since it's rounded up, even a 1kb record costs the same as a 25kb record. Our 1 million records/second stream costs about $440k/year in PUT payload unit charges.

To achieve additional cost savings with provisioned mode, you can create aggregate records, which are lists of records, and compress them before shipping to Kinesis. If you hit the sweet spot of aggregate record size just shy of 25kb, you're paying little in PUT payload unit costs and also reducing the number of records per second as far as Kinesis is concerned. The Kinesis Producer Library (KPL) and Kinesis Consumer Library (KCL) are both designed for this, with KPL capable of producing aggregate records and the KCL flattening them back to individual records on consumption.

This can result in savings on the number of shards needed (1000 aggregate records might be 10k logical records) and those PUT payload units (since the physical records can be closer to a 25kb increment), but comes with a few caveats:

  • It can't always reduce the number of shards: shards are still limited to 1MiB/s ingest and 2MiB/s read capacity (and PutRecords batches are also limited to 10MiB). For instance, in our example, we hit 1MiB/s by writing 1000 1kb records, no matter how they are grouped. You'd still need 1000 shards to ingest a million records per second, so those shard costs are unchanged.
  • Achieving effective batching into aggregate records can be complicated in real workloads and will often require a custom ingestion service. For example, our 1 million records/second stream might be due to 1 million users each publishing 1 event per second to the stream. We don't want each publisher to linger for 1000 seconds to collect a batch of 1000 elements, say.

On this second point, one can coalesce and aggregate concurrent requests from multiple users via a custom scalable ingest service, written using the KPL or from scratch using the base Kinesis HTTP API. A custom ingest service is a good idea, but it is something you have to build and operate, and the number of nodes needed to power such a service can be significant, even approaching the number of shards. For instance, if each ingest service instance can process 10MB/s (10k records x 1kb), you still need 100 instances of it to reach 1 million records per second, in addition to still paying the monthly per-shard costs.

Another minor downside of using aggregate records: if any consumers need to read historical data from the stream, AWS can no longer directly tell you which "physical" record contains the logical record you're interested in, though this often isn't a big deal for streaming applications which don't need any sort of random access to past records.

Summary

Putting this all together, here are our recommendations:

  • Use provisioned-mode. If you don't want to deal with the headache of monitoring and dynamically adjusting shard counts, you can over-provision and still come out well ahead on cost.
  • Use a custom ingest service that creates aggregate records.

How much savings can you expect with these approaches? Well, in our running example which cost ~ $800k per year with provisioned mode, $440k was due to PUT payload units. You can make these costs much less with a scalable ingest service that creates 25kb aggregate records (it's $0.014 per 25GB if you hit these 25kb boundaries exactly, so less than $18k/yr for our 1GB/s stream), though you do have to pay for the nodes powering this ingest service.

The number of shards needed would still be 1200 in our example, and there isn't much you can do about the fact that Kinesis storage is more expensive than S3, especially if you factor in that S3 supports use of a Lifecycle Policy or Intelligent Tiering to move data to colder (cheaper) storage. Stream processing in particular mostly needs access to access recent data, so a lifecycle policty that moves to colder storage can save signficantly. Infrequent Access is about half the price of S3 standard but charges a small amount for bandwidth on read, often a good tradeoff for historical access to data streams.

With some additional operational complexity, you can create a process that moves older Kinesis data into S3 (perhaps using AWS Firehose), but this costs money to move and means data can no longer be accessed by a normal Kinesis consumer.

Kinesis over S3

If the above techniques aren't enough or don't sound appealing, there's another option: a Kinesis-compatible API that uses S3 for storage and runs on your infrastructure. Yes, this is a product we make. It's a good idea and has several advantages for this use case:

  • S3 has no bandwidth charges in-region; it's a fixed per-request cost to read or write a file to S3. This avoids the on-demand Kinesis bandwidth costs and the PUT payload unit costs in provisioned mode.
  • S3's cost per GB-month is excellent, with a variety of options including intelligent tiering. You win on storage costs versus AWS Kinesis and can potentially eliminate the separate process to send data from Kinesis to S3 for longer-term storage.
  • Because shards write large pages of records, per-shard capacity can be much higher. You're no longer limited to 1MiB/s and 1000 records. This can reduce the number of shards needed, sometimes by a factor of 10 or more.
  • Since data goes directly to S3, individual records can be any size, and batches of records can be any size (no longer limited to 500 elements or 10MiB).
  • Larger pages can achieve better compression vs trying to compress individual records or smaller aggregate records.
  • Storing streams directly in S3 also means you can also replicate them across regions just by flipping a switch to enable S3's cross-region replication.

For the running example of a 1 million records per second stream with 3-day retention, the $800k/yr (or $4.2 million in on-demand mode) becomes ~ $140k/yr, even without accounting for better compression.

Since the implementation is API-compatible with existing Kinesis client libraries, no code changes are necessary.

If you're interested to learn more or would like to see a whitepaper on the design, sign up below.

Sign up to learn more.