AWS Storage Options – SQS and Redshift
SQS
It is a temporary data repository that stores messages and offers a highly scalable, reliable, and high-scalable hosted message queuing service to temporarily store and deliver short text-based data messages (up to 256KB).
Supports a virtually unlimited number queues and supports unintended, at-least once delivery of messages.
It is ideal for any situation where multiple components of an application must communicate and coordinate their work in loosely coupled fashion, especially producer-consumer scenarios.
It can be used to coordinate multi-step processing pipelines, where each message is associated to a task that must processed.
enables the number of worker instances to scale up or down, and also enable the processing power of each single worker instance to scale up or down, to suit the total workload, without any application changes.Anti-Patterns
Binary or Large MessagesSQS can be used for text messages up to 64 KB in size. It is recommended to use Amazon S3 and RDS for binary messages. SQS can store the pointer if the application requires it.
Long Term storageSQS can store messages for maximum 14 days. If an application requires storage for longer than 14 days, Amazon S3 and other storage options should be considered.
High-speed message queuing or very short tasksIf the application requires a very high-speed message send and receive response from a single producer or consumer, use of Amazon DynamoDB or a message-queuing system hosted on Amazon EC2 may be more appropriate.Performance
This distributed queuing system is optimized for horizontal scaling and not single-threaded sending and receiving speeds.
One client can send and receive Amazon SQS messages at a rate between 5 to 50 messages per minute. Multiple messages can be requested in one call to achieve a higher receive performance.
They are durable, but temporary.
All messages are stored redundantly across multiple servers or data centers.
The message retention time can be configured on a per-queue base, from a minimum one minute to a maximum 14 days.
Cost Model
Pricing is determined by the number of requests.
The amount of data that is transferred in and out (priced by the GB per month).
It is highly flexible and extremely scalable.
It is designed to allow virtually unlimited computers to read and to write virtually unlimited numbers of messages at any given time.
Amazon Redshift supports almost unlimited number of queues and messages per user.
It is a fully-managed, fast, petabyte-scale, data warehouse service that makes it easy and cost-effective to efficiently analyse all your data using existing business intelligence tools.
Optimized for datasets ranging from a few hundred gigabytes up to a petabyte or greater.
It manages all aspects of setting up, operating, and scaling a data warehouse. This includes automating ongoing administrative tasks like patching and backups.
This is ideal for analysing large datasets with existing business intelligence tools
Common use cases includeAnalyze global sales data for multiple products
Stock trade data from the past
Analyze clicks and impressions of ad campaigns
Aggregate gaming data
Analyze social trends
Measuring clinical quality, operational efficiency, and financial performance
Performance in the health care industryAnti-Pattern
OLTP workloadsRedshift, a column-oriented database, is better suited for data warehousing or analytics. Amazon RDS is a better option if the application involves online transaction processing.
Blob dataAmazon S3 is a better option for Blob storage. Amazon S3 can also be used with metadata in other storage such as RDS or DynamoDBPerformance
Amazon Redshift offers very high query performance for datasets r