AWS Data analysis flow
Data Analysis flow
Collect / Ingest Data
Needs Event Pipeline
Event Pipeline Approach #1
API Gateway Call a Lambda for every API call and then upload the payload into an S3 bucket.
Pros: Simple
Cons: It gets costly when throughput increases.
Event Pipeline Approach #2
Directly invoke the Kinesis Firehose from an API Gateway.
Pros: Avoid cost of Lambda and S3 request.
Cons:
Store
S3
RDS
Process / Analyze
EMR (Redshift, Spark)
Visualize / Report
To get more insights
Last updated