AWS Data analysis flow

Data Analysis flow

  1. Collect / Ingest Data

    • Needs Event Pipeline

    • Event Pipeline Approach #1

      API Gateway Call a Lambda for every API call and then upload the payload into an S3 bucket.

      • Pros: Simple

      • Cons: It gets costly when throughput increases.

    • Event Pipeline Approach #2

      Directly invoke the Kinesis Firehose from an API Gateway.

      • Pros: Avoid cost of Lambda and S3 request.

      • Cons:

  2. Store

    • S3

    • RDS

  3. Process / Analyze

    • EMR (Redshift, Spark)

  4. Visualize / Report

To get more insights

Last updated