shopping-cart

Backend for a Shopping Cart Microservice

This project implements the backend for a shopping cart microservice, designed to process purchase transactions securely, resiliently, and at scale. The architecture uses AWS services to ensure reliable data storage and asynchronous processing, with monitoring and alerts in case of failures. The solution includes a layered Data Lake for data analytics, with quality and governance policies in place.

System Architecture

to see the above diagram with more details, you can open it directly in Excalidraw, just import the image or the .excalidraw file

Operational Flow

Client Request
- The client sends an HTTP request with purchase details (buyer_id, product_id, number_of_installments, total_amount, purchase_date) to the shopping cart API.
Shopping Cart API
- The API, built with FastAPI, is hosted on an AWS Lambda function, and the code is versioned on GitHub, using the GitHub Actions as CI/CD for deployment.
- This API processes the request and sends the purchase data to an Amazon SNS topic for asynchronous processing.
- Then, the message is sent to an Amazon SQS queue subscribed to this SNS topic.
Processing with AWS Lambda
- Other AWS Lambda function, also with the code versioned on GitHub, consumes messages from SQS, processing each transaction and storing the data in Amazon DynamoDB for fast and scalable storage.
Data Persistence in DynamoDB
- Amazon DynamoDB stores the transaction data, ensuring scalability and low latency for read and write operations.
Fallback with Dead Letter Queue (DLQ)
- If a message fails to be processed, the DLQ in SQS stores failed messages, ensuring no transaction is lost.
Monitoring and Alerts
- Amazon CloudWatch monitors the Lambda, SQS, and DynamoDB operations. For critical errors, SNS (Simple Notification Service) sends alerts to notify stakeholders.

Data Lake and Medallion Architecture Layers

To implement the Medallion Architecture (Bronze, Silver, and Gold layers) an Airflow is used for orchestrate the data journey through these layers, ensuring data quality and scalability for analytics.

Bronze Layer: A scheduled DAG will run periodically extracting new records in DynamoDB and stores raw data (JSON Files) in a S3 Bucket, partitioning by year/month/day/hour/, using the DynamoDBToS3Operator().
Silver Layer:
- Performs transformations (such as data cleaning and enrichment) using Spark and Amazon EMR clusters, orchestrated by a DAG using EMR Operators.
- The datasets are written partitioned in S3 buckets in parquet file format.
- A quality gate is applied to check data quality using libraries like SODA or Great Expectations, before moving to the next layer.
- The Silver layer also uses a Glue Crawler to catalog the metadata in the Glue Data Catalog (or Amundsen).
Gold Layer:
- Performs aggregations also using Spark and Amazon EMR clusters, orchestrated by a DAG using EMR Operators.
- The aggregated datasets are written partitioned in S3 buckets in parquet file format, ready for analysis.
- An additional quality gate is applied after dataset generation in this layer, ensuring that only high-quality data is available for analytics.
- And also uses the Glue Crawler to catalog in Glue Data Catalog.
Data Contract: Formalizes data quality and structure expectations across layers, ensuring consistency and governance between the Data Producers and Data Consumers in the company. - Reference: - https://www.datamesh-manager.com/ - https://datacontract.com/
Data Contract: Formalizes data quality and structure expectations across layers, ensuring consistency and governance between the Data Producers and Data Consumers in the company.
- Reference:
  - https://www.datamesh-manager.com/
  - https://datacontract.com/

Technologies Used

Python with FastAPI for implementing the shopping cart API.
Amazon SNS and SQS for the asynchronous message queue.
AWS Lambda to process messages and store data.
DynamoDB for NoSQL storage of transaction data.
AWS CloudWatch and SNS for monitoring and alerts.
Apache Airflow for data pipeline orchestration
S3 Buckets for storage
Apache Spark for distributed data processing
AWS Glue (Crawler and Catalog) for metadata management.
Quality Gates libraries for data validation in the Silver and Gold pipelines.
- SODA
- Great Expectations
Data Contracts for consistency and governance between the Data Producers and Data Consumers in the company.
Medallion architecture (Bronze, Silver, and Gold layers).

Configuration Requirements

Python ^3.12.2
Poetry ^1.8.4

Installation

run:

poetry install

Running the Project

Configure environment variables for authentication and access permissions to AWS.
Configure the .env file on root with:
- SNS_TARGET_ARN="arn:aws:sns:us-east-2:00000000000:MyTopic"
run:

task run

Running tests

run:

task test

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github/workflows		.github/workflows
docs		docs
shopping_cart_api		shopping_cart_api
tests		tests
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

shopping-cart

Backend for a Shopping Cart Microservice

System Architecture

Operational Flow

Data Lake and Medallion Architecture Layers

Technologies Used

Configuration Requirements

Installation

Running the Project

Running tests

About

Releases

Packages

Languages

willrockoliv/shopping-cart

Folders and files

Latest commit

History

Repository files navigation

shopping-cart

Backend for a Shopping Cart Microservice

System Architecture

Operational Flow

Data Lake and Medallion Architecture Layers

Technologies Used

Configuration Requirements

Installation

Running the Project

Running tests

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages