BUILD_MANIFEST/storytelling.mdx

SEC_STATUS: CLEAR

[ CASE_STUDY // MEMORY_LOG_042 ]

SHA_HASH: 8f3d2a9a

COMPILED: 2026.06.04

AUTHOR: DEV_ARCHITECT

Project Context

When I started this project, the goal was not simply to run workloads on Kubernetes—it was to build a production-style platform that mirrored how mature engineering teams operate.

Most beginner Kubernetes deployments rely heavily on manual operations:

Engineers run kubectl apply manually
Secrets are stored in YAML or CI variables
Cloud credentials are injected as long-lived static keys

These approaches work for small demos, but reliability and security degrade as environments grow.

My responsibility as a Platform Engineer was to design a secure, reproducible deployment platform that could:

Provision infrastructure automatically
Deploy applications declaratively
Manage secrets without exposing credentials
Recover entirely from source control

At first glance, the problem appeared to be “just deploying an app on EKS.”

After deeper analysis, the real challenge was establishing a secure control plane around deployments, identity, and secrets.

//Understanding the Bottleneck

The architecture consisted of three primary layers:

Infrastructure layer (VPC, EKS, IAM)
GitOps control layer (ArgoCD)
Application layer (Kubernetes workloads)

Initial investigation showed that traditional deployment pipelines fail when secrets and infrastructure lifecycle are tightly coupled.

The key issue was this:

Kubernetes workloads needed runtime secrets from AWS, but exposing AWS credentials inside pods created a major security risk.

This meant conventional solutions such as injecting AWS access keys via Kubernetes Secrets were insufficient.

//Deep Observability

To understand system behavior and deployment state, I introduced:

ArgoCD sync/health dashboards
Kubernetes event inspection
Terraform output/state validation

These tools exposed several important behaviors:

Infrastructure state and application state drift independently
Sync success does not guarantee workload health
Secret delivery must occur before workload startup

This investigation revealed that deployment orchestration required dependency-aware sequencing rather than simple resource creation.

//Architectural Changes

The main architectural improvements included:

1. GitOps-Based Delivery

I implemented:

ArgoCD app-of-apps architecture
Environment overlays using Kustomize
Self-healing reconciliation

Instead of engineers manually deploying workloads, Git became the single source of truth.

This improved:

Reliability
Auditability
Scalability

Every deployment became a Git commit.

2. Keyless Secret Management with IRSA

Previously:

Pods required static AWS credentials to access Secrets Manager.

New design:

I enabled EKS OIDC and configured IAM Roles for Service Accounts (IRSA), allowing the External Secrets Operator to assume a dedicated IAM role using projected service account tokens.

This reduced:

Credential leakage risk
Secret sprawl across environments

The application never receives AWS keys directly.

Instead, External Secrets Operator fetches secrets and materializes them as Kubernetes Secrets at runtime.

//Non-Linear System Behavior

A critical insight was that infrastructure behavior was not linear.

During infrastructure provisioning:

Terraform dependencies dominate execution order

During application deployment:

ArgoCD sync waves dominate resource ordering

During runtime:

Kubernetes reconciliation dominates failure recovery

Because of this, simple deployment pipelines were ineffective.

One important discovery:

Provisioning infrastructure and deploying workloads are fundamentally different control loops and should remain decoupled.

Terraform manages cloud resources.

ArgoCD manages cluster desired state.

Mixing both into a single workflow increases operational complexity.

//Load Testing / Validation

I validated the platform through repeated infrastructure lifecycle testing.

The tests simulated:

Fresh cluster provisioning
Secret rotation events
Drift and accidental resource deletion

During validation, I monitored:

Pod readiness
Sync health
Secret propagation
Deployment recovery
Resource drift

Key findings:

ArgoCD successfully restored deleted resources
Secret rotation propagated without manual intervention
Cluster rebuilds remained deterministic

This helped identify the optimal configuration:

Terraform for infrastructure lifecycle
ArgoCD for workload lifecycle
External Secrets for runtime secret synchronization

//CI/CD / Delivery Optimization

Infrastructure improvements were only part of the solution.

The delivery pipeline had inefficiencies:

Manual deployment steps
Weak change traceability
Operational drift

I optimized the delivery workflow using:

GitOps reconciliation
Declarative environment overlays
Automatic drift correction

This reduced:

Deployment time
Manual intervention
Operational overhead

//Evidence / Validation

The improvement was validated using:

ArgoCD sync state
Kubernetes health checks
Terraform outputs
Disaster recovery rebuild tests

The goal was ensuring results reflected real platform reliability rather than theoretical architecture.

//Final Outcome

After implementation:

Infrastructure became fully reproducible
Deployments became declarative and self-healing
Secret delivery became keyless and secure

The platform can now reliably handle:

Complete environment provisioning, deployment, recovery, and secret management with minimal manual intervention.

//Key Results

Built production-style GitOps workflow on AWS EKS
Eliminated static AWS credentials using IRSA
Achieved full infrastructure reproducibility using Terraform
Enabled secure secret synchronization via External Secrets Operator