Project Context
When I started this project, the goal was not simply to run workloads on Kubernetes—it was to build a production-style platform that mirrored how mature engineering teams operate.
Most beginner Kubernetes deployments rely heavily on manual operations:
- Engineers run
kubectl applymanually - Secrets are stored in YAML or CI variables
- Cloud credentials are injected as long-lived static keys
These approaches work for small demos, but reliability and security degrade as environments grow.
My responsibility as a Platform Engineer was to design a secure, reproducible deployment platform that could:
- Provision infrastructure automatically
- Deploy applications declaratively
- Manage secrets without exposing credentials
- Recover entirely from source control
At first glance, the problem appeared to be “just deploying an app on EKS.”
After deeper analysis, the real challenge was establishing a secure control plane around deployments, identity, and secrets.
Understanding the Bottleneck
The architecture consisted of three primary layers:
- Infrastructure layer (VPC, EKS, IAM)
- GitOps control layer (ArgoCD)
- Application layer (Kubernetes workloads)
Initial investigation showed that traditional deployment pipelines fail when secrets and infrastructure lifecycle are tightly coupled.
The key issue was this:
Kubernetes workloads needed runtime secrets from AWS, but exposing AWS credentials inside pods created a major security risk.
This meant conventional solutions such as injecting AWS access keys via Kubernetes Secrets were insufficient.
Deep Observability
To understand system behavior and deployment state, I introduced:
- ArgoCD sync/health dashboards
- Kubernetes event inspection
- Terraform output/state validation
These tools exposed several important behaviors:
- Infrastructure state and application state drift independently
- Sync success does not guarantee workload health
- Secret delivery must occur before workload startup
This investigation revealed that deployment orchestration required dependency-aware sequencing rather than simple resource creation.
Architectural Changes
The main architectural improvements included:
1. GitOps-Based Delivery
I implemented:
- ArgoCD app-of-apps architecture
- Environment overlays using Kustomize
- Self-healing reconciliation
Instead of engineers manually deploying workloads, Git became the single source of truth.
This improved:
- Reliability
- Auditability
- Scalability
Every deployment became a Git commit.
2. Keyless Secret Management with IRSA
Previously:
Pods required static AWS credentials to access Secrets Manager.
New design:
I enabled EKS OIDC and configured IAM Roles for Service Accounts (IRSA), allowing the External Secrets Operator to assume a dedicated IAM role using projected service account tokens.
This reduced:
- Credential leakage risk
- Secret sprawl across environments
The application never receives AWS keys directly.
Instead, External Secrets Operator fetches secrets and materializes them as Kubernetes Secrets at runtime.
Non-Linear System Behavior
A critical insight was that infrastructure behavior was not linear.
During infrastructure provisioning:
- Terraform dependencies dominate execution order
During application deployment:
- ArgoCD sync waves dominate resource ordering
During runtime:
- Kubernetes reconciliation dominates failure recovery
Because of this, simple deployment pipelines were ineffective.
One important discovery:
Provisioning infrastructure and deploying workloads are fundamentally different control loops and should remain decoupled.
Terraform manages cloud resources.
ArgoCD manages cluster desired state.
Mixing both into a single workflow increases operational complexity.
Load Testing / Validation
I validated the platform through repeated infrastructure lifecycle testing.
The tests simulated:
- Fresh cluster provisioning
- Secret rotation events
- Drift and accidental resource deletion
During validation, I monitored:
- Pod readiness
- Sync health
- Secret propagation
- Deployment recovery
- Resource drift
Key findings:
- ArgoCD successfully restored deleted resources
- Secret rotation propagated without manual intervention
- Cluster rebuilds remained deterministic
This helped identify the optimal configuration:
- Terraform for infrastructure lifecycle
- ArgoCD for workload lifecycle
- External Secrets for runtime secret synchronization
CI/CD / Delivery Optimization
Infrastructure improvements were only part of the solution.
The delivery pipeline had inefficiencies:
- Manual deployment steps
- Weak change traceability
- Operational drift
I optimized the delivery workflow using:
- GitOps reconciliation
- Declarative environment overlays
- Automatic drift correction
This reduced:
- Deployment time
- Manual intervention
- Operational overhead
Evidence / Validation
The improvement was validated using:
- ArgoCD sync state
- Kubernetes health checks
- Terraform outputs
- Disaster recovery rebuild tests
The goal was ensuring results reflected real platform reliability rather than theoretical architecture.
Final Outcome
After implementation:
- Infrastructure became fully reproducible
- Deployments became declarative and self-healing
- Secret delivery became keyless and secure
The platform can now reliably handle:
Complete environment provisioning, deployment, recovery, and secret management with minimal manual intervention.
Key Results
- Built production-style GitOps workflow on AWS EKS
- Eliminated static AWS credentials using IRSA
- Achieved full infrastructure reproducibility using Terraform
- Enabled secure secret synchronization via External Secrets Operator
