On Monday, Oct. 20, 2025, Amazon Web Services (AWS) experienced a large-scale outage that knocked out large portions of the Internet. Below is a technically based overview of what happened and what this means for organizations.
Timeline & core data
- The outage reportedly began around 07:11 UTC (08:11 Dutch time) on Oct. 20.
- AWS indicated that the starting point was in the US-EAST-1 (Virginia) region of their data centers.
- The immediate cause: an error in the Domain Name System (DNS) resolution of a key API endpoint of the database service Amazon DynamoDB. In other words, the API servers could not be found because the Internet's ‘phonebook’ function (DNS) failed.
- That DNS error made specific AWS services inaccessible, leading to increased error rates and delays in multiple other AWS components.
- During the morning/afternoon, AWS indicated that the core issues were “fully resolved” but that there were still systems with backlogs or residual processing, such as AWS Lambda.
- The outage affected hundreds to thousands other services and applications worldwide that depend on AWS infrastructure.
Why this is relevant to our customers
As a customer of Analyst ICT, it is important to understand how such an outage can impact your organization - even though your infrastructure may not be running directly on AWS. At Analyst ICT, we don't build infrastructures on AWS but many software parties do.
Dependencies in the chain
Many modern applications, websites, back-end systems and cloud services are built on (or use) AWS components: storage (S3), compute (EC2, Lambda), databases (DynamoDB, RDS), network, DNS, and so on. When one component fails (as here, the DNS to DynamoDB), the chain reactions cause:
- Proprietary systems running directly on AWS can fail or become very slow.
- Third parties (externally delivered apps, SaaS services, links) may fail because they depend on AWS components.
- You notice not only loss of functionality, but also delays, errors, inaccessibility.
- Residual effects: buffers, queues, backlogs (e.g., in messages, events) take time to eliminate.
Infrastructure-level risks
- Concentration risk: one large cloud provider (such as AWS) dominates much of the infrastructure. This makes downtime tangible for many organizations at once.
- Chain reactions: a technical error (such as DNS) in the foundation creates effects in upper layers.
- SLAs and recovery: even if AWS says “recovered,” there may be residual issues. The accompanying recovery process is often slower than the core fix.
What can organizations do?
Given what has happened, it is wise to consider how you (or Analyst ICT on your behalf) have set up your infrastructure and dependencies. Here are some points of interest:
- Mapping dependencies
- What systems are running on AWS services? Are they critical to your business?
- What external services or SaaS products do you use that run on AWS?
- Are there single points of failure through one cloud provider or region?
- Resilience and fallback strategies
- Considering multi-region or multi-cloud approach: if region US-EAST-1 fails, do we have alternatives?
- For critical services: timely backups, redundancy, fail-over mechanisms.
- Monitoring for external dependencies: knowing when a third-party service (hosted on AWS) gets problems.
- Capital at sight: communication & recovery plan
- During outages, prompt communication is important - internally and externally (customers).
- Recovery plan: not just fixes, but aftercare (such as clearing backlogs, emptying queues).
- Customer service ready: provide clarity, answer questions.
- Evaluation and lessons learned
- Analyze what exactly the impact was on your organization due to this outage.
- Ask questions, “What went wrong?”, “How quickly did we notice?”, “Which systems had problems?”, “How quickly was recovery noticeable?”
- Use the outages as input for architecture and process improvement.
The outage at AWS on Oct. 20, 2025 is a harsh reminder that even the largest cloud provider is vulnerable. For organizations - including Analyst ICT's clients - this means above all: be prepared. It's not just about ‘will it ever happen,’ but about How we react when it happens. Structure, understanding dependencies and good plans make the difference.
Would you like us to work together your infrastructure and dependency map Walk through to see where you are vulnerable? We are happy to help with that.




