DNS name resolution failures
Incident Report for CodeShip
Postmortem

On July 10, 2017, between 19:13 UTC and 19:42 UTC, we detected problems with resolving codeship.com domain names such as www.codeship.com and app.codeship.com. The problem occurred at our DNS provider immediately following an update we made to our DNS records. As with other components of our infrastructure, we store our DNS records as code in a Github repository and use CI/CD to review, test and deploy the record changes to our provider via API as described in this blog post: https://blog.codeship.com/dnsimple-dns-history-continuous-deployment/ . Although we have been running the described system for over three years without failure, the deployment of our latest zone update caused an inconsistency on DNS servers deployed across the providers’ network— some servers resolved the names correctly while others had no records at all for our domain (!!). The Support team at the provider corrected the problem quickly by forcing a refresh allowing all DNS servers to start resolving correctly. Upon discussing the code that we are using for our deploy (found here: https://github.com/codeship/dns_deploy), they have provided some initial suggestions on changes we can make to ensure consistency while we are awaiting response from them on the root cause of DNS server records inconsistency.

Posted Jul 11, 2017 - 19:39 UTC

Resolved
This incident has been resolved.
Posted Jul 10, 2017 - 20:49 UTC
Monitoring
The issue was resolved by our DNS provider. We are working with them to identify root cause.
Posted Jul 10, 2017 - 19:42 UTC
Investigating
We are investigating DNS name resolution failures for codeship.com domains, i.e. app.codeship.com
Posted Jul 10, 2017 - 19:13 UTC