GitHub vs Compliance: Why You Need Backups for ISO 27001 and SOC 2

SimpleBackups founder

Laurent Lemaire

Co-founder, SimpleBackups

August 15, 2024

GitHub is an essential tool for developers and businesses worldwide, offering a platform where code can be stored, shared, and collaborated easily.

However, a common misconception persists the belief that GitHub’s built-in redundancy is all you need to safeguard your code.

This myth can lead to a false sense of security.

In this article, we’ll debunk the myth of GitHub’s built-in redundancy, explain what GitHub’s terms of service actually say about data protection, and explore how ISO 27001 and SOC compliance requirements relate to code stored on such services.

Download the GitHub Backup Checklist (PDF)

Make sure your GitHub backups are compliant and safe →

Understanding GitHub's Built-in Redundancy

GitHub does employ a sophisticated system of redundancy and backups to ensure the availability and integrity of their platform.

Their infrastructure is designed to protect against hardware failures, data corruption, and to ensure high availability of their services.\
However, it’s important to understand what this means—and more critically, what it does not mean.

🔁 Platform Resilience: GitHub’s redundancy is primarily focused on keeping the platform itself operational. This includes replicating data across multiple servers and data centers to prevent downtime in the event of hardware failure. However, this redundancy is not designed with individual user needs in mind; it’s about ensuring GitHub’s service continuity, not about backing up your specific data for your specific recovery needs.

⛈️ Disaster Recovery: GitHub’s internal backups are intended for disaster recovery at the platform level. In other words, these backups are meant to restore the entire GitHub service in the event of a catastrophic failure, not to recover individual user data or repositories on a case-by-case basis.

❌ No User Access to Backups: One of the most significant limitations is that GitHub users do not have direct access to the platform’s internal backups. If you accidentally delete a repository, GitHub’s internal backups are not there for you to access and restore that data. This is a critical distinction—GitHub’s backups are for their operational recovery, not for user-level data recovery.

What GitHub's Terms of Service Say

To truly understand the limitations of relying solely on GitHub’s built-in redundancy, it’s essential to look at GitHub’s own terms of service. Here’s what GitHub outlines regarding data protection and user responsibility:

  1. Responsibility for Data: GitHub’s terms clearly state that users are responsible for maintaining their own backups of their content. While GitHub strives to provide reliable service, they do not guarantee the preservation of data stored on their platform. This means that if your data is lost, corrupted, or deleted, GitHub is not liable for its recovery.
  2. No Guarantee of Data Availability: The terms of service also make it clear that GitHub does not guarantee the availability or recoverability of your data. They explicitly recommend that users maintain their own independent backups to protect against data loss.
  3. Limitation of Liability: GitHub limits its liability concerning data loss, placing the onus on the user to protect their data. This is a standard practice in the industry, but it underscores the fact that relying on GitHub’s built-in redundancy is not a substitute for a dedicated backup strategy.

These points are crucial for understanding why trusting GitHub’s internal redundancy is not enough. The platform is not responsible for ensuring that your individual data is backed up or recoverable; that responsibility lies with you.

Your Github data in the context of Compliance

In that context, let's know look at what the 2 major compliance standards are expecting your to do with your GitHub data.

GitHub Backup Checklist

ISO 27001 Compliance and Code Protection

ISO 27001 is an international standard for information security management, and it’s increasingly adopted by organizations looking to demonstrate their commitment to data protection. When it comes to code stored on services like GitHub, ISO 27001 has specific implications.

  1. Asset Management: Under ISO 27001, your code is considered an asset that needs to be identified, classified, and protected. This means that your organization must have policies and procedures in place to ensure the security of your code, including backups.
  2. Data Integrity and Availability: ISO 27001 requires organizations to ensure the integrity and availability of their data. This involves implementing controls to protect against data corruption, unauthorized access, and ensuring that data can be recovered in the event of loss. While GitHub’s redundancy might cover some aspects of availability at a platform level, it does not guarantee the integrity or availability of your specific codebase.
  3. Backup and Recovery: The standard explicitly requires that backups be made and maintained to ensure that data can be recovered. This means that relying solely on GitHub’s built-in redundancy is not enough to meet ISO 27001 requirements. You need an independent backup solution that allows you to restore your code in case of accidental deletion, corruption, or other forms of data loss.
  4. Regular Audits and Testing: ISO 27001 requires regular audits and testing of your security measures, including backups. This means you must regularly test your ability to recover your code from backups to ensure that your processes are effective and meet the standard’s requirements.

In summary, ISO 27001 compliance demands a proactive approach to data protection that goes beyond relying on GitHub’s built-in redundancy. An independent backup solution is necessary to ensure that you meet the standard’s requirements for data integrity, availability, and recoverability.

SOC Compliance and Code Protection

Service Organization Control (SOC) reports, specifically SOC 2, are designed to ensure that service providers manage data securely to protect the privacy and interests of their clients. SOC 2 compliance is particularly relevant for SaaS providers and organizations that handle sensitive information. Here’s how it relates to code stored on GitHub:

  1. Security and Availability Principles:SOC 2 is based on five trust service principles: security, availability, processing integrity, confidentiality, and privacy. When it comes to code stored on GitHub, the principles of security and availability are particularly relevant. SOC 2 requires that your organization implement robust security measures to protect against unauthorized access and ensure that your code is available and recoverable.
  2. Data Protection Controls: To achieve SOC 2 compliance, your organization must demonstrate that you have implemented adequate controls to protect your data, including code repositories. This includes ensuring that your code is backed up and that these backups can be accessed and restored when needed.
  3. Independent Backups: Just like ISO 27001, SOC 2 compliance requires independent backups. GitHub’s internal redundancy does not satisfy this requirement because it does not provide you with control over your backups or guarantee the ability to recover specific data. A third-party backup solution is necessary to meet SOC 2 standards.
  4. Incident Response and Recovery: SOC 2 compliance also involves having a robust incident response and recovery plan. This means that if your code is compromised or lost, you need to be able to quickly restore it from backups. Relying on GitHub alone leaves a gap in this plan, as their built-in redundancy does not support quick and reliable recovery at the user level.
  5. Audit Trail and Documentation: SOC 2 requires that you maintain detailed records of your data protection practices, including backups. This documentation must show that your backups are regularly tested and that you can restore your code if needed. An independent backup solution typically provides these features, whereas GitHub’s built-in systems do not.

Conclusion

Both ISO 27001 and SOC 2 compliance require organizations to take active measures to protect their code and ensure that it is available and recoverable.

If there is one thing to remember is that compliance standards require proper external backups and restore procedure. And frankly you don't have to be certified or looking to be compliant to these standards to understand that especially for any tech company, this is highly critical even-though often misunderstood.

Relying solely on GitHub’s built-in redundancy is not sufficient to meet these compliance standards. While GitHub does offer a resilient platform, their redundancy measures are not designed to meet the specific needs of individual users or to comply with stringent data protection requirements.

To achieve compliance with ISO 27001 and SOC 2, your organization needs an independent backup solution that gives you control over your data, allows for granular recovery, and ensures that you can meet all relevant standards for data protection. By taking these steps, you not only protect your code but also ensure that your organization is compliant with the highest standards of information security and data protection.



Back to blog

Stop worrying about your backups.
Focus on building amazing things!

Free 7-day trial. No credit card required.

Have a question? Need help getting started?
Get in touch via chat or at [email protected]

Customer support with experts
Security & compliance
Service that you'll love using