Nothing is foolproof when it comes to guaranteeing uptime. We could quote a crazy uptime percentage guarantee here, but instead, we will simply let the daily uptime stats speak for themselves. Want to see the current uptime status? Head over to our dedicated uptime page. If you have immediate questions or concerns about our uptime, get in touch.
Rustici Software has implemented a two-pronged approach to infrastructure and application security. We utilize the SANS 20 Critical Controls (20 CC) framework as our set of guiding principles for infrastructure and network security management. The 20 Critical Controls encompass all aspects of network management, covering topics ranging from encryption to protecting against social engineering attacks.
On the application security side, we have implemented the Open Web Application Security Project (OWASP) framework. In addition to using OWASP to guide our development and coding practices, we regularly pen test our application in a staging environment to ensure that new code changes do not introduce vulnerabilities.
Backups and Disaster Recovery
All the backups in the world are useless if you can’t use them to recover data and infrastructure. We’ve designed our backup and disaster recovery strategy around two metrics:
- If there’s a disaster, how fast can we recover?
- Have we done a test restore this week? If not, do one now.
To this end, we’ve taken the following measures:
- Our Web and Database tiers are split between two physical AWS availability zones. Both availability zones are provisioned such that they can handle a full system load, so that if one availability zone fails, the other can handle all system traffic without issue.
- SCORM Cloud is served primarily from the AWS US-East-1 region in N. Virginia. We have designed our configuration management systems so that we can stand up SCORM Cloud in the US-West-2 region in Oregon within 2 hours of a disaster that takes the entirety of the US-East-1 region offline.
- We maintain backups of our MySQL databases in AWS regions on both the East and West Coasts.
- We maintain active read-replica databases in the US-West-2 region. In the event of a disaster that takes out multiple US-East-1 availability zones, we still have up-to-the-minute data in US-West-2.
- We maintain content in version-controlled S3 buckets with a 60-day expiration period.
- We synchronize our production S3 buckets to backup buckets with US-West-2 endpoints.
- Every time we deploy a new version of SCORM Cloud, we’re also running a test of our DR systems. Web instances are built from scratch. API and interface tests are run and validated. Databases are restored from snapshots. CloudFormation scripts rebuild infrastructure.
SCORM Cloud utilizes a Content Distribution Network (CDN), Amazon CloudFront. This offers drastic performance improvements for learners launching course content, especially for learners not located in North America. Learn more about SCORM Cloud CDN here.
On the infrastructure side, we utilize Amazon Web Services for every aspect of SCORM Cloud. We lean on AWS’ expertise in data center and infrastructure management so that we can concentrate on what we do best – developing great software.
- All of our infrastructure is segmented using AWS’ Virtual Private Cloud (VPC) features. We use VPCs to keep our data and web servers carefully segregated, and to further separate our development and staging environments from production.
- We use Amazon S3 for content storage. Access to course content is granted on a per-course basis to end-users by SCORM Cloud during the launch process for a limited time. In other words, course content is not publicly exposed.
- We use Amazon RDS for MySQL database services. Our RDS instances are in a multi-availability zone configuration, so the loss of a single geographic location or database server maintenance won’t impact SCORM Cloud’s availability. We also replicate all MySQL databases to RDS instances in the AWS US-West-2 region in Oregon, and maintain database backups in both US-East-1 and US-West-2 regions.
- Our web servers are located in multiple physical availability zones in the US-East-1 region.
- We make extensive use of AWS’ CloudTrail and CloudWatch features to monitor our systems’ availability and to audit access. We have real-time alerts and change management configured that give us visibility into all changes to the environment.
- We use AWS’ IAM identity management features to facilitate and audit access to all AWS resources.
For a complete overview of Amazon Web Services’ security practices and regulatory compliance information, please see the following:
Monitoring and Availability
We’ve built extensive in-house monitoring for SCORM Cloud using Prometheus. Using Prometheus, we have visibility into server statistics like CPU, disk, and network usage; application metrics like JVM memory usage, JDBC response times, HTTP request rates, response times, and status codes; and database metrics like query rates and response times, hot indices, lock contention, and expensive queries. We use these to resolve problems (hopefully) before customers ever notice them.
As a backup, we also use AWS CloudWatch in some capacity to monitor system-level metrics.
We also have a tertiary integration we built with Pingdom that alerts us to system-wide outages via phone calls, Slack messages, texts, emails, and carrier pigeons (well, not that last item, but if we thought it would help, we would do it). Want to see real-time status? Head over here to our uptime page.
We have an on-call rotation implemented using OpsGenie. Prometheus (via Alertmanager), CloudWatch, and Pingdom can and do page a real developer if things go wrong. We strive to keep our status page up-to-date with the latest information in case of an incident.
We built out our SIEM system atop the ELK (Elasticsearch, Logstash and Kibana) Stack. All of our production instances forward their log data to Logstash, which parses the input, tags it according to rules, and the passes it along to our Elasticsearch cluster for indexing. We have built extensive dashboards in Kibana that give us visibility into the log data. Among other things, we use the ELK stack for monitoring remote administrative access, for per-instance change-control, and for monitoring application-level authentication events.
We tightly restrict access to our production databases and content stores: only a core group of developers and dev/ops folks have access to the databases and S3 buckets where the content resides.
Remote access to all production resources occurs via a restricted network here at Rustici Software. All access to our production resources occurs over an IPSec VPN connection.
Administrative access to production systems is extensively logged and audited. Database credentials are acquired via our Vault service, which maintains an audit log of both administrators and systems accessing our database. We use ThreatStack along with our SIEM ELK stack to maintain audit logs of all user activity on our application servers.
We make extensive use of AWS’ CloudTrail system to log API and console access to AWS resources. AWS access is controlled: each person that is able to log in is granted an explicit set of rights necessary to do what they need to do, and no more. We do not use AWS root keys for access at any time – all access is managed via AWS IAM services. Each administrator accesses the system using a unique set of credentials, keypairs, or API keys.
During our build process, we scan our application for CVEs in NIST NVD. If CVEs are found, the build process is failed so that we can’t deploy a vulnerable version of the application. This forces us to fix in-application vulnerabilities on the spot.
Our application servers run ThreatStack, a host-based intrusion detection system that records all commands (and syscalls) made on those servers. Suspicious commands (like “wget” or “curl”) immediately trigger an alert. ThreatStack also analyzes our AWS CloudTrail logs for alarming changes and alerts on those (IAM policy changes and security group changes being the most important). ThreatStack also analyzes operating system packages and reports on vulnerabilities daily, so that we know if we have a system with vulnerable software on it.
Our AWS account has GuardDuty configured, a network-based intrusion detection system offered by AWS. GuardDuty analyzes AWS VPC flow logs, DNS lookup requests, and more to catch suspicious behavior. Some examples: an IAM user logged in from a network that is unusual, or an instance is communicating with a remote host on an unusual port, or an instance is communicating with a cryptocurrency mining network, and so on.
SCORM Cloud utilizes a Blue/Green production infrastructure, which is to say that at any given time our production stack is running on one of two environments. When it is time to update the production environment, we build a fresh set of Amazon Machine Images (AMIs) and deploy them to whichever stack is inactive. We build our AMIs using a pipeline consisting of Ansible, Packer, Jenkins, and StackStorm. This build pipeline allows us represent all aspects of the systems’ configuration as code, and removes the potential for human error from the build and deployment process. The entire production stack can be rebuilt from scratch, tested, and deployed in under two hours.
By using indelible master AMIs in this fashion, we gain a major security advantage: even if an instance were to be compromised, we can redeploy the system with absolute assurance that the compromise is no longer viable – the compromised instance is immediately destroyed and replaced with an instance built from trusted sources.
Protecting your privacy is essential to us. Rustici Software will not give, sell, rent or loan any personal information to any third party, unless:
- It is necessary to share information in order to investigate, prevent, or take action regarding illegal activities, suspected fraud, situations involving potential threats to the physical safety of any person, violations of Terms of Service of the specific application, or as otherwise required by law.
We are Privacy Shield certified. Click here to view our certification.
Terms of Service
Our Terms of Service for SCORM Cloud are available here.
When it hits the fan
We’ve built SCORM Cloud from the ground up to be fault tolerant with no single point of failure (props to Amazon for making that easy). Sometimes, though, the unexpected happens. If you want to keep up with our progress during an outage, you can follow us on Twitter @SCORMCloud.
Questions about Cloud's infrastructure?
Reach out to us to learn more about SCORM Cloud.