Sunday, July 28, 2019

Thoughts About Cloud Storage

There is no cloud, only a bunch of computers you don't own, run by people you don't know.  Anonymous
My people have no tradition of proofreading.  —Ken White

 
TL;DR:  Cloud storage might be suitable for storing backups provided one can afford the storage space and bandwidth needed.  It is not suitable for storing the only copy of anything.  Data stored with a cloud service must be encrypted using strong encryption to protect it from disclosure.  Cloud resources must never be set up as an "always on" mapped drive.

Cloud Storage and How it is Used

Cloud computing, or cloud storage, isn't really just a bunch of computers you don't own.  It isn't just "on the Internet," either.  It's a lot of computers and some very clever software that, together, have six important characteristics:
  1. Self service:  When you establish a "cloud" account, there's no human intervention at the other end.  That's convenient because there's no waiting to set up an account, add storage, etc.  It's also crucial to keeping the cost down.
  2. Excellent network access:  A cloud provider might serve millions of subscribers and must provide sufficient speed and responsiveness to make the customer's connection, not the cloud provider's connection, be the bottleneck.
  3. Elastic scalability:  People can make new accounts, or decide to add hundreds of gigabytes to their storage allocation, and the infrastructure must deal with that.  (But, note that paying for 100 GB of storage doesn't mean 100 GB is immediately allocated to you; that doesn't happen until you use it.)
  4. Resource pooling: The necessary scalability is achieved by sharing massive resources among many subscribers.  For the big cloud providers, "many" means millions or tens of millions.  The principle of multi-tenancy means your data will share disk space and CPU cycles with that of many others.  It's up to that clever software to keep things separate.
  5. Redundancy:  The cloud provider will keep multiple copies of customers' data on different servers; failure of a single server, or even of several, will not compromise the data.  The really big cloud storage providers keep redundant copies across multiple data centers.
  6. Measured service:  This implements the principle of paying for what one uses.  Google will provide 15 GB free; beyond that, there's a charge.  For cloud storage, generally what's measured is storage used.  Other cloud services might also measure CPU seconds, transfer bandwidth used, or other resources.
 With all of that, cloud storage might seem to be the perfect answer to limited storage and disk failures for consumers.  Not so fast.  We need to consider the way we use cloud storage, the properties of a secure system, and the causes, probabilities, and consequences of failure.

There are two ways one could use cloud storage: as primary storage and as backup storage.  When cloud storage is used for primary storage, the only copies of data are those "in the cloud." Failure of the cloud storage means irretrievably lost data.  If cloud storage is used for backup, the operational copy of data is stored elsewhere, usually on local drives.  Both the local storage and the cloud storage would have to fail to cause loss of data.

Cloud storage can also be used for file sharing.  Shared files are still either primary or backup, depending on whether another copy exists.

Security and Threats

The security of a system can be measured by three properties:
  1. Confidentiality is the condition that data have not been revealed to unauthorized people.
  2. Integrity means data has not been altered or destroyed.
  3. Availability means data can be used by authorized people when needed and with suitable response time.
To analyze the security of any system, we need to analyze the threats to the confidentiality, integrity, and availability of its data.  Broadly, those threats are disclosure, alteration, and denial.



I rate the risk of disclosure as high.  All major cloud storage providers scan uploaded files for contraband, specifically for child pornography.  Dropbox, and possibly others, scan shared files for material protected by copyright.  Even if you are absolutely certain you have no electronic contraband, a false positive could lead to law enforcement action.  Resource pooling and multi-tenancy mean one subscriber's data could be accessible to others in the event of a software error.  Poorly protected accounts, e.g. by weak passwords, could make data accessible to malicious outsiders.  Finally, a configuration error by the subscriber could share data not intended to be shared; this is probably the most likely risk.

The risk of alteration is low; the nature of cloud storage protects the integrity of data.  An exception might be a configuration or software error that erroneously makes data shared and writable by others, or a malicious attack on a poorly protected account.

The risk of denial is medium.  Although redundancy and good network access mean that data will likely be available from the cloud provider, access also requires that the customer network be functioning.  Failure of the cloud provider's business could make data unavailable.  That need not be a financial failure; provider Megaupload was shuttered by United States law enforcement authorities and the stored data became permanently inaccessible.  Some cloud providers assert the right to remove files that violate their terms of service.  Finally, if a cloud drive is "mapped," that is set up to be viewed by the customer's operating system as a local resource, malicious software known as ransom-ware could render the contents inaccessible by encrypting the data.

Using cloud storage effectively

The consequences of disclosure, alteration, or denial could result in irrecoverable loss of data if cloud storage is used as primary storage.  Cloud storage must never be used for primary storage.

If cloud storage is used for backup, the consequences of alteration or denial are less severe; one is without backup until the situation is corrected.  However, denial caused by ransom-ware could make both primary storage and backup inaccessible.

For backup data, the consequences of disclosure are severe.  Even if disclosure does not lead to investigation by law enforcement, information in primary storage will be disclosed.  That could include financial user IDs, account numbers, and passwords, medical information, and other confidential data.  That leads to two conclusions:
  1. Cloud storage used for backup must never be "mapped" as a disk drive accessible to the operating system in order to protect it from malicious software.
  2.  Backup data on cloud storage must be be protected by strong encryption to protect against inadvertent disclosure and scanning by the cloud provider.

Other considerations

Encryption:  The only safe encryption is that for which you generated and hold the encryption key.  If the cloud provider holds the encryption key, you are trusting them not to unlock your data.  A strong encryption algorithm is needed; I recommend AES with a 128-bit key.  Suggestion: keep copies of the crypto key on two separate USB drives stored in different buildings; do not keep a copy on the system being backed up.

Storage size and cost: A 500 GB laptop drive will need at least 2 TB of backup space to do progressive backups.  That would be $50-75 if paid annually.

Bandwidth:  A 500 GB drive that's 60% full will take nearly a week to upload at DSL speeds and over 24 hours at 10 Mb.  A 15 GB progressive backup will take nearly 25 hours to upload at DSL speeds and almost four hours even with a 10 Mb connection.  To use cloud storage effectively for backups, you'll likely need a 50 Mb or faster Internet connection.

Account security: Use a strong password to protect your cloud account.  Choose a provider that offers two-factor authentication.  If possible, use a physical token like a YubiKey or an app that generates one-time passcodes; pass codes sent by text message are not secure because of SIM-swapping attacks.



Copyright © 2019 by Bob Brown
Last update: 2021-02-22

Creative Commons License
Thoughts About Cloud Storage by Bob Brown is licensed under a Creative Commons Attribution-ShareAlike 4.0 Unported License

No comments:

Post a Comment