The Storage Bible

The Storage Bible

The term Data Storage refers to the various methods and technologies used to capture and retain digital information. Data storage can be either physical or digital, and it can take many different forms depending on the needs of the user.

In this article, I want to explore the many different types of storage options available to you and your organisation. These include:

  • Hard disk drives (HDDs) and solid-state drives (SSDs): These are physical storage devices that use spinning disks or memory chips, respectively, to store data.
  • Cloud storage: This refers to the use of remote servers to store data, which can be accessed over the internet.
  • Optical storage: This includes CD, DVD, and Blu-ray discs, which store data using laser technology.
  • Tape storage: This involves the use of magnetic tapes to store data.
  • USB drives and memory cards: These are portable storage devices that use flash memory to store data.

In business, data storage is an important aspect of modern-day computing as it allows organisations to store, organise, and analyse vast amounts of data to make informed decisions. By keeping track of customer information, sales records, and financial data, businesses can gain insights into their operations and identify opportunities for improvement.

Data storage is also crucial for collaboration and communication within your business, as it allows employees to access and share information with each other in real-time.

In addition, data storage is essential for compliance with regulations and protecting sensitive business information from unauthorised access.

Data storage also plays a critical role in the reliability and performance of computing systems, as it allows us to access and process data quickly and efficiently.

The Importance of Data Storage

Data storage is essential for the operation of modern businesses and organisations, as well as for individuals. It allows for the creation, organisation, and retention of digital information, which is increasingly important in today's digital world.

Some of the key reasons for the importance of data storage include:

  • Data storage enables the creation and organisation of digital information: Without data storage, it would be difficult to create and organise digital information in a way that is easily accessible and usable.
  • Data storage allows for the retention of information: Data storage allows businesses and individuals to retain important information and records over time, which is crucial for regulatory compliance and long-term planning.
  • Data storage enables the sharing of information: Data storage makes it possible to share information with others, whether it be within a single organisation or with external parties.
  • Data storage enables data backup and disaster recovery: Data storage allows for the creation of backups of important information, which can be used to recover from data loss due to hardware failure, natural disasters, or other unforeseen events.

How Does Digital Data Storage Work?

Digital data storage refers to the use of digital media to store and retain digital information. There are many ways in which digital data storage can work, depending on the type of storage being used. Below are some examples for digital data storage:

  • Hard disk drives (HDDs) SAS, SATA and solid-state drives (SSDs): These types of storage devices use spinning disks or memory chips, respectively, to store data. When you save a file to an HDD or SSD, the device stores the data on one of these disks or chips using a system of magnetic or electronic signals. When you want to access the stored data, the device retrieves the data and sends it to your computer or other device.
  • NVMe: NVMe (nonvolatile memory express) is a new storage access and transport protocol for flash and next-generation solid-state drives (SSDs) that delivers the highest throughput and fastest response times yet for all types of enterprise workloads.
  • SSD Flash Drive Arrays: Using only flash memory, these solid-state storage systems offer swift data transfer between SSD and a smaller physical size than a disk array. The upfront cost tends to be higher, but there's great potential to pay a lower cost over time.
  • Hybrid Flash Arrays: These storage devices include both flash memory drives and hard disk drives for balanced performance. Hybrid flash arrays offer low-cost startup, reasonable performance costs and fast data access on demand. All-flash arrays offer lower latency and faster performance than hybrid flash but may cost even more.
  • Cloud storage: Cloud storage refers to the use of remote servers to store data, which can be accessed over the internet. When you save a file to the cloud, the data is transmitted over the internet to a remote server where it is stored. When you want to access the data, you can do so from any device with an internet connection by logging into your cloud storage account.
  • USB drives and memory cards: These portable storage devices use flash memory to store data. When you save a file to a USB drive or memory card, the data is written to the device's memory chips using an electrical current. When you want to access the data, the device retrieves the data from the memory chips and sends it to your computer or other device.

In general, digital data storage works by using digital media to store and retain digital information in a way that it can be accessed and retrieved later. Digital data storage can be either physical, as in the case of HDDs and SSDs, or it can be virtual, as in the case of cloud storage.

Direct-attached storage (DAS): Advantages and disadvantages

Direct-attached storage (DAS) is a type of data storage that is connected directly to a single computer or device. DAS is a simple and convenient way to add storage to a computer or device, and it is often used in personal computers and small businesses.

Some of the advantages of DAS include:

  • Convenience: DAS is easy to set up and use, as it simply requires connecting a storage device (e.g., a hard disk drive or solid-state drive) to a computer or device using a cable.
  • Cost-effectiveness: DAS can be a cost-effective option for adding storage to a single computer or device, as it does not require the purchase of additional hardware or infrastructure.
  • Performance: DAS can offer good performance for certain types of workloads, as the data is accessed directly from the storage device without the need for network communication.

Some of the disadvantages of DAS include:

  • Limited scalability: DAS is typically limited to the storage capacity of the single device to which it is attached, making it difficult to add more storage capacity as needed.
  • Limited accessibility: DAS is typically only accessible from the single computer or device to which it is attached, making it difficult for other users or devices to access the data.
  • Limited redundancy: DAS does not typically offer any redundancy or data protection, so if the storage device fails or the data is lost, it may not be possible to recover it.

In general, DAS is a simple and cost-effective way to add storage to a single computer or device, but it may not be suitable for larger or more complex environments that require more scalability, accessibility, or redundancy.

Network-based storage: NAS and SAN

Network-based storage refers to storage systems that are connected to a network and can be accessed by multiple users. There are two main types of network-based storage: network-attached storage (NAS) and storage area networks (SANs).

Network-attached storage (NAS) is a type of storage system that is connected to a network and can be accessed by multiple users. NAS devices are typically used in small to medium-sized businesses and organisations to provide centralised storage that can be accessed by multiple users. Data can be easily shared among connected machines, and permission levels can be set to control access.

NAS devices are relatively easy to set up and manage, and they can be used for a wide range of purposes, including file sharing, data backup, and media streaming.

Storage area networks (SANs) are more complex and expensive than NAS devices, and they are typically used in large enterprises to provide centralised storage for critical applications. SANs are typically built using specialised hardware and software and use high-speed networking technologies to provide fast and reliable access to storage.

Both NAS and SAN systems can be used to store a wide range of data, including files, databases, and virtual machines. The best type of network-based storage for a particular use case will depend on the specific needs of the user, including factors such as capacity, performance, reliability, and cost.

Cloud storage

Cloud storage is a type of data storage in which data is stored on remote servers that can be accessed over the internet. Cloud storage is a convenient and flexible way to store and access data, as it allows users to access their data from any device with an internet connection.

There are many different types of cloud storage services available, including public cloud storage, private cloud storage, and hybrid cloud storage. Public cloud storage refers to cloud storage services that are provided by companies such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform. These services allow users to store their data on servers that are owned and operated by the cloud provider.

Private cloud storage refers to cloud storage that is set up and managed by an organisation on its own servers or on servers that are leased from a third-party provider. Private cloud storage is typically more expensive and requires more in house technical expertise to set up and maintain than public cloud storage, but it can offer higher levels of security and control.

Hybrid cloud storage combines the benefits of both public cloud storage and private cloud storage, allowing organisations to store some data on-premises in a private cloud, while storing other data in a public cloud.

Cloud storage is a popular option for many businesses and organisations because it allows them to store and access large amounts of data without having to invest in expensive hardware and infrastructure.

It is also a convenient option for individuals who want to store and access their personal data from multiple devices.

Hybrid cloud storage

Hybrid cloud storage is a type of data storage that combines the benefits of both public cloud storage and private cloud storage. With hybrid cloud storage, organisations can store some data on-premises in a private cloud, while storing other data in a public cloud.

One of the main benefits of hybrid cloud storage is that it allows organisations to store data in the most appropriate location based on their needs. For example, an organisation might store sensitive data on-premises in a private cloud for security reasons, while storing less sensitive data in a public cloud for convenience and cost-effectiveness.

Hybrid cloud storage also allows organisations to take advantage of the scalability and flexibility of public cloud storage, while maintaining control over their data and infrastructure. It can also help organisations to incorporate data more easily from multiple sources, such as data from on-premises systems, data from other cloud environments, and data from the internet of things (IoT) devices.

Overall, hybrid cloud storage offers organisations a way to take advantage of the benefits of both public and private cloud storage, and it is an increasingly popular option for organisations looking to store and manage their data in the cloud.

SSD and flash storage

Solid-state drives (SSDs) and flash storage are both types of digital storage that use memory chips to store data. There are some differences between the two technologies, however.

An SSD is a type of physical storage device that uses memory chips to store data.

SSDs are typically used as a replacement for traditional hard disk drives (HDDs) because they are faster, more durable, and consume less power. SSDs are available in a range of capacities and can be used in a variety of devices, including laptops, desktops, servers, and storage arrays.

Flash storage refers to the use of flash memory to store data. Flash memory is a type of non-volatile memory that retains data even when the power is turned off. Flash storage can be used in a variety of applications, including USB drives, memory cards, and SSDs.

In general, SSDs and flash storage are both fast and reliable ways to store data, and they are often used in applications where speed and durability are important. The specific benefits of each technology will depend on the specific application and the needs of the user.

Backup storage is a type of data storage that is used to create copies of important data in case the original data is lost or becomes unavailable. Backup storage is typically used to protect against data loss due to hardware failure, natural disasters, cyber-attacks, or other unforeseen events.

Backup storage

Backup storage is a critical aspect of modern data management as it allows organisations to protect their data and ensure that it is available in the event of a disaster or other disruption.

By creating and maintaining backups of your data, you will minimise the risk of data loss and reduce downtime, which can have significant financial and operational impacts. In addition, backup storage is essential for compliance with regulations and industry standards that require organisations to keep copies of their data for a certain period of time.

There are many different types of backup storage, including:

  • Local backup storage: This refers to the use of storage devices that are connected directly to a computer or network, such as hard disk drives (HDDs), solid-state drives (SSDs), and USB drives. Local backup storage is convenient because it is easy to set up and use, but it is vulnerable to the same risks as the original data (e.g., if the computer or device fails, the backup data may also be lost).
  • Cloud backup storage: This refers to the use of cloud storage services to create backups of data. Cloud backup storage is convenient because it allows users to access their data from any device with an internet connection, and it is typically more secure than local backup storage because the data is stored on remote servers that are less vulnerable to physical threats.
  • Tape backup storage: This refers to the use of magnetic tapes to create backups of data. Tape backup storage is often used for large-scale data backup and archival purposes because it has a high capacity and a long shelf life, but it is slower and less convenient than other types of backup storage.

The best type of backup storage for a particular use case will depend on the specific needs of the user, including factors such as capacity, speed, durability, and cost. It is important for organisations and individuals to regularly create backups of their important data to ensure that it is protected in the event of data loss.

The best practice for back-up data storage is the 3-2-1 rule which states you should have three copies of your data, on at least two different media types and one of these copies should be kept offsite or in the cloud.

Back-up solutions can be expensive so before you make any considerations, explore the different options available to you and make sure that the features align with your requirements.

At Towers Associates, we have subject matter experts that can assist you in making sure your data is backed up securely and in a solution that is the best fit for your organisation.

Having full back-ups will be your lifeline in the event of a disaster we know how important it is to protect your data against loss, theft and corruption.

The Future of Data Storage

The data storage solutions highlighted above represent the current array of solutions, but the world of data storage is continuously evolving. The latest innovations in network storage can provide forward-thinking and comprehensive solutions for businesses that need to store a large volume of sensitive information.

If your business has more complex data storage needs, you may want to consider one of these more advanced storage options. Let’s take a look at two of these emerging data storage technologies.

Software-Defined Storage

Traditional data storage requires hardware and proprietary software to run it. When you need to scale your storage, you'll find yourself scrambling for more hardware.

In contrast, software-defined storage (SDS) decouples the software layer between where data is physically stored and how it's retrieved. Separating storage software from its hardware allows you to expand your storage capacity on any industry standard server or x86 system — so you don't have to keep buying proprietary hardware every time you need more storage and you don't have to use storage devices from the same vendor.

By abstracting the software layer, you can put your data wherever you need it, with the flexibility to expand capacity as you see fit, or descale if needed.  SDS offers additional benefits like automated management, cost efficiency, and the ability to join many different data sources to build a storage infrastructure.

Storage Virtualisation

Storage virtualisation refers to the storage capacity that is accumulated from multiple physical devices and then made available for reallocation in a virtualised environment. It is the pooling of physical storage from multiple devices into what appears to be a single storage device managed from a central console. Relying on software to identify available storage capacity, the technology then aggregates that capacity as a pool of storage that can be used in a virtual environment by virtual machines.

Unlike SDS, which separates the software layer from the hardware to build a storage infrastructure, storage virtualisation simply pools storage resources so that it appears to users like a single, standard read or write to a physical drive. It hides the complexity of the storage system, which allows users and administrators to perform tasks such as backup, archiving and recovery in an easier, less time-consuming manner. Storage virtualisation can also help you increase storage capacity without the need to buy new storage devices.

Hyperconverged Storage

Hyperconverged storage (HCS) is the next step up from storage virtualisation and SDS. HCS utilises the cloud to combine the functions of computing, virtualisation and storage as a physical unit that can be managed as a single system.

This is a type of software-defined storage because each node has a software layer running virtualisation software identical to all other nodes in the cluster. This software virtualises the resources in the individual node and shares them with the other nodes, allowing storage and other resources to be used as a single storage or compute pool.

HCS Benefits

Virtualisation

This virtualisation aspect is an advantage of hyperconverged storage, because it makes it possible to use commercial off-the-shelf hardware to make up the individual nodes. This means a hyperconverged device can be cheaper to build if you do it yourself or can result in a less expensive monthly or annual cost if you use a vendor.

User Experience

To users, virtual storage appears like a standard read or write to a physical drive. It hides the complexity of the storage system, which allows users and administrators to perform tasks such as backup, archiving and recovery in an easier, less time-consuming manner.

Increased Storage Capacity

Storage virtualisation can also help you increase storage capacity without the need to buy new storage devices.

Improved Efficiency

The combining of the storage functions into a single entity makes the transfer of data fast and efficient. In the past, to transfer data stored on one device to another, you would need to download data from one point to another and then bring the endpoint device online.

With virtual disk storage techniques like storage visualisation and HCS, you can specify the logical unit number of the drive and specify that data should now to go a new drive.

Other Emerging Data Storage Trends

The future of data storage seems to be heading away from traditional tiered units in favor of combined services that give organisations more control over their data and eliminate the need for a large IT staff, as many functions can be handled remotely.

Cloud storage that is accessible from different devices for users is another growing segment that shows promise to become even faster and more efficient.

Flash storage and flash storage chips within SSD drives are being developed as a storage option on which you can rely.

Artificial Intelligence (AI) is also becoming more prevalent in newer types of data storage to handle repetitious tasks, such as managing backup schedules and setting custom recovery points for specific data sets.

RAID

RAID is a technology that is used to increase the performance and/or reliability of data storage. The abbreviation stands for either Redundant Array of Independent Drives or Redundant Array of Inexpensive Disks, which is older and less used.

A RAID system consists of two or more drives working in parallel. These can be hard discs, but there is a trend to also use the technology for SSD (Solid State Drives). There are different RAID levels, each optimised for a specific situation. These are not standardised by an industry group or standardisation committee. This explains why companies sometimes come up with their own unique numbers and implementations. This article covers the following RAID levels:

  • RAID 0 – striping
  • RAID 1 – mirroring
  • RAID 5 – striping with parity
  • RAID 6 – striping with double parity
  • RAID 10 – combining mirroring and striping

The software to perform the RAID-functionality and control the drives can either be located on a separate controller card (a hardware RAID controller) or it can simply be a driver. Some versions of Windows, such as Windows Server 2012 as well as Mac OS X, include software RAID functionality. Hardware RAID controllers cost more than pure software, but they also offer better performance, especially with RAID 5 and 6.

RAID-systems can be used with several interfaces, including SATA, SCSI, IDE, or FC (fiber channel.) There are systems that use SATA disks internally, but that have a FireWire or SCSI-interface for the host system.

Sometimes disks in a storage system are defined as JBOD, which stands for Just a Bunch Of Disks. This means that those disks do not use a specific RAID level and acts as stand-alone disks. This is often done for drives that contain swap files or spooling data.

Below is an overview of the most popular RAID levels:

RAID level 0 – Striping

In a RAID 0 system data are split up into blocks that get written across all the drives in the array. By using multiple disks (at least 2) at the

same time, this offers superior I/O performance. This performance can be enhanced further by using multiple controllers, ideally one controller per disk.

  • RAID 0 offers great performance, both in read and write operations. There is no overhead caused by parity controls.
  • All storage capacity is used, there is no overhead.
  • The technology is easy to implement.

Disadvantages of RAID 0

  • RAID 0 is not fault tolerant. If one drive fails, all data in the RAID 0 array are lost. It should not be used for mission-critical systems.

Ideal use

RAID 0 is ideal for non-critical storage of data that have to be read/written at a high speed, such as on an image retouching or video editing station.

If you want to use RAID 0 purely to combine the storage capacity of twee drives in a single volume, consider mounting one drive in the folder path of the other drive. This is supported in Linux, OS X as well as Windows and has the advantage that a single drive failure has no impact on the data of the second disk or SSD drive.

RAID level 1 – Mirroring

Data are stored twice by writing them to both the data drive (or set of data drives) and a mirror drive (or set of drives). If a drive fails, the controller uses either the data drive or the mirror drive for data recovery and continuous operation. You need at least 2 drives for a RAID 1 array.

  • RAID 1 offers excellent read speed and a write-speed that is comparable to that of a single drive.
  • In case a drive fails, data do not have to be rebuild, they just must be copied to the replacement drive.
  • RAID 1 is a very simple technology.

Disadvantages of RAID 1

The main disadvantage is that the effective storage capacity is only half of the total drive capacity because all data get written twice.

Software RAID 1 solutions do not always allow a hot swap of a failed drive. That means the failed drive can only be replaced after powering down the computer it is attached to. For servers that are used simultaneously by many people, this may not be acceptable. Such systems typically use hardware controllers that do support hot swapping.

Ideal use

RAID-1 is ideal for mission critical storage, for instance for accounting systems. It is also suitable for small servers in which only two data drives will be used.

RAID level 5 – Striping with parity 

RAID 5 is the most common secure RAID level. It requires at least 3 drives but can work with up to 16.

Data blocks are striped across the drives and on one drive a parity checksum of all the block data is written. The parity data are not written to a fixed drive, they are spread across all drives, as the drawing below shows. Using the parity data, the computer can recalculate the data of one of the other data blocks, should those data no longer be available.

That means a RAID 5 array can withstand a single drive failure without losing data or access to data.

Although RAID 5 can be achieved in software, a hardware controller is recommended. Often extra cache memory is used on these controllers to improve the write performance.

Read data transactions are very fast while write data transactions are somewhat slower (due to the parity that has to be calculated).

If a drive fails, you still have access to all data, even while the failed drive is being replaced and the storage controller rebuilds the data on the new drive.

Disadvantages of RAID 5

Drive failures have an effect on throughput, although this is still acceptable.

This is complex technology. If one of the disks in an array using 4TB disks fails and is replaced, restoring the data (the rebuild time) may take a day or longer, depending on the load on the array and the speed of the controller. If another disk goes bad during that time, data are lost forever.

Ideal use

RAID 5 is a good all-round system that combines efficient storage with excellent security and decent performance. It is ideal for file and application servers that have a limited number of data drives.

RAID level 6 – Striping with double parity

RAID 6 is like RAID 5, but the parity data are written to two drives. That means it requires at least 4 drives and can withstand 2 drives dying simultaneously. The chances that two drives break down at the same moment are of course very small. However, if a drive in a RAID 5 systems dies and is replaced by a new drive, it takes hours or even more than a day to rebuild the swapped drive. If another drive dies during that time, you still lose all your data. With RAID 6, the RAID array will even survive that second failure.

Advantages of RAID 6

  • Like with RAID 5, read data transactions are very fast.
  • If two drives fail, you still have access to all data, even while the failed drives are being replaced. So RAID 6 is more secure than RAID 5.

Disadvantages of RAID 6

Write data transactions are slower than RAID 5 due to the additional parity data that must be calculated. In one report I read the write performance was 20% lower.

Drive failures influence throughput, although this is still acceptable.

This is complex technology. Rebuilding an array in which one drive failed can take a long time.

Ideal use

RAID 6 is a good all-round system that combines efficient storage with excellent security and decent performance. It is preferable over RAID 5 in file and application servers that use many large drives for data storage.

RAID level 10 – combining RAID 1 & RAID 0

It is possible to combine the advantages (and disadvantages) of RAID 0 and RAID 1 in one single system. This is a nested or hybrid RAID configuration. It provides security by mirroring all data on secondary drives while using striping across each set of drives to speed up data transfers.

Advantages of RAID 10

If something goes wrong with one of the disks in a RAID 10 configuration, the rebuild time is very fast since all that is needed is copying all the data from the surviving mirror to a new drive. This can take as little as 30 minutes for drives of 1 TB.

Disadvantages of RAID 10

Half of the storage capacity goes to mirroring, so compared to large RAID 5 or RAID 6 arrays, this is an expensive way to have redundancy.

What about RAID levels 2, 3, 4 and 7?

These levels do exist but are not that common (RAID 3 is essentially like RAID 5 but with the parity data always written to the same drive). This is just a simple introduction to RAID-systems.

RAID is no substitute for back-ups!

All RAID levels except RAID 0 offer protection from a single drive failure. A RAID 6 system even survives 2 disks dying simultaneously. For complete security, you do still need to back-up the data stored on a RAID system.

That back-up will come in handy if all drives fail simultaneously because of a power spike.

It is a safeguard when the storage system gets stolen.

Back-ups can be kept off-site at a different location. This can come in handy if a natural disaster or fire destroys your workplace.

The most important reason to back-up multiple generations of data is user error. If someone accidentally deletes some important data and this goes unnoticed for several hours, days, or weeks, a good set of back-ups ensure you can still retrieve those files.

  • Work with us

    If you’re ready to get started or your project is already underway, we’d like to know more.

  • Work with us

    If you’re ready to get started or your project is already underway, we’d like to know more.

Go to top