Storage Devices and Linux ch 8.1

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Does not provide fault tolerance

A failure of one disk in the set means all data is lost

The file system is not managed by the kernel, but rather opened by the kernel

Because of this we can easily port or move the file system between machines, or make it available to resources such as containers or virtual machines.

One of the more common external storage methods is Network Attached Storage or NAS

As the name implies, the storage is connected to the network and is accessed using network protocols. Your system may access the data via the data LAN or have specific network cards that are specifically connected to a storage network. Most often, NAS uses shares similar to an MS Windows or a Linux share and provides access to users. Another method is the Storage Area Network or SAN. A SAN provides all the connectivity, storage, and control to systems. Generally, a SAN provides external storage to servers, including diskless servers. There's a direct connection, usually fiber optic, from the server to the storage array.

Although the mount command is used by users occasionally, most file systems are mounted automatically at boot or mounted when they are added to the system

As with other operating systems, Linux can mount both locally connected and remotely connected devices

Linux builds software RAIDs with the multiple device administration tool, or mdadm

Let's say that we have five hard drives. Each drive will automatically be assigned a drive name, beginning with sda. To create a software RAID, Linux builds a new disk device that begins with md, which is short for multiple device. Then it gets assigned a device number, usually starting with zero. Now, if we want to use this RAID as a boot device for Linux, we need to set aside partitions on one of the drives to be used as the /boot directory.

In this lesson, we're going to look at how Linux builds and manages RAID

Linux typically creates software RAIDs, and these are installed on the Linux kernel and loaded by the boot loader. Let's take a closer look at how this works.

When adding storage to a local system, there may be a requirement to manage large data stores

Many systems have the ability to manage multiple storage devices, and Linux provides a method for this management called Linux Volume Management or LVM. Suppose you had a system with four physical storage devices. Via Linux, you create a partition on each device that spans the entire device, which becomes the physical volume. Via LVM management tools, you create a Logical volume group or VG by combining the space available on the physical volumes into a pool of storage space. With the VG in place, you can now create Logical volumes or LVs and format them with a Linux filesystem for general use. You simply configure an LV with the amount of space you wish to consume from the space available on the VG, define the Linux mount point, and format the LV with BtrFS, ext4, or any other available Linux file system. Once formatted, the LV is ready to use.DE

The system switches immediately from the failed disk to a functioning disk

Mirroring: Provides fault tolerance for a single disk failure

Depending on whom you talk to, you may get a different definition of RAID

Most will say RAID is an acronym for Redundant Array of Independent Disks. There are several different array types designated by a number. Each number designates a different type of array that performs a different function. All RAID numbers provide a model for storage that's handled a bit differently than a single storage device could. Some add capacity, and others add redundancy. The following is a partial list of common RAID levels:

So far, we've discussed internal storage

Now, we'll discuss external storage. Simply put, external storage is managed storage available using networking and network protocols. It's often a higher capacity found in most internal systems and is available to devices connected to an internal network. The storage may be in the same room, building, or halfway around the world.

This section helps you prepare for the following certification exam objectives: Exam

Objective TestOut Linux Pro 2

RAID 0 is striping

The array's storage capacity is the sum of all storage devices in the array. This means no additional cost or storage loss is using this RAID level. Data is written and read across all drives in the array, making it the fasted array in RAID. The problem is that there's no redundancy, so if a single drive fails, the entire array fails. Backups are very important. RAID 1 is mirroring. This is a costly level since the array's storage capacity is halved. It requires two drives, but only provides the capacity of a single drive. This level provides redundancy. Data is written to both drives simultaneously, causing a small performance delay. This delay is often imperceptible, and reads often see a performance increase. The benefit is redundancy. RAID 1 often provides fault tolerance to a server's boot disk. Should one of the two drives fail, the other takes over without skipping a beat.

There's another method for connecting servers to a SAN

iSCSI provides another connectivity option using a standard Ethernet infrastructure. iSCSI sends commands to the SAN using Ethernet for transport rather than a direct fiber optic connection. The iSCSI initiator sends SCSI commands to the iSCSI target, and the target provides the requested data. Often a separate storage Ethernet network is created for iSCSI communications using higher-speed Ethernet devices such as 10 Gbps or faster. Linux can operate as an iSCSI target and iSCSI initiator. Depending on your distribution, you may have to add the iSCSI components.

Lastly, the fuse

ko kernel module is used to access the FUSE system, but doesn't contain the actual file system. Bugs in the VFS won't affect the kernel, which provides a stable system environment.

Depending on the configuration, a RAID array can improve performance, provide fault tolerance, or both

The following table describes common RAID levels

1 Manage storage devices Create and manage disk partitions CompTIA Linux+ XK0-005

1

Hard drives are large unallocated disks used for storing data

Alone they are not very useful. When using file systems to organize and save our data, the computer saves our files, folders, and configurations. Also, users can move and save files. File systems are very useful but come with a few issues.

—comes from the Unix world

An optical device that has a data DVD inserted by a user will typically be found under /run/media/<username>

These storage types provide the ability to write and read data

Another type of data storage is optical. Optical storage can be thought of as write-once-read-many or WORM. While there are media available that provide for erasing and reusing optical storage, WORM drives are much more common. DVD and BluRay movies are examples of optical media. Optical drives were once the primary method for transporting data. Now, thumb drives or USB drives are more popular since they're much less expensive, very easy to use, and are available in much higher capacities than optical.

FUSE gives users more control over file operations

It provides a way for non-privileged users to create and mount file systems, restrict access, and give services or programs full permissions to entire directories or files.

Copyright © 2023 TestOut Corp

Copyright © CompTIA, Inc. All rights reserved.

Object storage - The newest method for storing data, object storage makes data available to clients in their original form, usually accessed in the form of a URL

FUSE The Filesystem in USErspace (FUSE) project was built as a way for regular, non-privileged users to create file systems without affecting the kernel

In this lesson, we're going to look at creating a virtual file system, or VFS, in the user space

File systems built in the user space are known as FUSE. Let's quickly review the importance of file systems.

Suppose you have three 8 TB drives in your system

For striping, we combine the space of all the drives giving us a 24 TB capacity. We have to add parity which takes away from the total capacity. Each drive's capacity is reduced by a fraction equal to 1 divided by the number of drives in the array. In this example, we have three drives, so we must reduce the capacity of 1 divided by 3 or one-third. Another way to measure the lost capacity is to remove the capacity of a single drive from the array. This calculation provides the total available capacity of the RAID 5 array.

Requires a minimum of two disks

Has no overhead because all disk space is available for storing data

So let's look at our first hard disk device, sda

It'll be split into three partitions. Each one will receive the device name, sda, followed by which partition number it is. First, we need to set aside 1 megabyte to be a BIOS grub spacer, and we need to create a partition to house the /boot directory. The remaining space will be used as a RAID component.

RAID 1 (mirroring) A mirrored volume stores data to two (or more) duplicate disks simultaneously

If one disk fails, data is present on another disk

Internal Storages

Inside most computer systems, there's some kind of internal storage with at least the system's operating system installed and configured. The first type is the magnetic or rotational drive. Magnetic hard drives were the first type of mass storage available for microcomputers and have remained since PCs have used mass storage. While they're still used, they're being replaced with flash-based solid state drives, or SSD. SSDs are more common today than magnetic drives due to their speed for storing and retrieving data. They're more expensive than magnetic storage and don't have the capacity that magnetic has. However, they're still a better choice for most PCs and notebooks. Solid state also comes in a different form-factor known as M2 or its update, non-volatile memory express, or NVMe.

There are two basic categories for storage: internal and external.

Internal storage is inside the computer case. External means the storage is elsewhere—usually accessed via a network connection. Most systems have some sort of internal storage that contains local data requirements, such as an operating system, applications, and local data. External storage may contain common applications and shared data. Internal storage includes a magnetic, optical, and solid state. External storage may consist of the same type of storage devices. However, it's managed separately and usually has a much higher capacity than local storage—often in the hundreds of terabytes. This storage is accessed via networked devices such as SAN or NAS.

There are two primary network file systems that are used in Linux: Network File System (NFS) - NFS is a protocol used by servers and clients to share storage on a network

It comes from the Unix world and has been in use since 1984

One of the types of storage that is used but won't be covered in detail is Fibre Channel

It is used in high-speed storage environment Storage Area Networks (SANs), and the fcstat command is used to gather information about fibre channel configuratoins

Has overhead

Overhead is 1 / n where n is the number of disks

RAID 5 (striping with distributed parity) A RAID 5 volume combines disk striping across multiple disks with parity for data redundancy

Parity information is stored on each disk

If data is written twice, half of the disk space is used to store the second copy of the data

RAID 1 is the most expensive fault tolerant system

There are other RAID levels that combine the ones already discussed

RAID 1+0 or RAID 10 is a mirror of stripes, and RAID 0+1 or RAID 01 is a stripe of mirrors.

One of the most popular RAID levels is RAID 5

RAID 5 is striping with parity. Data is striped, just like RAID 1, across all of the drives in the array. The difference between RAID 1 and RAID 5 is parity. This means each drive reserves a portion of its capacity to store information about the other drives. Parity reduces overall capacity but adds redundancy. If a single drive in the array fails, the surviving drives use their parity to take its place.

Linux uses the multipath daemon called multipathd to manage the behavior of the data being written to the storage array(s) in such a way as to provide redundancy in the case of a failure along one path

RAID on Linux Redundant Array of Independent Disks (RAID), also called Redundant Array of Inexpensive Disks, is a disk subsystem that combines multiple physical disks into a single logical storage unit

There are a few steps needed to establish connectivity to an iSCSI target

Remember, we're using Ethernet, so we have to define the device we're connecting to. The first step is ensuring you have the iSCSI initiator tools for your distribution. One example is shown here. After the tools are installed, consult the tools manual for the correct usage of the tools to connect to the iSCSI target. The method shown here is for a specific distribution. Your method may differ. Once you have the tools, you need to find your initiator's name. We need to find the iSCSI-qualified name or IQN for the iSCSI target. We need to know its IP address and send it a query to find its name. Here we use the ISCSI administrator tool to send a discovery for the send target type at the IP address listed. With the IQN, we can now connect to the target. We need to look in the messages database to find the connected iSCSI device name. Now, we have to format the device. Once the device is formatted, it can be mounted to the local filesystem.

Does not increase performance

Requires a minimum of two disks

The RAID levels available to you are defined by the RAID controller in your system

Several vendors manufacture RAID controllers, and some are proprietary. Consult your RAID controller's implementation guide to find out which RAID levels are supported by your system. Additionally, Linux LVM provides the ability to utilize software to create RAID levels, such as mirroring or striping with parity.

Should a single drive fail, the others will utilize their parity to keep the array running

Should this happen, the array will be in a degraded state until the failed drive is replaced and the RAID rebuilds the drive.

So, to access a NAS device on my Linux system after the device has been mounted, I just use the cd command to change into the mount directory, and the files are visible there even though they are physically located on another system, perhaps even many miles away

Since this lesson is not about all of the storage types or the various ways of mounting or configuring the storage, we'll keep our focus on two of the most commonly used types of storage: FUSE and RAID

The primary use for FUSE is for creating virtual file systems for applications

Specifically, sandboxed applications such as AppImages use FUSE to create a restricted, disconnected from the kernel, file system

Summarize Linux fundamentals

Storage concepts File storage Block storage Object storage Partition type FUSE RAID Striping Mirroring Parity Configure and manage storage using appropriate tools Storage area network (SAN) / network-attached storage (NAS) multipathd Network filesystems Network File Systems (NFS) Server Message Block/Common Internet File System (CIFS)

Block storage

Storage used by Linux to store traditional data in blocks or chunks of space (also called a block device)

RAID Level Description RAID 0 (striping) A stripe set breaks data into units and stores the units across a series of disks by reading and writing to all disks simultaneously

Striping: Provides an increase in performance

Server Message Block (SMB)/Common Internet File System (CIFS) - These protocols describe how to share storage across the network, much like NFS

The core protocols are used by Microsoft Windows for storage sharing in a Windows environment, which Linux can participate in with some limitations

RAID 6 is similar to RAID 5 as it's striping with parity

The difference is that RAID 6 uses double parity and can withstand a loss of 2 drives from the array. The net capacity of the RAID 6 array is the total capacity minus the capacity of 2 drives.

FUSE stands for file system in user space

The idea is that we set aside portions of the file system in use by users to create a virtual file system, or VFS. Once this portion of the file system is set aside, we create a FUSE kernel module named fuse.ko and insert it into the kernel.

Globally Unique Identifier (GUID) Partition Table (GPT)

The successor to MBR partition tables it provides much more storage capability and partition flexibility

Master Boot Record (MBR)

The traditional partition type used for storage devices

A USB storage device, such as a thumb drive, will also be found under /run/media/<username> when inserted by a user

There is a mount command that is used to do two primary things: list all file systems that are currently mounted and allow a privileged user to mount a file system on a storage device somewhere in the root file system tree

Storage Types There are several different ways in which data is stored on a Linux system

These are described by the manner in which the data is organized on the devices: File storage - This method is used by services such as NFS and SMB/CIFS for storage of files over the network, although locally attached storage devices also can use this storage type

In order to use FUSE, you'll need 3 elements installed on your Linux System

These elements typically need to be installed by an Administrator.

First, typically, only users with administrative access are allowed to make changes to the file system and protected portions of the hard drive

They are also the only users that can mount and unmount different hard drives or partitions.

Partition 3, which is /dev/sda3 on Disk 1, and the remaining 4 disks—sdb, sdc, sdd, and sde—are the ones we'll use

They'll be marked as components of the new RAID, which is md0, and be used to create the new device, /dev/md0.

Last, file systems live in the kernel space for operating systems, meaning the OS is responsible for managing the file system

This can make debugging file system issues difficult. And while debugging, we have a greater chance of crashing the machine.

On Linux systems, all mounted storage devices are attached to the same file system somewhere below the / location

This idea of a single tree—instead of a number of trees as found under Windows such as C:\, D:\, etc

The mdadm utility can be used to create most of the RAID types you'll need

This includes RAID 0, which splits the data across two or more hard drives, and RAID 1, which copies the data from one drive to another. There's also RAID 5, which stripes data across three or more drives while providing redundancy with parity. And RAID 10 takes RAID 5 and mirrors the data to another RAID 5.

Block storage - This is the oldest and most common type of storage, where data is placed in fixed length blocks of data

This is commonly used for hosting the operating system, applications and databases, and local data storage

Second, a user space library to interact with the FUSE VFS

This is typically one of the libfuse packages.

For example, the /home directory is where a standard user keeps their personal files

This standard also includes definitions where types of storage are generally located

The layout of the files and folders on Linux is, depending on the distribution, determined loosely or tightly by the File system Hierarchy Standard (FHS)

This standard defines where files and folders are stored, based on their function

The process of attaching storage devices in Linux is called mounting

Thus, directly attached storage that is used as the root of the file system is found in the / directory, which is called the root directory

Here, we have a sample of how a new RAID 5 will look on an Ubuntu server

You can see the md0 device that was created and how each disk is marked as a component of the software RAID. You can also see our BIOS grub spacer, the /boot directory, and the partition that'll be used as a component of the RAID.

Multipathing Storage One of the common ways redundancy for storage is created is using multipathing

Using multiple physical connections between a server and a storage array, such as Storage Area Network (SAN) or Network Attached Storage (NAS), data can be written to the target storage device when one of the paths becomes unavailable, such as what might be caused by a hardware failure

Network File Systems In addition to storage attached directly to a server, storage devices can be located on another host on the network that shares its storage space with other network hosts

Using network file systems, data can be written to the remote location as if attached locally, at least from the user's perspective

The remotely connected devices are usually some type of Storage Area Network device (SAN) or Network Attached Storage (NAS) device

Using networking protocols, these storage devices, usually managed by other systems, provide storage to the local system through the mount point in the root tree

That's all for this lesson

We learned about creating software RAIDs in Linux. We reviewed the mdadm utility and looked at how to configure a boot disk within a RAID. We also briefly reviewed common RAID types.

In this lesson, we talked about the file system in user space or FUSE

We looked at the requirements in order to run FUSE and the purpose of creating FUSE virtual file systems.

This lesson covers the following topics: Linux storage concepts FUSE RAID on Linux Linux Storage Concepts Linux is an operating system with roots in many historical computing environments

When discussing storage on Linux, we need to understand the ancestry of some of the concepts in order to make sense of how they are implemented on Linux

These isolated file systems leave kernel access to the FUSE kernel module, keeping the application from compromising system security even if they have vulnerabilities in them

While this approach is not always effective, it does provide another layer of security

Second, file systems can be large and complex to navigate

With larger hard drives available, file systems continue to grow.


Set pelajaran terkait

Practice of Real Estate and Disclosures-Final Exam

View Set

MA Front Office Test Questions #9-16

View Set

Combo with public law 2,3 and 11 others

View Set

Principles of Management Chapter 4

View Set

CNPP - Chapter 7 - Arrays and ArrayList

View Set

Chapter 24: Asepsis/Infection Control

View Set