High Performance File System (HPFS)

Sharing is caring!


HPFSor High Performance File System is a file system created specifically for the OS/2 operating system to improve upon the limitations of the FAT file system. HPFS was designed to get around several limitations at the time in MS-DOS, among them its eight-character name restriction, handling large files of up to 2 GB across multiple hard disks. HPFS uses a centrally-located root directory and B-tree lookup to speed access. It can exists with MS-Dos file system, FAT or run independently.

Benefits:

  • Contiguous storage of extended attributes (without the EA DATA.SF file used by FAT)
  • Resistance to file fragmentation
  • Small cluster size
  • Support for larger file storage devices (up to 512 GB)
  • Speedier disk operation (Faster Disk Operation)

Drawbacks:

  • Requires more system memory
  • HPFS partitions are not visible to MS-DOS, so if you need to boot from a floppy disk, it could be inconvenient.
  • Native DOS needs a special utility (Partition Magic from PowerQuest) to access a HPFS partition
There are lot of high-performance file systems out there: we will compare only two popular ones. (Lustre and GPFS) today but let’s start from difference between Parallel and Distributed file system.
Distributed file system (DFS):
File system data is usable at the same time from different clients. Often store entire objects (file) on a single storage node. Usually storage existed with the application or computer system.  e.g NFS, CIFS
Parallel file system (PFS):
Distributed file system with parallel data paths from clients to disks. Its distribute data of single object across multiple storage nodes. Usually storage is separated from computer system.  Normally can be found on enterprise share storage where high performance and lower management cost is important. e.g Lustre, GPFS



IBM GPFS
The General Parallel File System (GPFS) from IBM has been out now for a few years.
“GPFS is a high-performance, shared disk, clustered file system for AIX and Linux. Originally designed for technical high performance computing (HPC), it has since expanded into environments which require performance, fault tolerance and high capacity such as relational databases, CRM, Web 2.0 and media applications, engineering, financial applications and data archiving.
“GPFS is built on a SAN model where all the servers see all the storage. Data is striped across all the disks in each file system, which allows the bandwidth of each disk to be used for service of a single file or to produce aggregate performance for multiple files. This performance can be delivered to all the nodes that make up the cluster.
Lustre

Lustre is a GNU General Public licensed, open-source distributed parallel filesystem developed and maintained by Sun Microsystems Inc. Due to the extremely scalable architecture of the Lustre filesystem, Lustre deployments are popular in scientific per computing, as well as in the oil and gas, manufacturing, rich media, and finance sectors. Lustre presents a POSIX interface to its clients with parallel access capabilities to the shared file objects.

Its object base filesystem, composed of three components: Metadata Servers (MDSs), Object Storage Servers (OSSs) and Clients.
Metadata Servers (MDS) provide metadata services, MDC is a client of those service. One MDS per filesystem manages one metadata target (MDT). Each MDT stores file metadata, such as filenames, access permissions.
Object Storage Server (OSS) expose block devices and serves data. OSC is client of the services, Each OSS manages one or more object storage targets.
Lustre, being a POSIX-complaint filesystem. POSIX defines the application programming interface (API), along with command line shells and utility interfaces, for software compatibility with variants of UNIX and other operating systems.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.