分布式文件系统是指文件系统管理的物理存储资源下不一定直接连接在本地节点上,而是通过计算机网络与节点相连。
分布式文件系统的优点是集中访问、简化操作、数据容灾,以及提高了文件的存取性能。
MFS是一种半分布式文件系统,它是由波兰人开发的。MFS文件系统能够实现RAID的功能,不但能够更节约存储成本,而且不比专业的存储系统差,它还可以实现在线扩展。
MFS是一个具有容错性的网络分布式文件系统,它把数据分散存放在多个服务器上,而呈现给用户的则是一个统一的资源。
- 元数据服务器(Master):在整个体系中负责管理文件系统,维护元数据;
- 元数据日志服务器(Metalogger):备份Master服务器的变化日志文件,文件类型为changlog_ml.*.mfs。当Master服务器数据丢失或者损坏时,可以从日志服务器中取得文件,进行恢复;
- 数据存储服务器(Chunk Server):真正存储的数据的服务器。存储文件时,会把文件分块保存,并在数据服务器之间进行复制。数据服务器越多,能够使用的容量则越大,可靠性就越高,性能也就越好;
- 客户端(Client):可以像挂载NFS一样挂载MFS文件系统,其操作是相同的。
- 客户端向元数据服务器发出读请求;
- 元数据服务器把所需数据存放的位置(ChunkServer的IP地址和Chunk编号)告知客户端;
- 客户端向已知的ChunkServer请求发送数据;
- Chunkserver向客户端发送数据。
- 客户端向元数据服务器发送写入请求;
- 元数据服务器与ChunkServer进行交互,但元数据服务器只在某些服务器创建新的分块Chunks,创建成功后由ChunkServers告知元数据服务器操作成功;
- 元数据服务器告知客户端,可以在哪个ChunkServer的哪些Chunks吸入数据;
- 客户端向指定的ChunkServer写入数据;
- 该ChunkServer与其他ChunkServer进行数据同步,同步成功后ChunkServer告知客户端数据写入成功;
- 客户端告知元数据服务器本次写入完毕。

| 主机 | 操作系统 | IP地址 | 
|---|---|---|
| Master Server | Centos 7.3 X86_64 | 192.168.96.22 | 
| Metalogger | Centos 7.3 X86_64 | 192.168.96.11 | 
| Chunk1 | Centos 7.3 X86_64 | 192.168.96.12 | 
| Chunk2 | Centos 7.3 X86_64 | 192.168.96.13 | 
| Chunk3 | Centos 7.3 X86_64 | 192.168.96.14 | 
| Clinent | Centos 7.3 X86_64 | 192.168.96.15 | 
5台服务器需连接互联网
setenforce 0
systemctl stop firewalldcurl "https://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFScurl "http://ppa.moosefs.com/MooseFS-3-el7.repo" > /etc/yum.repos.d/MooseFS.repoyum updateyum -y install moosefs-master moosefs-cgi moosefs-cgiserv moosefs-cli确认配置文件,在/etc/mfs下生成了相关的配置文件(mfsexports.cfg、mfsmaster.cfg等)
以下配置文件均采用默认值,不需做修改:mfsmaster.cfg、mfsexports.cfg、mfstopology.cfg
mfsmaster start
ps -ef | grep mfs
setenforce 0
systemctl stop firewalldcurl "https://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFScurl "http://ppa.moosefs.com/MooseFS-3-el7.repo" > /etc/yum.repos.d/MooseFS.repoyum updateyum -y install moosefs-metaloggervim /etc/mfs/mfsmetalogger.cfg
  1 ###############################################
  2 # RUNTIME OPTIONS                             #
  3 ###############################################
  4
  5 # user to run daemon as (default is mfs)
  6 # WORKING_USER = mfs
  7
  8 # group to run daemon as (optional - if empty then default user group will be used)
  9 # WORKING_GROUP = mfs
 10
 11 # name of process to place in syslog messages (default is mfsmetalogger)
 12 # SYSLOG_IDENT = mfsmetalogger
 13
 14 # whether to perform mlockall() to avoid swapping out mfsmetalogger process (default is 0, i.e. no)
 15 # LOCK_MEMORY = 0
 16
 17 # Linux only: limit malloc arenas to given value - prevents server from using huge amount of virtual memor    y (default is 4)
 18 # LIMIT_GLIBC_MALLOC_ARENAS = 4
 19
 20 # Linux only: disable out of memory killer (default is 1)
 21 # DISABLE_OOM_KILLER = 1
 22
 23 # nice level to run daemon with (default is -19; note: process must be started as root to increase priorit    y, if setting of priority fails, process retains the nice level it started with)
 24 # NICE_LEVEL = -19
 25
 26 # set default umask for group and others (user has always 0, default is 027 - block write for group and bl    ock all for others)
 27 # FILE_UMASK = 027
 28
 29 # where to store daemon lock file (default is /var/lib/mfs)
 30 # DATA_PATH = /var/lib/mfs
 31
 32 # number of metadata change log files (default is 50)
 33 # BACK_LOGS = 50
 34
 35 # number of previous metadata files to be kept (default is 3)
 36 # BACK_META_KEEP_PREVIOUS = 3
 37
 38 # metadata download frequency in hours (default is 24, should be at least BACK_LOGS/2)
 39 # META_DOWNLOAD_FREQ = 24
 40
 41 ###############################################
 42 # MASTER CONNECTION OPTIONS                   #
 43 ###############################################
 44
 45 # delay in seconds before next try to reconnect to master if not connected (default is 5)
 46 # MASTER_RECONNECTION_DELAY = 5
 47
 48 # local address to use for connecting with master (default is *, i.e. default local address)
 49 # BIND_HOST = *
 50
 51 # MooseFS master host, IP is allowed only in single-master installations (default is mfsmaster)
 #修改为Master的IP地址 
 52 MASTER_HOST = 192.168.96.22
 53
 54 # MooseFS master supervisor port (default is 9419)
 55 # MASTER_PORT = 9419
 56
 57 # timeout in seconds for master connections (default is 10)
 58 # MASTER_TIMEOUT = 10mfsmetalogger startps -ef | grep mfs
setenforce 0
systemctl stop firewalldcurl "https://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFScurl "http://ppa.moosefs.com/MooseFS-3-el7.repo" > /etc/yum.repos.d/MooseFS.repoyum updateyum -y install moosefs-chunkservervim /etc/mfs/mfschunkserver.cfg
 66 ###############################################
 67 # MASTER CONNECTION OPTIONS                   #
 68 ###############################################
 69
 70 # labels string (default is empty - no labels)
 71 # LABELS =
 72
 73 # local address to use for master connections (default is *, i.e. default local address)
 74 # BIND_HOST = *
 75
 76 # MooseFS master host, IP is allowed only in single-master installations (default is mfsmaster)
 # 修改为Master的IP地址
 77 MASTER_HOST = 192.168.96.22
 78
 79 # MooseFS master command port (default is 9420)
 80 # MASTER_PORT = 9420
 81
 82
 83 # timeout in seconds for master connections. Value >0 forces given timeout, but when value is 0 then CS as    ks master for timeout (default is 0 - ask master)
 84 # MASTER_TIMEOUT = 0
 85
 86 # delay in seconds before next try to reconnect to master if not connected (default is 5)
 87 # MASTER_RECONNECTION_DELAY = 5
 88
 89 # authentication string (used only when master requires authorization)
 90 # AUTH_CODE = mfspasswordvim /etc/mfs/mfshdd.cfg
# This file keeps definitions of mounting points (paths) of hard drives to use with chunk server.
# A path may begin with extra characters which swiches additional options:
#  - ‘*‘ means that this hard drive is ‘marked for removal‘ and all data will be replicated to other hard drives (usually on other chunkservers)
#  - ‘<‘ means that all data from this hard drive should be moved to other hard drives
#  - ‘>‘ means that all data from other hard drives should be moved to this hard drive
#  - ‘~‘ means that significant change of total blocks count will not mark this drive as damaged
# If there are both ‘<‘ and ‘>‘ drives then data will be moved only between these drives
# It is possible to specify optional space limit (after each mounting point), there are two ways of doing that:
#  - set space to be left unused on a hard drive (this overrides the default setting from mfschunkserver.cfg)
#  - limit space to be used on a hard drive
# Space limit definition: [0-9]*(.[0-9]*)?([kMGTPE]|[KMGTPE]i)?B?, add minus in front for the first option.
#
# Examples:
#
# use hard drive ‘/mnt/hd1‘ with default options:
#/mnt/hd1
#
# use hard drive ‘/mnt/hd2‘, but replicate all data from it:
#*/mnt/hd2
#
# use hard drive ‘/mnt/hd3‘, but try to leave 5GiB on it:
#/mnt/hd3 -5GiB
#
# use hard drive ‘/mnt/hd4‘, but use only 1.5TiB on it:
#/mnt/hd4 1.5TiB
#
# use hard drive ‘/mnt/hd5‘, but fill it up using data from other drives
#>/mnt/hd5
#
# use hard drive ‘/mnt/hd6‘, but move all data to other hard drives
#</mnt/hd6
#
# use hard drive ‘/mnt/hd7‘, but ignore significant change of hard drive total size (e.g. compressed file systems)
#~/mnt/hd7
#提供给MFS的分区目录
/data特别提醒:/data为提供给MFS的分区,一般最好使用独立的分区或磁盘来挂载该目录
mkdir /datachown -R mfs.mfs /datamfschunkserver start
ps -ef | grep mfs


setenforce 0
systemctl stop firewalldcurl "https://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFScurl "http://ppa.moosefs.com/MooseFS-3-el7.repo" > /etc/yum.repos.d/MooseFS.repoyum updateyum -y install moosefs-clientmkdir -p /mfs/datamodprobe fusemfsmount /mfs/data -H 192.168.96.22df -h
通过yum安装方式已经默认安装好Mfscgiserv功能,它是同Python编写的一个web服务器,其监听端口为9425,可以在Master Server上通过mfscgiserv命令开启,然后利用浏览器打开就可以全面监控所有客户端挂载、Chunk Server、Master Server,以及客户端的各种操作等。
客户端通过浏览器访问http://192.168.96.22:9425,如下图











目标是指文件被复制的份数,设定了复制的份数后就可以通过mfsgetgoal命令来证实,也可以通过mfssetgoal来改变设定
实际的副本分数可以通过mfscheckfile和mfsfileinfo命令来证实。
整个目录树的内容摘要可以通过一个功能增强的、等同于“du -s”的命令mfsdirinfo来显示。
最重要的就是维护元数据服务器,而元数据服务器最重要的目录为/var/lib/mfs/,MFS数据的存储、修改、更新等操作变化都会记录咋这个目录的某个文件中,因此只要保证这个目录的数据安全,就能够保证整个MFS文件系统的安全性和可靠性。
/var/lib/mfs/目录下的数据由两部分组成:一部分是元数据服务器的改变日志,文件名称类似于changelog.*.mfs;另一部分是元数据文件metadata.mfs,运行mfsmaster时该文件会被命名为metadata.mfs.back。只要保证了这两部数据的安全,即使元数据服务器遭到致命×××,也可以通过备份的元数据文件来部署一套元数据服务器。

原文:http://blog.51cto.com/10316297/2151829