如何实现HDFS-Hadoop分布式文件系统
这篇文章将为大家详细讲解有关如何实现HDFS-Hadoop分布式文件系统,小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。
目前成都创新互联公司已为1000+的企业提供了网站建设、域名、雅安服务器托管、网站托管维护、企业网站设计、金昌网站维护等服务,公司将坚持客户导向、应用为本的策略,正道将秉承"和谐、参与、激情"的文化,与客户和合作伙伴齐心协力一起成长,共同发展。
hdfs和传统文件系统对比:
1,支持超大文件;
2,hdfs数据块独立于具体磁盘数据块,容错性;
hadoop节点分类:管理/工作节点:
管理节点:管理文件系统树以及整棵树内所有文件和目录,如果管理节点挂了,整个系统就挂了;
工作节点:存储具体数据的节点,定期向管理节点发送自己节点的数据列表;
hdfs管理节点防挂机制:双机热备份和定时备份;
伪分布模式部署:
1,hadoop通过ssh对各个节点进行通讯,所以需要配置ssh,并且用空口令;
其实这个只是通讯方式的问题,可以用ssh,根据安全需要,也可以改用其他通讯模式.,甚至可以用java socket重写。
配置ssh
t@ubuntu:~$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa Generating public/private rsa key pair. Your identification has been saved in /home/t/.ssh/id_rsa. Your public key has been saved in /home/t/.ssh/id_rsa.pub. The key fingerprint is: 5c:f9:27:86:a5:88:97:1b:07:fe:3c:95:90:a8:e8:8f t@ubuntu The key's randomart image is: +--[ RSA 2048]----+ | | | . o | | o = . | | . = = * . | | . o S + * . | | . . * o o | | . . + | | o . | | E . | +-----------------+ t@ubuntu:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
配置文件
core-site.xml:
fs.default.name hdfs://localhost
hdfs-site.xml
dfs.replication 1
mapred-site.xml
mapred.job.tracker localhost:8021
备注:hadoop最新版本已经不存在conf文件夹,配置文件直接写在
$HADOOP_INSTALL/hadoop-2.6.2/etc/hadoop/
格式化hdfs文件系统
t@ubuntu:~/hadoop/hadoop-2.6.2/etc/hadoop$ hadoop namenode -format
按 权威 一书方法会报错,需要在hadoop-env.sh中配置java_home
启动hdfs守护进程:
t@ubuntu:~$ start-dfs.sh
查看namenode:http://ip:50070/
关闭hdfs守护进程:
t@ubuntu:~$ stop-dfs.sh
执行hadoop文件输出:
t@ubuntu:~/hadoop/ex$ hadoop jar hadoop-urlCat.jar hdfs://localhost/testHadoop.txt URLCat output1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/t/hadoop/hadoop-2.6.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/t/hadoop/ex/hadoop-examples.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] haddop测试文件
hadoop文件操作基本命令
t@ubuntu:~/hadoop/ex$ hadoop fs Usage: hadoop fs [generic options] [-appendToFile... ] [-cat [-ignoreCrc] ...] [-checksum ...] [-chgrp [-R] GROUP PATH...] [-chmod [-R] PATH...] [-chown [-R] [OWNER][:[GROUP]] PATH...] [-copyFromLocal [-f] [-p] [-l] ... ] [-copyToLocal [-p] [-ignoreCrc] [-crc] ... ] [-count [-q] [-h] ...] [-cp [-f] [-p | -p[topax]] ... ] [-createSnapshot [ ]] [-deleteSnapshot ] [-df [-h] [ ...]] [-du [-s] [-h] ...] [-expunge] [-get [-p] [-ignoreCrc] [-crc] ... ] [-getfacl [-R] ] [-getfattr [-R] {-n name | -d} [-e en] ] [-getmerge [-nl] ] [-help [cmd ...]] [-ls [-d] [-h] [-R] [ ...]] [-mkdir [-p] ...] [-moveFromLocal ... ] [-moveToLocal ] [-mv ... ] [-put [-f] [-p] [-l] ... ] [-renameSnapshot ] [-rm [-f] [-r|-R] [-skipTrash] ...] [-rmdir [--ignore-fail-on-non-empty] ...] [-setfacl [-R] [{-b|-k} {-m|-x } ]|[--set ]] [-setfattr {-n name [-v value] | -x name} ] [-setrep [-R] [-w] ...] [-stat [format] ...] [-tail [-f] ] [-test -[defsz] ] [-text [-ignoreCrc] ...] [-touchz ...] [-usage [cmd ...]] Generic options supported are -conf specify an application configuration file -D use value for given property -fs specify a namenode -jt specify a ResourceManager -files specify comma separated files to be copied to the map reduce cluster -libjars specify comma separated jar files to include in the classpath. -archives specify comma separated archives to be unarchived on the compute machines. The general command line syntax is bin/hadoop command [genericOptions] [commandOptions] t@ubuntu:~/hadoop/ex$ hadoop fs -ls /
关于“如何实现HDFS-Hadoop分布式文件系统”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,使各位可以学到更多知识,如果觉得文章不错,请把它分享出去让更多的人看到。
本文题目:如何实现HDFS-Hadoop分布式文件系统
地址分享:http://pwwzsj.com/article/jdpgps.html