hdfs dfs -ls vrací “no such file or directory”

Pokud si zprovozníte HDFS, spustíte démony (start-dfs.sh a start-yarn.sh) a zkusíte zadat příkaz pro výpis aktuálního adresáře, může se stát, že dostanete chybu "no such file or directory”. Tato chyba může být způsobena tím, že pro aktuálního uživatele nemáte vytvořenu složku v /user (která také nemusí být vytvořena). Zkuste nejdříve, zda vám funguje tento příkaz.

hdfs dfs -ls /

Pokud nedostanete žádnou chybu, a zároveň žádný výpis, znamená to, že zatím nemáte vytvořeny žádné složky ani soubory. Zkuste tento příkaz.

Číst dál

Začínáme s Hadoopem – Instalace

V tomto příspěvku ukáži instalaci hadoop ve verzi 2.9. Pokud si budete o hadoopu hledet informace, dávejte si pozor na verze. Mě se třeba stalo, že první výsledek při vyhledávání infomací o HDFS (Hadoop Distributed File System), mě odkázal na verzi 1 a chvíli trvalo, než jsem na to přišel.

Instalace je popsána na stránkách projektu. K aktuální stabilní verzi je zde. Tuto instalaci jsem zkoušel, ale ne příliš úspěšně a skončil jsem s chybou "Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.". Našel jsem ale web data-flair.training, kde mají velmi podrobný a polopatě vysvětlený postup instalace hadoopu (viz zde). Je zde vysvětleno vše od instalace Javy 8, konfigurace ssh až po spuštění hadoopu. Tutoriálu není co vytknout. Já zde ukáži jen závěrečné kroky.

Pokud jste provedli všechna nastavení konfiguračních souborů, je třeba namenode naformátovat (hdfs namenode -format)

vitfo@vitfo-VirtualBox:~/hadoop/hadoop-2.9.0$ hdfs namenode -format
18/01/26 13:18:17 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = vitfo-VirtualBox/127.0.1.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.9.0
STARTUP_MSG:   classpath = /home/vitfo/hadoop/hadoop-2.9.0/etc/hadoop:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/gson-2.2.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/hamcrest-core-1.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-math3-3.1.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/log4j-1.2.17.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/stax2-api-3.1.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/curator-client-2.7.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/junit-4.11.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/httpclient-4.5.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jetty-sslengine-6.1.26.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/stax-api-1.0-2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-cli-1.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jets3t-0.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jersey-server-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jersey-json-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/servlet-api-2.5.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jersey-core-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/httpcore-4.4.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-lang3-3.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-logging-1.1.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/activation-1.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jetty-util-6.1.26.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/curator-framework-2.7.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jettison-1.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/hadoop-annotations-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-collections-3.2.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/avro-1.7.7.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/woodstox-core-5.0.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/zookeeper-3.4.6.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/netty-3.6.2.Final.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-io-2.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/asm-3.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/hadoop-auth-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-codec-1.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-compress-1.4.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-net-3.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jetty-6.1.26.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/paranamer-2.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/xz-1.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jcip-annotations-1.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/snappy-java-1.0.5.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/guava-11.0.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jsp-api-2.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-digester-1.8.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/commons-lang-2.6.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/slf4j-api-1.7.25.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jsch-0.1.54.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jsr305-3.0.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/json-smart-1.1.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/xmlenc-0.52.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-nfs-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0-tests.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jackson-core-2.7.8.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/okhttp-2.4.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/okio-1.4.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jackson-databind-2.7.8.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jackson-annotations-2.7.8.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/commons-io-2.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/asm-3.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/hadoop-hdfs-client-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/lib/htrace-core4-4.1.0-incubating.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/hadoop-hdfs-native-client-2.9.0-tests.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/hadoop-hdfs-nfs-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/hadoop-hdfs-client-2.9.0-tests.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/hadoop-hdfs-native-client-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/hadoop-hdfs-2.9.0-tests.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/hadoop-hdfs-client-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/hdfs/hadoop-hdfs-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/gson-2.2.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/fst-2.50.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-math3-3.1.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jersey-client-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/HikariCP-java7-2.4.12.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/stax2-api-3.1.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/curator-client-2.7.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/httpclient-4.5.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-configuration-1.6.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jetty-sslengine-6.1.26.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-cli-1.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jets3t-0.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/curator-recipes-2.7.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jersey-server-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-math-2.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/guice-3.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/apacheds-i18n-2.0.0-M15.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jersey-json-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/servlet-api-2.5.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jersey-core-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/httpcore-4.4.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-lang3-3.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-beanutils-core-1.8.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/activation-1.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/ehcache-3.3.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/curator-framework-2.7.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/mssql-jdbc-6.2.1.jre7.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jettison-1.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/curator-test-2.7.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/api-asn1-api-1.0.0-M20.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/json-io-2.5.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/javax.inject-1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/java-util-1.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/metrics-core-3.0.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/avro-1.7.7.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/api-util-1.0.0-M20.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/woodstox-core-5.0.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/java-xmlbuilder-0.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-io-2.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/asm-3.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/geronimo-jcache_1.0_spec-1.0-alpha-1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/nimbus-jose-jwt-3.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-codec-1.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-net-3.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-beanutils-1.7.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jetty-6.1.26.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/paranamer-2.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/xz-1.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jcip-annotations-1.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/snappy-java-1.0.5.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/guava-11.0.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jsp-api-2.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-digester-1.8.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/commons-lang-2.6.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jsch-0.1.54.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/javassist-3.18.1-GA.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/json-smart-1.1.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/xmlenc-0.52.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/lib/htrace-core4-4.1.0-incubating.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-api-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-router-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-client-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-registry-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-common-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-common-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/junit-4.11.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/avro-1.7.7.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/lib/snappy-java-1.0.5.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.9.0-tests.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar:/home/vitfo/hadoop/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.9.0.jar:/contrib/capacity-scheduler/*.jar
STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r 756ebc8394e473ac25feac05fa493f6d612e6c50; compiled by 'arsuresh' on 2017-11-13T23:15Z
STARTUP_MSG:   java = 1.8.0_151
************************************************************/
18/01/26 13:18:17 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
18/01/26 13:18:17 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-58d73ffb-2447-4142-ab4e-6fe6769518dd
18/01/26 13:18:18 INFO namenode.FSEditLog: Edit logging is async:true
18/01/26 13:18:18 INFO namenode.FSNamesystem: KeyProvider: null
18/01/26 13:18:18 INFO namenode.FSNamesystem: fsLock is fair: true
18/01/26 13:18:18 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
18/01/26 13:18:18 INFO namenode.FSNamesystem: fsOwner             = vitfo (auth:SIMPLE)
18/01/26 13:18:18 INFO namenode.FSNamesystem: supergroup          = supergroup
18/01/26 13:18:18 INFO namenode.FSNamesystem: isPermissionEnabled = true
18/01/26 13:18:18 INFO namenode.FSNamesystem: HA Enabled: false
18/01/26 13:18:18 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
18/01/26 13:18:18 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
18/01/26 13:18:18 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
18/01/26 13:18:18 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
18/01/26 13:18:18 INFO blockmanagement.BlockManager: The block deletion will start around 2018 Jan 26 13:18:18
18/01/26 13:18:18 INFO util.GSet: Computing capacity for map BlocksMap
18/01/26 13:18:18 INFO util.GSet: VM type       = 64-bit
18/01/26 13:18:18 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
18/01/26 13:18:18 INFO util.GSet: capacity      = 2^21 = 2097152 entries
18/01/26 13:18:18 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
18/01/26 13:18:18 WARN conf.Configuration: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS
18/01/26 13:18:18 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
18/01/26 13:18:18 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
18/01/26 13:18:18 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
18/01/26 13:18:18 INFO blockmanagement.BlockManager: defaultReplication         = 1
18/01/26 13:18:18 INFO blockmanagement.BlockManager: maxReplication             = 512
18/01/26 13:18:18 INFO blockmanagement.BlockManager: minReplication             = 1
18/01/26 13:18:18 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
18/01/26 13:18:18 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
18/01/26 13:18:18 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
18/01/26 13:18:18 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
18/01/26 13:18:18 INFO namenode.FSNamesystem: Append Enabled: true
18/01/26 13:18:19 INFO util.GSet: Computing capacity for map INodeMap
18/01/26 13:18:19 INFO util.GSet: VM type       = 64-bit
18/01/26 13:18:19 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
18/01/26 13:18:19 INFO util.GSet: capacity      = 2^20 = 1048576 entries
18/01/26 13:18:19 INFO namenode.FSDirectory: ACLs enabled? false
18/01/26 13:18:19 INFO namenode.FSDirectory: XAttrs enabled? true
18/01/26 13:18:19 INFO namenode.NameNode: Caching file names occurring more than 10 times
18/01/26 13:18:19 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: falseskipCaptureAccessTimeOnlyChange: false
18/01/26 13:18:19 INFO util.GSet: Computing capacity for map cachedBlocks
18/01/26 13:18:19 INFO util.GSet: VM type       = 64-bit
18/01/26 13:18:19 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
18/01/26 13:18:19 INFO util.GSet: capacity      = 2^18 = 262144 entries
18/01/26 13:18:19 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
18/01/26 13:18:19 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
18/01/26 13:18:19 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
18/01/26 13:18:19 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
18/01/26 13:18:19 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
18/01/26 13:18:19 INFO util.GSet: Computing capacity for map NameNodeRetryCache
18/01/26 13:18:19 INFO util.GSet: VM type       = 64-bit
18/01/26 13:18:19 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
18/01/26 13:18:19 INFO util.GSet: capacity      = 2^15 = 32768 entries
18/01/26 13:18:19 INFO namenode.FSImage: Allocated new BlockPoolId: BP-284323615-127.0.1.1-1516969099775
18/01/26 13:18:19 INFO common.Storage: Storage directory /home/vitfo/hadoop/hdata/dfs/name has been successfully formatted.
18/01/26 13:18:19 INFO namenode.FSImageFormatProtobuf: Saving image file /home/vitfo/hadoop/hdata/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
18/01/26 13:18:20 INFO namenode.FSImageFormatProtobuf: Image file /home/vitfo/hadoop/hdata/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 322 bytes saved in 0 seconds.
18/01/26 13:18:20 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
18/01/26 13:18:20 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at vitfo-VirtualBox/127.0.1.1
************************************************************/

Vypuštění 🙂 hdfs a yarn démonů.

vitfo@vitfo-VirtualBox:~/hadoop/hadoop-2.9.0$ start-dfs.sh 
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/vitfo/hadoop/hadoop-2.9.0/logs/hadoop-vitfo-namenode-vitfo-VirtualBox.out
localhost: starting datanode, logging to /home/vitfo/hadoop/hadoop-2.9.0/logs/hadoop-vitfo-datanode-vitfo-VirtualBox.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/vitfo/hadoop/hadoop-2.9.0/logs/hadoop-vitfo-secondarynamenode-vitfo-VirtualBox.out

vitfo@vitfo-VirtualBox:~/hadoop/hadoop-2.9.0$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/vitfo/hadoop/hadoop-2.9.0/logs/yarn-vitfo-resourcemanager-vitfo-VirtualBox.out
localhost: starting nodemanager, logging to /home/vitfo/hadoop/hadoop-2.9.0/logs/yarn-vitfo-nodemanager-vitfo-VirtualBox.out

Kontrola služeb. Jps je Java Virtual Machine Process Status Tool. Hadoop používá Javu (hadoop je v Javě). Jps zobrazí seznam java procesů uživatele. V pseudo distribuovaném módu (to je ten způsob uvedený výše, kdy vše běží na jediném stroji) běži každý démon ve vlastní JVM a jsp je zobrazí. Je možné použít přepínač -l (dlouhý výpis)

vitfo@vitfo-VirtualBox:~/hadoop/hadoop-2.9.0$ jps
3056 DataNode
3412 ResourceManager
3525 NodeManager
3256 SecondaryNameNode
3834 Jps
2941 NameNode

Hadoop nabízí i webové rozhraní pro zobrazení dat o HDFS na localhost:50070.

hadoop-web-overview

 

hadoop-web-datanodes

Zastavení yarnu a dfs

vitfo@vitfo-VirtualBox:~/hadoop/hadoop-2.9.0$ stop-yarn.sh 
stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
localhost: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
no proxyserver to stop

vitfo@vitfo-VirtualBox:~/hadoop/hadoop-2.9.0$ stop-dfs.sh 
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode

Začínáme s Hadoopem: Co je to HDFS

Hadoop Distibuted File System (HDFS) je distribuovaný souborový systém navržený tak, aby běžel na běžném hardwaru. Protože se jedná o distribuovaný souborový systém (běží na clusteru) a navíc je určen k tomu, aby běžel na běžně dostupném hardware, je velký důraz kladen na to, aby se dokázal vypořádat s chybami, které mohou vzniknout (chyby na síti, chyby na hardware). K hardwarovým chybám přistupuje tak, že se spíše jedná o standard než o výjimku :-). HDFS je napsáno v jazyce Java.

U aplikací používajících HDFS se předpokládá, že pracují s velkým množstvím dat. Typický soubor na HDFS má gigibajty až terabajty dat. HDFS je tedy zaměřeno na podporu velkých souborů. Se soubory v HDFS se pracuje stylem zapiš jednou, čti mnohokrát (write-once-read-many). Předpokládá se, že jednou vytvořený soubor se nebude měnit (výjimkou je přidání – append, nebo zkrácení – truncate), ale bude se k němu mnohokrát přistupovat (bude mnohokrát čten). Dalším předpokladem je, že mnohem efektivnější než přesouvat data k aplikaci provádějící výpočet je přiblížit výpočet k datům (moving computation is cheaper than moving data).

Číst dál

Hadoop, getmerge a .crc soubor

Příkaz getmerge (hdfs dfs -getmerge zdrojovy-adresar cilovy-soubor) v Hadoopu vezme všechny soubory v zadaném adresáři, sloučí je do jednoho lokálního souboru. Tento příkaz je podobný příkazu get (hdfs dfs -get zdrojovy-soubor cilovy-soubor) který provede kopii jednoho souboru. Na rozdíl od get, ale vytvoří skrytý soubor .nazev-ciloveho-souboru.crc. Soubor .crc obsahuje kontrolní součet.

Na toto chování upozorňuje issue ticket HADOOP-12643. Na tomto ticketu mě zaujalo, že vznikl 15. prosince 2015 18:23, komentář k němu přibyl ten samý den o dvě minuty později a pak nic. Tento ticket je stále ve stavu OPEN (aktuálně tedy více než dva roky).

hadoop-getmerge.crc-secret-file