配置Hadoop2.x的HDFS、MapReduce来运行WordCount程序

2024-06-14 06:18

本文主要是介绍配置Hadoop2.x的HDFS、MapReduce来运行WordCount程序,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

主机HDFSMapReduce
node1NameNodeResourceManager
node2SecondaryNameNode & DataNodeNodeManager
node3DataNodeNodeManager
node4DataNodeNodeManager

1.配置hadoop-env.sh

export JAVA_HOME=/csh/link/jdk

2.配置core-site.xml

<property><name>fs.defaultFS</name><value>hdfs://node1:9000</value>
</property>
<property><name>hadoop.tmp.dir</name><value>/csh/hadoop/hadoop2.7.2/tmp</value>
</property>

3.配置hdfs-site.xml

<property><name>dfs.namenode.http-address</name><value>node1:50070</value>
</property>
<property><name>dfs.namenode.secondary.http-address</name><value>node2:50090</value>
</property>
<property><name>dfs.namenode.name.dir</name><value>/csh/hadoop/hadoop2.7.2/name</value>
</property>
<property><name>dfs.datanode.data.dir</name><value>/csh/hadoop/hadoop2.7.2/data</value>
</property>
<property><name>dfs.replication</name><value>3</value>
</property>

4.配置mapred-site.xml

<property><name>mapreduce.framework.name</name><value>yarn</value>
</property>

5.配置yarn-site.xml

<property><name>yarn.resourcemanager.hostname</name><value>node1</value>
</property>
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value>
</property>
<property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

6.配置masters

node2

7.配置slaves

node2
node3
node4

8.启动Hadoop

bin/hadoop namenode -format
sbin/start-dfs.sh
sbin/start-yarn.sh

9.运行WordCount程序

//创建文件wc.txt
echo "I love Java I love Hadoop I love BigData Good Good Study, Day Day Up" > wc.txt
//创建HDFS中的文件
hdfs dfs -mkdir -p /input/wordcount/
//将wc.txt上传到HDFS中
hdfs dfs -put wc.txt /input/wordcount
//运行WordCount程序
hadoop jar /csh/software/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /input/wordcount/ /output/wordcount/

10.结果

[root@node1 sbin]# hadoop jar /csh/software/hadoop-2.7.2/share/hadoop/mapreduce/hadoapreduce-examples-2.7.2.jar wordcount /input/wordcount/ /output/wordcount/
16/03/24 19:26:48 INFO client.RMProxy: Connecting to ResourceManager at node1/192.161.11:8032
16/03/24 19:26:56 INFO input.FileInputFormat: Total input paths to process : 1
16/03/24 19:26:56 INFO mapreduce.JobSubmitter: number of splits:1
16/03/24 19:26:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_145887175_0001
16/03/24 19:26:59 INFO impl.YarnClientImpl: Submitted application application_145887175_0001
16/03/24 19:27:00 INFO mapreduce.Job: The url to track the job: http://node1:8088/prapplication_1458872237175_0001/
16/03/24 19:27:00 INFO mapreduce.Job: Running job: job_1458872237175_0001
16/03/24 19:28:13 INFO mapreduce.Job: Job job_1458872237175_0001 running in uber modfalse
16/03/24 19:28:13 INFO mapreduce.Job:  map 0% reduce 0%
16/03/24 19:30:07 INFO mapreduce.Job:  map 100% reduce 0%
16/03/24 19:31:13 INFO mapreduce.Job:  map 100% reduce 33%
16/03/24 19:31:16 INFO mapreduce.Job:  map 100% reduce 100%
16/03/24 19:31:23 INFO mapreduce.Job: Job job_1458872237175_0001 completed successfu
16/03/24 19:31:24 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=106FILE: Number of bytes written=235387FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=174HDFS: Number of bytes written=64HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job Counters Launched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=116501Total time spent by all reduces in occupied slots (ms)=53945Total time spent by all map tasks (ms)=116501Total time spent by all reduce tasks (ms)=53945Total vcore-milliseconds taken by all map tasks=116501Total vcore-milliseconds taken by all reduce tasks=53945Total megabyte-milliseconds taken by all map tasks=119297024Total megabyte-milliseconds taken by all reduce tasks=55239680Map-Reduce FrameworkMap input records=4Map output records=15Map output bytes=129Map output materialized bytes=106Input split bytes=105Combine input records=15Combine output records=9Reduce input groups=9Reduce shuffle bytes=106Reduce input records=9Reduce output records=9Spilled Records=18Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=1468CPU time spent (ms)=6780Physical memory (bytes) snapshot=230531072Virtual memory (bytes) snapshot=4152713216Total committed heap usage (bytes)=134795264Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=69File Output Format Counters Bytes Written=64
[root@node1 sbin]# hdfs dfs -cat /output/wordcount/*
BigData 1
Day 2
Good    2
Hadoop  1
I   3
Java    1
Study,  1
Up  1
love    3

个人博客原文:
配置Hadoop2.x的HDFS、MapReduce来运行WordCount程序

这篇关于配置Hadoop2.x的HDFS、MapReduce来运行WordCount程序的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1059611

相关文章

在不同系统间迁移Python程序的方法与教程

《在不同系统间迁移Python程序的方法与教程》本文介绍了几种将Windows上编写的Python程序迁移到Linux服务器上的方法,包括使用虚拟环境和依赖冻结、容器化技术(如Docker)、使用An... 目录使用虚拟环境和依赖冻结1. 创建虚拟环境2. 冻结依赖使用容器化技术(如 docker)1. 创

SpringBoot+MyBatis-Flex配置ProxySQL的实现步骤

《SpringBoot+MyBatis-Flex配置ProxySQL的实现步骤》本文主要介绍了SpringBoot+MyBatis-Flex配置ProxySQL的实现步骤,文中通过示例代码介绍的非常详... 目录 目标 步骤 1:确保 ProxySQL 和 mysql 主从同步已正确配置ProxySQL 的

Spring Boot整合log4j2日志配置的详细教程

《SpringBoot整合log4j2日志配置的详细教程》:本文主要介绍SpringBoot项目中整合Log4j2日志框架的步骤和配置,包括常用日志框架的比较、配置参数介绍、Log4j2配置详解... 目录前言一、常用日志框架二、配置参数介绍1. 日志级别2. 输出形式3. 日志格式3.1 PatternL

配置springboot项目动静分离打包分离lib方式

《配置springboot项目动静分离打包分离lib方式》本文介绍了如何将SpringBoot工程中的静态资源和配置文件分离出来,以减少jar包大小,方便修改配置文件,通过在jar包同级目录创建co... 目录前言1、分离配置文件原理2、pom文件配置3、使用package命令打包4、总结前言默认情况下,

通过prometheus监控Tomcat运行状态的操作流程

《通过prometheus监控Tomcat运行状态的操作流程》文章介绍了如何安装和配置Tomcat,并使用Prometheus和TomcatExporter来监控Tomcat的运行状态,文章详细讲解了... 目录Tomcat安装配置以及prometheus监控Tomcat一. 安装并配置tomcat1、安装

mysqld_multi在Linux服务器上运行多个MySQL实例

《mysqld_multi在Linux服务器上运行多个MySQL实例》在Linux系统上使用mysqld_multi来启动和管理多个MySQL实例是一种常见的做法,这种方式允许你在同一台机器上运行多个... 目录1. 安装mysql2. 配置文件示例配置文件3. 创建数据目录4. 启动和管理实例启动所有实例

IDEA运行spring项目时,控制台未出现的解决方案

《IDEA运行spring项目时,控制台未出现的解决方案》文章总结了在使用IDEA运行代码时,控制台未出现的问题和解决方案,问题可能是由于点击图标或重启IDEA后控制台仍未显示,解决方案提供了解决方法... 目录问题分析解决方案总结问题js使用IDEA,点击运行按钮,运行结束,但控制台未出现http://

解决Spring运行时报错:Consider defining a bean of type ‘xxx.xxx.xxx.Xxx‘ in your configuration

《解决Spring运行时报错:Considerdefiningabeanoftype‘xxx.xxx.xxx.Xxx‘inyourconfiguration》该文章主要讲述了在使用S... 目录问题分析解决方案总结问题Description:Parameter 0 of constructor in x

解决IDEA使用springBoot创建项目,lombok标注实体类后编译无报错,但是运行时报错问题

《解决IDEA使用springBoot创建项目,lombok标注实体类后编译无报错,但是运行时报错问题》文章详细描述了在使用lombok的@Data注解标注实体类时遇到编译无误但运行时报错的问题,分析... 目录问题分析问题解决方案步骤一步骤二步骤三总结问题使用lombok注解@Data标注实体类,编译时

VScode连接远程Linux服务器环境配置图文教程

《VScode连接远程Linux服务器环境配置图文教程》:本文主要介绍如何安装和配置VSCode,包括安装步骤、环境配置(如汉化包、远程SSH连接)、语言包安装(如C/C++插件)等,文中给出了详... 目录一、安装vscode二、环境配置1.中文汉化包2.安装remote-ssh,用于远程连接2.1安装2