如何使用Yarn客户端成功提交MapReduce任务？

要使用Yarn客户端提交MapReduce任务，可以使用以下命令：，，“shell，yarn jar your_mapreduce_application.jar [mainClass] [args...]，`，，your_mapreduce_application.jar 是你的 MapReduce 应用程序的 JAR 文件，[mainClass] 是主类的全名（如果需要指定），[args…] 是传递给应用程序的参数。，，假设你有一个名为 MyMapReduceApp.jar 的 MapReduce 应用程序，并且主类为 com.example.MyMapReduceApp，你可以使用以下命令来提交任务：，，`shell，yarn jar MyMapReduceApp.jar com.example.MyMapReduceApp inputPath outputPath，`，，在上述命令中，inputPath 是输入数据的路径，outputPath` 是输出结果的路径。请根据实际情况替换相应的参数。

Yarn 提交 MapReduce 任务使用 Yarn 客户端

简介

在 Hadoop 2.x 版本中，MapReduce 作业的调度和资源管理由 YARN（Yet Another Resource Negotiator）负责，YARN 是一个通用的资源管理框架，可以支持多种计算模型，包括 MapReduce，本文将介绍如何使用 YARN 客户端提交 MapReduce 作业。

环境准备

1、Hadoop 集群：确保 Hadoop 集群已经正确安装并配置好。

2、Java 环境：确保 Java 环境已经安装并配置。

3、Hadoop 命令行工具：确保 Hadoop 命令行工具可以在终端中使用。

步骤

1. 编写 MapReduce 程序

首先需要编写一个 MapReduce 程序，可以使用 Java、Python 等语言，这里以一个简单的 WordCount 示例为例，使用 Java 编写。

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
    public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();
        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            StringTokenizer itr = new StringTokenizer(value.toString());
            while (itr.hasMoreTokens()) {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }
    }
    public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        private IntWritable result = new IntWritable();
        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }
            result.set(sum);
            context.write(key, result);
        }
    }
    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "word count");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenizerMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

2. 编译和打包程序

将上述代码保存为WordCount.java，然后使用以下命令进行编译和打包：


$ javac -classpathhadoop classpath -d . WordCount.java
$ jar -cvf wordcount.jar *.class

3. 上传 JAR 包到 HDFS

将生成的 JAR 包上传到 HDFS：

$ hadoop fs -put wordcount.jar /usr/local/hadoop/share/hadoop/mapreduce

4. 使用 YARN 客户端提交作业

通过 YARN 客户端提交 MapReduce 作业：

$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/wordcount.jar WordCount /input /output

/usr/local/hadoop/share/hadoop/mapreduce/wordcount.jar 是 JAR 包在 HDFS 上的路径。

WordCount 是主类的名称。

/input 是输入数据的路径。

/output 是输出结果的路径。

相关问题与解答

问题1：如何查看 MapReduce 作业的运行状态？

可以通过 YARN ResourceManager Web UI 查看 MapReduce 作业的运行状态，ResourceManager Web UI 的地址是http://<ResourceManager-Hostname>:8088，在页面上可以看到正在运行的作业及其详细信息。

问题2：如何设置 MapReduce 作业的资源需求？

可以通过设置mapreduce.job.maps 和mapreduce.job.reduces 参数来指定 Map 和 Reduce 任务的数量。

$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/wordcount.jar 
    WordCount 
    -D mapreduce.job.maps=10 
    -D mapreduce.job.reduces=5 
    /input 
    /output

这样设置了 Map 任务数量为 10，Reduce 任务数量为 5。

以上内容就是解答有关“yarn 提交mapreduce_使用Yarn客户端提交任务”的详细内容了，我相信这篇文章可以为您解决一些疑惑，有任何问题欢迎留言反馈，谢谢阅读。

原创文章，作者：未希，如若转载，请注明出处：https://www.kdun.com/ask/1139705.html

本网站发布或转载的文章及图片均来自网络，其原创性以及文中表达的观点和判断不代表本网站。如有问题，请联系客服处理。

如何使用Yarn客户端成功提交MapReduce任务？

相关推荐

如何优化服务器的内存使用以提升性能？

如何查看MySQL中的所有数据库和资源？

如何在提交MapReduce任务时设置任务优先级？

什么是文件句柄，它在计算机科学中扮演着怎样的角色？

发表回复