Hi,
-
I have this example of MapReduce [1], and I
want to print info in the stdout and in a log file. It seems
that the logs isn’t print anything. How can I make my class
print these words?
-
I also have set in the yarn-site.xml
to retain log. Although the logs are retained in the /app-logs
dir, the userlogs
dir is deleted at the end of the job execution. How can I
make MapReduce to not delete files in the userlogs
dir?
I am using Yarn.
Thanks,
[1] Wordcount
exampla with just the map part.
public class MyWordCount {
public static class MyMap extends Mapper {
Log log = LogFactory.getLog(MyWordCount.class);
private final static IntWritable _one_ = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
StringTokenizer itr = new StringTokenizer(value.toString());
System.out.println("HERRE");
log.info("HERRRRRE");
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
output.collect(word, one);
}
}
public void run(Context context) throws IOException, InterruptedException {
setup(context);
try {
while (context.nextKeyValue()) {
System.out.println("Key: " + context.getCurrentKey() + " Value: " + context.getCurrentValue());
map(context.getCurrentKey(), context.getCurrentValue(), context);
}
} finally {
cleanup(context);
}
}
public void cleanup(Mapper.Context context) {}
}
[2] yarn-site.xml
<!-- job history -->
<property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property>
<property> <name>yarn.nodemanager.log.retain-seconds</name> <value>900000</value> </property>
<property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/app-logs</value> </property>