Utilities > Finding DMExpress Hadoop Logs on YARN (MRv2)

Finding DMExpress Hadoop Logs on YARN (MRv2)

Article #: Product: Version:

Summary

When using DMExpress within Hadoop, the DMExpress execution metadata (status messages and statistics) is output to the Hadoop stderr logs. This log output can be useful for ensuring that DMExpress was invoked, reviewing any issued warnings or errors, and checking statistics for the executed job.

The logs can be viewed individually using either the JobHistoryServer (JHS) web interface or the ResourceManager (RM) web interface. They can also be gathered using the attached script, which requires JHS to be running to gather the logs.

The instructions provided here assume that DMExpress is being used with Hadoop MapReduce version 2 (YARN), and apply to all methods of invocation of DMExpress within Hadoop, including streaming.

Resolution

When running a Hadoop job that invokes DMExpress, the DMExpress execution metadata does not appear on the terminal, but is captured in the Hadoop logs as follows:

Hadoop job log files are stored in a standard location and made available over HTTP. You can access them in the following ways, as described in detail in the next sections:

Attachments

The attached script, getlogs.sh, can be used to gather the logs.

Additional Information

By default, the JHS will retain the logs for one week, after which they are deleted. This default value can be configured by setting the JHS configuration parameter mapreduce.jobhistory.max-age-ms.

If the job logs are deleted, it will not be possible to determine whether DMExpress was invoked for that job.

For instructions on finding the Hadoop logs for MRv1, see Finding DMExpress Hadoop Logs on MRv1.

Last updated: