How to troubleshoot issues with map-reduce jobs not appearing under 'Jobs' in the MapR MCS

Document created by jbubier Employee on Feb 13, 2016
Version 1Show Document
  • View in full screen mode

Author: Jonathan Bubier

 

Original Publication Date: August 13, 2014

 

The Job Management feature (JM) in MapR provide an administrator with an advanced interface to monitor submitted map-reduce jobs.  The monitoring capabilities allow for more in-depth analysis than that provided by the stock Hadoop JobTracker UI.  There are a number of components which are needed for this to function and in some cases an administrator may find that jobs are not visible despite enabling the JM feature. 

 

First, a brief description of how this feature works and which components are used on which nodes to perform necessary functions.  All job and task status information used by JM is stored in a MySQL database.  This MySQL database can be running on a cluster node or outside of the cluster, for the purposes of using JM both are valid configurations.  Job and task information is inserted into the MySQL database by the 'hoststats' process running on the active JobTracker host.  The 'hoststats' process connects to MySQL using the configuration contained within /opt/mapr/conf/db.conf, most importantly the URL, the username/password and the schema.  When an administrator requests job information in the MCS it is retrieved directly from MySQL by the webserver process using the configuration contained within /opt/mapr/conf/hibernate.cfg.xml. 

 

To determine why jobs are not available within JM it is important to first identify the scope of the problem.  Specifically it is important to determine whether new jobs are not being inserted into the MySQL JM database (and therefore not viewable in MCS) or if new jobs are not retrievable within the MCS.  If the problem is observed to be the former then the primary focus will be on the active JobTracker to ensure new job information gets to MySQL.  If it is the latter then the JobTracker host can be assumed to be functioning properly and the focus will be on the webserver host(s). 

 

Before continuing please review the prerequisite configuration steps for the JM MySQL database described here: http://doc.mapr.com/display/MapR/Setting+up+the+MapR+Metrics+Database.  These steps are important to ensure the MySQL database is setup properly for use with JM.  Also note that while it is not required, having the MySQL client available on all JobTracker nodes and all webserver nodes is very helpful in diagnosing connectivity issues with MySQL. 

 

New Job information is not getting inserted into MySQL

 

1.  Verify Jobs data in MySQL

 

To determine if new job information is getting inserted into MySQL first connect to the MySQL host used for JM using a MySQL client.  Ex:

 

$ mysql -u <user> -h <hostname> -p

 

Once connected switch to the database used by JM as specified by 'db.schema' in /opt/mapr/conf/db.conf on the active JobTracker host.  Ex:

 

mysql> use metrics;

Database changed


Once switched to the correct database, perform a select on the 'JOB' table.  This table is used by JM to store the summary information about all submitted map-reduce jobs.  Ex:

 

mysql> select * from JOB limit 1;

 

+-----------------------+------------------------------------------------------------------+---------------+----------------+----------------+---------------+---------------+---------------------+---------------------+

| JOB_ID                | JOB_NAME                                                         | PARENT_JOB_ID | USER_SUBMITTED | TIME_SUBMITTED | TIME_STARTED  | TIME_FINISHED | CLUSTER_ID          | CREATED             |

+-----------------------+------------------------------------------------------------------+---------------+----------------+----------------+---------------+---------------+---------------------+---------------------+

| job_201403171330_0001 | Sleep job                                                        | NULL          | root           |  1395088297243 | 1395088297548 |          NULL | 7270004970574606037 | 2014-03-17 13:31:37 |

+-----------------------+------------------------------------------------------------------+---------------+----------------+----------------+---------------+---------------+---------------------+---------------------+

1 row in set (0.00 sec)

 

mysql>

 

If there is already data in the 'JOB' table but more recent records are not being inserted, confirm the latest entry in the table.  Ex:

 

 

mysql> select max(CREATED) from JOB;

 

+---------------------+

| max(CREATED)        |

+---------------------+

| 2014-08-08 17:48:37 |

+---------------------+

1 row in set (0.00 sec)

 

mysql>

 

 

2.  Verify mapr-metrics is installed on all JobTracker hosts

 

Once it is confirmed that new job data is not getting into the 'JOB' table in the JM database, the JobTracker host will need to be inspected further.  First, verify that the mapr-metrics package is installed on all JobTracker hosts.  Ex:

 

$ rpm -qa | grep mapr-metrics (RedHat/CentOS)
$ dpkg -l | grep mapr-metrics (Ubuntu)

 

After confirming the package is installed verify symbolic links are created for libsoci_core.so.3.1 and libsoci_mysql.so.3.1 under /usr/lib64/.  Ex:

 

 

$ ls -la /usr/lib64/libsoci_*

 

lrwxrwxrwx 1 root root 35 May  6 11:56 /usr/lib64/libsoci_core.so.3.1 -> /opt/mapr/lib/libsoci_core.so.3.1.0

lrwxrwxrwx 1 root root 36 May  6 12:00 /usr/lib64/libsoci_mysql.so.3.1 -> /opt/mapr/lib/libsoci_mysql.so.3.1.0

 

Depending on your MapR version these symbolic links may point to libraries under /usr/lib64 rather than /opt/mapr/lib/.  Ex:

 

$ ls -la /usr/lib64/libsoci_*

lrwxrwxrwx 1 root root      26 Feb  4  2014 /usr/lib64/libsoci_core.so.3.1 -> libsoci_core.so.3.1.0.suse

 

-rwxr-xr-x 1 root root 1399689 Jan 22  2013 /usr/lib64/libsoci_core.so.3.1.0.suse

lrwxrwxrwx 1 root root      27 Feb  4  2014 /usr/lib64/libsoci_mysql.so.3.1 -> libsoci_mysql.so.3.1.0.suse

-rwxr-xr-x 1 root root  349976 Jan 22  2013 /usr/lib64/libsoci_mysql.so.3.1.0.suse

 

If these symbolic links are not present in your environment first run 'ldconfig' to attempt to recreate the necessary linking. 

 

3.  Verify database credentials on the JobTracker hosts

Once the mapr-metrics package is confirmed and the required shared libraries are validated, confirm that the configuration in /opt/mapr/conf/db.conf is valid to connect to MySQL.  Additionally check for any database connection related errors in /opt/mapr/logs/hoststats.log and /opt/mapr/logs/hoststats.err on the active JobTracker host which may point to a misconfiguration.  First, the current DB configuration:

 

$ cat /opt/mapr/conf/db.conf

db.url=hadoop-n1:3306

db.user=root

db.passwd=mapr

db.schema=metrics
...

 

Using the configuration values in db.conf, verify first from the active JobTracker host the connection to MySQL using the MySQL client.  Ex:

 

$ mysql -u root -h hadoop-n1 -p metrics

 

When prompted enter the password specified by 'db.passwd' in db.conf.  If this returns an 'Access denied' or similar message from MySQL the credentials need to be inspected to ensure the defined user has sufficient privileges and the correct password to connect.  Once the credentials are corrected, use the MySQL client again to validate the connection.  If the credentials in /opt/mapr/conf/db.conf need to be changed 'hoststats' will need to be restarted.  This can be done using the following maprcli command:

 

$ maprcli node services -nodes <hostname of active JT> -name hoststats -action restart

 

Once the configuration is validated on the active JobTracker host repeat the same steps on the standby JobTracker hosts.

 

 

4.  Verify the hoststats process on the active JobTracker is running with the correct options

 

Once the credentials for MySQL are validated confirm that the 'hoststats' process is running with the correct command line options.  As mentioned above 'hoststats' is responsible for inserting the job and task data into MySQL.  If the process is either not running or not using the correct options job data will not get into MySQL.  The running 'hoststats' process should look similar to the following:

 

 

$ ps -ef | grep hoststats

 

mapr     32048     1  0 Jun12 ?        00:52:05 /opt/mapr/server/hoststats 5660 /opt/mapr/logs/TaskTracker.stats -S 1

 

Note the '-S 1' option at the end of the command line arguments.  If this option is not present verify that the following options have the corresponding values in /opt/mapr/conf/warden.conf:

 

 

rpc.drop=false

 

hs.rpcon=true

hs.port=1111

hs.host=localhost

 

If 'hoststats' was started manually, i.e. by running '/etc/init.d/mapr-hoststats start' rather than started by warden 'hoststats' will likely be started incorrectly.  To resolve that condition terminate the running 'hoststats' process and allow warden to restart it using the correct syntax.

 

Once the 'hoststats' process is verified on the active JobTracker host repeat the same verification steps on the standby JobTracker hosts.

 

5.  Verify the hoststats binary has the correct shared library dependencies

 

Once the command line arguments for 'hoststats' are confirmed, verify that the 'hoststats' binary has the correct shared library dependencies.  In some cases after applying a patch from MapR that contains a new hoststats binary the shared library dependencies will be missing libsoci_core.so.3.1.  Ex:

 

 

$ ldd /opt/mapr/server/hoststats

 

linux-vdso.so.1 =>  (0x00007fffd2fff000)

libdl.so.2 => /lib64/libdl.so.2 (0x00000033a5c00000)

libpthread.so.0 => /lib64/libpthread.so.0 (0x00000033a6000000)

libMapRClient.so => /opt/mapr/lib/libMapRClient.so (0x00007f85051b2000)

libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00000033abc00000)

libm.so.6 => /lib64/libm.so.6 (0x00000033a6800000)

libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000033aac00000)

libc.so.6 => /lib64/libc.so.6 (0x00000033a5800000)

/lib64/ld-linux-x86-64.so.2 (0x00000033a5400000)

 

The missing dependency will prevent hoststats from connecting to MySQL to insert job and task data in the JM database.  This is a known issue and requires a new patch package with an updated hoststats binary.  If you observe this contact MapR support at support@mapr.com with the above details to request a new patch.


Job Information cannot be retrieved from MySQL in the MCS

 

If job data is being inserted into MySQL but the information is not available in the MCS there is likely an issue with the configuration on the webserver host.  As a prerequisite confirm that the 'mapr-metrics' package is installed on all webserver nodes.  Review the information from step 2 above to verify the package is installed and the correct symbolic links are created for the libsoci* libraries.

 

1.  Verify database credentials on the webserver hosts

 

Once the mapr-metrics package is confirmed and the required shared libraries are validated, confirm that the configuration in /opt/mapr/conf/hibernate.cfg.xml is valid to connect to MySQL.  Additionally check for any database connection related errors in /opt/mapr/logs/adminuiapp.log on the webserver host which may point to a misconfiguration.  First, the current DB configuration:

 

$ cat /opt/mapr/conf/hibernate.cfg.xml
...

    <property name="connection.url">jdbc:mysql://hadoop-n1:3306/metrics</property>

 

    <property name="connection.username">root</property>

    <property name="connection.password">mapr</property>
...


Using the configuration values in hibernate.cfg.xml, verify from the webserver host the connection to MySQL using the MySQL client.  Ex:

 

$ mysql -u root -h hadoop-n1 -p metrics

 

When prompted enter the password specified by 'connection.password' in hibernate.cfg.xml.  If this returns an 'Access denied' or similar message from MySQL the credentials need to be inspected to ensure the defined user has sufficient privileges and the correct password to connect. 

 

Once the credentials are corrected, use the MySQL client again to validate the connection.  If the credentials in /opt/mapr/conf/hibernate.cfg.xml need to be changed do so using the MCS in the left navigation pane under 'System Settings > 'Metrics'.  After updating the configuration in the MCS confirm the correct settings are in place in /opt/mapr/conf/hibernate.cfg.xml. 


If the database credentials are correct and have been validated and the necessary packages are installed and the job information is still unavailable in the MCS please contact MapR support at support@mapr.com to investigate further.  To expedite resolution please provide /opt/mapr/logs/adminuiapp.log from the webserver host or a full MapR support-dump if possible. 

Attachments

    Outcomes