How to troubleshoot Hue FileBrowser issues using REST API calls to httpfs

Document created by Hao Zhu Employee on Feb 17, 2016
Version 1Show Document
  • View in full screen mode

Author: Hao Zhu

Original Publication Date: December 23, 2014

 

Goal:

Hue's FileBrowser operates on files and directories in MapR-FS by sending REST API calls to a host running httpfs. This article shows how to troubleshoot issues with the FileBrowser using the same set of REST API calls outside of Hue.

Solution:

1. Enable DEBUG logging for Hue.

CherryPy is a web framework that provides much of the core functionality of Hue. The CherryPy webserver logs to /opt/mapr/hue/hue-<version>/logs/runcpserver.log where <version> is the installed Hue version. By default, DEBUG level logging is enabled. The below steps are included to confirm the current setting and to change the setting if it was modified.

 

To enable the DEBUG level logging, edit /opt/mapr/hue/hue-<version>/desktop/conf/log.conf:

 

[handler_logfile]

class=handlers.RotatingFileHandler

# Choices are DEBUG, INFO, WARNING, ERROR, CRITICAL

level=DEBUG
formatter=default

args=('%LOG_DIR%/%PROC_NAME%.log', 'a', 1000000, 3)

 

After making the above change restart Hue using the following maprcli commands.

$ maprcli node services -name hue -action stop -nodes hostname 
$ maprcli node services -name hue -action start -nodes hostname

In the above maprcli commands replace 'hostname' with the hostname of the node where Hue is running.

2. Verify Hue configuration is pointing to correct httpfs IP and port.

Verify the httpfs configuration in /opt/mapr/hue/hue-<version>/desktop/conf/hue.ini defined by the 'webhdfs_url' parameter in the section "[[hdfs_clusters]]". Ex:

# Use WebHdfs/HttpFs as the communication mechanism. 
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs. webhdfs_url=http://mapr4-3:14000/webhdfs/v1

If this property requires any change edit hue.ini and restart Hue for the change to take effect.

$ maprcli node services -name hue -action stop -nodes hostname 
$ maprcli node services -name hue -action start -nodes hostname


3. Verify the httpfs service is running.

Use the below maprcli command to identify where the httpfs service is running.

$ maprcli node list -columns service | grep httpfs

Go to the host where httpfs is running and confirm there is a TCP listener on the default httpfs port of 14000.

$ lsof -i:14000 
COMMAND    PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
java      7848 mapr  134u  IPv6 35688826      0t0  TCP *:scotty-ft (LISTEN)

 

4. Monitor Hue webserver log to capture the REST API calls to httpfs.

As mentioned above the Hue webserver logs to /opt/mapr/hue/hue-<version>/logs/runcpserver.log. With DEBUG logging enabled this log file will contain detail about the REST API calls made by the Hue webserver to httpfs when the Hue FileBrowser is accessed. For example, if we copy a file on MapR-FS at /tmp/mapr/Master.csv to /tmp/mapr/Master.csv.2 using the Hue FileBrowser, we can capture the below calls in runcpserver.log.

 

a. Get metadata of source file.
From /opt/mapr/hue/hue-<version>/logs/runcpserver.log the following API call is seen when browsing a directory path on MapR-FS in the Hue FileBrowser :

GET /webhdfs/v1/tmp/mapr/Master.csv?op=GETFILESTATUS&user.name=mapr&doas=mapr HTTP/1.1

Use the below curl command to manually check the result using the same REST API call. Replace the path /tmp/mapr/Master.csv' with an existing file on MapR-FS. This will help to determine whether the Hue FileBrowser issue is specific to Hue or if it is a general issue with accessing MapR-FS using httpfs. Note "mapr4-3" is the hostname of httpfs server in this example and should be replaced with the hostname of your httpfs server.

$ curl "http://mapr4-3:14000/webhdfs/v1/tmp/mapr/Master.csv?op=GETFILESTATUS&user.name=mapr" 
{"FileStatus":{"pathSuffix":"","type":"FILE","length":6049426,"owner":"mapr","group":"mapr","permission":"755","accessTime":1419263547000,"modificationTime":1419263556835,"blockSize":268435456,"replication":3}}

b. Open and read source file.
From /opt/mapr/hue/hue-<version>/logs/runcpserver.log the following API call is seen when attempting to open a file on MapR-FS using the Hue FileBrowser:

GET /webhdfs/v1/tmp/mapr/Master.csv?length=67108864&op=OPEN&user.name=mapr&offset=0&doas=mapr HTTP/1.1

Use the below curl command to manually check the result using the same REST API call.  Note that this curl command will attempt to open and read the specified file and print the result to stdout. As above replace the path /tmp/mapr/Master.csv with an existing file on MapR-FS and replace 'mapr4-3' with the hostname of your httpfs server.

$ curl -X GET -L "http://mapr4-3:14000/webhdfs/v1/tmp/mapr/Master.csv?length=67108864&op=OPEN&user.name=mapr&offset=0&doas=mapr"

Please refer to webhdfs API call for more details on syntax.

 

Note the reason of adding "&user.name=mapr" is to avoid the below error:

HTTP Status 403 - Anonymous requests are disallowed

If the results from the above REST API calls are not as expected it may indicate an issue with httpfs that is unrelated to Hue. Please open a case with MapR Support using the Support Portal with the problem details and steps to reproduce the issue to investigate further or make a comment below.

Attachments

    Outcomes