Remote Debugging using JConsole, JMX and SSH Tunnels

This turned out not to be as easy as I’d liked. Even with a bevy of useful resources available – lotsa people have run into these issues – it took a while to get the right combination. Let’s hope I can save you a bit of that pain …

JConsole / JMX Remoting via SSH Tunnels

I’ve recently hosted my Spring / Hibernate webapp in the cloud thanks to Amazon Web Services. A future post will mention the monitoring I’ve put in place, but for now it’s all about Tomcat. It keeps dying. Stop it, Tomcat. Stop dying!

Things are in a bad way, so I need to be able to debug remotely. I don’t want to open up my security group to general traffic, so using SSH tunnels is the best option. JConsole is a great tool for measuring current statistics and performance of your Java app, and relies on JMX remoting. It works great locally, or even within a trusted network, but once you’re behind a firewall with broad port-blocking, there are some significant issues. There are several core Java forum topics related to this discussion:

Daniel Fuchs has written several articles which illustrate these issues and provide good solutions. He explains that JMX remoting needs two ports: one for the RMI registry, and one for the RMI connection objects which are stubs used for remoting all the critical data. If you’re using the default JVM agent, you’ll tend to use the following JVM System.property’s on the server:

-Djava.rmi.server.hostname
-Djava.rmi.server.useLocalHostname
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port
-Dcom.sun.management.jmxremote.ssl
-Dcom.sun.management.jmxremote.ssl.need.client.auth
-Dcom.sun.management.jmxremote.authenticate
-Dcom.sun.management.jmxremote.access.file
-Dcom.sun.management.jmxremote.password.file

I’ll come back to these, but the one that’s important here is jmxremote.port. It allows you to specify the RMI registry port, the one that you’ll use to establish your remote connection via JConsole. However, the port for RMI export, which is used for all the critical data callbacks, is randomly chosen (on a session basis or JVM basis, not sure) and cannot be specified. And you can’t open a port for tunneling if you don’t know what it is.

You can see this issue if you crank up the debugging on JConsole. I was having issues getting the logging output so I took the double-barrel approach, using both the -debug argument and the custom java.util.logging descriptor, the contents of which I stole from here. Invoke it as follows:

% jconsole -debug -J"-Djava.util.logging.config.file=FILENAME"

The quotes are optional. Provide the logging descriptor filename. You can call out the JMX Service URL or the hostname:port combination at the end if you like. Now you’ll eventually a debug output much like this:

FINER: [javax.management.remote.rmi.RMIConnector: jmxServiceURL=service:jmx:rmi:///jndi/rmi://localhost:PORT/jmxrmi] connecting...
FINER: [javax.management.remote.rmi.RMIConnector: jmxServiceURL=service:jmx:rmi:///jndi/rmi://localhost:PORT/jmxrmi] finding stub...
FINER: [javax.management.remote.rmi.RMIConnector: jmxServiceURL=service:jmx:rmi:///jndi/rmi://localhost:PORT/jmxrmi] connecting stub...
FINER: [javax.management.remote.rmi.RMIConnector: jmxServiceURL=service:jmx:rmi:///jndi/rmi://localhost:PORT/jmxrmi] getting connection...
FINER: [javax.management.remote.rmi.RMIConnector: jmxServiceURL=service:jmx:rmi:///jndi/rmi://localhost:PORT/jmxrmi] failed to connect: java.rmi.ConnectException: Connection refused to host: IP-ADDRESS; nested exception is:
java.net.ConnectException: Operation timed out

PORT will be the RMI registry port you’re tunneling into. IP-ADDRESS is special, we’ll get to that, and it’s important to note that it’s a ‘ConnectException’ occurring against that host.

This debugging information can show up rather late in the whole connection process, undoubtedly because it’s an ‘Operation timed out’ issue, so don’t be surprised if it takes a while. Fortuntely, you can also see immediate verbose feedback when you set up your ssh tunnel connection (see below).

Addressing the Randomly-Generated RMI Export Port

The first problem I chose to resolve was the one relating to the random RMI export port issue. Daniel has provided a fine example of how to implement a custom ‘pre-main’ Agent which you can use to supplant the standard JVM one. There’s his quick’n’dirty version which doesn’t address security – which is where I started. And then there’s a more full-fledged version, which I modified to be configurable.

Most importantly, it builds its JMXConnectorServer with the following service URL:

service:jmx:rmi://HOSTNAME:RMI-EXPORT-PORT/jndi/rmi://HOSTNAME:RMI-REGISTRY-PORT/jmxrmi

Traditionally, you’ll see this service URL from the client perspective, where HOSTNAME:RMI-EXPORT-PORT is not defined and you just have ‘service:jmx:rmi:///jndi/rmi://…’. JConsole will build that sort of URL for you if you just provide HOSTNAME:RMI-REGISTRY-PORT (eg. hostname:port) when connecting.

By calling out the RMI-EXPORT-PORT in the agent’s service URL, you can affix it and tunnel to it. You can use the same port as the RMI registry; this only requires you to open one port for tunneling.

On your client / for JConsole, the HOSTNAME will probably be localhost, where you’ve opened your tunnel like so:

% ssh -N -v -L9999:REMOTE-HOST:9999 REMOTE-HOST

9999 is just an example port. REMOTE-HOST is the host you’re tunneling to. You can remove the -v argument, but it’s good to have that around so that you can see the network activity. You can also use the -l argument to specify the login username on the remote host. Note that you’re opening the same port locally as you’re hitting on the server, with no offset. You’ll need to open the same port because your agent on the server is going to need to know what port to callback to itself on for RMI export, and that won’t work if you have an offset. So you might as well use the same port for both the RMI registry and RMI export, and just keep that one port available locally.

On the server in your agent, the HOSTNAME part of the service URL can either be InetAddress.getLocalHost().getHostName(), or an IP address (127.0.0.1), or in my case ‘localhost’ just worked fine.

The major reason to create the custom agent is to build the port-qualifying service URL. As usual, the example code takes a lot of shortcuts. So, I built myself a more re-usable agent – influenced by the standard JVM agent’s System.property’s – which allowed me to configure the same sorts of things as mentioned above:

hostname : to be used as the HOSTNAME value above
port.registry : to be used for RMI-REGISTRY-PORT
port.export : to be used for RMI-EXPORT-PORT, defaulting to the same as port.registry if not provided
ssl : true to enable SSL
authenticate : true to enable credentials / authentication
access.file : specifies the location of the user credentials file
password.file : specifies the location of the user password file

And so I was able to configure my agent service URL for localhost, using the same port for both RMI requirements, and using simple password-based auth. I did not go down the SSL route, though many of the posts from Daniel and others explain this as well. Do that once you’ve tackled the core problem

Another great post relating to this issue mentions that Tomcat has a custom Listener for setting up a similar agent. The example was:

<Listener className="org.apache.catalina.mbeans.JMXAdaptorLifecycleListener" namingPort="RMI-REGISTRY-PORT" port="RMI-EXPORT-PORT" host="localhost"/>

I didn’t look any deeper into this to see whether it supports SSL and/or basic authentication. But it seems clear that this is not a Java agent, because you have to set those up via System.property’s. Here’s what I needed to add to Tomcat startup for my custom agent:

-D___.hostname=localhost
-D___.port.registry=RMI-REGISTRY-AND-EXPORT-PORT
-D___.authenticate=true
-D___.password.file=$CATALINA_HOME/conf/jmxremote.password
-D___.access.file=$CATALINA_HOME/conf/jmxremote.access
-javaagent:PATH/TO/AGENT.jar

I’ve redacted the ___ namespace I used for my agent, and the formal name of the ‘pre-main’-compatible JAR file that I built using the instructions that Daniel provided. Tomcat won’t start up properly until the agent is happy; after that, then you’re golden.

So I got Tomcat running, started up an ssh tunnel, and invoked JConsole. And still no matter what I did, I still got ‘ConnectException: Operation timed out’. I tried to connect via JConsole in all the following ways:

HOSTNAME:PORT
service:jmx:rmi:///jndi/rmi://HOSTNAME:PORT/jmxrmi
service:jmx:rmi://HOSTNAME:PORT/jndi/rmi://HOSTNAME:PORT/jmxrmi

All of these are valid URLs for connecting via JConsole. For a while there I wasn’t sure whether you could use the same port for both the RMI registry and export, so I could see that the JConsole log was different when I called out the RMI export info explicitly in the service URL. Still, it didn’t seem to help.

Then I started to realize that there were two separate issues going on, although they tended to blend together a lot in the posts I’d been reading.

Addressing the RMI Export Hostname

The short version is, even if you’ve set up your JMX service URL properly on the server – yes, even if you’ve set its HOSTNAME up to be ‘localhost’ – you’ll still need to tell JMX remoting which hostname the RMI export objects should use for callbacks. This requires you to provide the following System.property’s as well:

-Djava.rmi.server.hostname=localhost
-Djava.rmi.server.useLocalHostname=true

The useLocalHostname may not be relevant, but it doesn’t hurt. All this time I’d thought that because I was configuring that information in the service URL that RMI would build the objects accordingly. But I was wrong … it doesn’t … you need to call that out separately.

What was not apparent to me – until I started to see the same articles pop up when I revised my search criteria – was the IP-ADDRESS in this exception dump:

FINER: [javax.management.remote.rmi.RMIConnector: jmxServiceURL=service:jmx:rmi:///jndi/rmi://localhost:PORT/jmxrmi] failed to connect: java.rmi.ConnectException: Connection refused to host: IP-ADDRESS; nested exception is:
java.net.ConnectException: Operation timed out

It was the IP address of my instance in the cloud. The callbacks were being made back to my VM, but they needed to be made to ‘localhost’ so that they could go through the tunnel that I’d opened. The ‘Operation timed out’ was due to the port being closed, which is the whole reason you’re using ssh tunnels in the first place. Once the RMI exported objects know to use ‘localhost’, that addresses the problem. And magically, JConsole will connect and show you all the data in the world about your server.

So you must provide those System.property’s above regardless of what other configuration you’ve provided in your custom JMX agent.

Additional Concerns

There were a number of other red herrings that I followed for a while, but they were called out as being potential issues, so I kept note.

If your server is running Linux, there are a couple of things you’ll want to check, to make sure that your /etc/hosts is self-referencing correctly, and that you’re not filtering out packets.
You will have troubles stopping Tomcat when it has been started with your custom JMX agent; you’ll have to kill the process. Apparently agents don’t release their thread-pools very nicely. Daniel provides an example of an agent with a thread-cleaning thread – which still has some limitations, and raises the philosophical question ‘who cleans the thread-cleaner’? He also provides an agent that can be remotely stopped – which is reasonably complex. I’ll save that one for a rainy day.
If you want to use SSL in your authentication chain, read up on Daniel’s other postings, and use the following System.property’s on both the server and when starting JConsole:
```
-Djavax.net.ssl.keyStore
-Djavax.net.ssl.keyStorePassword
-Djavax.net.ssl.trustStore
-Djavax.net.ssl.trustStorePassword
```
I have built some Ruby scripts which allow me to dynamically look up an AWS instance’s public DNS entry and then start up a Net::SSH process with port-forwarding tunnels. This works fine for HTTP and even for remote JVM debugging, but it did not work for JMX remoting. I’m not sure why, so you should stick with using ssh for setting up your tunnels.
I started out this exercise using Wireshark for packet sniffing. I’m using OS X, and I installed Wireshark – the successor to Ethereal – via MacPorts. It runs under X11, which you’ll need to install from either Apple’s site or your Optional Installs DVD. I couldn’t get any Interfaces (ethernet card, etc.) to show up, until I learned that I should:
```
% sudo wireshark
```
The app will warn you that this is unsafe, but it works. The Nominet team says that you can address this issue by providing:
```
% sudo chmod go+r /dev/bpf*
```
However that is volatile, and has to be done whenever your Mac starts up. More config involved, so I took the easy path.
If you’re using a script to start and stop Tomcat, you’ll need to somehow separate out the System.property’s that should be used on startup, and omit them when invoking shutdown. If you invoke shutdown with your debug and/or RMI ports specified, the shutdown will fail because those ports are already in use. I’m using the newest standard Tomcat RPM available for Fedora Core 8 – tomcat5-5.5.27 – and it’s uniquely nutty in terms of how it is deployed:
```
/etc/init.d/tomcat5
/usr/bin/dtomcat5
/etc/tomcat5/*.conf
```
That’s a very non-standard arrangement. The init.d script awks the *.conf files, and a whole array of other exciting things. I still haven’t gotten it to properly do an init.d restart due to how it blends the JAVA_OPTS handling. So that’s left as a case-specific exercise.
The whole reason I went down this path was to address a memory leak relating to Hibernate sessions which I blogged about a long time ago. The fix required me to invoke Tomcat with the following System.property’s:
```
-Djavax.rmi.CORBA.PortableRemoteObjectClass
-Djava.naming.factory.initial
```
The org.objectweb.carol JAR, which these settings were targeted at, is part of my weapp, so it’s available in its own Classloader. However, once I put the custom JMX agent in place, I got:
```
FATAL ERROR in native method: processing of -javaagent failed
Exception in thread "main" java.lang.reflect.InvocationTargetException
Caused by: java.lang.ClassNotFoundException: org.objectweb.carol.jndi.spi.MultiOrbInitialContextFactory
```
Attempting to create a symlink to the app-specific JAR in either common/lib, common/endorsed or shared/lib did not address the issue. I had to hack the JAR into the --classpath in order to get Tomcat to start. And yes, hack was the operative term (again).

In Summary

Frankly, all that discovery was enough for one day. And yes, it took me that long to find all of the corner-cases I was dealing with. I hope that if you find this article that it will make your path a bit easier. I know I’ll be glad that I blogged about the details the next time I bump into the issue!

NOTE: If your screen reader is reading this, please contact me at admin@cantremember.com ... because it shouldn't. FIXME: build this dynamically based upon the maximum content in any sub-Element of this Element. I will call this my "Safari Reader Counterweight". In some of my Posts, I have huge code excerpts, etc. Safari Reader, at least in iOS, will identify the 'main Element', the one it features, based upon its content length. Sometimes those code excerpts get identified as the 'main Element', and the Post is borked in Safari Reader mode. This is a counter-weight; it gives the <main> Element additional content so that it gets featured, algorithmically. Yes, it increases the payload of every page (@see FIXME above). But not by that much. Then again, this is a guess as to how much content any given Element could contain. If it's not enough, BOOM, Safari Reader looks like crap. So, here's a great article on how to enable Safari Reader on your site. It's mostly guesswork, but those guesses helped me debug this obtuse goddamn problem. Oh, and look, you can enter and exit Reader programmatically. JavaScript can fix anything. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload. I promise I will never cut-and-paste lines of text simply to add Element payload.