The bug is occurring again. We are updating to Spark 2.0.1 and this LinkageError happens only in 1 of 3 Spark Jobs, despite all of them are writing to hbase.
The previous solution didn't work anymore so it was only a coincidence...
But i have found 2 working workarounds / solutions.
Workaround / Solution 1:
I noticed following behavior of Spark:
- Spark loaded in the driver the classes MapRPut, MapRCallBackQueue, PutConverter in the same classloader
- but in the executor it loaded it across two classloaders, so MapRPut was loaded in a different classloader
i don't know the reason for that.
As a workaround i tried to force loading of MapRPut in the same classloader as the other classes and now the bug dissappeared.
Workaround / Solution 2:
setting the JVM Options "mapr.library.flatclass", e.g. "--conf spark.executor.extraJavaOptions=-Dmapr.library.flatclass"
This solves the problem completely without other changes.
So now the question to the community and mapr support: What does this setting and can someone explain this behaviour?
I never experienced this classloader problem at any other framework, so i assume this is MapR specific.
we were using MapR 4.0.1 Version with Spark 1.3.1 (non MapR Version) for a long time.
Now we updated the server to 5.1 and the client maven artifacts are still on 4.1. (is working fine)
When i try to update the maven artifacts to 5.1 also, my spark jobs sporadically fails when writing via HBase API to the mapr tables.
Both normal writes to mapr db via "saveAsNewHadoopDataset" also as incremental bulk inserts have this problem.
java.lang.LinkageError: loader constraint violation: when resolving method "com.mapr.fs.jni.MapRPut.<init>(Ljava/lang/Object;Lcom/mapr/fs/jni/MapRCallBackQueue;)V" the class loader (instance of org/apache/spark/util/MutableURLClassLoader) of the current class, com/mapr/fs/hbase/PutConverter, and the class loader (instance of sun/misc/Launcher$ExtClassLoader) for the method's defining class, com/mapr/fs/jni/MapRPut, have different Class objects for the type com/mapr/fs/jni/MapRCallBackQueue used in the signature
Here the stacktrace:
The strange thing this is, sometimes the jobs are working, sometimes the LinkageError is thrown.
- we are using mapr-db
- we are not using the mapr spark packages
- we are using spark against a dedicated mesos cluster
Maybe someone can help me or give me any hint to solve this problem.
Any help is appreciated.