Skip to content

java.lang.NoClassDefFoundError: Could not initialize class org.tensorflow.Graph when using pretrained model #14016

Open
@hilmar05

Description

@hilmar05

Is there an existing issue for this?

  • I have searched the existing issues and did not find a match.

Who can help?

No response

What are you working on?

I am trying to get spark-nlp to work on Databricks using an example from the documentation.

Current Behavior

sentence_detector_dl download started this may take some time.
Approximate size to download 514.9 KB
[ / ]
An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadModel.
: java.lang.NoClassDefFoundError: Could not initialize class org.tensorflow.Graph
	at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.readGraph(TensorflowWrapper.scala:415)
	at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.unpackWithoutBundle(TensorflowWrapper.scala:330)
	at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.read(TensorflowWrapper.scala:484)
	at com.johnsnowlabs.ml.tensorflow.ReadTensorflowModel.readTensorflowModel(TensorflowSerializeModel.scala:154)
	at com.johnsnowlabs.ml.tensorflow.ReadTensorflowModel.readTensorflowModel$(TensorflowSerializeModel.scala:123)
	at com.johnsnowlabs.nlp.annotators.sentence_detector_dl.SentenceDetectorDLModel$.readTensorflowModel(SentenceDetectorDLModel.scala:648)
	at com.johnsnowlabs.nlp.annotators.sentence_detector_dl.ReadsSentenceDetectorDLGraph.readSentenceDetectorDLGraph(SentenceDetectorDLModel.scala:621)
	at com.johnsnowlabs.nlp.annotators.sentence_detector_dl.ReadsSentenceDetectorDLGraph.readSentenceDetectorDLGraph$(SentenceDetectorDLModel.scala:616)
	at com.johnsnowlabs.nlp.annotators.sentence_detector_dl.SentenceDetectorDLModel$.readSentenceDetectorDLGraph(SentenceDetectorDLModel.scala:648)
	at com.johnsnowlabs.nlp.annotators.sentence_detector_dl.ReadsSentenceDetectorDLGraph.$anonfun$$init$$1(SentenceDetectorDLModel.scala:625)
	at com.johnsnowlabs.nlp.annotators.sentence_detector_dl.ReadsSentenceDetectorDLGraph.$anonfun$$init$$1$adapted(SentenceDetectorDLModel.scala:625)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$onRead$1(ParamsAndFeaturesReadable.scala:50)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$onRead$1$adapted(ParamsAndFeaturesReadable.scala:49)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.onRead(ParamsAndFeaturesReadable.scala:49)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$read$1(ParamsAndFeaturesReadable.scala:61)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$read$1$adapted(ParamsAndFeaturesReadable.scala:61)
	at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:38)
	at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:24)
	at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.downloadModel(ResourceDownloader.scala:518)
	at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.downloadModel(ResourceDownloader.scala:510)
	at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader$.downloadModel(ResourceDownloader.scala:709)
	at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadModel(ResourceDownloader.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
	at py4j.Gateway.invoke(Gateway.java:306)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
	at java.lang.Thread.run(Thread.java:750)

Expected Behavior

Code should run without any errors.

Steps To Reproduce

from sparknlp.base import DocumentAssembler
from sparknlp.annotator import SentenceDetectorDLModel, MarianTransformer
from pyspark.ml import Pipeline
document_assembler = DocumentAssembler().setInputCol("text").setOutputCol("document")

sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx").setInputCols("document").setOutputCol("sentence")

marian_transformer = MarianTransformer.pretrained().setInputCols("sentence").setOutputCol("translation")

pipeline = Pipeline().setStages([document_assembler,  sentence_detector, marian_transformer])

data = spark.createDataFrame([["You can use Spark NLP to translate text. " + \
                               "This example pipeline translates English to French"]]).toDF("text")

# Create a pipeline model that can be reused across multiple data frames
model = pipeline.fit(data)

# You can use the model on any data frame that has a “text” column
result = model.transform(data)

display(result.select("text", "translation.result"))

Spark NLP version and Apache Spark

Spark NLP version: 5.1.2
Spark version: 3.4.1

Databricks Runtime Version: 13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12)

Type of Spark Application

Python Application

Java Version

No response

Java Home Directory

No response

Setup and installation

I iinstalled the libraries below directly on the cluster.

spark-nlp==5.1.2
com.johnsnowlabs.nlp:spark-nlp_2.12:5.1.2

Operating System and Version

No response

Link to your project (if available)

No response

Additional Information

No response

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions