Replies: 10 comments
-
Hi @Antonio-Sorrentini PS: your code example is pure Apache Spark to manually create a DataFrame, not really Spark NLP. |
Beta Was this translation helpful? Give feedback.
-
Ok @maziyarpanahi thanks. |
Beta Was this translation helpful? Give feedback.
-
Hi, @Antonio-Sorrentini ! Here is example for Java pipeline with Spark-NLP. You can find lib in maven repo.
|
Beta Was this translation helpful? Give feedback.
-
I am trying to use a pretrained pipeline and running into an issue. SparkSession spark = SparkSession
.builder()
.appName(appName)
.master(master)
.config("spark.jars.packages", "JohnSnowLabs:spark-nlp:2.2.1")
.getOrCreate();
PretrainedPipeline pipeline = new PretrainedPipeline("explain_document_ml", "en",
PretrainedPipeline$.MODULE$.apply$default$3());
List<StructField> fields = new ArrayList<StructField>();
StructField field1 = DataTypes.createStructField("id", DataTypes.IntegerType, true);
StructField field2 = DataTypes.createStructField("text", DataTypes.StringType, true);
fields.add(field1);
fields.add(field2);
StructType schema = DataTypes.createStructType(fields);
List<Row> rows = Lists.newArrayList(
RowFactory.create(1, "Google has announced the release of a beta version of the popular TensorFlow machine learning library"),
RowFactory.create(2, "The Paris metro will soon enter the 21st century, ditching single-use paper tickets for rechargeable electronic cards.")
);
Dataset<Row> testData = spark.createDataFrame(rows, schema);
Dataset<Row> annotation = pipeline.transform(testData); I'm getting a NoClassDefFoundError that I can't quite figure out. It is getting thrown on the line creating the PretrainedPIpeline object. I tried explicitly including the scala-library jar( version 2.12.x) in my project, but that didn't fix it. Any ideas?
|
Beta Was this translation helpful? Give feedback.
-
In case anyone else comes across this I did get my NoClassDefFound error fixed. My problem was that I was using the spark-code_2.12 version, but Spark NLP is built with _2.11 (which I should have seen originally since it's in the name of the library). Once I changed the spark lib to _2.11 it works. |
Beta Was this translation helpful? Give feedback.
-
You can use the Javadoc reference here https://javadoc.io/doc/com.johnsnowlabs.nlp/spark-nlp-gpu_2.11/latest/index.html#package. That's not enough to be able to completely work with the library, but that's a start. |
Beta Was this translation helpful? Give feedback.
-
@dkincaid It seems SparkNLP has now moved to Scala 2.12. Did you try again with that version? What did you do to change the spark-lib to 2.11? I am asking because I am also getting a |
Beta Was this translation helpful? Give feedback.
-
Answering my own question: I was able to fix the ClassNotFoundException by explitly add the following dependency to my POM:
This is a bit strange, because the SparkNLP POM does specify the required dependency, so this should not be necessary, as I would expect it to get pulled in as a transitive dependency. I presume this is because the scope of the SparkNLP dependency is |
Beta Was this translation helpful? Give feedback.
-
Spark NLP starts the support for Scala 2.12 since 3.x release. We now support both Scala _2.11 and _2.12 for Apache Spark 2.3.x, 2.4.x (Scala 2.11), and Apache Spark 3.0.x and 3.1.x (Scala 2.12)
You are absolutely correct, since the Apache Spark dependencies are |
Beta Was this translation helpful? Give feedback.
-
I published this repo featuring lemmatization with Spark NLP using the Java language. It is used in a web app deployed here. |
Beta Was this translation helpful? Give feedback.
-
On home page of website: https://nlp.johnsnowlabs.com/ I read "Full Python, Scala, and Java support"
Unfortunately it's 3 days now I'm trying to use Spark NLP in Java without any success.
val testData = spark.createDataFrame(Seq((1, "Google ..."),(2, "The Paris ..."))).toDF("id", "text")
from Scala to Java neither searching on Google for 3 days was of any help, neither trying on my own I was able to make this conversion.
Sorry to annoy with this but is there anyone out there who is really using Spark NLP with Java? How do you do it? Are there online resources available to learn how to do it?
Thanks anyway.
Beta Was this translation helpful? Give feedback.
All reactions