本文主要是介绍SparkR读取CSV格式文件错误java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.spark.u,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
使用如下命令启动sparkR shell:
bin/sparkR --packages com.databricks:spark-csv_2.10:1.0.3
之后读入csv文件:
flights <- read.df(sqlContext, "/sparktest/nycflights13.csv", "com.databricks.spark.csv", header="true")
head(flights)
报错:
16/04/07 23:06:46 ERROR CsvRelation$: Exception while parsing line: 2013,1,1,914,-6,1244,4,"AA","N517AA",1589,"EWR","DFW",238,1372,9,14.
java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String
at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:46)
at org.apache.spark.sql.catalyst.expressions.GenericMutableRow.getUTF8String(rows.scala:248)
at org.apache.spark.sql.catalyst.expressions.BoundReference.eval(BoundAttribute.scala:49)
at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:295)
at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:84)
at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:60)
at com.databricks.spark.csv.CsvRelation$$anonfun$com$databricks$spark$csv$CsvRelation$$parseCSV$1.apply(CsvRelation.scala:150)
at com.databricks.spark.csv.CsvRelation$$anonfun$com$databricks$spark$csv$CsvRelation$$parseCSV$1.apply(CsvRelation.scala:130)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157
错误原因:
读取csv格式文件时加载的包错误:com.databricks:spark-csv_2.10:1.0.3
解决方法:
修改sparkR shell启动命令:
bin/sparkR --packages com.databricks:spark-csv_2.10:1.3.0
这篇关于SparkR读取CSV格式文件错误java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.spark.u的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!