CopyPastor

Detecting plagiarism made easy.

Score: 1.8390137553215027; Reported for: String similarity, Exact paragraph match Open both answers

Possible Plagiarism

Reposted on 2021-04-10
by Chen

Original Post

Original - Posted on 2021-04-10
by Chen



            
Present in both answers; Present only in the new answer; Present only in the old answer;

In short, there is a very basic concept in RDD called RDD operation: Transformation v.s. Action. See more details here: https://data-flair.training/blogs/spark-rdd-operations-transformations-actions/
The part `.toRdd.mapPartitions` in the code below
lazy val rdd: RDD[T] = { val objectType = exprEnc.deserializer.dataType rddQueryExecution.toRdd.mapPartitions { rows => rows.map(_.get(0, objectType).asInstanceOf[T]) } } is just adding a transformation from DataFrame's Tungsten storage format of RDD[InternalRow] into RDD[JavaObject], which will not be executed if you just want to do a `getNumPartitions` operation.
This added transformation will only be executed when you have follow-up actions on the converted RDD.
There is a very basic concept in RDD called RDD operation: Transformation v.s. Action. See more details here: https://data-flair.training/blogs/spark-rdd-operations-transformations-actions/
The part `.toRdd.mapPartitions` in the code below
lazy val rdd: RDD[T] = { val objectType = exprEnc.deserializer.dataType rddQueryExecution.toRdd.mapPartitions { rows => rows.map(_.get(0, objectType).asInstanceOf[T]) } } is just adding a transformation from DataFrame's Tungsten storage format of RDD[InternalRow] into RDD[JavaObject], which will not be executed if you just want to do a `getNumPartitions` operation.
This added transformation will only be executed when you have follow-up actions on the converted RDD.

        
Present in both answers; Present only in the new answer; Present only in the old answer;