Cover a range of common Spark tasks with this 10 code examples

Itexamtools.com
2 min readOct 21, 2023

Here are ten essential Apache Spark program code examples in Scala to help you get started with Spark:

  1. Creating a SparkSession:
scala

import org.apache.spark.sql.SparkSession

val spark = SparkSession.builder()
.appName("SparkExample")
.getOrCreate()

2. Reading Data from a CSV File:

scala

val df = spark.read.csv("data.csv")
df.show()

3. Writing Data to Parquet Format:

scala

df.write.parquet("data.parquet")

4. Basic DataFrame Operations:

scala

df.select("column1", "column2")
df.filter(df("age") > 21)
df.groupBy("gender").count()

5. Spark Streaming from a Socket:

scala

import org.apache.spark.streaming._
val ssc = new StreamingContext(spark.sparkContext, Seconds(1))
val lines = ssc.socketTextStream("localhost", 9999)

6. Spark MLlib for Machine Learning:

scala 

import org.apache.spark.ml.classification.LogisticRegression
val lr = new LogisticRegression()
.setMaxIter(10)
.setRegParam(0.01)

7. Spark SQL for Structured Data:


scala

df.createOrReplaceTempView("people")
val sqlDF = spark.sql("SELECT name, age FROM people WHERE age > 20")

8. Pair RDD Transformation (map and reduceByKey):

scala

val rdd = spark.sparkContext.parallelize(Seq("A", "B", "A", "C", "B"))
val pairRDD = rdd.map(word => (word, 1))
val result = pairRDD.reduceByKey(_ + _)

9. Working with DataFrames and SQL:

scala

val avgAge = df.groupBy("department").avg("age")
avgAge.show()

10. Spark Streaming with Kafka:

scala

import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.kafka010._

val ssc = new StreamingContext(spark.sparkContext, Seconds(1))
val stream = KafkaUtils.createStream(ssc, "localhost:2181", "my-group", Map("my-topic" -> 1))

hope this helps….

These code examples cover a range of common Spark tasks, including creating SparkSessions, reading and writing data, Spark Streaming, machine learning with MLlib, Spark SQL, pair RDD transformations, and integrating with Apache Kafka. You can use these as a starting point for your Spark applications and adapt them to your specific needs…

Learn more
— — — — — — — — — —
https://itexamtools.com
https://itexamsusa.blogspot.com/
https://askcarlito.blogspot.com/

--

--

Itexamtools.com

At ITExamtools.com we help IT students and Professionals by providing important info. about latest IT Trends & for selecting various Academic Training courses.