HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
Dynamic width file in Spark
1. How to handle Dynamic Width File in Spark
Dynamic WidthFile is a common type of source fromMainframe sources;The Belowdemonstrationis one of the efficient
ways to handle dynamic widthFile usingScala, Spark RDDandDataframe. Check thiscode, Execute in your REPL.
Source File
Schema of the File
Code to be Executed
case classSubjectwisemarks(subject:String,marks:Int)
case classScoreRecord(id:Int,fname: String,lname:String,numberofsubject:Int,subjectwisemarks:
Seq[Subjectwisemarks])
val dataRDD = data.map(line
=>ScoreRecord(line.substring(0,2).toInt,line.substring(2,12).trim,line.substring(12,22).trim,line.subst
ring(22,24).toInt,convert(line,line.substring(22,24).toInt)));
val df = dataRDD.toDF
Convertisan User define functiontoconvertseriesof Subject-marksintoList
Dataframe Schema
Registeringas Temp Table and Show the Data
2. ImplementingAnalytical Queryinto the temptable
SELECT id,fname,lname,CAST(sum(subject_wise_marks.marks)/numberofsubjectasDouble) FROM
score LATERAL VIEW explode(subjectwisemarks) marks_table assubject_wise_marksgroupby
id,fname,lname,numberofsubject;
Result