[11주차] 하둡과 Spark (4)
🙂 Spark 파일 포맷Unstructured: TextSemi-structured: JSON, XML, CSVStructured: PARQUET, AVRO, ORC, SequenceFile🙂 Execution Plan✔ Transformations and ActionsTransformationsNarrow Dependencies: 독립적인 Partition Level 작업select, filter, map 등Wide Dependencies: Shuffling이 필요한 작업groupby, reduceby, partitionby, repartition, coalesce 등ActionsRead, Write, Show, Collect -> Job을 실행시킴 (실제 코드가 실행)Lazy Execution어떤..