Skip to content

sahilbhange/spark-slowly-changing-dimension

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spark-slowly-changing-dimention

Spark implementation of slowly changing dimention

Implemented a slowly changing dimention type 2 using Scala Spark and Pyspark.

After every run, save the updated data to Hive table in ORC format with Snappy compression.

Hive table: Scala Application output Table - yelp_data_scala_sbhange.yelp_user_hist

Table Directory - /user/sahilbhange/scala/yelp_hist/

Pyspark Application output Table - yelp_data_spark_df_sbhange.yelp_user_hist

Table Directory - /user/sahilbhange/spark_df/

Releases

No releases published

Packages

No packages published