Google DataProc Spark Scala Job for MNIST Handwritten Digit Recogintion using Decision Trees.
Spark 2.2.0
Scala 2.11.12
Google Cloud Environment
spark-core 2.11
spark-mllib 2.11
spark-sql 2.11
- Create Scala Project.
- Prepare Jar file.
- Create Cluster in Google Cloud dataproc. (I used 3 nodes)
- Upload train data, test data, jar file to the Google Cloud Storage Bucket.
- Create spark job by providing path to the data files and jar files.
- Submit the job.