pyspark - How to config the sc in spark? -
myconf = sparkconf().setappname("create_h5_pairwise")\ .set("spark.hadoop.validateoutputspecs", false)\ .set("spark.akka.framesize", 300)\ .set("spark.driver.maxresultsize","8g")\ .set("spark.num.executors", 40)\ .set("spark.executor.memory", "20g")\ .set("spark.executor.cores", 3)\ .set("spark.driver.memory", "4g") sc = sparkcontext(conf=myconf)
i have use config of sc read small table hive (500 rows). want change sc configure read table more 600 million rows. how configure sc parameters ? use same sc read huge table ? when count it, stuck in following phase:
[stage 11:> (0 + 2) / 4]
and there no progress @ all.
Comments
Post a Comment