pyspark - How to config the sc in spark? -


myconf = sparkconf().setappname("create_h5_pairwise")\              .set("spark.hadoop.validateoutputspecs", false)\              .set("spark.akka.framesize", 300)\              .set("spark.driver.maxresultsize","8g")\              .set("spark.num.executors", 40)\              .set("spark.executor.memory", "20g")\              .set("spark.executor.cores", 3)\              .set("spark.driver.memory", "4g") sc = sparkcontext(conf=myconf) 

i have use config of sc read small table hive (500 rows). want change sc configure read table more 600 million rows. how configure sc parameters ? use same sc read huge table ? when count it, stuck in following phase:

[stage 11:>                                                         (0 + 2) / 4] 

and there no progress @ all.


Comments

Popular posts from this blog

javascript - WinJS appendTextAsync producing scheduler errors -

minify - Minimizing css files -

Sockets with kotlin -