pyspark - How to config the sc in spark? -


myconf = sparkconf().setappname("create_h5_pairwise")\              .set("spark.hadoop.validateoutputspecs", false)\              .set("spark.akka.framesize", 300)\              .set("spark.driver.maxresultsize","8g")\              .set("spark.num.executors", 40)\              .set("spark.executor.memory", "20g")\              .set("spark.executor.cores", 3)\              .set("spark.driver.memory", "4g") sc = sparkcontext(conf=myconf) 

i have use config of sc read small table hive (500 rows). want change sc configure read table more 600 million rows. how configure sc parameters ? use same sc read huge table ? when count it, stuck in following phase:

[stage 11:>                                                         (0 + 2) / 4] 

and there no progress @ all.


Comments

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

minify - Minimizing css files -

Add a dynamic header in angular 2 http provider -