apache spark - Calling pipe() from a PairRDD and passing a Java Object to it -
i have pairrdd javapairrdd<string, graph> graph java object created using
pairfunction<row, string, graph> pairfunction = new pairfunction<row, string, graph>() { private static final long serialversionuid = 1l; public tuple2<string, graph> call(row row) throws exception { integer parameter = row.getas("foo"); string otherparameter = row.getas("bar"); graph graph = new graph( parameter, otherparameter ); string key = somekeygenerator(); return new tuple2<string, graph>( key, graph ); } }; now need run external program using mypairrdd.pipe('external.sh') think spark pass graph object external.sh via stdin.
i need access graph.parameter , graph.otherparameter inside external.sh.
how manage situation?
found !!
just need override tostring() method of pojo (graph) expose desirable attributes !!!
in case:
@override public string tostring() { return this.parameter + "," + this.otherparameter; } now output is:
(62,foo,bar)
Comments
Post a Comment