apache spark - Calling pipe() from a PairRDD and passing a Java Object to it -
i have pairrdd javapairrdd<string, graph>
graph
java object created using
pairfunction<row, string, graph> pairfunction = new pairfunction<row, string, graph>() { private static final long serialversionuid = 1l; public tuple2<string, graph> call(row row) throws exception { integer parameter = row.getas("foo"); string otherparameter = row.getas("bar"); graph graph = new graph( parameter, otherparameter ); string key = somekeygenerator(); return new tuple2<string, graph>( key, graph ); } };
now need run external program using mypairrdd.pipe('external.sh')
think spark pass graph
object external.sh
via stdin.
i need access graph.parameter
, graph.otherparameter
inside external.sh
.
how manage situation?
found !!
just need override tostring()
method of pojo (graph) expose desirable attributes !!!
in case:
@override public string tostring() { return this.parameter + "," + this.otherparameter; }
now output is:
(62,foo,bar)
Comments
Post a Comment