python - How to use pyspark to extract bytearray object from parquet table and save it to file -
i have parquet table stores emails. emails written table using java.
each row corresponds email, , columns related email fields. has column stores email attachments in array of bytearray, hence each bytearray attachment.
i can read bytearray using pyspark, , write binary file
newfile = open("filename.txt", "wb") newfile.write(newfilebytes) newfilebytes extracted bytearray using pyspark. binary file not readable. note bytearray can file type, e.g. pdf, image,
do have idea of went wrong? thank you.
Comments
Post a Comment