python - Parsing XML into pandas dataframe with namespaces -
i'm trying parse exported xml file pandas dataframe want work , clean text. xml looks this:
<ns:export xmlns:ns="http://www.canto.com/ns/export/1.0"> <ns:layout tablename="assetrecords"> <ns:fields> <ns:field uid="record name" type="0" valueinterpretation="0"> <ns:name>record name</ns:name> </ns:field> <ns:field uid="keywords" type="0" valueinterpretation="0"> <ns:name>keywords</ns:name> </ns:field> <ns:field uid="description" type="0" valueinterpretation="0"> <ns:name>description</ns:name> </ns:field> <ns:field uid="year" type="2" valueinterpretation="0"> <ns:name>year</ns:name> </ns:field> <ns:field uid="sted" type="0" valueinterpretation="0"> <ns:name>sted</ns:name> </ns:field> <ns:field uid="number" type="0" valueinterpretation="0"> <ns:name>number</ns:name> </ns:field> <ns:field uid="title" type="0" valueinterpretation="0"> <ns:name>title</ns:name> </ns:field> <ns:field uid="topografisk nr" type="0" valueinterpretation="0"> <ns:name>topografisk nr</ns:name> </ns:field> </ns:fields> </ns:layout> <ns:items> <ns:item catalogid="7" id="8087"> <ns:fieldvalue uid="record name">f_6000.tif</ns:fieldvalue> <ns:fieldvalue uid="description">på bagsiden navne på personerne</ns:fieldvalue> <ns:fieldvalue uid="year">1901</ns:fieldvalue> <ns:fieldvalue uid="sted">københavn</ns:fieldvalue> <ns:fieldvalue uid="number">f_6000</ns:fieldvalue> <ns:fieldvalue uid="title">statens tegnelærerkursus</ns:fieldvalue> </ns:item> <ns:item catalogid="7" id="8086"> <ns:fieldvalue uid="record name">f_6000-bagside.tif</ns:fieldvalue> </ns:item> <ns:item catalogid="7" id="1646"> <ns:fieldvalue uid="record name">f_5303g.tif</ns:fieldvalue> </ns:item> <ns:item catalogid="7" id="4074"> <ns:fieldvalue uid="record name">f_5288.tif</ns:fieldvalue> <ns:fieldvalue uid="description">sct. jacobi skole, varde - klassefoto</ns:fieldvalue> <ns:fieldvalue uid="year">1945</ns:fieldvalue> <ns:fieldvalue uid="number">5288</ns:fieldvalue> <ns:fieldvalue uid="title">sct. jacobi skole, varde</ns:fieldvalue> </ns:item> <ns:item catalogid="7" id="3267"> <ns:fieldvalue uid="record name">f_5282.tif</ns:fieldvalue> <ns:fieldvalue uid="description">pr. charlottesgades skolen, kampenskole(oslo) barnekor og musikkorps på besøg</ns:fieldvalue> <ns:fieldvalue uid="year">1947</ns:fieldvalue> <ns:fieldvalue uid="number">5282</ns:fieldvalue> <ns:fieldvalue uid="title">pr. charlottesgades skolen, kampenskole(oslo)</ns:fieldvalue> </ns:item>
my goal create pandas dataframe row each id , columne each fieldvalue.
any pointers appreciated.
Comments
Post a Comment