python - How to set a specific column to a int type with pandas -


i have script writing csv files excel folder:

from pandas.io.excel import excelwriter import pandas import os  path = 'data/' ordered_list = sorted(os.listdir(path), key = lambda x: int(x.split(".")[0]))   excelwriter('my_excel.xlsx') ew:     csv_file in ordered_list:         pandas.read_csv(path + csv_file).to_excel(ew, index = false, sheet_name=csv_file[:-4], encoding='utf-8') 

now problem columns (let's g:h) in string format (ex '400 or '10) ' before, think come string because of csv converting them strings, need them int, how can make g:h int?! use python 3, thank you!

ps(this csv sample):

anpis,,,,,,, agentia judeteana pentru plati si inspectie sociala timis,,,,,,, ,,,,,,, macheta comparativa creditori - numai pentru beneficiile caror evidenta se tine si in contabilitate si in aplicatia safir,,,,,,, situatie analitica - nominal la 30.06.2017,,,,,,, 1. alocatia de stat pentru copii,,,,,,, nr. benef,nume prenume,cnp,data constituirii,suma contabilitate,suma safir,differenta suma,explicatii daca exista diferente 1,2,3,4,5,6,7=5-6,8 1,cazacu mihai,133121140,aug 2016,84,84 2,nicoara petru,143152638,"aug 2014, sept 2014",126,84 3,cernea nicolae dan,143354723,dec 2015,84,84 4,ludwig petru,144091376,nov 2014,42,42 5,popa remus,1440915363,iun 2015,84,84 6,bogdan marcel,144154726,"feb 2015, apr 2015, sept 2015, oct 2015, feb 2016",336,336 7,hendre augustin,145054704,feb 2015,42,42 8,cojoc vasile,147050307,"sept 2014, oct 2014",84,84 9,radulescu victor,147352628,"sept 2014, oct 2014, nov 2014, dec 2014",168,168 10,radau dumitru,148054764,"feb 2017, mar 2017",168,168 11,covaciu petru,148054802,iun 2016,84,84 12,bot ioan,14808634,"aug 2014, sept 2014, oct 2014, nov 2014",168,168 

^^ , head one:

anpis,,,,,,, agentia judeteana pentru plati si inspectie sociala timis,,,,,,, ,,,,,,, macheta comparativa creditori - numai pentru beneficiile caror evidenta se tine si in contabilitate si in aplicatia safir,,,,,,, situatie analitica - nominal la 30.06.2017,,,,,,, 1. alocatia de stat pentru copii,,,,,,, nr. benef,nume prenume,cnp,data constituirii,suma contabilitate,suma safir,differenta suma,explicatii daca exista diferente 1,2,3,4,5,6,7=5-6,8 

you can read each file twice - first header parameter nrows , body skiprows.

then need write twice too.

solution bit complicated, because pandas wrong parse data - not support multtiindex 8 levels. if set no headers, data header joined body , output mess.

with excelwriter('my_excel.xlsx') ew:     csv_file in ordered_list:         df1 = pandas.read_csv(path + csv_file, nrows=8, header=none)         df2 = pandas.read_csv(path + csv_file, skiprows=8, header=none)         df1.to_excel(ew, index = false, sheet_name=csv_file[:-4], encoding='utf-8', header=false)         row = len(df1.index)         df2.to_excel(ew, index = false, sheet_name=csv_file[:-4], encoding='utf-8', startrow=row , startcol=0, header=false) 

use apply remove ' strip , cast int astype:

cols = ['g','h']  excelwriter('my_excel.xlsx') ew:     csv_file in ordered_list:         df = pandas.read_csv(path + csv_file)         df[cols] = df[cols].astype(str).apply(lambda x: x.str.strip("'")).astype(int)         print (df.head())         df.to_excel(ew, index = false, sheet_name=csv_file[:-4], encoding='utf-8') 

another solution use parameter converters custom function:

cols = ['g','h']  def converter(x):     return int(x.strip("'")) #define each column converters={x:converter x in cols}  excelwriter('my_excel.xlsx') ew:     csv_file in ordered_list:         df = pandas.read_csv(path + csv_file, converters=converters)         print (df.head())         df.to_excel(ew, index = false, sheet_name=csv_file[:-4], encoding='utf-8') 

Comments

Popular posts from this blog

neo4j - finding mutual friends in a cypher statement starting with three or more persons -

php - How to remove letter in front of the word laravel -

minify - Minimizing css files -