python - Add an extra column to a pandas dataframe that is dependent of another column -
i made pandas dataframe iris dataset , want add column calls specieid. means iris-setosa gets id of 0, iris-versicolor, 1 , iris-virginica, 2.
i tried code:
def create_specie_id(): if iris["species"] == "iris-setosa": id = 0 elif iris["species"] == "iris-versicolor": id = 1 elif iris["species"] == "iris-virginica": id = 2 return id iris = iris.assign(specieid = lambda x: create_specie_id()) print (iris)
but recieved following error :
--------------------------------------------------------------------------- valueerror traceback (most recent call last) <ipython-input-58-2abd69ffef4b> in <module>() 10 return id 11 ---> 12 iris = iris.assign(specieid = lambda x: create_specie_id()) 13 14 print (iris) c:\users\masc\appdata\local\continuum\anaconda3\lib\site-packages\pandas\core\frame.py in assign(self, **kwargs) 2495 results = {} 2496 k, v in kwargs.items(): -> 2497 results[k] = com._apply_if_callable(v, data) 2498 2499 # ... , assign c:\users\masc\appdata\local\continuum\anaconda3\lib\site-packages\pandas\core\common.py in _apply_if_callable(maybe_callable, obj, **kwargs) 439 """ 440 if callable(maybe_callable): --> 441 return maybe_callable(obj, **kwargs) 442 return maybe_callable 443 <ipython-input-58-2abd69ffef4b> in <lambda>(x) 10 return id 11 ---> 12 iris = iris.assign(specieid = lambda x: create_specie_id()) 13 14 print (iris) <ipython-input-58-2abd69ffef4b> in create_specie_id() 2 3 def create_specie_id(): ----> 4 if iris["species"] == "iris-setosa": 5 id = 0 6 elif iris["species"] == "iris-versicolor": c:\users\masc\appdata\local\continuum\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self) 953 raise valueerror("the truth value of {0} ambiguous. " 954 "use a.empty, a.bool(), a.item(), a.any() or a.all()." --> 955 .format(self.__class__.__name__)) 956 957 __bool__ = __nonzero__ valueerror: truth value of series ambiguous. use a.empty, a.bool(), a.item(), a.any() or a.all().
how can create column contains specieid's?
you can use numpy.select
:
iris=pd.dataframe({'species':['iris-setosa','iris-versicolor','iris-virginica', 'another']}) m1 = iris["species"] == "iris-setosa" m2 = iris["species"] == "iris-versicolor" m3 = iris["species"] == "iris-virginica" iris['id'] = np.select([m1,m2,m3], [0,1,2], default=-1) print (iris) species id 0 iris-setosa 0 1 iris-versicolor 1 2 iris-virginica 2 3 -1
another solution use map
dict
- nan
if values not matchad, added fillna
astype
:
d = { "iris-setosa" : 0, "iris-versicolor":1, "iris-virginica":2} iris['id'] = iris['species'].map(d).fillna(-1).astype(int) print (iris) species id 0 iris-setosa 0 1 iris-versicolor 1 2 iris-virginica 2 3 -1
Comments
Post a Comment