python - Add an extra column to a pandas dataframe that is dependent of another column -


i made pandas dataframe iris dataset , want add column calls specieid. means iris-setosa gets id of 0, iris-versicolor, 1 , iris-virginica, 2.

i tried code:

def create_specie_id():     if iris["species"] == "iris-setosa":         id = 0     elif iris["species"] == "iris-versicolor":         id = 1     elif iris["species"] == "iris-virginica":         id = 2     return id  iris = iris.assign(specieid = lambda x: create_specie_id())  print (iris) 

but recieved following error :

--------------------------------------------------------------------------- valueerror                                traceback (most recent call last) <ipython-input-58-2abd69ffef4b> in <module>()      10     return id      11  ---> 12 iris = iris.assign(specieid = lambda x: create_specie_id())      13       14 print (iris)  c:\users\masc\appdata\local\continuum\anaconda3\lib\site-packages\pandas\core\frame.py in assign(self, **kwargs)    2495         results = {}    2496         k, v in kwargs.items(): -> 2497             results[k] = com._apply_if_callable(v, data)    2498     2499         # ... , assign  c:\users\masc\appdata\local\continuum\anaconda3\lib\site-packages\pandas\core\common.py in _apply_if_callable(maybe_callable, obj, **kwargs)     439     """     440     if callable(maybe_callable): --> 441         return maybe_callable(obj, **kwargs)     442     return maybe_callable     443   <ipython-input-58-2abd69ffef4b> in <lambda>(x)      10     return id      11  ---> 12 iris = iris.assign(specieid = lambda x: create_specie_id())      13       14 print (iris)  <ipython-input-58-2abd69ffef4b> in create_specie_id()       2        3 def create_specie_id(): ----> 4     if iris["species"] == "iris-setosa":       5         id = 0       6     elif iris["species"] == "iris-versicolor":  c:\users\masc\appdata\local\continuum\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)     953         raise valueerror("the truth value of {0} ambiguous. "     954                          "use a.empty, a.bool(), a.item(), a.any() or a.all()." --> 955                          .format(self.__class__.__name__))     956      957     __bool__ = __nonzero__  valueerror: truth value of series ambiguous. use a.empty, a.bool(), a.item(), a.any() or a.all(). 

how can create column contains specieid's?

you can use numpy.select:

iris=pd.dataframe({'species':['iris-setosa','iris-versicolor','iris-virginica', 'another']})  m1 =  iris["species"] == "iris-setosa" m2 =  iris["species"] == "iris-versicolor" m3 =  iris["species"] == "iris-virginica"  iris['id'] = np.select([m1,m2,m3], [0,1,2], default=-1)  print (iris)            species  id 0      iris-setosa   0 1  iris-versicolor   1 2   iris-virginica   2 3           -1 

another solution use map dict - nan if values not matchad, added fillna astype:

d = { "iris-setosa" : 0, "iris-versicolor":1,  "iris-virginica":2} iris['id'] = iris['species'].map(d).fillna(-1).astype(int)  print (iris)            species  id 0      iris-setosa   0 1  iris-versicolor   1 2   iris-virginica   2 3           -1 

Comments

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

Add a dynamic header in angular 2 http provider -

minify - Minimizing css files -