Selecting rows in one dataframe based on data in another dataframe in Python Pandas -
i have 2 dataframes created pandas. first 1 has co-occurrences of items happening in years:
date item1 item2 0 1975 b 1 1976 b c 2 1977 b 3 1977 b 4 1978 c d 5 1979 e f 6 1980 f
the second 1 has birthdates of items:
birthdate item 1975 1975 b 1976 c 1978 d 1979 f 1979 e
now, want set age variable, example:
age = 2
and populate third dataframe (alternative transform first one) version of first 1 keeping rows of co-occurrences happened when item1 below defined 'age'.
you merge
dataframes - similar join
in sql
import pandas data = [ [1975,'a','b'], [1976,'b','c'], [1977,'b','a'], [1977,'a','b'], [1978,'c','d'], [1979,'e','f'], [1980,'a','f'], ] birthdate = [ [1975,'a'], [1975,'b'], [1976,'c'], [1978,'d'], [1979,'f'], [1979,'e'] ] df1 = pandas.dataframe(data, columns = ['date', 'item1', 'item2']) df2 = pandas.dataframe(birthdate, columns = ['birthdate', 'item']) #print df1 #print df2 newdf = pandas.merge(left=df1, right=df2, left_on='item1', right_on='item') print newdf print newdf[ newdf['birthdate'] > 1975 ]
.
date item1 item2 birthdate item 0 1975 b 1975 1 1977 b 1975 2 1980 f 1975 3 1976 b c 1975 b 4 1977 b 1975 b 5 1978 c d 1976 c 6 1979 e f 1979 e date item1 item2 birthdate item 5 1978 c d 1976 c 6 1979 e f 1979 e
Comments
Post a Comment