Python Pandas Cleaning columns with multiple dates -
i have dataframe column looking this:
event date 1/3/2013 11/01/2011-10/01/2012 11/01/2011-10/01/2012 11/01/2011-10/01/2012 10/01/2012 - 02/18/2013 2/12/2013 01/18/2013-01/23/2013 11/01/2012-01/19/2013 is there way separate dates 2 columns like
df['start date'] df['end date'] where rows single dates start date default.
you can use series.str.extract() here in 1 fell swoop:
in [22]: df out[22]:                 event_date 0                 1/3/2013 1    11/01/2011-10/01/2012 2    11/01/2011-10/01/2012 3    11/01/2011-10/01/2012 4  10/01/2012 - 02/18/2013 5                2/12/2013 6    01/18/2013-01/23/2013 7    11/01/2012-01/19/2013  in [23]: df.event_date.str.extract(r'(?p<all>(?p<start>\d{1,2}/\d{1,2}/\d{4})\s*-?\s*(?p<end>\d{1,2}/\d{1,2}/\d{4})?)') out[23]:                              start         end 0                 1/3/2013    1/3/2013         nan 1    11/01/2011-10/01/2012  11/01/2011  10/01/2012 2    11/01/2011-10/01/2012  11/01/2011  10/01/2012 3    11/01/2011-10/01/2012  11/01/2011  10/01/2012 4  10/01/2012 - 02/18/2013  10/01/2012  02/18/2013 5                2/12/2013   2/12/2013         nan 6    01/18/2013-01/23/2013  01/18/2013  01/23/2013 7    11/01/2012-01/19/2013  11/01/2012  01/19/2013 
Comments
Post a Comment