Python Pandas Cleaning columns with multiple dates -
i have dataframe column looking this:
event date 1/3/2013 11/01/2011-10/01/2012 11/01/2011-10/01/2012 11/01/2011-10/01/2012 10/01/2012 - 02/18/2013 2/12/2013 01/18/2013-01/23/2013 11/01/2012-01/19/2013
is there way separate dates 2 columns like
df['start date'] df['end date']
where rows single dates start date default.
you can use series.str.extract()
here in 1 fell swoop:
in [22]: df out[22]: event_date 0 1/3/2013 1 11/01/2011-10/01/2012 2 11/01/2011-10/01/2012 3 11/01/2011-10/01/2012 4 10/01/2012 - 02/18/2013 5 2/12/2013 6 01/18/2013-01/23/2013 7 11/01/2012-01/19/2013 in [23]: df.event_date.str.extract(r'(?p<all>(?p<start>\d{1,2}/\d{1,2}/\d{4})\s*-?\s*(?p<end>\d{1,2}/\d{1,2}/\d{4})?)') out[23]: start end 0 1/3/2013 1/3/2013 nan 1 11/01/2011-10/01/2012 11/01/2011 10/01/2012 2 11/01/2011-10/01/2012 11/01/2011 10/01/2012 3 11/01/2011-10/01/2012 11/01/2011 10/01/2012 4 10/01/2012 - 02/18/2013 10/01/2012 02/18/2013 5 2/12/2013 2/12/2013 nan 6 01/18/2013-01/23/2013 01/18/2013 01/23/2013 7 11/01/2012-01/19/2013 11/01/2012 01/19/2013
Comments
Post a Comment