- Hands-On Exploratory Data Analysis with Python
- Suresh Kumar Mukhiya Usman Ahmed
- 270字
- 2025-04-04 13:14:46
Refactoring timezones
Next, we want to refactor the timezone based on our timezone:
We can refactor timezones by using the method given here:
import datetime
import pytz
def refactor_timezone(x):
est = pytz.timezone('US/Eastern')
return x.astimezone(est)
Note that in the preceding code, I converted the timezone into the US/Eastern timezone. You can choose whatever timezone you like.
2.ow that our function is created, let's call it:
dfs['date'] = dfs['date'].apply(lambda x: refactor_timezone(x))
3.ext, we want to convert the day of the week variable into the name of the day, as in, Saturday, Sunday, and so on. We can do that as shown here:
dfs['dayofweek'] = dfs['date'].apply(lambda x: x.weekday_name)
dfs['dayofweek'] = pd.Categorical(dfs['dayofweek'], categories=[
'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday',
'Saturday', 'Sunday'], ordered=True)
4.reat! Next, we do the same process for the time of the day. See the snippet given here:
dfs['timeofday'] = dfs['date'].apply(lambda x: x.hour + x.minute/60 + x.second/3600)
5.ext, we refactor the hour, the year integer, and the year fraction, respectively. First, refactor the hour as shown here:
dfs['hour'] = dfs['date'].apply(lambda x: x.hour)
6.efactor the year integer as shown here:
dfs['year_int'] = dfs['date'].apply(lambda x: x.year)
7.astly, refactor the year fraction as shown here:
dfs['year'] = dfs['date'].apply(lambda x: x.year + x.dayofyear/365.25)
8.aving done that, we can set the date to index and we will no longer require the original date field. So, we can remove that:
dfs.index = dfs['date']
del dfs['date']
Great! Good work so far. We have successfully executed our data transformation steps. If some of the steps were not clear, don't worry—we are going to deal with each of these phases in detail in upcoming chapters.