[TOC]
1. 时间转换
对于时间序列的转换,好像都是通过datetime作为中转来变换timestamp和字符串的,也就是:
timestamp ←→ datetime ←→ str
timestamp → datetime pd.datetime.fromtimestamp(1487312878)
str → datetime pd.to_datetime('2017-01-02')
import datetime
datetime 变量a → timestamp a.timestamp()
datetime 变量a → str a.strftime('%Y-%m-%d')
pandas应用:
timestamp → datetime Seriesobj.map(pd.datetime.fromtimestamp)
str → datetime Seriesobj.map(pd.to_datetime)
datetime 变量x → timestamp Seriesobj.map(lambda x.timestamp()).astype(np.int64)
datetime 变量x → str Seriesobj.map(lambda x.strftime('%Y-%m-%d %H-%m'))
2. 计算2个日期之间的时间差
2.1 自定义month_diff函数
Define a "month" as 1/12 year, then do this:
def month_diff(d1, d2):
"""Return the number of months between d1 and d2,
such that d2 + month_diff(d1, d2) == d1
"""
diff = (12 * d1.year + d1.month) - (12 * d2.year + d2.month)
return diff
assert diff_month(datetime(2010,10,1), datetime(2010,9,1)) == 1
assert diff_month(datetime(2010,10,1), datetime(2009,10,1)) == 12
assert diff_month(datetime(2010,10,1), datetime(2009,11,1)) == 11
assert diff_month(datetime(2010,10,1), datetime(2009,8,1)) == 14
date1 = datetime(2012, 2, 15)
date1.year # 2012, type为int
2.2 使用dateutile.relativedelta.relativedelta
from datetime import datetime
from dateutil import relativedelta
date1 = datetime.strptime(str('2011-08-15 12:00:00'), '%Y-%m-%d %H:%M:%S')
date2 = datetime.strptime(str('2012-02-15'), '%Y-%m-%d')
r = relativedelta.relativedelta(date2, date1)
print(r.months) # 5
date1 = datetime(2012, 2, 15)
date2 = datetime(2013, 12, 1)
r = relativedelta.relativedelta(date2, date1) # relativedelta(years=+1, months=+9, days=+16)
r.years * 12 + r.months # 21
有一个缺点是,如果时间差超过1年,则会将年的数字分配到r.years
中,如2011-01和2012-01的month差r.months
是0
3. datetime类型
In [1]: import datetime
In [2]: date1 = datetime.datetime(2010, 10, 4, 10, 10)
In [3]: date1
Out[3]: datetime.datetime(2010, 10, 4, 10, 10)
In [4]: date1.date()
Out[4]: datetime.date(2010, 10, 4)
In [6]: date1.day
Out[6]: 4
In [7]: date1.year
Out[7]: 2010
In [8]: type(date1.month)
Out[8]: int
4. 时间转换样例
A 目标
Start Date End Date Contract
0 1/1/17 6/1/17 1
1 7/1/17 12/1/17 2
转换为:
Month Contract
0 2017-01 1
1 2017-02 1
2 2017-03 1
3 2017-04 1
4 2017-05 1
5 2017-06 1
6 2017-07 2
7 2017-08 2
8 2017-09 2
9 2017-10 2
10 2017-11 2
11 2017-12 2
B 代码
import pandas as pd
import numpy as np
import re, io, os
import datetime
data = '''Start Date End Date Contract
1/1/17 6/1/17 1
7/1/17 12/1/17 2
'''
df = pd.read_csv(io.StringIO(data), sep='\t', dtype=str)
for col in ['Start Date', 'End Date']:
df[col] = df[col].map(lambda x: datetime.datetime.strptime(x, "%m/%d/%y"))
def convert_time(obj):
x = pd.period_range(obj['Start Date'], obj['End Date'], freq='M')
dfx = pd.DataFrame(x)
dfx['Contract'] = obj['Contract']
dfx.rename(columns={0:'Month'}, inplace=True)
return dfx
for i in range(df.shape[0]):
if i == 0:
dfn = convert_time(df.iloc[i])
else:
dfn = pd.concat([dfn, convert_time(df.iloc[i])], ignore_index=True)
dfn