本文主要是介绍Backtrader 试用,empyrical、pyfolio、quantstats记录,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
一、Backtrader试用
开始研究backtrader。第一步安装,
安装版本,考虑到数据源,先用免费版本的akshare,以后再装tushare或者其它的,所有安装python3.8以上的版本,最终安装主要包版本列表:
akshare 1.9.81
backtrader 1.9.78.123
pandas 2.0.1
tushare 1.2.89
pyfolio 0.9.2
不说明安装包的都是耍流氓。安装好后,运行一下akshare的示例代码(AkShare 策略示例 - AkShare 中文帮助文档 - 开发文档 - 文江博客)
from datetime import datetimeimport backtrader as bt
import matplotlib.pyplot as plt
import akshare as ak
import pandas as pdplt.rcParams["font.sans-serif"] = ["SimHei"]
plt.rcParams["axes.unicode_minus"] = Falsestock_hfq_df = ak.stock_zh_a_daily(symbol="sh600000", adjust="hfq") # 利用 AkShare 获取后复权数据
pd.set_option('display.max_columns', None) # 展示所有列stock_hfq_df.rename(columns={'date':'datetime'},inplace=True)
stock_hfq_df.set_index('datetime')class MyStrategy(bt.Strategy):"""主策略程序"""params = (("maperiod", 20),) # 全局设定交易策略的参数def __init__(self):"""初始化函数"""self.data_close = self.datas[0].close # 指定价格序列# 初始化交易指令、买卖价格和手续费self.order = Noneself.buy_price = Noneself.buy_comm = None# 添加移动均线指标self.sma = bt.indicators.SimpleMovingAverage(self.datas[0], period=self.params.maperiod)def next(self):""":return::rtype:"""if self.order: # 检查是否有指令等待执行,return# 检查是否持仓if not self.position: # 没有持仓if self.data_close[0] > self.sma[0]: # 执行买入条件判断:收盘价格上涨突破20日均线self.order = self.buy(size=100) # 执行买入else:if self.data_close[0] < self.sma[0]: # 执行卖出条件判断:收盘价格跌破20日均线self.order = self.sell(size=100) # 执行卖出cerebro = bt.Cerebro() # 初始化回测系统
start_date = datetime(2000, 1, 1) # 回测开始时间
end_date = datetime(2020, 4, 21) # 回测结束时间
data = bt.feeds.PandasData(dataname=stock_hfq_df, fromdate=start_date, todate=end_date) # 加载数据
cerebro.adddata(data) # 将数据传入回测系统
cerebro.addstrategy(MyStrategy) # 将交易策略加载到回测系统中
start_cash = 1000000
cerebro.broker.setcash(start_cash) # 设置初始资本为 100000
cerebro.broker.setcommission(commission=0.002) # 设置交易手续费为 0.2%
cerebro.run() # 运行回测系统port_value = cerebro.broker.getvalue() # 获取回测结束后的总资金
pnl = port_value - start_cash # 盈亏统计print(f"初始资金: {start_cash}\n回测期间:{start_date.strftime('%Y%m%d')}:{end_date.strftime('%Y%m%d')}")
print(f"总资金: {round(port_value, 2)}")
print(f"净收益: {round(pnl, 2)}")cerebro.plot(style='candlestick') # 画图
注意有两句是自己加的
stock_hfq_df.rename(columns={'date':'datetime'},inplace=True)
stock_hfq_df.set_index('datetime')
运行一下,不出意料的报错:
File "D:\ProgramData\Anaconda3\envs\backtrader38\lib\site-packages\backtrader\feeds\pandafeed.py", line 269, in _load
dt = tstamp.to_pydatetime()
AttributeError: 'int' object has no attribute 'to_pydatetime'
研究了下backtrader源码,backtrader要求将日期设置索引,从akshare返回的日行情数据需要处理一下:
stock_hfq_df['openinterest'] = 0
stock_hfq_df.set_index('date',inplace=True)
二、Pyfolio量化分析
使用pyfolio进行量化分析报错:
"pyfolio\plotting.py", line 648, in show_perf_stats,
AttributeError:'Series' object has no attribute 'iteritems'
直接改pyfolio源码,将648行改为:
#for stat, value in perf_stats[column].iteritems():
for stat, value in perf_stats[column].items():
继续运行,报错:
File "...\pyfolio\timeseries.py", line 896, in get_max_drawdown_underwater
peak = underwater[:valley][underwater[:valley] == 0].index[-1]
将timeseries.py893行改为:
# valley = np.argmin(underwater) # end of the period
valley = underwater.idxmin()
继续报错:
File "...\pyfolio\timeseries.py", line 1140, in summarize_paths
cone_bounds = pd.DataFrame(columns=pd.Float64Index([]))
AttributeError: module 'pandas' has no attribute 'Float64Index'
原因是pandas版本太高了(2.0.1),pip uninstall pandas
pip install pandas==1.5.3 -i https://pypi.tuna.tsinghua.edu.cn/simple
继续运行,一大堆的FutureWarning和报错:
File "\pyfolio\round_trips.py", line 133, in _groupby_consecutive
grouped_price = (t.groupby(('block_dir',
KeyError: ('block_dir', 'block_time'),
括号惹得祸,修改round_trips.py第133行
# grouped_price = (t.groupby(('block_dir',# 'block_time'))# .apply(vwap))grouped_price = (t.groupby(['block_dir','block_time']).apply(vwap))grouped_price.name = 'price'grouped_rest = t.groupby(['block_dir', 'block_time']).agg({'amount': 'sum','symbol': 'first','dt': 'first'})
继续,还是报错:
File "...\pyfolio\round_trips.py", line 77, in agg_all_long_short
stats_all = (round_trips
pandas.errors.SpecificationError: nested renamer is not supported
改round_trips.py第77行
stats_all = (round_trips.assign(ones=1).groupby('ones')[col].agg(list(stats_dict.items())).T.rename(columns={1.0: 'All trades'}))stats_long_short = (round_trips.groupby('long')[col].agg(list(stats_dict.items())).T.rename(columns={False: 'Short trades',True: 'Long trades'}))
继续,同类型错误:
File "...\pyfolio\round_trips.py", line 393, in gen_round_trip_stats
round_trips.groupby('symbol')['returns'].agg(RETURN_STATS).T
pandas.errors.SpecificationError: nested renamer is not supported
393行修改:
stats['symbols'] = \round_trips.groupby('symbol')['returns'].agg(list(RETURN_STATS.items())).T
再来:
File "...\pyfolio\plotting.py", line 1767, in plot_round_trip_lifetimes
ax.set_yticklabels([utils.format_asset(s) for s in sample])
ValueError: The number of FixedLocator locations (16), usually from a call to set_ticks, does not match the number of labels (3).
追踪报错,在tears.py文件的871行,不清楚画什么图,先注释掉
# plotting.plot_round_trip_lifetimes(trades, ax=ax_trade_lifetimes)
utils文件报错,date即是索引又是字段,改个名字就好了,修改351行,
# txn_val['date'] = txn_val.index.date# txn_val = txn_val.groupby('date').cumsum()txn_val['pfdate'] = txn_val.index.datetxn_val = txn_val.groupby('pfdate').cumsum()# txn_val.rename(columns={'pfdate':'date'},inplace=True)# Calculate exposure, then take peak of exposure every daytxn_val['exposure'] = txn_val.abs().sum(axis=1)from pandas.core import resample as rpcondition = (txn_val['exposure'] == txn_val.groupby([rp.TimeGrouper('24H')])['exposure'].transform(max))
运行一下,跑通了,pyfolio在Console模式下支持不好,换Jupter notebook,可以看到多个统计表和图,就不一一说明了。
三、Empyrical
Empyrical比较简单,传个股日权益变动率和bench的日变动率(可利用pct_change()计算)即可
def show_result_empyrical(returns,factor_returns,risk_free=0.03):# print(type(returns))# print(type(factor_returns))# returns = np.array(returns)# factor_returns = np.array(factor_returns)print('\n\n<----Emprical策略评价---->')# 总收益率pan = cum_returns(returns)[len(returns)-1]print('总收益率:', pan)print('年化收益:', annual_return(returns))print('非系统性风险ALPHA:', alpha(returns,factor_returns,risk_free=risk_free))print('系统性风险BETA:', beta(returns,factor_returns,risk_free=risk_free))# print('alpha_beta_aligned:',alpha_beta_aligned(returns))print('最大回撤:', max_drawdown(returns))print('夏普比', sharpe_ratio(returns))print('卡玛比', calmar_ratio(returns))print('omega', omega_ratio(returns,risk_free))print('annual_volatility', annual_volatility(returns))print('downside_risk', downside_risk(returns))print('sortino_ratio', sortino_ratio(returns))print('tail_ratio', tail_ratio(returns))print('<----Emprical 评价 End---->\n')
四、Quantstats
Quantstats可以输出的评价指标,一大堆
['avg_loss','avg_return','avg_win','best','cagr','calmar','common_sense_ratio','comp','compare','compsum','conditional_value_at_risk','consecutive_losses','consecutive_wins','cpc_index','cvar','drawdown_details','expected_return','expected_shortfall','exposure','gain_to_pain_ratio','geometric_mean','ghpr','greeks','implied_volatility','information_ratio','kelly_criterion','kurtosis','max_drawdown','monthly_returns','outlier_loss_ratio','outlier_win_ratio','outliers','payoff_ratio','profit_factor','profit_ratio','r2','r_squared','rar','recovery_factor','remove_outliers','risk_of_ruin','risk_return_ratio','rolling_greeks','ror','sharpe','skew','sortino','tail_ratio','to_drawdown_series','ulcer_index','ulcer_performance_index','upi','utils','value_at_risk','var','volatility','win_loss_ratio','win_rate','worst']['daily_returns','distribution','drawdown','drawdowns_periods','earnings','histogram','log_returns','monthly_heatmap','returns','rolling_beta','rolling_sharpe','rolling_sortino','rolling_volatility','snapshot','yearly_returns']
一句命令就可生产完整的报告,非常方便
qs.reports.html(days_returns,benchmark=bench_returns,output='stats.html',title='stock:'+stock_symbol+' bench:'+bench_symbol)
计算omega时报错, 'numpy.float64' object has no attribute 'values',修改quantstats\stats.py", line 469:
# numer = returns_less_thresh[returns_less_thresh > 0.0].sum().values[0]numer = returns_less_thresh[returns_less_thresh > 0.0].sum().tolist()# denom = -1.0 * \# returns_less_thresh[returns_less_thresh < 0.0].sum().values[0]denom = -1.0 * \returns_less_thresh[returns_less_thresh < 0.0].sum().tolist()
以下是计算一些常用的指标
feature_df = pd.DataFrame(index=days_returns.index)feature_df['累积收益率'] = qs.stats.compsum(days_returns).valuesfeature_df['回撤'] = qs.stats.to_drawdown_series(days_returns)print(f"累积收益率(compsum): {feature_df['累积收益率'].iloc[-1]}")print(f"年收益率(cagr): {qs.stats.cagr(days_returns)}")print(f"夏普比率(sharpe): {qs.stats.sharpe(days_returns)}")print(f"索蒂诺(sortino): {qs.stats.sortino(days_returns)}")print(f"omega(omega): {qs.stats.omega(days_returns)}")print(f"最大回撤(max_drawdown): {qs.stats.max_drawdown(days_returns)}")print(f"最大回撤(天)(max_drawdown_bar): {int(qs.stats.drawdown_details(feature_df['回撤'])['days'].max())}")print(f"年波动率(volatility): {qs.stats.volatility(days_returns)}")print(f"calmar(calmar): {qs.stats.calmar(days_returns)}")print(f"信息比率(information_ratio): {qs.stats.information_ratio(days_returns, bench_returns)}")print(f"tail_ratio(tail_ratio): {qs.stats.tail_ratio(days_returns)}")print(f"win_loss_ratio: {qs.stats.win_loss_ratio(days_returns)}")print(f"win_rate: {qs.stats.win_rate(days_returns)}")print(f"avg_loss: {qs.stats.avg_loss(days_returns)}")print(f"avg_return: {qs.stats.avg_return(days_returns)}")print(f"avg_win: {qs.stats.avg_win(days_returns)}")
五、一些问题
1.缺失值-交易日历法
回测的过程中,个股总有停牌的日子,和基准(如上证指数)相比,交易日期有缺失。如何处理?两种思路,1)处理个股数据,将个股相比基准缺失的日期使用上一个交易日进行数据填充,如有连续缺失,就用最新一天交易数据填充几天。这种方法可能会导致指标失真;2)设置交易日历,如果个股停牌或者交易日期缺失,将个股停牌日或缺失日期标记为holiday。这种方法对个股处理,多股就不适用。
先试用交易日历,写个获取交易日历的方法
#!/usr/bin/env python
# coding=utf-8import backtrader as bt
import datetime
import pandas as pdclass ChineseCalendar(bt.TradingCalendar):params = dict(#已知的节假日holidays=[datetime.date(2016, 1, 1),datetime.date(2016, 1, 18),datetime.date(2016, 2, 15),datetime.date(2016, 3, 25),datetime.date(2016, 5, 30),datetime.date(2016, 7, 4),datetime.date(2016, 9, 5),datetime.date(2016, 11, 24),datetime.date(2016, 12, 26),],earlydays=[(datetime.date(2016, 11, 25),datetime.time(9, 30), datetime.time(13, 0))],open=datetime.time(9, 30),close=datetime.time(16, 0),)def get_trade_calenar(stockDate, indexDate):stockCalendar = ChineseCalendar()si = indexDate.tolist().index(stockDate[0]) #个股开始日期在基准指数日期索引值ei = indexDate.tolist().index(stockDate[len(stockDate)-1]) #个股结束日期在基准指数日期索引值indexDate = indexDate[si:ei+1]indexSet = set(indexDate)stockSet = set(stockDate)holidaySet = set(stockCalendar.p.holidays)holidaySet = holidaySet.union(indexSet - stockSet)stockCalendar.p.holidays = sorted(list(holidaySet))return stockCalendar
调用时将个股的交易日期和基准指数的交易日期传入即可
stock_date = stock_hfq_df['date']
index_date = index_bech_df['date']
stockCalendar = get_trade_calenar(stock_date,index_date)
cerebro.addcalendar(stockCalendar)
不管啥用,交易仍然有错
2.缺失值-next函数判断法
观察到图中两个日期不一致,所以在_next函数中一开始检查两个日期是否一致,不一致就直接返回,交易记录正常了。这种方法可能会引起计算指标或其它潜在问题。
def next(self): stock_time = int(time.mktime(self.datas[0].datetime.date(0).timetuple()))index_time = int(time.mktime(self.datetime.date().timetuple()))if not (stock_time == index_time):return
3.缺失值-对齐剪裁法
第三种思路,就是让基准指数数据与回测的个股数据进行对齐,对齐方式1)个股向基准指数数据补起停牌日期,这种方式有几个问题:缺失值用何值补,补齐日期可能触发交易和计算指标偏差;2)将个股停牌或缺失值的日期,在基准指数数据中删除。个人推荐第2中方法。
stock_date = stock_hfq_df['date']
index_bech_df.set_index('date',inplace=True)
start_date = stock_date[0].to_pydatetime() # 个股开始日期在基准指数日期索引值
end_date = stock_date[len(stock_date) - 1].to_pydatetime() # 个股结束日期在基准指数日期索引值
index_bech_df = index_bech_df[start_date:end_date]
#对基准指数数据按个股日期进行剪裁
index_bech_df = pd.merge(stock_date,index_bech_df,on='date',how='left')
index_bech_df.set_index('date',inplace=True)
4、UTC时间问题
Python的时间戳timestamp有时区的区别,可以有时区,也可没有时区,但是时间比较的时候要统一,建议都转成UTC时间,方式如下:
df.tz_localize('UTC')
5、三个策略评价指标比较
<----Emprical策略评价---->
总收益率: -0.00015290785679900054
年化收益: -2.708025608932907e-05
非系统性风险ALPHA: 6.4675985691486915e-06
系统性风险BETA: 0.0006593087987428167
最大回撤: -0.0008138226348769249
夏普比 -0.1259127744640561
卡玛比 -0.03327537835492182
omega 0.9688804769454026
annual_volatility 0.0003224261664151693
downside_risk 0.0002107268263522359
sortino_ratio -0.19265498311678378
tail_ratio 1.0199623514327334
<----Emprical 评价 End----><!---- Backtrader 自带指标--->
股票代码: sh600000 初始资金: 1000000
回测时间:20000101 - 20051231 回测天数(Bars):1423
期末权益: 998465.84 净收益: -1534.16
投资回报率ROI(%): -0.1534 年化收益率(%): -0.027
投资回报率ROI(%)B: -0.1535 年化收益(%)B: -0.0272
Bench代码: sh000300 无风险利率: 0.03
收益率/年: [nan, nan, nan, -0.00017868778165341936, -0.00016061879786810618, -1.0376020991520463e-05]
夏普指数Sharpe(X): nan
年化夏普指数Sharpe(X): nan
最大回撤周期(max_drowdown_len): 1422
最大回撤(比率%)max_drowdown: 0.1739161259999266
最大回撤(资金)max_drowdown_money: 1739.161259999266
动态加权回报率(VWR): nan
年化卡玛指数(Calmar): OrderedDict([(datetime.date(2000, 12, 31), nan), (datetime.date(2001, 12, 31), nan), (datetime.date(2002, 12, 31), nan), (datetime.date(2003, 12, 31), nan), (datetime.date(2004, 12, 31), nan), (datetime.date(2005, 12, 31), nan)])
<!---- Backtrader 自带指标 end ---><----Quantstats 策略评价---->
累积收益率(compsum): -0.00015290785679900054
年收益率(cagr): -2.5521226758251636e-05 0.0000255
夏普比率(sharpe): -0.10278825059212589
索蒂诺(sortino): -0.15724685399443492
omega(omega): 0.9688804769454026
最大回撤(max_drawdown): -0.0008138226348770594
最大回撤(天)(max_drawdown_bar): 988
年波动率(volatility): 0.00026312374143224344
calmar(calmar): -0.03135969149114047
信息比率(information_ratio): 0.020937988611098746
tail_ratio(tail_ratio): 0.8151738286529508
win_loss_ratio: 1.1982289956426235
win_rate: 0.4470842332613391
avg_loss: -1.9170567627610057e-05
avg_return: -3.2985789166700927e-07
avg_win: 2.297072999433019e-05<----Quantstats 策略评价 end---->
对比Backtrader、Empyrical和Quantstats,统计出来的多少有些差别,有的还很离谱,得有取舍的使用,总的来说Quantstats指标看起来还可靠(没有验证)。
这篇关于Backtrader 试用,empyrical、pyfolio、quantstats记录的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!