Cython程序比普通Python慢（10M选项3.5s vs 3.25s Black Scholes）

Cython程序比普通Python慢（10M选项3.5s vs 3.25s Black Scholes） - 我缺少什么？(Cython program is slower than plain Python (10M options 3.5s vs 3.25s Black Scholes) - what am I missing?)

好的，这是我下面的第一个Cython程序，代码是对期货的欧洲期权定价（Black Scholes没有股息）。它在10M选项上以3.5s运行，而不是我在下面发布的直接numpy Python 3.25s的代码。任何人都可以指出为什么我的Cython代码更慢 - 就像因为我使用循环而不是向量化调用（在C中不确定如何做到这一点，生成的cython代码似乎可以对其进行矢量化）。即使变量是从numpy数组传入的，我可以在这个循环中使用nogil和nogil吗？在Cython的示例上发布的简单示例无法使用循环http://docs.cython.org/src/userguide/parallelism.html#module-cython.parallel上的cython.parallel prange正确编译。反馈非常感谢，为一个有点开放式的问题道歉 - 其他人可以在这里自由使用此代码作为起点，因为它已经比我在C和Python中看到的其他在线工作更快。这里是：

保存为CyBlack.pyx文件进行编译（注意所有输入都是float64除了Black_callput ，它是Black_callput表示调用，-1表示put）。编译完成后， from CyBlack.CyBlack import CyBlack ：

from numpy cimport ndarray cimport numpy as np cimport cython cdef extern from "math.h": double exp(double) double sqrt(double) double log(double) double erf(double) cdef double std_norm_cdf(double x): return 0.5*(1+erf(x/sqrt(2.0))) @cython.boundscheck(False) cpdef CyBlack(ndarray[np.float64_t, ndim=1] BlackPnL, ndarray[np.float64_t, ndim=1] Black_S, ndarray[np.float64_t, ndim=1] Black_Texpiry, ndarray[np.float64_t, ndim=1] Black_strike, ndarray [np.float64_t, ndim=1] Black_volatility, ndarray[np.float64_t, ndim=1] Black_IR, ndarray[np.int64_t, ndim=1] Black_callput): cdef Py_ssize_t i cdef Py_ssize_t N = BlackPnL.shape[0] cdef double d1, d2 for i in range(N): d1 = ((log(Black_S[i] / Black_strike[i]) + Black_Texpiry[i] * Black_volatility[i] **2 / 2)) / (Black_volatility[i] * sqrt(Black_Texpiry[i])) d2 = d1 - Black_volatility[i] * sqrt(Black_Texpiry[i]) BlackPnL[i] = exp(-Black_IR[i] * Black_Texpiry[i]) * (Black_callput[i] * Black_S[i] * std_norm_cdf(Black_callput[i] * d1) - Black_callput[i] * Black_strike[i] * std_norm_cdf(Black_callput[i] * d2)) return BlackPnL

这是setup.py所以其他人可以构建这个类型： python setup.py build_ext --inplace构建的Python 3.5 64位Windows。

from setuptools import setup from setuptools import Extension from Cython.Distutils import build_ext import numpy as np ext_modules = [Extension("CyBlack",sources=["CyBlack.pyx"], extra_compile_args=['/Ox', '/openmp', '/favor:INTEL64'], language='c++')] setup( name= 'Generic model class', cmdclass = {'build_ext': build_ext}, include_dirs = [np.get_include()], ext_modules = ext_modules)

好的，这是我非常快速的Python只有代码：

import numpy as np from scipy.stats import norm d1=((np.log(Black_S / Black_strike) + Black_Texpiry * Black_volatility **2 / 2)) / (Black_volatility * np.sqrt(Black_Texpiry)) d2=d1 - Black_volatility * np.sqrt(Black_Texpiry) BlackPnL = np.exp(-Black_IR * Black_Texpiry) * (Black_callput * Black_S * norm.cdf(Black_callput * d1) - Black_callput * Black_strike * norm.cdf(Black_callput * d2))

Okay here's my first Cython program below, the code to price European options on futures (Black Scholes without a dividend). It runs in 3.5s on 10M options, versus the code I posted below with straight numpy Python 3.25s. Can anyone point out why my Cython code is slower - like because I used a loop instead of vectorizing the call (not sure in C how to do that, the generated cython code appears to vectorize it though). Can I use nogil and openmp around this loop, even though the variables are passed in from numpy arrays? The simple examples posted on Cython's examples don't compile correctly with cython.parallel prange on the loop http://docs.cython.org/src/userguide/parallelism.html#module-cython.parallel. Feedback greatly appreciated, apologies for a somewhat open-ended question - others can use this code freely here as a starting point since it already works faster than other work profiled online that I've seen, in C and Python. Here it is:

Save as CyBlack.pyx file to compile (note all inputs are float64 except Black_callput which is int64, 1 for a call, -1 for a put). After compiling, from CyBlack.CyBlack import CyBlack:

from numpy cimport ndarray cimport numpy as np cimport cython cdef extern from "math.h": double exp(double) double sqrt(double) double log(double) double erf(double) cdef double std_norm_cdf(double x): return 0.5*(1+erf(x/sqrt(2.0))) @cython.boundscheck(False) cpdef CyBlack(ndarray[np.float64_t, ndim=1] BlackPnL, ndarray[np.float64_t, ndim=1] Black_S, ndarray[np.float64_t, ndim=1] Black_Texpiry, ndarray[np.float64_t, ndim=1] Black_strike, ndarray [np.float64_t, ndim=1] Black_volatility, ndarray[np.float64_t, ndim=1] Black_IR, ndarray[np.int64_t, ndim=1] Black_callput): cdef Py_ssize_t i cdef Py_ssize_t N = BlackPnL.shape[0] cdef double d1, d2 for i in range(N): d1 = ((log(Black_S[i] / Black_strike[i]) + Black_Texpiry[i] * Black_volatility[i] **2 / 2)) / (Black_volatility[i] * sqrt(Black_Texpiry[i])) d2 = d1 - Black_volatility[i] * sqrt(Black_Texpiry[i]) BlackPnL[i] = exp(-Black_IR[i] * Black_Texpiry[i]) * (Black_callput[i] * Black_S[i] * std_norm_cdf(Black_callput[i] * d1) - Black_callput[i] * Black_strike[i] * std_norm_cdf(Black_callput[i] * d2)) return BlackPnL

Here is the setup.py so others can build this typing: python setup.py build_ext --inplace built with VS2015 for Python 3.5 64bit Windows.

from setuptools import setup from setuptools import Extension from Cython.Distutils import build_ext import numpy as np ext_modules = [Extension("CyBlack",sources=["CyBlack.pyx"], extra_compile_args=['/Ox', '/openmp', '/favor:INTEL64'], language='c++')] setup( name= 'Generic model class', cmdclass = {'build_ext': build_ext}, include_dirs = [np.get_include()], ext_modules = ext_modules)

Okay and here is my very fast numpy Python only code:

import numpy as np from scipy.stats import norm d1=((np.log(Black_S / Black_strike) + Black_Texpiry * Black_volatility **2 / 2)) / (Black_volatility * np.sqrt(Black_Texpiry)) d2=d1 - Black_volatility * np.sqrt(Black_Texpiry) BlackPnL = np.exp(-Black_IR * Black_Texpiry) * (Black_callput * Black_S * norm.cdf(Black_callput * d1) - Black_callput * Black_strike * norm.cdf(Black_callput * d2))

最满意答案

我在你的cython代码中的函数之前添加了以下行，我从Cython获得了比Python 2.7更快的结果

@cython.boundscheck(False) @cython.wraparound(False) @cython.cdivision(True)

我的结果为10M分

%timeit PyBlack(BlackPnL, Black_S, Black_Texpiry, Black_strike, Black_volatility, Black_IR, Black_callput) 1 loops, best of 3: 3.49 s per loop

和

%timeit CyBlack(BlackPnL, Black_S, Black_Texpiry, Black_strike, Black_volatility, Black_IR, Black_callput) 1 loops, best of 3: 2.12 s per loop

编辑

CyBlack.pyx

from numpy cimport ndarray cimport numpy as np cimport cython cdef extern from "math.h": double exp(double) double sqrt(double) double log(double) double fabs(double) cdef double a1 = 0.254829592 cdef double a2 = -0.284496736 cdef double a3 = 1.421413741 cdef double a4 = -1.453152027 cdef double a5 = 1.061405429 cdef double p = 0.3275911 @cython.boundscheck(False) @cython.wraparound(False) @cython.cdivision(True) cdef inline double erf(double x): cdef int sign = 1 if (x < 0): sign = -1 x = fabs(x) cdef double t = 1.0/(1.0 + p*x) cdef double y = 1.0 - (((((a5*t + a4)*t) + a3)*t + a2)*t + a1)*t*exp(-x*x) return sign*y @cython.boundscheck(False) @cython.wraparound(False) @cython.cdivision(True) cdef double std_norm_cdf(double x): return 0.5*(1+erf(x/sqrt(2.0))) @cython.boundscheck(False) @cython.wraparound(False) @cython.cdivision(True) cpdef CyBlack(ndarray[np.float64_t, ndim=1] BlackPnL, ndarray[np.float64_t, ndim=1] Black_S, ndarray[np.float64_t, ndim=1] Black_Texpiry, ndarray[np.float64_t, ndim=1] Black_strike, ndarray [np.float64_t, ndim=1] Black_volatility, ndarray[np.float64_t, ndim=1] Black_IR, ndarray[np.int64_t, ndim=1] Black_callput): cdef Py_ssize_t i cdef Py_ssize_t N = BlackPnL.shape[0] cdef double d1, d2 for i in range(N): d1 = ((log(Black_S[i] / Black_strike[i]) + Black_Texpiry[i] * Black_volatility[i] **2 / 2)) / (Black_volatility[i] * sqrt(Black_Texpiry[i])) d2 = d1 - Black_volatility[i] * sqrt(Black_Texpiry[i]) BlackPnL[i] = exp(-Black_IR[i] * Black_Texpiry[i]) * (Black_callput[i] * Black_S[i] * std_norm_cdf(Black_callput[i] * d1) - Black_callput[i] * Black_strike[i] * std_norm_cdf(Black_callput[i] * d2)) return BlackPnL

setup.py

try: from setuptools import setup from setuptools import Extension except ImportError: from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext import numpy as np ext_modules = [Extension("CyBlack",["CyBlack.pyx"])] setup( name= 'Generic model class', cmdclass = {'build_ext': build_ext}, include_dirs = [np.get_include()], ext_modules = ext_modules)

I added the following lines before your functions in your cython code and I got a faster result from Cython than Python 2.7

@cython.boundscheck(False) @cython.wraparound(False) @cython.cdivision(True)

My results for 10M points

%timeit PyBlack(BlackPnL, Black_S, Black_Texpiry, Black_strike, Black_volatility, Black_IR, Black_callput) 1 loops, best of 3: 3.49 s per loop

and

%timeit CyBlack(BlackPnL, Black_S, Black_Texpiry, Black_strike, Black_volatility, Black_IR, Black_callput) 1 loops, best of 3: 2.12 s per loop

EDIT

CyBlack.pyx

from numpy cimport ndarray cimport numpy as np cimport cython cdef extern from "math.h": double exp(double) double sqrt(double) double log(double) double fabs(double) cdef double a1 = 0.254829592 cdef double a2 = -0.284496736 cdef double a3 = 1.421413741 cdef double a4 = -1.453152027 cdef double a5 = 1.061405429 cdef double p = 0.3275911 @cython.boundscheck(False) @cython.wraparound(False) @cython.cdivision(True) cdef inline double erf(double x): cdef int sign = 1 if (x < 0): sign = -1 x = fabs(x) cdef double t = 1.0/(1.0 + p*x) cdef double y = 1.0 - (((((a5*t + a4)*t) + a3)*t + a2)*t + a1)*t*exp(-x*x) return sign*y @cython.boundscheck(False) @cython.wraparound(False) @cython.cdivision(True) cdef double std_norm_cdf(double x): return 0.5*(1+erf(x/sqrt(2.0))) @cython.boundscheck(False) @cython.wraparound(False) @cython.cdivision(True) cpdef CyBlack(ndarray[np.float64_t, ndim=1] BlackPnL, ndarray[np.float64_t, ndim=1] Black_S, ndarray[np.float64_t, ndim=1] Black_Texpiry, ndarray[np.float64_t, ndim=1] Black_strike, ndarray [np.float64_t, ndim=1] Black_volatility, ndarray[np.float64_t, ndim=1] Black_IR, ndarray[np.int64_t, ndim=1] Black_callput): cdef Py_ssize_t i cdef Py_ssize_t N = BlackPnL.shape[0] cdef double d1, d2 for i in range(N): d1 = ((log(Black_S[i] / Black_strike[i]) + Black_Texpiry[i] * Black_volatility[i] **2 / 2)) / (Black_volatility[i] * sqrt(Black_Texpiry[i])) d2 = d1 - Black_volatility[i] * sqrt(Black_Texpiry[i]) BlackPnL[i] = exp(-Black_IR[i] * Black_Texpiry[i]) * (Black_callput[i] * Black_S[i] * std_norm_cdf(Black_callput[i] * d1) - Black_callput[i] * Black_strike[i] * std_norm_cdf(Black_callput[i] * d2)) return BlackPnL

setup.py

try: from setuptools import setup from setuptools import Extension except ImportError: from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext import numpy as np ext_modules = [Extension("CyBlack",["CyBlack.pyx"])] setup( name= 'Generic model class', cmdclass = {'build_ext': build_ext}, include_dirs = [np.get_include()], ext_modules = ext_modules)

更多推荐

Cython程序比普通Python慢（10M选项3.5s vs 3.25s Black Scholes）

最满意答案

发布评论取消回复

最近发表

热门文章

标签列表