This page shows performance tests and their results. For element-by-element pipelines, the performance as good as ad-hoc crafted Python code written without the same flexibility. For chunking pipelines (see numpy_chunking), the performance is better than that of either ad-hoc crafted C code that processes element-by-element or NumPy
The following shows the performance of three implementations finding the correlation between two CSV columns pruned for outliers:
The ad-hoc crafted CSV code using csv.reader is
r = csv.reader(open(_f_name, 'r'))
fields = r.next()
ind0, ind1 = fields.index('0'), fields.index('1')
sx, sxx, sy, syy, sxy, n = 0, 0, 0, 0, 0, 0
try:
while True:
row = r.next()
x, y = float(row[ind0]), float(row[ind1])
if x < 0.5 and y < 0.5:
sx += x
sxx += x * x
sy += y
sxy += x * y
syy += y * y
n += 1
except StopIteration:
c = (n * sxy - sx * sy) / math.sqrt(n * sxx - sx * sx) / math.sqrt(n * syy - sy * sy)
The ad-hoc crafted CSV code using csv.DictReader is
def _dict_corr():
r = csv.DictReader(open(_f_name, 'r'), ('0', '1'))
sx, sxx, sy, syy, sxy, n = 0, 0, 0, 0, 0, 0
for row in r:
x, y = float(row['0']), float(row['1'])
if x < 0.5 and y < 0.5:
sx += x
sxx += x * x
sy += y
sxy += x * y
syy += y * y
n += 1
c = (n * sxy - sx * sy) / math.sqrt(n * sxx - sx * sx) / math.sqrt(n * syy - sy * sy)
The pipeline code is:
def _csv_pipes_corr():
c = csv_vals(open(_f_name, 'r'), ('0', '1')) | \
filt(pre = lambda (x, y) : x < 0.5 and y < 0.5) | \
corr()
The following shows the performance of two implementations finding the mean of a CSV column using direct !Numpy and dagpype.
The !NumPy implementation processing all data in a single chunk is:
x = numpy.genfromtxt(_f_name, usecols = (0), delimiter = ',')
a = numpy.mean(x)
The pipeline implementation is:
c = np.chunk_stream_vals(_f_name, '0') | np.mean()
The following shows the performance of three implementations finding the correlation between data in binary format using C code, direct !Numpy, and dagpype:
The C implementation is:
double c_corr_prune(const char f_name[])
{
FILE *const pf = fopen(f_name, "rb");
assert(pf != NULL);
double sx = 0, sxx = 0, sy = 0, syy = 0, sxy = 0;
size_t n = 0;
while(1)
{
double x, y;
if(fread(&x, sizeof(double), 1, pf) != 1 || fread(&y, sizeof(double), 1, pf) != 1 || feof(pf))
{
fclose(pf);
break;
}
if(x >= 0.25 || y >= 0.25)
continue;
sx += x;
sxx += x * x;
sy += y;
sxy += x * y;
syy += y * y;
++n;
}
// printf("C %ld values\n", n);
return (n * sxy - sx * sy) / sqrt(n * sxx - sx * sx) / sqrt(n * syy - sy * sy);
}
The !NumPy implementation processing all data in a single chunk is:
s = open(_f_name, 'rb').read()
a = numpy.fromstring(s)
xy = a.reshape(a.shape[0] / 2, 2)
s = numpy.sum(xy, axis = 0)
sx = s[0]
sy = s[1]
c = numpy.dot(xy.T, xy)
sxx = c[0, 0]
sxy = c[0, 1]
syy = c[1, 1]
n = xy.shape[0]
# print 'numpy core', n, 'values'
res = (n * sxy - sx * sy) / math.sqrt(n * sxx - sx * sx) / math.sqrt(n * syy - sy * sy)
The pipeline implementation is:
c = np.chunk_stream_bytes(_f_name, num_cols = 2) | np.corr()
The following shows the performance of three implementations finding the correlation between data in binary format, pruning pairs with values larger than 0.25, using C code, direct !Numpy, and dagpype:
The C implementation is:
double c_corr_prune(const char f_name[])
{
FILE *const pf = fopen(f_name, "rb");
assert(pf != NULL);
double sx = 0, sxx = 0, sy = 0, syy = 0, sxy = 0;
size_t n = 0;
while(1)
{
double x, y;
if(fread(&x, sizeof(double), 1, pf) != 1 || fread(&y, sizeof(double), 1, pf) != 1 || feof(pf))
{
fclose(pf);
break;
}
if(x >= 0.25 || y >= 0.25)
continue;
sx += x;
sxx += x * x;
sy += y;
sxy += x * y;
syy += y * y;
++n;
}
// printf("C %ld values\n", n);
return (n * sxy - sx * sy) / sqrt(n * sxx - sx * sx) / sqrt(n * syy - sy * sy);
}
The !NumPy implementation processing all data in a single chunk is:
s = open(_f_name, 'rb').read()
a = numpy.fromstring(s)
xy = a.reshape(a.shape[0] / 2, 2)
xy = xy[numpy.logical_and(xy[:, 0] < 0.25, xy[:, 1] < 0.25), :]
s = numpy.sum(xy, axis = 0)
sx = s[0]
sy = s[1]
c = numpy.dot(xy.T, xy)
sxx = c[0, 0]
sxy = c[0, 1]
syy = c[1, 1]
n = xy.shape[0]
# print 'numpy core', n, 'values'
res = (n * sxy - sx * sy) / math.sqrt(n * sxx - sx * sx) / math.sqrt(n * syy - sy * sy)
single
The pipeline implementation is:
c = np.chunk_stream_bytes(_f_name, num_cols = 2) | \
filt(lambda a : a[numpy.logical_and(a[:, 0] < 0.25, a[:, 1] < 0.25), :]) | \
np.corr()
The following shows the performance of three implementations finding the correlation between data in binary format, truncating values at 0.25, using C code, direct !Numpy, and dagpype:
The C implementation is:
double c_corr_trunc(const char f_name[])
{
FILE *const pf = fopen(f_name, "rb");
assert(pf != NULL);
double sx = 0, sxx = 0, sy = 0, syy = 0, sxy = 0;
size_t n = 0;
while(1)
{
double x, y;
if(fread(&x, sizeof(double), 1, pf) != 1 || fread(&y, sizeof(double), 1, pf) != 1 || feof(pf))
{
fclose(pf);
break;
}
x = fmin(x, 0.25);
y = fmin(x, 0.25);
sx += x;
sxx += x * x;
sy += y;
sxy += x * y;
syy += y * y;
++n;
}
// printf("C %ld values\n", n);
return (n * sxy - sx * sy) / sqrt(n * sxx - sx * sx) / sqrt(n * syy - sy * sy);
}
The NumPy implementation processing all data in a single chunk is:
s = open(_f_name, 'rb').read()
a = numpy.fromstring(s)
xy = a.reshape(a.shape[0] / 2, 2)
xy = numpy.where(xy < 0.25, xy, 0.25)
s = numpy.sum(xy, axis = 0)
sx = s[0]
sy = s[1]
c = numpy.dot(xy.T, xy)
sxx = c[0, 0]
sxy = c[0, 1]
syy = c[1, 1]
n = xy.shape[0]
# print 'numpy core', n, 'values'
res = (n * sxy - sx * sy) / math.sqrt(n * sxx - sx * sx) / math.sqrt(n * syy - sy * sy)
The pipeline implementation is:
c = np.chunk_stream_bytes(_f_name, num_cols = 2) | \
filt(lambda a : numpy.where(a < 0.25, a, 0.25)) | \
np.corr()