使用numpy进行描述性统计分析 (numpy数据统计分析的客观题)

import numpy as np

numpy.random模块对Python内置的random进行了补充,增加了一些用于高效生成多种概率分布的样本值的函数。例如,你可以用normal来得到标准正态分布的4×4样本数组:

In [2]:

samples = np.random.normal(size=(4,4))
samples

Out[2]:

array([[-1.02570025,  0.21453741, -0.79162175, -0.51183586],
       [-0.23193273, -0.38605516, -0.57922755, -0.98164917],
       [ 0.25000492, -1.25107691, -1.50895315, -0.59762743],
       [-1.13635858, -1.06930107,  0.68740703, -0.55023257]])

而Python内置的random模块则只能一次生成一个样本值。如果需要产生大量的样本值,numpy.random快了不止一个数量级:

In [7]:

from random import normalvariate
N = 1000000
%timeit samples = [normalvariate(0,1) for _ in range(N)]
807 ms ± 27.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [8]:

%timeit np.random.normal(size=N)
29 ms ± 303 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

我们简单的来看看随机漫步的列子:

In [10]:

import random
position = 0
walk = [position]
steps = 1000
for i in range(steps):
    step = 1 if random.randint(0,1) else -1
    position += step
    walk.append(position)

In [14]:

import matplotlib.pyplot as plt
plt.plot(walk,label="Random walk with +1/-1 steps")
plt.show()

numpy数值计算上机总结,numpy数据统计分析的客观题