00、背景

假设您是一家物流公司 (3PL) 的区域总监，职责范围内有 22 个仓库。

精益6西格玛精髓,精益六西格玛

在每个仓库，现场经理都为操作员设定了拣选生产率目标；您的目标是找到正确的激励政策以达到该目标的 75%。

拣选生产率是指每小时付费时拣选的纸箱数量。

在下面一个简短的动画解释视频，以了解该解决方案背后的概念。

精益6西格玛精髓,精益六西格玛

目标：找到合适的激励政策

目前，生产性操作员（达到每日生产率目标的操作员）除了日薪 64 欧元（税后）外，每天还可获得 5 欧元。

但这一激励政策在2个仓库的实施效果并不理想；只有 20% 的运营商达到了这一目标。

问题

达到拣选生产率目标 75% 所需的最低每日奖金应该是多少？

实验

在您的 22 个仓库中随机选择操作员，实施 1 至 20 欧元之间的每日奖励金额

检查操作员是否达到目标？

01、分析

使用 Python 替换 Minitab 以执行 Logistic 回归，以估计达到生产率目标 75% 所需的最低奖金

精益6西格玛精髓,精益六西格玛

我们将使用 Python 实现逻辑回归，以估计每日生产力奖金对仓库操作员拣选生产力的影响。

02、精益六西格玛

精益六西格玛（Lean Six Sigma，LSS）是精益生产与六西格玛管理的结合，其本质是消除浪费。精益六西格玛管理的目的是通过整合精益生产与六西格玛管理，吸收两种生产模式的优点，弥补单个生产模式的不足，达到更佳的管理效果。

精益六西格码是一种基于逐步方法改进流程的方法。该方法通常遵循 5 个步骤（定义、测量、分析、改进和控制）来改进原因不明的现有流程问题。

精益生产：源于二十世纪六、七十年代早期的丰田生产方式。

六西格玛：首先于20世纪80年代中期在摩托罗拉公司取得成功应用

03、完整代码

1、导入包

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats as stat
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

2、导入数据

df_incentive = pd.read_excel('df_incentive.xlsx')

print("{:,} records".format(len(df_incentive)))
df_incentive.head()

精益6西格玛精髓,精益六西格玛

3、两个值分布的箱形图

df_incentive.boxplot(figsize=(10,7), by=['Target'], column = ['Incentive'])
plt.xlabel('目标（1:真，0:假）')
plt.ylabel('奖励（欧元/天）')
plt.show()

精益6西格玛精髓,精益六西格玛

4、格式化

df_calc = pd.DataFrame(df_incentive.groupby(['Incentive'])['Target'].sum())
df_calc.columns = ['Target']

df_calc['No Target'] = df_incentive.groupby(['Incentive'])['Target'].count() - df_calc['Target']
df_calc['Total'] = df_incentive.groupby(['Incentive'])['Target'].count()
# 重置索引并按行数排序
df_calc.reset_index(inplace = True)
df_calc.sort_values(['Incentive'], ascending = True, inplace = True)
df_calc.head()

精益6西格玛精髓,精益六西格玛

5、训练模型和预测

# 定义 X, y
X = df_incentive[['Incentive']]
y = df_incentive['Target']
# 训练/测试集
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.3,random_state=0)
# 实例化模型
log_regression = LogisticRegression()
# 训练模型
log_regression.fit(X_train,y_train)

# 计算p值
denom = (2.0*(1.0+np.cosh(log_regression.decision_function(X))))
denom = np.tile(denom,(X.shape[1],1)).T
# 信息矩阵
F_ij = np.dot((X/denom).T,X)
## 逆信息矩阵
Cramer_Rao = np.linalg.inv(F_ij)
sigma_estimates = np.sqrt(np.diagonal(Cramer_Rao))
# z分数
z_scores = log_regression.coef_[0]/sigma_estimates
# p值的双尾检验
p_values = [stat.norm.sf(abs(x))*2 for x in z_scores]

print("p值: {}".format(p_values[0]))

p值: 2.1327739857133364e-141

6、最终成图

plt.figure(figsize=(12, 6))
ax = plt.gca()
sns.regplot(x='Incentive', y='Target', data=df_incentive, logistic=True, ax = ax)
plt.xlabel('生产力激励（欧元/天)')
plt.ylabel('达到生产力目标的概率')
plt.show()

精益6西格玛精髓,精益六西格玛

精益6西格玛常用的方法论 (精益6西格玛方案)

00、背景

目标：找到合适的激励政策

实验

01、分析

02、精益六西格玛

03、完整代码

1、导入包

2、导入数据

3、两个值分布的箱形图

4、格式化

5、训练模型和预测

6、最终成图