在本文中,我们将学习如何使用Pandas的 get_dummies()方法在Python中创建虚拟变量。 虚拟变量(或二进制/指标变量)通常用于统计分析以及更简单的描述性统计。 虚拟编码可以通过统计软件(例如Python、R或者SPSS)自动完成。
import pandas as pddata_url = 'Salaries.csv'df = pd.read_csv(data_url, index_col=0)print(df.head())


print(pd.get_dummies(df['sex']).head())

df_dummies = pd.get_dummies(df, columns=['sex'])print(df_dummies.head())

df_dummies = pd.get_dummies(df, prefix='Gender', prefix_sep='.', columns=['sex'])print(df_dummies.head())

df_dummies = pd.get_dummies(df, prefix='', prefix_sep='', columns=['sex'])print(df_dummies.head())

print(pd.get_dummies(df['rank']).head())

df_dummies = pd.get_dummies(df, columns=['rank'])print(df_dummies.head())

df_dummies = pd.get_dummies(df, prefix='Rank', prefix_sep='.', columns=['rank'])print(df_dummies.head())


df_dummies = pd.get_dummies(df, prefix='', prefix_sep='', columns=['rank', 'sex'])print(df_dummies.head())

df_dummies = pd.get_dummies(df, prefix='', prefix_sep='', columns=['rank', 'sex', 'discipline'])print(df_dummies.head())
文章转载于:https://www.jianshu.com/p/087803eccd31
原著是一个有趣的人,若有侵权,请通知删除
还没有人抢沙发呢~