时间: 2020-09-4|41次围观|0 条评论

pandas通常具有“索引”,即用一列每一行提供名称。 它像数据库表中的主键一样工作。 Pandas还支持MultiIndex,其中行的索引是几列的复合键。

从CSV文件创建未索引的DataFrame

>>> import pandas, io>>> data = io.StringIO('''Fruit,Color,Count,Price... Apple,Red,3,$1.29... Apple,Green,9,$0.99... Pear,Red,25,$2.59... Pear,Green,26,$2.79... Lime,Green,99,$0.39... ''')>>> df_unindexed = pandas.read_csv(data)>>> df_unindexed   Fruit  Color  Count  Price0  Apple    Red      3  $1.291  Apple  Green      9  $0.992   Pear    Red     25  $2.593   Pear  Green     26  $2.794   Lime  Green     99  $0.39>>> df = df_unindexed.set_index(['Fruit', 'Color'])>>> df             Count  PriceFruit ColorApple Red        3  $1.29      Green      9  $0.99Pear  Red       25  $2.59      Green     26  $2.79Lime  Green     99  $0.39>>>>>>>>> df.xs('Apple')       Count  PriceColorRed        3  $1.29Green      9  $0.99>>>>>> df.xs('Red', level='Color')       Count  PriceFruitApple      3  $1.29Pear      25  $2.59>>> df.loc['Apple', :]       Count  PriceColorRed        3  $1.29Green      9  $0.99>>>>>>>>> df.loc[('Apple', 'Red'), :]Count        3Price    $1.29Name: (Apple, Red), dtype: object>>>

https://www.somebits.com/~nelson/pandas-multiindex-slice-demo.html

pandas.DataFrame.xs

此方法采用关键参数来选择MultiIndex特定级别的数据,实际上也适用于单列索引,用于通过索引的方式访问行,和loc类似。

>>> d = {'num_legs': [4, 4, 2, 2],...      'num_wings': [0, 0, 2, 2],...      'class': ['mammal', 'mammal', 'mammal', 'bird'],...      'animal': ['cat', 'dog', 'bat', 'penguin'],...      'locomotion': ['walks', 'walks', 'flies', 'walks']}>>> df = pd.DataFrame(data=d)>>> df   num_legs  num_wings   class   animal locomotion0         4          0  mammal      cat      walks1         4          0  mammal      dog      walks2         2          2  mammal      bat      flies3         2          2    bird  penguin      walks>>> df = df.set_index(['class', 'animal', 'locomotion'])>>> df                           num_legs  num_wingsclass  animal  locomotionmammal cat     walks              4          0       dog     walks              4          0       bat     flies              2          2bird   penguin walks              2          2>>> df.xs('mammal')                   num_legs  num_wingsanimal locomotioncat    walks              4          0dog    walks              4          0bat    flies              2          2>>> df.xs(('mammal', 'dog'))sys:1: PerformanceWarning: indexing past lexsort depth may impact performance.            num_legs  num_wingslocomotionwalks              4          0>>> df.xs('cat', level=1)                   num_legs  num_wingsclass  locomotionmammal walks              4          0>>> df.xs(('bird', 'walks'),level=[0, 'locomotion'])         num_legs  num_wingsanimalpenguin         2          2>>> df.xs('num_wings', axis=1)class   animal   locomotionmammal  cat      walks         0        dog      walks         0        bat      flies         2bird    penguin  walks         2Name: num_wings, dtype: int64

文章转载于:https://www.jianshu.com/p/1927233c4158

原著是一个有趣的人,若有侵权,请通知删除

本博客所有文章如无特别注明均为原创。
复制或转载请以超链接形式注明转自起风了,原文地址《pandas多索引(MultiIndex)简介
   

还没有人抢沙发呢~