1、认识层级索引
以下示例将创建一个 Series 对象, 索引 Index 由两个子 list 组成,第一个子 list 是外层索引,第二个 list 是内层索引:
>>>importpandasaspd >>>importnumpyasnp >>>obj=pd.Series(np.random.randn(12),index=[['a','a','a','b','b','b','c','c','c','d','d','d'],[0,1,2,0,1,2,0,1,2,0,1,2]]) >>>obj a0-0.201536 1-0.629058 20.766716 b0-1.255831 1-0.483727 2-0.018653 c00.788787 11.010097 2-0.187258 d01.242363 1-0.822011 2-0.085682 dtype:float64
2、MultiIndex 索引对象
尝试打印上面示例中 Series 的索引类型,会得到一个 MultiIndex 对象,MultiIndex 对象的levels属性表示两个层级中分别有那些标签,codes属性表示每个位置分别是什么标签,如下所示:
>>>importpandasaspd
>>>importnumpyasnp
>>>obj=pd.Series(np.random.randn(12),index=[['a','a','a','b','b','b','c','c','c','d','d','d'],[0,1,2,0,1,2,0,1,2,0,1,2]])
>>>obj
a00.035946
1-0.867215
2-0.053355
b0-0.986616
10.026071
2-0.048394
c00.251274
10.217790
21.137674
d0-1.245178
11.234972
2-0.035624
dtype:float64
>>>
>>>type(obj.index)
<class'pandas.core.indexes.multi.MultiIndex'>
>>>
>>>obj.index
MultiIndex([('a',0),
('a',1),
('a',2),
('b',0),
('b',1),
('b',2),
('c',0),
('c',1),
('c',2),
('d',0),
('d',1),
('d',2)],
)
>>>obj.index.levels
FrozenList([['a','b','c','d'],[0,1,2]])
>>>
>>>obj.index.codes
FrozenList([[0,0,0,1,1,1,2,2,2,3,3,3],[0,1,2,0,1,2,0,1,2,0,1,2]])
通常可以使用 from_arrays() 方法来将数组对象转换为 MultiIndex 索引对象:
>>>arrays=[[1,1,2,2],['red','blue','red','blue']]
>>>pd.MultiIndex.from_arrays(arrays,names=('number','color'))
MultiIndex([(1,'red'),
(1,'blue'),
(2,'red'),
(2,'blue')],
names=['number','color'])
其他常用方法见下图:
3、提取值
对于这种有多层索引的对象,如果只传入一个参数,则会对外层索引进行提取,其中包含对应所有的内层索引,如果传入两个参数,则第一个参数表示外层索引,第二个参数表示内层索引,示例如下:
>>>importpandasaspd >>>importnumpyasnp >>>obj=pd.Series(np.random.randn(12),index=[['a','a','a','b','b','b','c','c','c','d','d','d'],[0,1,2,0,1,2,0,1,2,0,1,2]]) >>>obj a00.550202 10.328784 21.422690 b0-1.333477 1-0.933809 2-0.326541 c00.663686 10.943393 20.273106 d01.354037 1-2.312847 2-2.343777 dtype:float64 >>> >>>obj['b'] 0-1.333477 1-0.933809 2-0.326541 dtype:float64 >>> >>>obj['b',1] -0.9338094811708413 >>> >>>obj[:,2] a1.422690 b-0.326541 c0.273106 d-2.343777 dtype:float64
4、交换分层与排序
MultiIndex 对象的 swaplevel() 方法可以交换外层与内层索引,sortlevel() 方法会先对外层索引进行排序,再对内层索引进行排序,默认是升序,如果设置 ascending 参数为 False 则会降序排列,示例如下:
>>>importpandasaspd >>>importnumpyasnp >>>obj=pd.Series(np.random.randn(12),index=[['a','a','a','b','b','b','c','c','c','d','d','d'],[0,1,2,0,1,2,0,1,2,0,1,2]]) >>>obj a0-0.110215 10.193075 2-1.101706 b0-1.325743 10.528418 2-0.127081 c0-0.733822 11.665262 20.127073 d01.262022 1-1.170518 20.966334 dtype:float64 >>> >>>obj.swaplevel() 0a-0.110215 1a0.193075 2a-1.101706 0b-1.325743 1b0.528418 2b-0.127081 0c-0.733822 1c1.665262 2c0.127073 0d1.262022 1d-1.170518 2d0.966334 dtype:float64 >>> >>>obj.swaplevel().index.sortlevel() (MultiIndex([(0,'a'), (0,'b'), (0,'c'), (0,'d'), (1,'a'), (1,'b'), (1,'c'), (1,'d'), (2,'a'), (2,'b'), (2,'c'), (2,'d')], ),array([0,3,6,9,1,4,7,10,2,5,8,11],dtype=int32))
更多Python知识,请点击Python视频教程!!
上一篇
下一篇