当前位置：首页 > news >正文

做网站前期ps 图多大网站建设中 html 下载

news 2025/11/16 4:52:33

做网站前期ps 图多大,网站建设中 html 下载,怎么做网站布局,百度一下建设银行网站首页pandas 入门培训 pandas简介 - 官网链接#xff1a;http://pandas.pydata.org/ - pandas pannel data data analysis - Pandas是python的一个数据分析包 , Pandas最初被作为金融数据分析工具而开发出来#xff0c;因此#xff0c;pandas为时间序列分析提供了很好的支持 …pandas 入门培训 pandas简介 - 官网链接http://pandas.pydata.org/ - pandas pannel data data analysis - Pandas是python的一个数据分析包 , Pandas最初被作为金融数据分析工具而开发出来因此pandas为时间序列分析提供了很好的支持基本功能 - 具备按轴自动或显式数据对齐功能的数据结构 - 集成时间序列功能 - 既能处理时间序列数据也能处理非时间序列数据的数据结构 - 数学运算和约简比如对某个轴求和可以根据不同的元数据轴编号执行 - 灵活处理缺失数据 - 合并及其他出现在常见数据库例如基于SQL的中的关系型运算数据结构数据结构 serial - Series是一种类似于一维数组的对象它由一组数据各种NumPy数据类型以及一组与之相关的数据标签即索引组成。 - Series的字符串表现形式为索引在左边值在右边。代码 - serial的创建 - 使用列表 - 使用字典 - Serial的读写 - serial的运算 # -*- coding: utf-8 -*- from pandas import Series # from __future__ import print_functionprint 用数组生成Series obj Series([4, 7, -5, 3]) #使用列表生成Serial print obj print obj.values print obj.index printprint 指定Series的index obj2 Series([4, 7, -5, 3], index [d, b, a, c]) #通过使用index关键字申明serial的索引值 print obj2 print obj2.index print obj2[a] obj2[d] 100 #通过索引修改serial某个元素的值 print obj2[[c, a, d]] #通过索引指定输出顺序 print obj2[obj2 0] # 找出大于0的元素 print b in obj2 # 判断索引是否存在 print e in obj2 printprint 使用字典生成Series sdata {Ohio:10000, Texas:20000, Oregon:16000, Utah:5000} obj3 Series(sdata) #通过字典构建serial数据结构 print obj3 printprint 使用字典生成Series并额外指定index不匹配部分为NaN没有的部分直接舍弃 states [California, Ohio, Oregon, Texas] obj4 Series(sdata, index states) #通过index指定索引 print obj4 printprint Series相加相同索引部分相加不同的部分直接赋值为nan,整体结果是求并的结果 print obj3 obj4 printprint 指定Series及其索引的名字 obj4.name population #指定serial的名字 obj4.index.name state #指定行索引的名字 print obj4 printprint 替换index obj.index [Bob, Steve, Jeff, Ryan] print obj 用数组生成Series 0 4 1 7 2 -5 3 3 dtype: int64 [ 4 7 -5 3] RangeIndex(start0, stop4, step1)指定Series的index d 4 b 7 a -5 c 3 dtype: int64 Index([ud, ub, ua, uc], dtypeobject) -5 c 3 a -5 d 100 dtype: int64 d 100 b 7 c 3 dtype: int64 True False使用字典生成Series Ohio 10000 Oregon 16000 Texas 20000 Utah 5000 dtype: int64使用字典生成Series并额外指定index不匹配部分为NaN没有的部分直接舍弃 California NaN Ohio 10000.0 Oregon 16000.0 Texas 20000.0 dtype: float64Series相加相同索引部分相加不同的部分直接赋值为nan,整体结果是求并的结果 California NaN Ohio 20000.0 Oregon 32000.0 Texas 40000.0 Utah NaN dtype: float64指定Series及其索引的名字 state California NaN Ohio 10000.0 Oregon 16000.0 Texas 20000.0 Name: population, dtype: float64替换index Bob 4 Steve 7 Jeff -5 Ryan 3 dtype: int64数据结构 DataFrame - DataFrame是一个表格型的数据结构它含有一组有序的列每列可以是不同的值类型数值、字符串、布尔值等 - DataFrame既有行索引也有列索引它可以被看做由Series组成的字典共用同一个索引 - 可以输入给DataFrame构造器的数据代码 - 创建 - 读写 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrameprint 用字典生成DataFramekey为列的名字。 data {state:[Ohio, Ohio, Ohio, Nevada, Nevada], #字典的key作为dataframe的列索引year:[2000, 2001, 2002, 2001, 2002],pop:[1.5, 1.7, 3.6, 2.4, 2.9]} print DataFrame(data) print DataFrame(data, columns [year, state, pop]) # 指定列顺序 columns列 index行 printprint 指定索引在列中指定不存在的列默认数据用NaN。 frame2 DataFrame(data,columns [year, state, pop, debt],#定义列索引index [one, two, three, four, five])#定义行索引print frame2 print frame2[state] #取出‘state’这一列的数据 print frame2.year #取出‘year的数据 print frame2.ix[three] #通过ix表示是通过行索引 frame2[debt] 16.5 # 修改一整列 print frame2 frame2.debt np.arange(5) # 用numpy数组修改元素 print frame2 printprint 用Series指定要修改的索引及其对应的值没有指定的默认数据用NaN。 val Series([-1.2, -1.5, -1.7], index [two, four, five]) #将“debt”列中的第2,4,5个元素更换值其余的1,3,设置为nan frame2[debt] val print frame2 printprint 赋值给新列 frame2[eastern] (frame2.state Ohio) # 增加一个新的列,列的值取如果state等于Ohio为True print frame2 print frame2.columns printprint DataFrame转置 pop {Nevada:{2001:12.4, 2002:2.9},Ohio:{2000:1.5, 2001:1.7, 2002:3.6}} frame3 DataFrame(pop) #使用字典构建dataframe print frame3 print frame3 print frame3.T printprint 指定索引顺序以及使用切片初始化数据。 print DataFrame(pop, index [2001, 2002, 2003]) pdata {Ohio:frame3[Ohio][:-1], Nevada:frame3[Nevada][:2]} print DataFrame(pdata) printprint 指定索引和列的名称 frame3.index.name year frame3.columns.name state print frame3 print frame3.values print frame2.values 用字典生成DataFramekey为列的名字。pop state year 0 1.5 Ohio 2000 1 1.7 Ohio 2001 2 3.6 Ohio 2002 3 2.4 Nevada 2001 4 2.9 Nevada 2002year state pop 0 2000 Ohio 1.5 1 2001 Ohio 1.7 2 2002 Ohio 3.6 3 2001 Nevada 2.4 4 2002 Nevada 2.9指定索引在列中指定不存在的列默认数据用NaN。year state pop debt one 2000 Ohio 1.5 NaN two 2001 Ohio 1.7 NaN three 2002 Ohio 3.6 NaN four 2001 Nevada 2.4 NaN five 2002 Nevada 2.9 NaN one Ohio two Ohio three Ohio four Nevada five Nevada Name: state, dtype: object one 2000 two 2001 three 2002 four 2001 five 2002 Name: year, dtype: int64 year 2002 state Ohio pop 3.6 debt NaN Name: three, dtype: objectyear state pop debt one 2000 Ohio 1.5 16.5 two 2001 Ohio 1.7 16.5 three 2002 Ohio 3.6 16.5 four 2001 Nevada 2.4 16.5 five 2002 Nevada 2.9 16.5year state pop debt one 2000 Ohio 1.5 0 two 2001 Ohio 1.7 1 three 2002 Ohio 3.6 2 four 2001 Nevada 2.4 3 five 2002 Nevada 2.9 4用Series指定要修改的索引及其对应的值没有指定的默认数据用NaN。year state pop debt one 2000 Ohio 1.5 NaN two 2001 Ohio 1.7 -1.2 three 2002 Ohio 3.6 NaN four 2001 Nevada 2.4 -1.5 five 2002 Nevada 2.9 -1.7赋值给新列year state pop debt eastern one 2000 Ohio 1.5 NaN True two 2001 Ohio 1.7 -1.2 True three 2002 Ohio 3.6 NaN True four 2001 Nevada 2.4 -1.5 False five 2002 Nevada 2.9 -1.7 False Index([uyear, ustate, upop, udebt, ueastern], dtypeobject)DataFrame转置 frame3Nevada Ohio 2000 NaN 1.5 2001 12.4 1.7 2002 2.9 3.62000 2001 2002 Nevada NaN 12.4 2.9 Ohio 1.5 1.7 3.6指定索引顺序以及使用切片初始化数据。Nevada Ohio 2001 12.4 1.7 2002 2.9 3.6 2003 NaN NaNNevada Ohio 2000 NaN 1.5 2001 12.4 1.7指定索引和列的名称 state Nevada Ohio year 2000 NaN 1.5 2001 12.4 1.7 2002 2.9 3.6 [[ nan 1.5][12.4 1.7][ 2.9 3.6]] [[2000 Ohio 1.5 nan True][2001 Ohio 1.7 -1.2 True][2002 Ohio 3.6 nan True][2001 Nevada 2.4 -1.5 False][2002 Nevada 2.9 -1.7 False]]/Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:22: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated数据结构索引对象 - pandas的索引对象负责管理轴标签和其他元数据比如轴名称等。构建Series或DataFrame时所用到的任何数组或其他序列的标签都会被转换成一个Index. - Index对象是不可修改的immutable因此用户不能对其进行修改。不可修改性非常重要因为这样才能使Index对象在多个数据结构之间安全共享 - pandas中主要的index对象 - Index的方法和属性 I - Index的方法和属性 II 代码 # -*- coding: utf-8 -*- import numpy as np import pandas as pd import sys from pandas import Series, DataFrame, Indexprint 获取index obj Series(range(3), index [a, b, c]) index obj.index #获取serial对象的行索引 print index[1:] try:index[1] d # index对象read only无法对其赋值 except:print sys.exc_info()[0] printprint 使用Index对象 index Index(np.arange(3))#构建行索引 obj2 Series([1.5, -2.5, 0], index index) print obj2 print obj2.index is index printprint 判断列和索引是否存在 pop {Nevada:{20001:2.4, 2002:2.9},Ohio:{2000:1.5, 2001:1.7, 2002:3.6}} frame3 DataFrame(pop) print frame3 print Ohio in frame3.columns #判断是否在列索引中 print 2003 in frame3.index #判断是否在行索引中获取index Index([ub, uc], dtypeobject) type exceptions.TypeError使用Index对象 0 1.5 1 -2.5 2 0.0 dtype: float64 True判断列和索引是否存在Nevada Ohio 2000 NaN 1.5 2001 NaN 1.7 2002 2.9 3.6 20001 2.4 NaN True False基本功能基本功能重新索引 - 创建一个适应新索引的新对象该Series的reindex将会根据新索引进行重排。如果某个索引值当前不存在就引入缺失值 - 对于时间序列这样的有序数据重新索引时可能需要做一些插值处理。method选项即可达到此目的。 - reindex函数的参数屏幕快照 2018-06-07 上午9.24.50.png 代码 # -*- coding: utf-8 -*- import numpy as np from pandas import DataFrame, Seriesprint 重新指定索引及顺序 obj Series([4.5, 7.2, -5.3, 3.6], index [d, b, a, c]) print obj obj2 obj.reindex([a, b, d, c, e])#默认的填充方法是nan print obj2 print obj.reindex([a, b, d, c, e], fill_value 0) # 指定不存在元素的填充值 printprint 重新指定索引并指定填元素充方法 obj3 Series([blue, purple, yellow], index [0, 2, 4]) print obj3 print obj3.reindex(range(6), method ffill)#根据前一个数据的值进行填充 printprint 对DataFrame重新指定索引 frame DataFrame(np.arange(9).reshape(3, 3),index [a, c, d],columns [Ohio, Texas, California]) print frame frame2 frame.reindex([a, b, c, d])#默认更新轴为行 print frame2 printprint 重新指定column states [Texas, Utah, California] print frame.reindex(columns states)#制定列索引的顺序 print frameprint 对DataFrame重新指定索引并指定填元素充方法 print frame.reindex(index [a, b, c, d],method ffill) # columns states) print frame.ix[[a, b, d, c], states]#通过ix指定修改的轴为行重新指定索引及顺序 d 4.5 b 7.2 a -5.3 c 3.6 dtype: float64 a -5.3 b 7.2 d 4.5 c 3.6 e NaN dtype: float64 a -5.3 b 7.2 d 4.5 c 3.6 e 0.0 dtype: float64重新指定索引并指定填元素充方法 0 blue 2 purple 4 yellow dtype: object 0 blue 1 blue 2 purple 3 purple 4 yellow 5 yellow dtype: object对DataFrame重新指定索引Ohio Texas California a 0 1 2 c 3 4 5 d 6 7 8Ohio Texas California a 0.0 1.0 2.0 b NaN NaN NaN c 3.0 4.0 5.0 d 6.0 7.0 8.0重新指定columnTexas Utah California a 1 NaN 2 c 4 NaN 5 d 7 NaN 8Ohio Texas California a 0 1 2 c 3 4 5 d 6 7 8 对DataFrame重新指定索引并指定填元素充方法Ohio Texas California a 0 1 2 b 0 1 2 c 3 4 5 d 6 7 8Texas Utah California a 1.0 NaN 2.0 b NaN NaN NaN d 7.0 NaN 8.0 c 4.0 NaN 5.0/Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:38: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated基本功能丢弃指定轴上的项 - 丢弃某条轴上的一个或多个项很简单只要有一个索引数组或列表即可。由于需要执行一些数据整理和集合逻辑所以drop方法返回的是一个在指定轴上删除了指定值的新对象代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrame# print Series根据索引删除元素 # obj Series(np.arange(5.), index [a, b, c, d, e]) # new_obj obj.drop(c)#根据行索引删除某一个行 # print new_obj # obj obj.drop([d, c]) # print obj # printprint DataFrame删除元素可指定索引或列。 data DataFrame(np.arange(16).reshape((4, 4)),index [Ohio, Colorado, Utah, New York],columns [one, two, three, four]) print data print data.drop([Colorado, Ohio]) print data.drop(two, axis 1)#指定列索引 print data.drop([two, four], axis 1) DataFrame删除元素可指定索引或列。one two three four Ohio 0 1 2 3 Colorado 4 5 6 7 Utah 8 9 10 11 New York 12 13 14 15one two three four Utah 8 9 10 11 New York 12 13 14 15one three four Ohio 0 2 3 Colorado 4 6 7 Utah 8 10 11 New York 12 14 15one three Ohio 0 2 Colorado 4 6 Utah 8 10 New York 12 14基本功能索引、选取和过滤 - Series索引obj[…]的工作方式类似于NumPy数组的索引只不过Series的索引值不只是整数。 - 利用标签的切片运算与普通的Python切片运算不同其末端是包含的inclusive,完全闭区间。 - 对DataFrame进行索引其实就是获取一个或多个列 - 为了在DataFrame的行上进行标签索引引入了专门的索引字段ix - DataFrame的索引选项代码列表索引切片索引行/列索引条件索引 -- coding: utf-8 -- import numpy as np from pandas import Series, DataFrame print ‘Series的索引默认数字索引可以工作。’ obj Series(np.arange(4.), index [‘a’, ‘b’, ‘c’, ‘d’]) print obj[‘b’] print obj[3] print obj[[1, 3]]#索引时候使用的是列表非索引一般用的是元祖选中obj[1]和obj[3] print obj[obj 2]#将obj中小于2的元素打印出来 print print ‘Series的数组切片’ print obj[‘b’:’d’] # 闭区间[b:d] obj[‘b’:’c’] 5 print obj print print ‘DataFrame的索引’ data DataFrame(np.arange(16).reshape((4, 4)), index [‘Ohio’, ‘Colorado’, ‘Utah’, ‘New York’], columns [‘one’, ‘two’, ‘three’, ‘four’]) print data print data[‘two’] # 打印列.使用下标进行索引时默认的是列索引 print data[[‘three’, ‘one’]]#以列表进行索引 print data[:2] print data.ix[‘Colorado’, [‘two’, ‘three’]] # 指定索引和列通过ix完成行索引 print data.ix[[‘Colorado’, ‘Utah’], [3, 0, 1]] print data.ix[2] # 打印第2行从0开始 print data.ix[:’Utah’, ‘two’] # 从开始到Utah第2列。 print print ‘根据条件选择’ print data[data.three 5] print data 5 # 打印True或者False data[data 5] 0 print data 基本功能算术运算和数据对齐 - 对不同的索引对象进行算术运算 - 自动数据对齐在不重叠的索引处引入了NA值缺失值会在算术运算过程中传播。 - 对于DataFrame对齐操作会同时发生在行和列上。 - fill_value参数 - DataFrame和Series之间的运算代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrameprint 加法 s1 Series([7.3, -2.5, 3.4, 1.5], index [a, c, d, e]) s2 Series([-2.1, 3.6, -1.5, 4, 3.1], index [a, c, e, f, g]) print s1 print s2 print s1 s2 #相同索引的元素对应相加不相同的部分直接赋值为nan加法后的索引为之前索引的并集 printprint DataFrame加法索引和列都必须匹配。 df1 DataFrame(np.arange(9.).reshape((3, 3)),columns list(bcd),index [Ohio, Texas, Colorado]) df2 DataFrame(np.arange(12).reshape((4, 3)),columns list(bde),index [Utah, Ohio, Texas, Oregon]) print df1 print df2 print df1 df2#dataframe加法是作用于行和列两个方向的相同索引的相加不同索引的赋值nan printprint 数据填充 df1 DataFrame(np.arange(12.).reshape((3, 4)), columns list(abcd)) df2 DataFrame(np.arange(20.).reshape((4, 5)), columns list(abcde)) print df1 print df2 print df1 df2 print df1 df2 print df1.add(df2, fill_value 0)#使用add函数进行相加和符号的结果不一样 print df1.reindex(columns df2.columns, fill_value 0)#使用dataframe2的列索引来跟新dataframe1的列索引没有的填充0 printprint DataFrame与Series之间的操作 arr np.arange(12.).reshape((3, 4)) print arr print arr[0] print arr - arr[0] frame DataFrame(np.arange(12).reshape((4, 3)),columns list(bde),index [Utah, Ohio, Texas, Oregon]) series frame.ix[0] print frame print series print frame - series #把serial看成是一个dataframe只不过此时他只有一行而已在利用dataframe的减法原则 series2 Series(range(3), index list(bef)) print frame series2 series3 frame[d] print frame.sub(series3, axis 0) # 按列减加法 a 7.3 c -2.5 d 3.4 e 1.5 dtype: float64 a -2.1 c 3.6 e -1.5 f 4.0 g 3.1 dtype: float64 a 5.2 c 1.1 d NaN e 0.0 f NaN g NaN dtype: float64DataFrame加法索引和列都必须匹配。b c d Ohio 0.0 1.0 2.0 Texas 3.0 4.0 5.0 Colorado 6.0 7.0 8.0b d e Utah 0 1 2 Ohio 3 4 5 Texas 6 7 8 Oregon 9 10 11b c d e Colorado NaN NaN NaN NaN Ohio 3.0 NaN 6.0 NaN Oregon NaN NaN NaN NaN Texas 9.0 NaN 12.0 NaN Utah NaN NaN NaN NaN数据填充a b c d 0 0.0 1.0 2.0 3.0 1 4.0 5.0 6.0 7.0 2 8.0 9.0 10.0 11.0a b c d e 0 0.0 1.0 2.0 3.0 4.0 1 5.0 6.0 7.0 8.0 9.0 2 10.0 11.0 12.0 13.0 14.0 3 15.0 16.0 17.0 18.0 19.0 df1 df2a b c d e 0 0.0 2.0 4.0 6.0 NaN 1 9.0 11.0 13.0 15.0 NaN 2 18.0 20.0 22.0 24.0 NaN 3 NaN NaN NaN NaN NaNa b c d e 0 0.0 2.0 4.0 6.0 4.0 1 9.0 11.0 13.0 15.0 9.0 2 18.0 20.0 22.0 24.0 14.0 3 15.0 16.0 17.0 18.0 19.0a b c d e 0 0.0 1.0 2.0 3.0 0 1 4.0 5.0 6.0 7.0 0 2 8.0 9.0 10.0 11.0 0DataFrame与Series之间的操作 [[ 0. 1. 2. 3.][ 4. 5. 6. 7.][ 8. 9. 10. 11.]] [0. 1. 2. 3.] [[0. 0. 0. 0.][4. 4. 4. 4.][8. 8. 8. 8.]]b d e Utah 0 1 2 Ohio 3 4 5 Texas 6 7 8 Oregon 9 10 11 b 0 d 1 e 2 Name: Utah, dtype: int64b d e Utah 0 0 0 Ohio 3 3 3 Texas 6 6 6 Oregon 9 9 9b d e f Utah 0.0 NaN 3.0 NaN Ohio 3.0 NaN 6.0 NaN Texas 6.0 NaN 9.0 NaN Oregon 9.0 NaN 12.0 NaNb d e Utah -1 0 1 Ohio -1 0 1 Texas -1 0 1 Oregon -1 0 1/Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:45: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated基本功能函数应用和映射 - numpy的ufuncs元素级数组方法 - DataFrame的apply方法 - 对象的applymap方法因为Series有一个应用于元素级的map方法 - 所有numpy作用于元素级别的函数都可以作用于pandas的datafram 代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrameprint 函数 frame DataFrame(np.random.randn(4, 3),columns list(bde),index [Utah, Ohio, Texas, Oregon]) print frame print np.abs(frame)#对dataframe中的每个元素求绝对值 printprint lambda以及应用 f lambda x: x.max() - x.min() print frame.apply(f)#默认是对列的元素进行操作 print frame.apply(f, axis 1)#忽略列对行进行操作def f(x):return Series([x.min(), x.max()], index [min, max]) print frame.apply(f) printprint applymap和map _format lambda x: %.2f % x print frame.applymap(_format) print frame[e].map(_format) 函数b d e Utah -0.188935 0.298682 1.692648 Ohio -0.666434 -0.102262 -0.172966 Texas -1.103831 -1.324074 -1.024516 Oregon 1.354406 -0.564374 -0.967438b d e Utah 0.188935 0.298682 1.692648 Ohio 0.666434 0.102262 0.172966 Texas 1.103831 1.324074 1.024516 Oregon 1.354406 0.564374 0.967438lambda以及应用 b 2.458237 d 1.622756 e 2.717164 dtype: float64 Utah 1.881583 Ohio 0.564172 Texas 0.299558 Oregon 2.321844 dtype: float64b d e min -1.103831 -1.324074 -1.024516 max 1.354406 0.298682 1.692648applymap和mapb d e Utah -0.19 0.30 1.69 Ohio -0.67 -0.10 -0.17 Texas -1.10 -1.32 -1.02 Oregon 1.35 -0.56 -0.97 Utah 1.69 Ohio -0.17 Texas -1.02 Oregon -0.97 Name: e, dtype: object基本功能排序和排名 - 对行或列索引进行排序 - 对于DataFrame根据任意一个轴上的索引进行排序 - 可以指定升序降序 - 按值排序 - 对于DataFrame可以指定按值排序的列 - rank函数代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrameprint 根据索引排序对于DataFrame可以指定轴。 obj Series(range(4), index [d, a, b, c]) print obj.sort_index()#通过索引进行排序 frame DataFrame(np.arange(8).reshape((2, 4)),index [three, one],columns list(dabc)) print frame.sort_index()#默认是对行索引进行排序 print frame.sort_index(axis 1)#对列索引进行排序 print frame.sort_index(axis 1, ascending False) # 降序 printprint 根据值排序 obj Series([4, 7, -3, 2]) print obj.sort_values() # order已淘汰 printprint DataFrame指定列排序 frame DataFrame({b:[4, 7, -3, 2], a:[0, 1, 0, 1]}) print frame print frame.sort_values(by b) # sort_index(by ...)已淘汰 print frame.sort_values(by [a, b]) printprint rank求排名的平均位置(从1开始) obj Series([7, -5, 7, 4, 2, 0, 4]) # 对应排名-5(1), 0(2), 2(3), 4(4), 4(5), 7(6), 7(7) print obj.rank() print obj.rank(method first) # 去第一次出现不求平均值。 print obj.rank(ascending False, method max) # 逆序并取最大值。所以-5的rank是7. frame DataFrame({b:[4.3, 7, -3, 2],a:[0, 1, 0, 1],c:[-2, 5, 8, -2.5]}) print frame print frame.rank(axis 1)根据索引排序对于DataFrame可以指定轴。 a 1 b 2 c 3 d 0 dtype: int64d a b c one 4 5 6 7 three 0 1 2 3a b c d three 1 2 3 0 one 5 6 7 4d c b a three 0 3 2 1 one 4 7 6 5根据值排序 2 -3 3 2 0 4 1 7 dtype: int64DataFrame指定列排序a b 0 0 4 1 1 7 2 0 -3 3 1 2a b 2 0 -3 3 1 2 0 0 4 1 1 7a b 2 0 -3 0 0 4 3 1 2 1 1 7rank求排名的平均位置(从1开始) 0 6.5 1 1.0 2 6.5 3 4.5 4 3.0 5 2.0 6 4.5 dtype: float64 0 6.0 1 1.0 2 7.0 3 4.0 4 3.0 5 2.0 6 5.0 dtype: float64 0 2.0 1 7.0 2 2.0 3 4.0 4 5.0 5 6.0 6 4.0 dtype: float64a b c 0 0 4.3 -2.0 1 1 7.0 5.0 2 0 -3.0 8.0 3 1 2.0 -2.5a b c 0 2.0 3.0 1.0 1 1.0 3.0 2.0 2 2.0 1.0 3.0 3 2.0 3.0 1.0基本功能带有重复值的索引 - 对于重复索引返回Series对应单个值的索引则返回标量。代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrameprint 重复的索引 obj Series(range(5), index [a, a, b, b, c]) print obj print obj.index.is_unique # 判断是非有重复索引 print obj[a][0], obj.a[1] df DataFrame(np.random.randn(4, 3), index [a, a, b, b]) print df print df.ix[b].ix[0] print df.ix[b].ix[1]重复的索引 a 0 a 1 b 2 b 3 c 4 dtype: int64 False 0 10 1 2 a 1.166285 0.600093 1.043009 a 0.791440 0.764078 1.136826 b -1.624025 -0.384034 1.255976 b 0.164236 -0.181083 0.131282 0 -1.624025 1 -0.384034 2 1.255976 Name: b, dtype: float64 0 0.164236 1 -0.181083 2 0.131282 Name: b, dtype: float64/Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:13: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecateddel sys.path[0] /Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:14: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated汇总和计算描述统计汇总和计算描述统计汇总和计算描述统计 - 常用方法选项 - 常用描述和汇总统计函数 I - 常用描述和汇总统计函数 II - 数值型和非数值型的区别 - NA值被自动排查除非通过skipna选项代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrameprint 求和 df DataFrame([[1.4, np.nan], [7.1, -4.5], [np.nan, np.nan], [0.75, -1.3]],index [a, b, c, d],columns [one, two]) print df print df.sum() # 按列求和默认求和的方式是按列求和 print df.sum(axis 1) # 按行求和通过axis关键字指定按行进行求和 printprint 平均数 print df.mean(axis 1, skipna False)#按行进行求平均不跳过nan print df.mean(axis 1)#默认跳过nan printprint 其它 print df.idxmax()#默认对列进行操作 print df.idxmax(axis 1) #默认是按列操作 print df.cumsum()#默认按列进行操作 print df.describe()#默认是按列进行操作 obj Series([a, a, b, c] * 4) print obj print obj.describe() 求和one two a 1.40 NaN b 7.10 -4.5 c NaN NaN d 0.75 -1.3 one 9.25 two -5.80 dtype: float64 a 1.40 b 2.60 c 0.00 d -0.55 dtype: float64平均数 a NaN b 1.300 c NaN d -0.275 dtype: float64 a 1.400 b 1.300 c NaN d -0.275 dtype: float64其它 one b two d dtype: object a one b one c NaN d one dtype: objectone two a 1.40 NaN b 8.50 -4.5 c NaN NaN d 9.25 -5.8one two count 3.000000 2.000000 mean 3.083333 -2.900000 std 3.493685 2.262742 min 0.750000 -4.500000 25% 1.075000 -3.700000 50% 1.400000 -2.900000 75% 4.250000 -2.100000 max 7.100000 -1.300000 0 a 1 a 2 b 3 c 4 a 5 a 6 b 7 c 8 a 9 a 10 b 11 c 12 a 13 a 14 b 15 c dtype: object count 16 unique 3 top a freq 8 dtype: object### 汇总和计算描述统计相关系数与协方差 - 相关系数相关系数是用以反映变量之间相关关系密切程度的统计指标。百度百科 - 协方差从直观上来看协方差表示的是两个变量总体误差的期望。如果两个变量的变化趋势一致也就是说如果其中一个大于自身的期望值时另外一个也大于自身的期望值那么两个变量之间的协方差就是正值如果两个变量的变化趋势相反即其中一个变量大于自身的期望值时另外一个却小于自身的期望值那么两个变量之间的协方差就是负值。代码 # -*- coding: utf-8 -*- import numpy as np # from pandas_datareader import data , web import pandas.io.data as web from pandas import DataFrameprint 相关性与协方差 # 协方差https://zh.wikipedia.org/wiki/%E5%8D%8F%E6%96%B9%E5%B7%AE all_data {} for ticker in [AAPL, IBM, MSFT, GOOG]:all_data[ticker] web.get_data_yahoo(ticker, 4/1/2016, 7/15/2015)price DataFrame({tic: data[Adj Close] for tic, data in all_data.iteritems()})volume DataFrame({tic: data[Volume] for tic, data in all_data.iteritems()}) returns price.pct_change() print returns.tail() print returns.MSFT.corr(returns.IBM) print returns.corr() # 相关性自己和自己的相关性总是1 print returns.cov() # 协方差 print returns.corrwith(returns.IBM) print returns.corrwith(returns.volume) ---------------------------------------------------------------------------ImportError Traceback (most recent call last)ipython-input-61-a72f5c63b2a8 in module()3 import numpy as np4 # from pandas_datareader import data , web ---- 5 import pandas.io.data as web6 from pandas import DataFrame7 /Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/pandas/io/data.py in module()1 raise ImportError( ---- 2 The pandas.io.data module is moved to a separate package 3 (pandas-datareader). After installing the pandas-datareader package 4 (https://github.com/pydata/pandas-datareader), you can change 5 the import from pandas.io import data, wb to ImportError: The pandas.io.data module is moved to a separate package (pandas-datareader). After installing the pandas-datareader package (https://github.com/pydata/pandas-datareader), you can change the import from pandas.io import data, wb to from pandas_datareader import data, wb.汇总和计算描述统计唯一值以及成员资格 - 常用方法代码 # -*- coding: utf-8 -*- import numpy as np import pandas as pd from pandas import Series, DataFrameprint 去重 obj Series([c, a, d, a, a, b, b, c, c]) print obj print obj.unique() #去重索引 print obj.value_counts() #计算索引对应的个数 printprint 判断元素存在 mask obj.isin([b, c]) print mask print obj[mask] #只打印元素b和c data DataFrame({Qu1:[1, 3, 4, 3, 4],Qu2:[2, 3, 1, 2, 3],Qu3:[1, 5, 2, 4, 4]}) print data print data.apply(pd.value_counts).fillna(0) print data.apply(pd.value_counts, axis 1).fillna(0) 去重 0 c 1 a 2 d 3 a 4 a 5 b 6 b 7 c 8 c dtype: object [c a d b] c 3 a 3 b 2 d 1 dtype: int64判断元素存在 0 True 1 False 2 False 3 False 4 False 5 True 6 True 7 True 8 True dtype: bool 0 c 5 b 6 b 7 c 8 c dtype: objectQu1 Qu2 Qu3 0 1 2 1 1 3 3 5 2 4 1 2 3 3 2 4 4 4 3 4Qu1 Qu2 Qu3 1 1.0 1.0 1.0 2 0.0 2.0 1.0 3 2.0 2.0 0.0 4 2.0 0.0 2.0 5 0.0 0.0 1.01 2 3 4 5 0 2.0 1.0 0.0 0.0 0.0 1 0.0 0.0 2.0 0.0 1.0 2 1.0 1.0 0.0 1.0 0.0 3 0.0 1.0 1.0 1.0 0.0 4 0.0 0.0 1.0 2.0 0.0处理缺失数据处理缺失数据 - NA处理方法 - NaNNot a Number表示浮点数和非浮点数组中的缺失数据 - None也被当作NA处理代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Seriesprint 作为null处理的值 string_data Series([aardvark, artichoke, np.nan, avocado]) print string_data print string_data.isnull() #判断是否为空缺值 string_data[0] None print string_data.isnull() 作为null处理的值 0 aardvark 1 artichoke 2 NaN 3 avocado dtype: object 0 False 1 False 2 True 3 False dtype: bool 0 True 1 False 2 True 3 False dtype: bool处理缺失数据滤除缺失数据 - dropna - 布尔索引 - DatFrame默认丢弃任何含有缺失值的行 - how参数控制行为axis参数选择轴thresh参数控制留下的数量代码 # -*- coding: utf-8 -*- import numpy as np from numpy import nan as NA from pandas import Series, DataFrame# print 丢弃NA # data Series([1, NA, 3.5, NA, 7 , None]) # print data.dropna() #去掉serial数据中的NA值 # print data[data.notnull()] # printprint DataFrame对丢弃NA的处理 data DataFrame([[1., 6.5, 3.], [1., NA, NA],[NA, NA, NA], [NA, 6.5, 3.]]) print data print data.dropna() # 默认只要某行有NA就全部删除 print data.dropna(how all) # 全部为NA才删除,使用how来指定方式 data[4] NA # 新增一列 print data.dropna(axis 1, how all)#默认按行进行操作可以通过axis来指定通过列进行操作 data DataFrame(np.random.randn(7, 3)) data.ix[:4, 1] NA data.ix[:2, 2] NA print data print data.dropna(thresh 2) # 每行至少要有2个非NA元素 DataFrame对丢弃NA的处理0 1 2 0 1.0 6.5 3.0 1 1.0 NaN NaN 2 NaN NaN NaN 3 NaN 6.5 3.00 1 2 0 1.0 6.5 3.00 1 2 0 1.0 6.5 3.0 1 1.0 NaN NaN 3 NaN 6.5 3.00 1 2 0 1.0 6.5 3.0 1 1.0 NaN NaN 2 NaN NaN NaN 3 NaN 6.5 3.00 1 2 0 -0.181398 NaN NaN 1 -1.153083 NaN NaN 2 -0.072996 NaN NaN 3 0.783739 NaN 0.324288 4 -1.277365 NaN -1.683068 5 2.305280 0.082071 0.175902 6 -0.167521 -0.043577 -0.9591340 1 2 3 0.783739 NaN 0.324288 4 -1.277365 NaN -1.683068 5 2.305280 0.082071 0.175902 6 -0.167521 -0.043577 -0.959134/Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:22: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated处理缺失数据填充缺失数据 - fillna - inplace参数控制返回新对象还是就地修改代码 # -*- coding: utf-8 -*- import numpy as np from numpy import nan as NA import pandas as pd from pandas import Series, DataFrame, Indexprint 填充0 df DataFrame(np.random.randn(7, 3)) print df df.ix[:4, 1] NA df.ix[:2, 2] NA print df print df.fillna(0) df.fillna(0, inplace False) #不在原先的数据结构上进行修改 df.fillna(0, inplace True) #对原先的数据结构进行修改 print df printprint 不同行列填充不同的值 print df.fillna({1:0.5, 3:-1}) # 第3列不存在 printprint 不同的填充方式 df DataFrame(np.random.randn(6, 3)) df.ix[2:, 1] NA df.ix[4:, 2] NA print df print df.fillna(method ffill) print df.fillna(method ffill, limit 2) printprint 用统计数据填充 data Series([1., NA, 3.5, NA, 7]) print data.fillna(data.mean())填充00 1 2 0 -0.747530 0.733795 0.207921 1 0.329993 -0.092622 -0.274532 2 -0.498705 1.097721 -0.248666 3 -1.072368 1.281738 1.143063 4 -0.838184 -1.229197 -1.588577 5 0.386622 -1.056740 0.120941 6 -0.104685 0.062590 -0.6826520 1 2 0 -0.747530 NaN NaN 1 0.329993 NaN NaN 2 -0.498705 NaN NaN 3 -1.072368 NaN 1.143063 4 -0.838184 NaN -1.588577 5 0.386622 -1.05674 0.120941 6 -0.104685 0.06259 -0.6826520 1 2 0 -0.747530 0.00000 0.000000 1 0.329993 0.00000 0.000000 2 -0.498705 0.00000 0.000000 3 -1.072368 0.00000 1.143063 4 -0.838184 0.00000 -1.588577 5 0.386622 -1.05674 0.120941 6 -0.104685 0.06259 -0.6826520 1 2 0 -0.747530 0.00000 0.000000 1 0.329993 0.00000 0.000000 2 -0.498705 0.00000 0.000000 3 -1.072368 0.00000 1.143063 4 -0.838184 0.00000 -1.588577 5 0.386622 -1.05674 0.120941 6 -0.104685 0.06259 -0.682652不同行列填充不同的值0 1 2 0 -0.747530 0.00000 0.000000 1 0.329993 0.00000 0.000000 2 -0.498705 0.00000 0.000000 3 -1.072368 0.00000 1.143063 4 -0.838184 0.00000 -1.588577 5 0.386622 -1.05674 0.120941 6 -0.104685 0.06259 -0.682652不同的填充方式0 1 2 0 0.037005 -0.554357 -0.968951 1 0.600986 -0.564576 -0.718096 2 1.268549 NaN 1.006229 3 0.813411 NaN 0.451489 4 0.097840 NaN NaN 5 -1.944482 NaN NaN0 1 2 0 0.037005 -0.554357 -0.968951 1 0.600986 -0.564576 -0.718096 2 1.268549 -0.564576 1.006229 3 0.813411 -0.564576 0.451489 4 0.097840 -0.564576 0.451489 5 -1.944482 -0.564576 0.4514890 1 2 0 0.037005 -0.554357 -0.968951 1 0.600986 -0.564576 -0.718096 2 1.268549 -0.564576 1.006229 3 0.813411 -0.564576 0.451489 4 0.097840 NaN 0.451489 5 -1.944482 NaN 0.451489用统计数据填充 0 1.000000 1 3.833333 2 3.500000 3 3.833333 4 7.000000 dtype: float64/Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:11: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated# This is added back by InteractiveShellApp.init_path() /Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:26: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated层次化索引 - 使你能在一个轴上拥有多个两个以上索引级别。抽象的说它使你能以低纬度形式处理高维度数据。 - 通过stack与unstack变换DataFrame 代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrame, MultiIndex# print Series的层次索引 # data Series(np.random.randn(10), # index [[a, a, a, b, b, b, c, c, d, d], # [1, 2, 3, 1, 2, 3, 1, 2, 2, 3]]) # print data # print data.index # print data.b # print data[b:c] # print data[:2] # print data.unstack() # print data.unstack().stack() # printprint DataFrame的层次索引 frame DataFrame(np.arange(12).reshape((4, 3)),index [[a, a, b, b], [1, 2, 1, 2]],columns [[Ohio, Ohio, Colorado], [Green, Red, Green]]) print frame frame.index.names [key1, key2] frame.columns.names [state, color] print frame print frame.ix[a, 1] print frame.ix[a, 2][Colorado] print frame.ix[a, 2][Ohio][Red] printprint 直接用MultiIndex创建层次索引结构 print MultiIndex.from_arrays([[Ohio, Ohio, Colorado], [Gree, Red, Green]],names [state, color]) DataFrame的层次索引Ohio ColoradoGreen Red Green a 1 0 1 22 3 4 5 b 1 6 7 82 9 10 11 state Ohio Colorado color Green Red Green key1 key2 a 1 0 1 22 3 4 5 b 1 6 7 82 9 10 11 state color Ohio Green 0Red 1 Colorado Green 2 Name: (a, 1), dtype: int64 color Green 5 Name: (a, 2), dtype: int64 4直接用MultiIndex创建层次索引结构 MultiIndex(levels[[uColorado, uOhio], [uGree, uGreen, uRed]],labels[[1, 1, 0], [0, 2, 1]],names[ustate, ucolor])/Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:27: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexingSee the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated层次化索引重新分级顺序 - 索引交换 - 索引重新排序代码 # -*- coding: utf-8 -*- import numpy as np from pandas import Series, DataFrameprint 索引层级交换 frame DataFrame(np.arange(12).reshape((4, 3)),index [[a, a, b, b], [1, 2, 1, 2]],columns [[Ohio, Ohio, Colorado], [Green, Red, Green]]) frame.index.names [key1, key2] frame_swapped frame.swaplevel(key1, key2) print frame_swapped print frame_swapped.swaplevel(0, 1) printprint 根据索引排序 print frame.sortlevel(key2) print frame.swaplevel(0, 1).sortlevel(0) 索引层级交换Ohio ColoradoGreen Red Green key2 key1 1 a 0 1 2 2 a 3 4 5 1 b 6 7 8 2 b 9 10 11Ohio ColoradoGreen Red Green key1 key2 a 1 0 1 22 3 4 5 b 1 6 7 82 9 10 11根据索引排序Ohio ColoradoGreen Red Green key1 key2 a 1 0 1 2 b 1 6 7 8 a 2 3 4 5 b 2 9 10 11Ohio ColoradoGreen Red Green key2 key1 1 a 0 1 2b 6 7 8 2 a 3 4 5b 9 10 11/Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:17: FutureWarning: sortlevel is deprecated, use sort_index(level ...) /Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/ipykernel_launcher.py:18: FutureWarning: sortlevel is deprecated, use sort_index(level ...)层次化索引根据级别汇总统计 - 指定索引级别和轴代码 # -*- coding: utf-8 -*- import numpy as np from pandas import DataFrameprint 根据指定的key计算统计信息 frame DataFrame(np.arange(12).reshape((4, 3)),index [[a, a, b, b], [1, 2, 1, 2]],columns [[Ohio, Ohio, Colorado], [Green, Red, Green]]) frame.index.names [key1, key2] print frame print frame.sum(level key2) 根据指定的key计算统计信息Ohio ColoradoGreen Red Green key1 key2 a 1 0 1 22 3 4 5 b 1 6 7 82 9 10 11Ohio ColoradoGreen Red Green key2 1 6 8 10 2 12 14 16层次化索引使用DataFrame的列 - 将指定列变为索引 - 移除或保留对象 - reset_index恢复代码 # -*- coding: utf-8 -*- import numpy as np from pandas import DataFrameprint 使用列生成层次索引 frame DataFrame({a:range(7),b:range(7, 0, -1),c:[one, one, one, two, two, two, two],d:[0, 1, 2, 0, 1, 2, 3]}) print frame print frame.set_index([c, d]) # 把c/d列变成索引 print frame.set_index([c, d], drop False) # 列依然保留 frame2 frame.set_index([c, d]) print frame2.reset_index()使用列生成层次索引a b c d 0 0 7 one 0 1 1 6 one 1 2 2 5 one 2 3 3 4 two 0 4 4 3 two 1 5 5 2 two 2 6 6 1 two 3a b c d one 0 0 71 1 62 2 5 two 0 3 41 4 32 5 23 6 1a b c d c d one 0 0 7 one 01 1 6 one 12 2 5 one 2 two 0 3 4 two 01 4 3 two 12 5 2 two 23 6 1 two 3c d a b 0 one 0 0 7 1 one 1 1 6 2 one 2 2 5 3 two 0 3 4 4 two 1 4 3 5 two 2 5 2 6 two 3 6 1其它话题其它话题整数索引 - 歧义的产生 - 可靠的不考虑索引类型的基于位置的索引代码 # -*- coding: utf-8 -*- import numpy as np import sys from pandas import Series, DataFrameprint 整数索引 ser Series(np.arange(3.)) print ser try:print ser[-1] # 这里会有歧义 except:print sys.exc_info()[0] ser2 Series(np.arange(3.), index [a, b, c]) print ser2[-1] ser3 Series(range(3), index [-5, 1, 3]) print ser3.iloc[2] # 避免直接用[2]产生的歧义 printprint 对DataFrame使用整数索引 frame DataFrame(np.arange(6).reshape((3, 2)), index [2, 0, 1]) print frame print frame.iloc[0] print frame.iloc[:, 1] 整数索引 0 0.0 1 1.0 2 2.0 dtype: float64 type exceptions.KeyError 2.0 2对DataFrame使用整数索引0 1 2 0 1 0 2 3 1 4 5 0 0 1 1 Name: 2, dtype: int64 2 1 0 3 1 5 Name: 1, dtype: int64其它话题面板(Pannel)数据 - 通过三维ndarray创建pannel对象 - 通过ix[…]选取需要的数据 - 访问顺序item - major - minor - 通过stack展现面板数据代码 # -*- coding: utf-8 -*- import numpy as np import pandas as pd import pandas.io.data as web from pandas import Series, DataFrame, Index, Panelpdata Panel(dict((stk, web.get_data_yahoo(stk, 1/1/2016, 1/15/2016)) for stk in [AAPL, GOOG, BIDU, MSFT])) print pdata pdata pdata.swapaxes(items, minor) print pdata printprint 访问顺序# Item - Major - Minor print pdata[Adj Close] print pdata[:, 1/5/2016, :] print pdata[Adj Close, 1/6/2016, :] printprint Panel与DataFrame相互转换 stacked pdata.ix[:, 1/7/2016:, :].to_frame() print stacked print stacked.to_panel() ---------------------------------------------------------------------------ImportError Traceback (most recent call last)ipython-input-83-82a16090a331 in module()3 import numpy as np4 import pandas as pd ---- 5 import pandas.io.data as web6 from pandas import Series, DataFrame, Index, Panel7 /Users/robot1/wfy/soft/anconda/anaconda2/lib/python2.7/site-packages/pandas/io/data.py in module()1 raise ImportError( ---- 2 The pandas.io.data module is moved to a separate package 3 (pandas-datareader). After installing the pandas-datareader package 4 (https://github.com/pydata/pandas-datareader), you can change 5 the import from pandas.io import data, wb to ImportError: The pandas.io.data module is moved to a separate package (pandas-datareader). After installing the pandas-datareader package (https://github.com/pydata/pandas-datareader), you can change the import from pandas.io import data, wb to from pandas_datareader import data, wb.

查看全文

http://www.pierceye.com/news/187041/