首页 > 其他 > 详细

pandas 初识(五)

时间:2019-10-15 01:06:08      阅读:181      评论:0      收藏:0      [点我收藏+]

1. 如何实现把一个属性(列)拆分成多列,产生pivot,形成向量信息,计算相关性?

例:

     class_    timestamp    count
0    10    2019-01-20 13:23:00    1
1    10    2019-01-20 13:24:00    2
2    10    2019-01-20 13:25:00    2
3    10    2019-01-20 13:26:00    1
4    10    2019-01-20 13:27:00    2

转为:

class_ 1 2 3 4 10
timestamp
2019-01-20 13:23:01 1.0 NaN NaN NaN NaN
2019-01-20 13:24:02 NaN NaN 2.0 NaN NaN
2019-01-20 13:25:03 NaN 2.0 NaN NaN NaN
2019-01-20 13:26:02 NaN NaN NaN 1.0 NaN
2019-01-20 13:27:05 NaN NaN NaN NaN 2.0

解决:

import pandas as pd
from pandas import Timestamp info = {‘class_‘: {0: 1, 1: 2, 2: 3, 3: 4, 4: 10}, ‘timestamp‘: {0: Timestamp(‘2019-01-20 13:23:00‘), 1: Timestamp(‘2019-01-20 13:24:00‘), 2: Timestamp(‘2019-01-20 13:25:00‘), 3: Timestamp(‘2019-01-20 13:26:00‘), 4: Timestamp(‘2019-01-20 13:27:00‘)}, ‘count‘: {0: 1, 1: 2, 2: 2, 3: 1, 4: 2}} df = pd.DataFrame(info)
# df.pivot(index=‘timestamp‘, columns="class_", values="count").fillna(0)
df.pivot(index=‘timestamp‘, columns="class_", values="count")

 

 

2. 如何实现把一个属性的多列(属性唯一)合并成一列

例:

class_ 1 2 3 4 10
timestamp
2019-01-20 13:23:01 1.0 NaN NaN NaN NaN
2019-01-20 13:24:02 NaN NaN 2.0 NaN NaN
2019-01-20 13:25:03 NaN 2.0 NaN NaN NaN
2019-01-20 13:26:02 NaN NaN NaN 1.0 NaN
2019-01-20 13:27:05 NaN NaN NaN NaN 2.0

转为:

     class_    timestamp    count
0    10    2019-01-20 13:23:00    1
1    10    2019-01-20 13:24:00    2
2    10    2019-01-20 13:25:00    2
3    10    2019-01-20 13:26:00    1
4    10    2019-01-20 13:27:00    2

解决:

import pandas as pd
from pandas import Timestampinfo = {‘class_‘: {0: 1, 1: 2, 2: 3, 3: 4, 4: 10},
 ‘timestamp‘: {0: Timestamp(‘2019-01-20 13:23:00‘),
  1: Timestamp(‘2019-01-20 13:24:00‘),
  2: Timestamp(‘2019-01-20 13:25:00‘),
  3: Timestamp(‘2019-01-20 13:26:00‘),
  4: Timestamp(‘2019-01-20 13:27:00‘)},
 ‘count‘: {0: 1, 1: 2, 2: 2, 3: 1, 4: 2}}
df = pd.DataFrame(info)
#df1 = _df.pivot(index=‘timestamp‘, columns="class_", values="count").dropna()
df1 = _df.pivot(index=‘timestamp‘, columns="class_", values="count")

pandas 初识(五)

原文:https://www.cnblogs.com/spaceapp/p/11674966.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!