数据分析-day04-pandas-dataFrame中group by分组与聚合

本文主要是介绍数据分析-day04-pandas-dataFrame中group by分组与聚合，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

源数据:

分组后:

grouped = df.groupby(by="columns_name")

grouped是一个DataFrameGroupBy对象，是可迭代的

grouped中的每一个元素是一个元组 ,元组里面是（索引(分组的值)，分组之后的DataFrame）

#!usr/bin/env python
#-*- coding:utf-8 _*-
'''
@author:Administrator
@file: pandas_dataframe_group_demo.py
@time: 2020-01-05 上午 9:27
'''
import pandas as pd;
import numpy as np
from matplotlib import pyplot as plt
df=pd.read_csv("../data/starbucks_store_worldwide.csv");
df=df.head(1000);
#以country分组，组成类似map的数据类型，key=国家名称，values=dataframe（关于key代表国家的所有信息）
grouped = df.groupby(by="Country");
print(grouped)
#遍历查看内容for m,n in grouped:print(m)print("===")print(n)#查看所有等于cA的数据
r=df[df["Country"]=="CA"];
#print(r)
#调用聚合方法
country_count = grouped["Brand"].count()
print(country_count)
print(country_count["AE"])
#统计中国每个省店铺的数量
china_data = df[df["Country"] =="CN"]
grouped = china_data.groupby(by="State/Province")["Brand"].count()
print(grouped)
#数据按照多个条件进行分组,返回Series
grouped = df["Brand"].groupby(by=[df["Country"],df["State/Province"]]).count()
print(grouped)
print(type(grouped))
#数据按照多个条件进行分组,返回DataFrame，df["Brand"]再嵌套一层[],变为df[["Brand"]]
grouped1 = df[["Brand"]].groupby(by=[df["Country"],df["State/Province"]]).count()
grouped2= df.groupby(by=[df["Country"],df["State/Province"]])[["Brand"]].count()
grouped3 = df.groupby(by=[df["Country"],df["State/Province"]]).count()[["Brand"]]