主体的统计资料¶

In [1]:

Copied!





%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"

from china_water_use import ChineseWater
from hydra import compose, initialize
import os

with initialize(version_base=None, config_path="../../config"):
    cfg = compose(config_name="processing")
os.chdir(cfg.root)
%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"

from china_water_use import ChineseWater
from hydra import compose, initialize
import os

with initialize(version_base=None, config_path="../../config"):
    cfg = compose(config_name="processing")
os.chdir(cfg.root)

框定空间范围¶

我们使用来自 Zhou 等人（2020）的地级市尺度水资源利用数据。

选择与黄河流域的空间范围重叠面积超 $95\%$ 的地级市，即为本研究所考虑的地级市。

In [2]:

Copied!





import geopandas as gpd

gdf = gpd.read_file(cfg.ds.cities.shp)
CITIES = dict(gdf.set_index("City_ID")["Province_n"])
CITIES = [f"C{city}" for city in CITIES.keys()]
len(CITIES)
import geopandas as gpd

gdf = gpd.read_file(cfg.ds.cities.shp)
CITIES = dict(gdf.set_index("City_ID")["Province_n"])
CITIES = [f"C{city}" for city in CITIES.keys()]
len(CITIES)

Out[2]:

一共有59个符合条件的地级市，分属8个不同的省。

灌溉用水强度¶

用水强度数据也直接来自于 Zhou 等人（2020），单位是 mm。

这里的灌溉用水强度是单位面积为灌溉而提取的水量，包括在输送和田间应用过程中的损失。

来源于统计数据，统计方法是地区水资源管理单位按照经验统计上报的用水量，不区分用水的来源。

注意这里，用水强度的单位是 mm，而 ChineseWater 包目前还没有更新，仍然是用 mm/kha 进行预处理。需要找机会更新一下包。

In [4]:

Copied!

from src.ci.processing import clean_water_use_intensity, choose_by_cities

cwu = choose_by_cities(cities=CITIES)
irr_wui = clean_water_use_intensity(cwu=cwu, plot=True)
irr_wui.head()
from src.ci.processing import clean_water_use_intensity, choose_by_cities

cwu = choose_by_cities(cities=CITIES)
irr_wui = clean_water_use_intensity(cwu=cwu, plot=True)
irr_wui.head()

No description has been provided for this image

Out[4]:

	City_ID	Year	Province_n	Wheat	Maize	Rice
1274	C27	1965	Gansu	646.00	823.68	965.44
1275	C27	1966	Gansu	655.15	835.35	979.12
1276	C27	1967	Gansu	651.07	830.14	973.01
1277	C27	1968	Gansu	655.08	835.26	979.01
1278	C27	1969	Gansu	663.94	846.55	992.25

可以看出，三种主要粮食作物都是在宁夏、内蒙的单位面积耗水量大，即用水效率低，单位面积的水稻在宁夏尤其用水量极高。

结合 AquaCrop 模型估算的作物净耗水量，可以估算农田灌溉系数 $\eta$。

灌溉面积¶

模拟中，主体将拥有明确的灌溉面积。

我们利用地级市尺度上的灌溉面积数据 $A_{p, c}$，将面积用均匀分布分配给改地级市的所有主体集合 $V_c$ 加总，应保证符合历史时期各年的实际统计数据：

$$ A_{p, c} = \sum_{v \in V}^{N_{p, c}} A_{v, c}$$

In [6]:

Copied!

cwu = ChineseWater()

# 看面积
cwu.update_scope("measurements", "Magnitude")

# 只用用水强度
cwu.update_scope("sectors", "IRR")

# 仅关注三种作物
cwu.update_scope("items", include=[f"Irrigated area: {crop}" for crop in crops])
cwu = ChineseWater()

# 看面积
cwu.update_scope("measurements", "Magnitude")

# 只用用水强度
cwu.update_scope("sectors", "IRR")

# 仅关注三种作物
cwu.update_scope("items", include=[f"Irrigated area: {crop}" for crop in crops])

这里按省份分组，统计所有地级市的灌溉面积之和。

In [7]:

Copied!





all_cities_irr_area = (
    cwu.data.groupby(["Province_n", "Year"]).sum(numeric_only=True).sum(axis=1)
)
all_cities_irr_area.head()
all_cities_irr_area = (
    cwu.data.groupby(["Province_n", "Year"]).sum(numeric_only=True).sum(axis=1)
)
all_cities_irr_area.head()

Out[7]:

Province_n  Year
Anhui       1965     953.10
            1966     940.24
            1967   1,107.62
            1968   1,191.92
            1969   1,270.53
dtype: float64

再筛选黄河流域的省份和地区，使用研究区的灌溉面积数据，除以各省总灌溉面积，得到各地级市的灌溉面积占比。这个比例将被用来与每个省的村庄数量相乘，作为省$P$创建主体的数量$N_P$的标准：

$$ N_{P} = N_{villages} * \alpha_{A_{p}} $$

In [8]:

Copied!





# 使用黄河流域的省份/地区
cwu.update_scope("City_ID", CITIES)

# 单位转化
irr_wu = cwu.units(converter="ha", dequantify=True)

# 表头转换
irr_wu.rename(
    {col: col.replace("Irrigated area: ", "") for col in irr_wu.columns},
    axis=1,
    inplace=True,
)
irr_wu.head()
# 使用黄河流域的省份/地区
cwu.update_scope("City_ID", CITIES)

# 单位转化
irr_wu = cwu.units(converter="ha", dequantify=True)

# 表头转换
irr_wu.rename(
    {col: col.replace("Irrigated area: ", "") for col in irr_wu.columns},
    axis=1,
    inplace=True,
)
irr_wu.head()

Out[8]:

	City_ID	Year	Province_n	Wheat	Rice	Maize
1274	C27	1965	Gansu	16,089.68	391.45	1,152.31
1275	C27	1966	Gansu	16,485.68	383.84	1,434.74
1276	C27	1967	Gansu	17,803.30	416.67	1,442.82
1277	C27	1968	Gansu	18,863.37	437.43	1,514.68
1278	C27	1969	Gansu	19,700.68	447.62	1,549.92

In [9]:

Copied!





selected_cities_irr_area = (
    cwu.data.groupby(["Province_n", "Year"]).sum(numeric_only=True).sum(axis=1)
)
selected_cities_irr_area.head()
selected_cities_irr_area = (
    cwu.data.groupby(["Province_n", "Year"]).sum(numeric_only=True).sum(axis=1)
)
selected_cities_irr_area.head()

Out[9]:

Province_n  Year
Gansu       1965   198.77
            1966   209.47
            1967   223.77
            1968   236.77
            1969   246.53
dtype: float64

In [10]:

Copied!





ratio = (
    selected_cities_irr_area / all_cities_irr_area.loc[selected_cities_irr_area.index]
)

irr_area_ratio = (
    ratio.reset_index()
    .rename({0: "Ratio"}, axis=1)
    .pivot_table(index="Year", columns="Province_n", values="Ratio")
)
irr_area_ratio
ratio = (
    selected_cities_irr_area / all_cities_irr_area.loc[selected_cities_irr_area.index]
)

irr_area_ratio = (
    ratio.reset_index()
    .rename({0: "Ratio"}, axis=1)
    .pivot_table(index="Year", columns="Province_n", values="Ratio")
)
irr_area_ratio

Out[10]:

Province_n	Gansu	Henan	Neimeng	Ningxia	Qinghai	Shaanxi	Shandong	Shanxi
Year
1965	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1966	0.76	0.31	0.33	1.00	0.88	0.72	0.61	0.67
1967	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1968	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1969	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1970	0.76	0.31	0.35	1.00	0.88	0.72	0.61	0.66
1971	0.76	0.31	0.34	1.00	0.88	0.71	0.61	0.66
1972	0.76	0.31	0.34	1.00	0.88	0.72	0.60	0.66
1973	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1974	0.77	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1975	0.77	0.31	0.34	1.00	0.88	0.72	0.61	0.67
1976	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1977	0.76	0.31	0.35	1.00	0.88	0.72	0.61	0.67
1978	0.76	0.31	0.35	1.00	0.88	0.72	0.61	0.67
1979	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.67
1980	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.67
1981	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.67
1982	0.77	0.31	0.35	1.00	0.88	0.72	0.61	0.66
1983	0.77	0.31	0.35	1.00	0.88	0.72	0.61	0.66
1984	0.77	0.31	0.35	1.00	0.88	0.72	0.61	0.66
1985	0.77	0.32	0.35	1.00	0.88	0.72	0.61	0.65
1986	0.77	0.31	0.35	1.00	0.88	0.72	0.61	0.66
1987	0.77	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1988	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1989	0.76	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1990	0.76	0.32	0.34	1.00	0.88	0.72	0.61	0.66
1991	0.77	0.31	0.34	1.00	0.88	0.72	0.61	0.66
1992	0.77	0.32	0.34	1.00	0.88	0.72	0.61	0.66
1993	0.77	0.31	0.34	1.00	0.88	0.72	0.58	0.66
1994	0.77	0.31	0.34	1.00	0.88	0.70	0.58	0.67
1995	0.77	0.31	0.33	1.00	0.88	0.71	0.59	0.68
1996	0.77	0.31	0.33	1.00	0.88	0.70	0.60	0.68
1997	0.78	0.31	0.32	1.00	0.88	0.71	0.60	0.68
1998	0.79	0.31	0.32	1.00	0.88	0.70	0.60	0.68
1999	0.78	0.32	0.32	1.00	0.88	0.70	0.61	0.69
2000	0.79	0.31	0.32	1.00	0.88	0.70	0.60	0.68
2001	0.79	0.31	0.30	1.00	0.88	0.74	0.60	0.65
2002	0.80	0.31	0.30	1.00	0.88	0.76	0.60	0.65
2003	0.78	0.31	0.27	1.00	0.88	0.77	0.62	0.66
2004	0.78	0.31	0.30	1.00	0.88	0.75	0.62	0.67
2005	0.77	0.30	0.30	1.00	0.88	0.76	0.61	0.67
2006	0.77	0.30	0.30	1.00	0.88	0.76	0.62	0.68
2007	0.78	0.30	0.31	1.00	0.88	0.77	0.63	0.68
2008	0.78	0.30	0.29	1.00	0.88	0.77	0.62	0.68
2009	0.79	0.30	0.29	1.00	0.88	0.77	0.63	0.67
2010	0.79	0.30	0.30	1.00	0.88	0.77	0.61	0.66
2011	0.79	0.30	0.28	1.00	0.88	0.76	0.63	0.67
2012	0.79	0.30	0.29	1.00	0.88	0.76	0.63	0.66
2013	0.79	0.29	0.29	1.00	0.87	0.74	0.62	0.65

In [11]:

Copied!

import pandas as pd
import numpy as np

county_num = pd.read_csv(cfg.ds.county_num, index_col=0)
county_num
import pandas as pd
import numpy as np

county_num = pd.read_csv(cfg.ds.county_num, index_col=0)
county_num

Out[11]:

	Shanxi	Neimeng	Shandong	Henan	Shaanxi	Gansu	Qinghai	Ningxia
Year
1985	1439	1347	1881	1938	2260	1398	428	256
1986	1442	1331	1684	1896	2239	1361	429	247
1987	1441	1308	1670	1826	2231	1354	404	246
1988	1435	1301	1650	1786	2232	1356	429	246
1989	1437	1302	1619	1767	2232	1357	403	247
1990	1436	1298	1591	1741	2227	1358	403	248
1991	1436	1298	1553	1743	2228	1368	403	248
1992	1435	1299	1495	1739	2204	1365	403	248
1993	1432	1298	1383	1718	2204	1362	404	241
1994	1431	1288	1268	1655	2201	1359	404	236
1995	1419	1281	1128	1586	2149	1360	404	238
1996	1417	1281	1051	1495	1637	1360	404	241
1997	1410	1262	979	1449	1283	1352	404	239
1998	1409	1253	934	1386	1116	1351	402	239
1999	1402	1189	923	1346	1105	1342	390	238
2000	1388	1107	781	1289	1089	1315	390	245
2001	629	733	335	1252	1068	1253	349	233
2002	713	713	302	1242	684	1152	285	234
2003	714	711	296	1243	682	1081	285	100
2004	717	703	295	1241	679	928	285	99
2005	717	553	277	1232	678	800	283	97
2006	724	236	276	1105	675	774	241	97
2007	732	190	270	1089	671	769	238	94
2008	633	182	270	1033	672	763	229	93
2009	633	179	269	978	649	762	229	93
2010	633	179	156	929	648	761	229	93
2011	632	192	128	852	82	759	229	92
2012	632	277	113	827	80	758	228	92
2013	632	275	91	755	74	750	228	92
2014	632	275	81	718	74	702	225	91
2015	632	275	75	703	23	600	225	90
2016	632	272	73	682	23	487	225	90
2017	632	272	70	640	23	413	223	90
2018	632	270	68	618	21	343	223	90

人口数据导出

In [12]:

Copied!

agents_num = (county_num * irr_area_ratio).bfill(axis=0).loc[1980:2013]
if cfg.save:
    agents_num.astype(int).to_csv(cfg.out.agents_num)
agents_num = (county_num * irr_area_ratio).bfill(axis=0).loc[1980:2013]
if cfg.save:
    agents_num.astype(int).to_csv(cfg.out.agents_num)

灌溉面积导出

In [13]:

Copied!

if cfg.save:
    irr_wu.to_csv(cfg.out.irr_area_ha)
if cfg.save:
    irr_wu.to_csv(cfg.out.irr_area_ha)

计算配额权重¶

黄河水资源分配方案所计划的水量包括：

所有产业（农业和工业）
所有类型灌溉耕地（粮食作物、经济作物、林地）

所以分水额度需要进行处理：

各市三种作物总用水 / 该省总用水量

假设水量分配是按灌溉面积进行分配的，则对属于$P$省的地级市$p$在第$j$年的灌溉用水配额$Q_{p, j}$可估计为：

\begin{equation} Q_{p} = Q_{P} * \frac{\sum_{c \in crops} WU_{c, p}}{WU_{P}} \end{equation}

In [14]:

Copied!





cwu = ChineseWater()

# 使用黄河流域的省份/地区
cwu.update_scope("City_ID", CITIES)


# 农业所有作物的用水量
cwu.update_scope("measurements", "WU")
cwu.update_scope("sectors", "IRR")
# 加总
wu_irr_sum = cwu.data.sum(axis=1, numeric_only=True)
wu_irr_sum.name = "IRR_WU_sum"

cwu.data.head()
cwu = ChineseWater()

# 使用黄河流域的省份/地区
cwu.update_scope("City_ID", CITIES)


# 农业所有作物的用水量
cwu.update_scope("measurements", "WU")
cwu.update_scope("sectors", "IRR")
# 加总
wu_irr_sum = cwu.data.sum(axis=1, numeric_only=True)
wu_irr_sum.name = "IRR_WU_sum"

cwu.data.head()

Out[14]:

	City_ID	Year	Province_n	Irrigation WU: Rice	Irrigation WU: Maize	Irrigation WU: Vegetables and fruits	Irrigation WU: Others	Irrigation WU: Wheat
1274	C27	1965	Gansu	377.92	949.13	497.56	17,833.27	10,393.89
1275	C27	1966	Gansu	375.82	1,198.51	562.30	19,422.23	10,800.66
1276	C27	1967	Gansu	405.43	1,197.74	611.83	20,200.17	11,591.15
1277	C27	1968	Gansu	428.25	1,265.15	680.57	20,537.95	12,357.03
1278	C27	1969	Gansu	444.15	1,312.09	673.74	21,063.94	13,080.03

In [15]:

Copied!





# 仅关注三种作物用水
cwu.update_scope("items", include=[f"Irrigation WU: {crop}" for crop in crops])

# 加总
wu_crops_sum = cwu.data.sum(axis=1, numeric_only=True)
wu_crops_sum.name = "IRR_WU_crops_sum"
cwu.data.head()
# 仅关注三种作物用水
cwu.update_scope("items", include=[f"Irrigation WU: {crop}" for crop in crops])

# 加总
wu_crops_sum = cwu.data.sum(axis=1, numeric_only=True)
wu_crops_sum.name = "IRR_WU_crops_sum"
cwu.data.head()

Out[15]:

	City_ID	Year	Province_n	Irrigation WU: Rice	Irrigation WU: Maize	Irrigation WU: Wheat
1274	C27	1965	Gansu	377.92	949.13	10,393.89
1275	C27	1966	Gansu	375.82	1,198.51	10,800.66
1276	C27	1967	Gansu	405.43	1,197.74	11,591.15
1277	C27	1968	Gansu	428.25	1,265.15	12,357.03
1278	C27	1969	Gansu	444.15	1,312.09	13,080.03

In [16]:

Copied!

crops_ratio = wu_crops_sum / wu_irr_sum
crops_ratio.name = "Ratio_crops"
crops_ratio.head()
crops_ratio = wu_crops_sum / wu_irr_sum
crops_ratio.name = "Ratio_crops"
crops_ratio.head()

Out[16]:

1274   0.43
1275   0.42
1276   0.42
1277   0.43
1278   0.44
Name: Ratio_crops, dtype: float64

In [17]:

Copied!





SUMMARY = {
    "Total water use": "总用水量",
    "IND": "工业总用水",
    "IRR": "灌溉总用水",
    "RUR": "农村总用水",
    "URB": "城市总用水",
}
# @property
# def total(self) -> pd.DataFrame:
#     """根据当前的范围过滤后的总结数据"""
#     time = self.origin["Year"].isin(self.time)
#     cities = self.origin["City_ID"].isin(self.cities)
#     cols = SUMMARY.keys()
#     return self.origin.loc[(time & cities), [*GENERAL_COLUMNS, *cols]]

cwu.total
SUMMARY = {
    "Total water use": "总用水量",
    "IND": "工业总用水",
    "IRR": "灌溉总用水",
    "RUR": "农村总用水",
    "URB": "城市总用水",
}
# @property
# def total(self) -> pd.DataFrame:
#     """根据当前的范围过滤后的总结数据"""
#     time = self.origin["Year"].isin(self.time)
#     cities = self.origin["City_ID"].isin(self.cities)
#     cols = SUMMARY.keys()
#     return self.origin.loc[(time & cities), [*GENERAL_COLUMNS, *cols]]

cwu.total

Out[17]:

	City_ID	Year	Province_n	Total water use	IND	IRR	RUR	URB
1274	C27	1965	Gansu	0.33	0.01	0.30	0.01	0.01
1275	C27	1966	Gansu	0.35	0.01	0.32	0.01	0.01
1276	C27	1967	Gansu	0.37	0.02	0.34	0.01	0.01
1277	C27	1968	Gansu	0.39	0.02	0.35	0.01	0.01
1278	C27	1969	Gansu	0.41	0.02	0.37	0.01	0.01
...	...	...	...	...	...	...	...	...
13029	C266	2009	Shaanxi	1.06	0.19	0.71	0.05	0.11
13030	C266	2010	Shaanxi	1.06	0.18	0.69	0.06	0.12
13031	C266	2011	Shaanxi	1.04	0.20	0.65	0.06	0.13
13032	C266	2012	Shaanxi	1.07	0.19	0.67	0.06	0.15
13033	C266	2013	Shaanxi	1.08	0.19	0.65	0.06	0.18

2891 rows × 8 columns

In [18]:

Copied!

import seaborn as sns

irr_ratio = cwu.total["IRR"] / cwu.total["Total water use"]
irr_ratio.name = "IRR ratio"
sns.histplot(irr_ratio)
import seaborn as sns

irr_ratio = cwu.total["IRR"] / cwu.total["Total water use"]
irr_ratio.name = "IRR ratio"
sns.histplot(irr_ratio)

Out[18]:

<Axes: xlabel='IRR ratio', ylabel='Count'>

绘制出各地级市灌溉用水配额权重的分布比例。

In [19]:

Copied!

quota_ratio = irr_ratio * crops_ratio
quota_ratio.name = "Ratio"
sns.histplot(quota_ratio)
quota_ratio = irr_ratio * crops_ratio
quota_ratio.name = "Ratio"
sns.histplot(quota_ratio)

Out[19]:

<Axes: xlabel='Ratio', ylabel='Count'>

In [20]:

Copied!

import pandas as pd

ratio_data = pd.concat([cwu.index, quota_ratio], axis=1)
ratio_data.head()
import pandas as pd

ratio_data = pd.concat([cwu.index, quota_ratio], axis=1)
ratio_data.head()

Out[20]:

	City_ID	Year	Province_n	Ratio
1274	C27	1965	Gansu	0.39
1275	C27	1966	Gansu	0.38
1276	C27	1967	Gansu	0.38
1277	C27	1968	Gansu	0.39
1278	C27	1969	Gansu	0.39

In [21]:

Copied!





ratio_pivot = ratio_data.pivot_table(
    index="Year", columns="Province_n", values="Ratio", aggfunc="mean"
)
ratio_pivot.head()
ratio_pivot = ratio_data.pivot_table(
    index="Year", columns="Province_n", values="Ratio", aggfunc="mean"
)
ratio_pivot.head()

Out[21]:

Province_n	Gansu	Henan	Neimeng	Ningxia	Qinghai	Shaanxi	Shandong	Shanxi
Year
1965	0.40	0.45	0.31	0.63	0.31	0.45	0.34	0.30
1966	0.40	0.48	0.33	0.61	0.31	0.44	0.34	0.31
1967	0.40	0.50	0.34	0.62	0.32	0.45	0.35	0.31
1968	0.40	0.51	0.33	0.61	0.32	0.45	0.34	0.29
1969	0.40	0.51	0.33	0.61	0.32	0.44	0.35	0.29

In [23]:

Copied!

cfg.save
cfg.save

Out[23]:

True

In [22]:

Copied!

if cfg.save:
    ratio_pivot.to_csv(cfg.out.irr_ratio)
if cfg.save:
    ratio_pivot.to_csv(cfg.out.irr_ratio)