交叉表(Crosstab)顯示兩個變數的值的次數分佈(frequency distribution ),可以用於找出兩個變數間是否有關聯。pandas.crosstab() 可以幫我們計算交叉表,並且顯示漂亮的表格。
pandas.crosstab(
index,
columns,
values=None,
rownames=None,
colnames=None,
aggfunc=None,
margins=False,
margins_name='All',
dropna=True,
normalize=False
)index:顯示於列(rows)的值。型態為 array-like、Series、list of arrays/Series。columns:顯示於行(columns)的值。型態為 array-like、Series、list of arrays/Series。margins:顯示列和行的小計(subtotals)。
範例
以下的資料是從 Women Entrepreneurship and Labor Force 取得,我們只擷取部分的資料。
import pandas as pd
import numpy as np
df = pd.DataFrame(
np.array([
['Austria', 'Developed', 'Member', 'Euro'],
['Spain', 'Developed', 'Member', 'Euro'],
['Japan', 'Developed', 'Not Member', 'National Currency'],
['Argentina', 'Developing', 'Not Member', 'National Currency'],
['Bolivia', 'Developing', 'Not Member', 'National Currency'],
['Taiwan', 'Developed', 'Not Member', 'National Currency'],
]),
columns=['Country', 'Level of development', 'European Union Membership', 'Currency']
)
pd.crosstab(df['Level of development'], df['European Union Membership'])
pd.crosstab(df['Level of development'], df['European Union Membership'], margins=True)

pd.crosstab(df["Level of development"], [df['European Union Membership'], df['Currency']])




