Dataquest mission 113 - FiveThirtyEight college-majors
FiveThrtyEight が提供する college-majors データの分析。
データを読み込む
import pandas as pd
# 全ての年代と、近年のデータをそれぞれ読み込む
all_ages = pd.read_csv('all-ages.csv')
recent_grads = pd.read_csv('recent-grads.csv')
recent_grads.head(5)
Rank | Major_code | Major | Major_category | Total | Sample_size | Men | Women | ShareWomen | Employed | ... | Part_time | Full_time_year_round | Unemployed | Unemployment_rate | Median | P25th | P75th | College_jobs | Non_college_jobs | Low_wage_jobs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2419 | PETROLEUM ENGINEERING | Engineering | 2339 | 36 | 2057 | 282 | 0.120564 | 1976 | ... | 270 | 1207 | 37 | 0.018381 | 110000 | 95000 | 125000 | 1534 | 364 | 193 |
1 | 2 | 2416 | MINING AND MINERAL ENGINEERING | Engineering | 756 | 7 | 679 | 77 | 0.101852 | 640 | ... | 170 | 388 | 85 | 0.117241 | 75000 | 55000 | 90000 | 350 | 257 | 50 |
2 | 3 | 2415 | METALLURGICAL ENGINEERING | Engineering | 856 | 3 | 725 | 131 | 0.153037 | 648 | ... | 133 | 340 | 16 | 0.024096 | 73000 | 50000 | 105000 | 456 | 176 | 0 |
3 | 4 | 2417 | NAVAL ARCHITECTURE AND MARINE ENGINEERING | Engineering | 1258 | 16 | 1123 | 135 | 0.107313 | 758 | ... | 150 | 692 | 40 | 0.050125 | 70000 | 43000 | 80000 | 529 | 102 | 0 |
4 | 5 | 2405 | CHEMICAL ENGINEERING | Engineering | 32260 | 289 | 21239 | 11021 | 0.341631 | 25694 | ... | 5180 | 16697 | 1672 | 0.061098 | 65000 | 50000 | 75000 | 18314 | 4440 | 972 |
5 rows × 21 columns
学部毎の卒業生数:過去、現在
# 学位のカテゴリ(学部に相当する)の一覧を作成する
major_categories = all_ages['Major_category'].value_counts().index
# 学位カテゴリ毎の総人数
all_ages_major_categories = {}
recent_grads_major_categories = {}
# 与えられたデータフレームのカテゴリ毎の総人数を求める
def calc_total_for_major_category(df):
major_categories = df['Major_category'].value_counts().index
totals = {}
for major_category in major_categories:
totals[major_category] = df[df['Major_category'] == major_category]['Total'].sum()
return totals
# 過去の全データと、近年のデータ、それぞれで学位カテゴリ毎の総人数を求める
all_ages_major_categories = calc_total_for_major_category(all_ages)
recent_grads_major_categories = calc_total_for_major_category(recent_grads)
低賃金に甘んじている学位取得者の割合は?
low_wage_percent = recent_grads['Low_wage_jobs'].sum() / recent_grads['Total'].sum()
print(low_wage_percent)
0.0985254607612
=> およそ10%
学位毎の失業率は増えている?
majors = recent_grads['Major'].value_counts().index
all_ages_ordered = all_ages.sort_values('Major')
recent_grads_ordered = recent_grads.sort_values('Major')
all_ages_are_better = all_ages_ordered['Unemployment_rate'] < recent_grads_ordered['Unemployment_rate']
recent_grads_are_better = all_ages_ordered['Unemployment_rate'] > recent_grads_ordered['Unemployment_rate']
all_ages_lower_unemp_count = sum([1 if better == True else 0 for better in all_ages_are_better])
recent_grads_lower_unemp_count = sum([1 if better == True else 0 for better in recent_grads_are_better])
print(all_ages_lower_unemp_count)
print(recent_grads_lower_unemp_count)
128
43
=> 近年は43の学位で失業率が低い。悪化している。