코딩고치

[파이썬][데이터 사이언스 기초] DataFrame 변형 본문

파이썬/데이터 사이언스 기초

[파이썬][데이터 사이언스 기초] DataFrame 변형

코딩고치 2020. 6. 3. 00:04

DataFrame 변형

import pandas as pd
miracle_df = pd.read_csv('test1.csv', index_col=0)
miracle_df
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15

값 변경하기

  • 하나의 데이터 값 변경
miracle_df.loc['Heal Aid', 'Faith_Required'] = 6
miracle_df
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 6
Soothing Sunlight 80 1 45
Replenishment 30 1 15
  • 데이터 한 줄 변경
miracle_df.loc['Heal Aid'] = ['20', '2', '4']
miracle_df
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 20 2 4
Soothing Sunlight 80 1 45
Replenishment 30 1 15
miracle_df['Slots_used'] = ['2', '2', '2', '1', '2', '2']
miracle_df
  FP_cost Slots_used Faith_Required
Way of White Corona 15 2 18
Projected Heal 55 2 28
Lighting Arrow 19 2 35
Heal Aid 20 1 4
Soothing Sunlight 80 2 45
Replenishment 30 2 15
  • 한 줄을 전부 같은 값으로 변경
miracle_df['Slots_used'] = 3
miracle_df
  FP_cost Slots_used Faith_Required
Way of White Corona 15 3 18
Projected Heal 55 3 28
Lighting Arrow 19 3 35
Heal Aid 20 3 4
Soothing Sunlight 80 3 45
Replenishment 30 3 15
  • 여러 줄 변경
miracle_df[['FP_cost', 'Faith_Required']] = 'x'
miracle_df
  FP_cost Slots_used Faith_Required
Way of White Corona x 3 x
Projected Heal x 3 x
Lighting Arrow x 3 x
Heal Aid x 3 x
Soothing Sunlight x 3 x
Replenishment x 3 x
miracle_df.loc[['Way of White Corona', 'Soothing Sunlight']] = 'O'
miracle_df
  FP_cost Slots_used Faith_Required
Way of White Corona O O O
Projected Heal x 3 x
Lighting Arrow x 3 x
Heal Aid x 3 x
Soothing Sunlight O O O
Replenishment x 3 x
miracle_df.loc['Projected Heal':'Heal Aid'] = 'O'
miracle_df
  FP_cost Slots_used Faith_Required
Way of White Corona O O O
Projected Heal O O O
Lighting Arrow O O O
Heal Aid O O O
Soothing Sunlight O O O
Replenishment x 3 x
miracle_df2 = pd.read_csv('test2.csv', index_col=0)
miracle_df2
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
miracle_df2.loc[miracle_df2['FP_cost'] > 20]
  FP_cost Slots_used Faith_Required
Projected Heal 55 1 28
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
miracle_df2.loc[miracle_df2['FP_cost'] > 20] = 'No'
miracle_df2
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal No No No
Lighting Arrow 19 1 35
Heal Aid No No No
Soothing Sunlight No No No
Replenishment No No No
miracle_df2.iloc[[1, 2], [2, 1]] = 'Yes'
miracle_df2
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal No Yes Yes
Lighting Arrow 19 Yes Yes
Heal Aid No No No
Soothing Sunlight No No No
Replenishment No No No

값 추가 및 삭제

miracle_df3 = pd.read_csv('test3.csv', index_col=0)
miracle_df3
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
  • 없는 row 또는 column의 이름으로 값을 넣으면 새로운 row 또는 column이 생김
miracle_df3.loc['Bountiful Sunlight'] = ['70', '2', 35]
miracle_df3
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
Bountiful Sunlight 70 2 35
miracle_df3['User'] = 'Ashenone'
miracle_df3
  FP_cost Slots_used Faith_Required User
Way of White Corona 15 1 18 Ashenone
Projected Heal 55 1 28 Ashenone
Lighting Arrow 19 1 35 Ashenone
Heal Aid 27 1 8 Ashenone
Soothing Sunlight 80 1 45 Ashenone
Replenishment 30 1 15 Ashenone
Bountiful Sunlight 70 2 35 Ashenone
  • 값 삭제
    • miracle_df3.drop('Bountiful Sunlight', axis='index', inplace=False)
      • Bountiful Sunlight의 row정보를 삭제
      • axis='index'는 row 삭제를 의미
      • inplace=False는 원본 데이터는 건드리지 않는다는 의미
      • 원본 자체를 바꾸고 싶으면 inplace=True
miracle_df3.drop('Bountiful Sunlight', axis='index', inplace=False)
  FP_cost Slots_used Faith_Required User
Way of White Corona 15 1 18 Ashenone
Projected Heal 55 1 28 Ashenone
Lighting Arrow 19 1 35 Ashenone
Heal Aid 27 1 8 Ashenone
Soothing Sunlight 80 1 45 Ashenone
Replenishment 30 1 15 Ashenone
miracle_df3
  FP_cost Slots_used Faith_Required User
Way of White Corona 15 1 18 Ashenone
Projected Heal 55 1 28 Ashenone
Lighting Arrow 19 1 35 Ashenone
Heal Aid 27 1 8 Ashenone
Soothing Sunlight 80 1 45 Ashenone
Replenishment 30 1 15 Ashenone
Bountiful Sunlight 70 2 35 Ashenone
miracle_df3.drop('Bountiful Sunlight', axis='index', inplace=True)
miracle_df3
  FP_cost Slots_used Faith_Required User
Way of White Corona 15 1 18 Ashenone
Projected Heal 55 1 28 Ashenone
Lighting Arrow 19 1 35 Ashenone
Heal Aid 27 1 8 Ashenone
Soothing Sunlight 80 1 45 Ashenone
Replenishment 30 1 15 Ashenone
miracle_df3.drop('User', axis='columns', inplace=True)
miracle_df3
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
  • 여러 줄 삭제
miracle_df3.drop(['Heal Aid', 'Replenishment'], axis='index', inplace=True)
miracle_df3
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Soothing Sunlight 80 1 45

column, index 이름 재설정

miracle_df4 = pd.read_csv('test3.csv', index_col=0)
miracle_df4
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
  • miracle_df4.rename(columns={'기존 이름': '바꿀 이름'})을 이용하여 변경
miracle_df4.rename(columns={'FP_cost': 'FP'})
  FP Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
miracle_df4
  FP_cost Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
  • rename 함수는 기존 DataFrame은 건들지 않음
  • 기존 DataFrame을 바꾸려면 miracle_df4.rename(columns={'기존 이름': '바꿀 이름'}, inplace=True)
  • 바꾸는 값은 여러 개 넣어줄 수 있음
miracle_df4.rename(columns={'FP_cost': 'FP'}, inplace=True)
miracle_df4
  FP Slots_used Faith_Required
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
miracle_df4.rename(columns={'Slots_used': 'Slots', 'Faith_Required': 'Faith'}, inplace=True)
miracle_df4
  FP Slots Faith
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
  • miracle_df4.index.name: index에 이름 부여
miracle_df4.index.name = 'Miracle'
miracle_df4
  FP Slots Faith
Miracle      
Way of White Corona 15 1 18
Projected Heal 55 1 28
Lighting Arrow 19 1 35
Heal Aid 27 1 8
Soothing Sunlight 80 1 45
Replenishment 30 1 15
  • index 변경
    1. 기존 index를 새로운 column에 지정
      • miracle_df4['Miracle Name'] = miracle_df.index
    2. index 변경
      • miracle_df4.set_index('Faith', inplace=True)
  • index는 겹칠 일이 없는 데이터로 설정하는 게 좋음
miracle_df4['Miracle Name'] = miracle_df.index
miracle_df4
  FP Slots Faith Miracle Name
Miracle        
Way of White Corona 15 1 18 Way of White Corona
Projected Heal 55 1 28 Projected Heal
Lighting Arrow 19 1 35 Lighting Arrow
Heal Aid 27 1 8 Heal Aid
Soothing Sunlight 80 1 45 Soothing Sunlight
Replenishment 30 1 15 Replenishment
miracle_df4.set_index('Faith', inplace=True)
miracle_df4
  FP Slots Miracle Name
Faith      
18 15 1 Way of White Corona
28 55 1 Projected Heal
35 19 1 Lighting Arrow
8 27 1 Heal Aid
45 80 1 Soothing Sunlight
15 30 1 Replenishment
Comments