Hospital

2022. 9. 14. 12:18

โ‘  load dataset

import pandas as pd

df=pd.read_csv("/content/ADMISSIONS.csv", delimiter=',')
df.set_index('row_id', inplace=True)
df.head()

df.set_index('row_id', inplace=True)

โ‡จ 'row_id'๋ฅผ ์ธ๋ฑ์Šค๋กœ ์‚ฌ์šฉํ•˜๊ฒ ๋‹ค.

โ‡จ ์›๋ณธ ๊ฐ์ฒด๋ฅผ ์‚ฌ์šฉํ•˜๊ฒ ๋‹ค.

 

index๋ฅผ ์ง€์ •ํ•ด์ฃผ๋Š” ์ด์œ ๋Š” dataframe์€ ๋‚˜์—ด์ด๋‹ˆ๊นŒ ํ‘œ ํ˜•ํƒœ๋กœ ๋งŒ๋“ค์–ด์ฃผ๊ธฐ ์œ„ํ•ด์„œ.

 

โœ”๏ธ์ธ๋ฑ์Šค ์„ธํŒ… ๋ฐ ๋ฆฌ์…‹: set_index (dataframe ๋‚ด์˜ ์—ด์„ ์ด์šฉํ•œ ์ธ๋ฑ์Šค ์„ค์ •)

dataframe.set_index(keys, drop=True, append=False, inplace=False)

1) keys: ์ธ๋ฑ์Šค๋กœ ์„ธํŒ…ํ•˜๋ ค๋Š” ์—ด์˜ ๋ ˆ์ด๋ธ”, ์—ฌ๋Ÿฌ๊ฐœ๋ฅผ ์„ ํƒํ•˜๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ์—๋Š” ๋ฐฐ์—ด ํ˜•ํƒœ

2) drop: ์ธ๋ฑ์Šค๋กœ ์„ธํŒ…ํ•œ ์—ด์„ dataframe ๋‚ด์—์„œ ์‚ญ์ œํ• ์ง€ ์—ฌ๋ถ€(์„ ํƒ), ๋‚จ๊ฒจ๋†“๊ณ  ์‹ถ์„ ๊ฒฝ์šฐ drop=False

3) append: ์ด๋ฏธ ์ธ๋ฑ์Šค๊ฐ€ ์„ค์ •๋˜์–ด ์žˆ๊ณ  ๋‹ค๋ฅธ ์ƒˆ๋กœ์šด ์ธ๋ฑ์Šค๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ์„ ๋•Œ ๋‚จ๊ฒจ๋†“๊ณ  ์‹ถ์„ ๊ฒฝ์šฐ append=True

4) inplace: ์›๋ณธ ๊ฐ์ฒด๋ฅผ ๋ณ€๊ฒฝํ• ์ง€ ์—ฌ๋ถ€(์„ ํƒ), ์›๋ณธ dataframe์— ์ ์šฉํ•˜๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ inplace=True (df ์ด๋ฆ„ ๊ทธ๋Œ€๋กœ ์“ธ ๊ฒƒ์ธ์ง€)

 

โ‘ก ์ด ํ™˜์ž ์ˆ˜

print(df.shape)

 

โ‘ข ํ‰๊ท  ์ž…์›์ผ

import datetime
import numpy as np

dateformat="%Y-%m-%d %H:%M:$S"
admit_dur=[]

for i in range(len(df)):
  start_datetime=df['admittime'][df.index[i]]
  end_datetime=df['dischtime'][df.index[i]]

  start_convert=datetime.datetime.strptime(start_datetime, dateformat)
  end_convert=datetime.datetime.strptime(end_datetime, dateformat)

  admit_day=(end_convert-start_convert).days
  admit_dur.append(admit_day)

print("ํ‰๊ท  ์ž…์›์ผ: ", np.mean(admit_dur))

 

 

 

 

'๐Ÿ“ Data Analysis > ๐Ÿ–ฑ ์‹ค์Šต' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

Titanic  (0) 2022.09.19
Pandas  (0) 2022.09.08

BELATED ARTICLES

more