本文介绍了R:将列添加到 data.frame 以分为低、中、高范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
限时送ChatGPT账号..我有一个data.frame
星系和它们的距离(z
):
I have a data.frame
of galaxies and their distances (z
):
> head(sdss16, 10)
SDSS RAJ2000 DEJ2000 MJD Class QSO z umag gmag rmag imag zmag e_umag e_gmag e_rmag e_imag e_zmag
1 000000.15+353104.2 0.000629 35.517841 58402 0 1 0.845435 18.9640 18.6307 18.4295 18.4118 18.2555 0.0248228 0.0138142 0.0173684 0.0171765 0.0281816
2 000000.33+310325.3 0.001415 31.057048 58073 0 1 2.035491 22.0825 21.7871 21.5621 21.3595 20.9340 0.1381920 0.0461832 0.0504525 0.0603687 0.1857780
3 000000.36+070350.8 0.001535 7.064129 58449 0 1 1.574227 22.5173 22.1028 21.8542 21.6380 21.8888 0.2093710 0.0641275 0.0674263 0.0829677 0.2956540
4 000000.36+274356.2 0.001526 27.732283 57654 0 1 1.770552 22.3475 21.9031 21.7528 21.6635 21.9946 0.1889810 0.0556878 0.0731551 0.0841880 0.3567380
5 000000.45+092308.2 0.001914 9.385637 58450 0 1 2.024146 18.7664 18.6627 18.4998 18.3365 18.1586 0.0261839 0.0309531 0.0179315 0.0260643 0.0214897
6 000000.45+174625.4 0.001898 17.773739 56945 3 1 2.309000 22.4403 21.9089 22.0700 21.9268 21.3725 0.2871240 0.0677072 0.1153900 0.1489100 0.3854550
7 000000.47-002703.9 0.001978 -0.451088 55477 3 1 0.250000 21.6832 21.1946 20.5092 20.1535 19.8793 0.1288200 0.0415909 0.0301123 0.0290315 0.0765198
8 000000.57+055630.8 0.002375 5.941903 57367 0 1 2.102771 22.3606 21.6176 21.3399 21.2840 20.7872 0.3101850 0.0539608 0.0710789 0.1014390 0.2420300
9 000000.62+311944.3 0.002595 31.328982 58073 0 1 1.991313 19.6818 19.4060 19.3189 19.0364 18.8358 0.0299476 0.0160732 0.0150661 0.0247494 0.0376382
10 000000.66+145828.8 0.002756 14.974675 56268 3 1 2.497000 21.9420 21.2236 20.8861 20.7823 20.6592 0.1638730 0.0360871 0.0372218 0.0509094 0.2107500
我想添加一个新列,根据星系所在的分位数将 z
描述为低"、中"或高":
I want to add a new column which describes the z
as 'Low', 'Medium', or 'High' based on which quantile the galaxy is in:
summary(z)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
-0.002643 1.177832 1.692103 1.740606 2.260000 7.023917 4
我可以使用
lowz <- sdss16 %>% filter(z < quantile(z, 0.25))
midz <- sdss16 %>% filter(z >= quantile(z, 0.25) & z < quantile(z, 0.75))
hiz <- sdss16 %>% filter(z >= quantile(z, 0.75))
所以我的问题是,如所述,如何根据四分位数添加新列?
so my question is, how can I add a new column based on the quartiles, as described?
推荐答案
也许这行得通?
library(tidyverse)
sdss16 %>%
mutate(z_category = case_when(z < quantile(z, 0.25) ~ "Low",
z >= quantile(z, 0.25) & z <= quantile(z, 0.75) ~ "Medium",
z > quantile(z, 0.75) ~ "High"))
这篇关于R:将列添加到 data.frame 以分为低、中、高范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论