Vous êtes sur la page 1sur 1

Factors with forcats : : CHEAT SHEET

The forcats package provides tools for working with factors, which are R's data structure for categorical data.

Factors stored displayed Change the order of levels Change the value of levels
R represents categorical integer 1 1= a a 1= a
data with factors. A factor vector 3 23 == bc c 23 == bc a 1= a a 1= b fct_relevel(.f, ..., after = 0L) a 1= a v 1= v fct_recode(.f, ...) Manually change
is an integer vector with a c 2= b c 2= c Manually reorder factor levels. c 2= b z 2= x levels. Also fct_relabel which obeys
2 b 3= c 3= a fct_relevel(f, c("b", "c", "a")) 3= c 3= z purrr::map syntax to apply a function
levels attribute that stores levels 1 a b b b x
a set of mappings between or expression to each level.
integers and categorical values. When you view a factor, R
a a a v fct_recode(f, v = "a", x = "b", z = "c")
displays not the integers, but the values associated with them. fct_infreq(f, ordered = NA) fct_relabel(f, ~ paste0("x", .x))

Create a factor with factor() c 1= a c 1= c Reorder levels by the frequency


1= a c 2= c c 2= a in which they appear in the
a a a 1= a 1=2
c c 2= b factor(x = character(), levels, a a data (highest frequency first). 2 2=1 fct_anon(f, prefix = ""))
labels = levels, exclude = NA, ordered f3 <- factor(c("c", "c", "a")) 2= b
b b
3= c c 3= c 1 3=3 Anonymize levels with random
= is.ordered(x), nmax = NA) Convert fct_infreq(f3)
a a a vector to a factor. Also as_factor. b 3 integers. fct_anon(f)
f <- factor(c("a", "c", "b", "a"), a 2
levels = c("a", "b", "c")) 1= a 1= b fct_inorder(f, ordered = NA)
b b
a 2= b a 2= a Reorder levels by order in
a 1= a a Return its levels with levels() which they appear in the data. a 1= a x 1= x fct_collapse(.f, ...) Collapse levels
2= b b c 2= b c 2= c into manually defined groups.
c 3= c c
levels(x) Return/set the levels of a fct_inorder(f2) 3= c fct_collapse(f, x = c("a", "b"))
b factor. levels(f); levels(f) <- c("x","y","z") b x
a a x
Use unclass() to see its structure
a 1= a a 1= c fct_rev(f) Reverse level order.
2= b 2= b f4 <- factor(c("a","b","c")) fct_lump(f, n, prop, w = NULL,
b b
Inspect Factors c
3= c
c
3= a fct_rev(f4) a
c
1= a
2= b
a 1= a
2 = Other
other_level = "Other", ties.method =
c("min", "average", "first", "last",
3= c
Other
"random", "max")) Lump together
a 1= a f n fct_count(f, sort = FALSE) a 1= a a 1= c fct_shift(f) Shift levels to left b Other
least/most common levels into a
c 2= b
a 2 Count the number of values 2= b 2= a or right, wrapping around end. a a
3= c with each level. fct_count(f) b 3= c b 3= b fct_shift(f4)
single level. Also fct_lump_min.
b b1 c c fct_lump(f, n = 1)
a c 1

a 1= a a 1= a fct_unique(f) Return the a 1= a a 1= a fct_shuffle(f, n = 1L) Randomly a 1= a a 1= a fct_other(f, keep, drop, other_level =
2= b 2= b unique values, removing b 2= b b 2= c permute order of factor levels. 2= b 2= b "Other") Replace levels with "other."
c c 3= c 3= b fct_shuffle(f4) c 3= c
Other

b
3= c
b
3= c duplicates. fct_unique(f) c c b b
3 = Other fct_other(f, keep = c("a", "b"))
a a a
fct_reorder(.f, .x, .fun=median, ...,
.desc = FALSE) Reorder levels by
Combine Factors Add or drop levels
1= a 1= b
a 2= b a 2= c their relationship with another
bc 3= c bc 3= a
variable.
a 1= a + a fct_c(…) Combine factors boxplot(data = iris, Sepal.Width ~ a a
b 1= a = 1= a
fct_reorder(Species, Sepal.Width))
1= a 1= a fct_drop(f, only) Drop unused levels.
c 2= c a 2= b c 2= c with different levels. b 2= b 2= b f5 <- factor(c("a","b"),c("a","b","x"))
3= b f1 <- factor(c("a", "c")) 3= x b
b f6 <- fct_drop(f5)
f2 <- factor(c("b", "a"))
a fct_c(f1, f2) fct_reorder2(.f, .x, .y, .fun =
1= a 1= b last2, ..., .desc = TRUE) Reorder a 1= a a 1= a fct_expand(f, …) Add levels to
2= b 2= c 2= b 2= b a factor. fct_expand(f6, "x")
3= c 3= a
levels by their final values when b b 3= x
a 1= a a 1= a plotted with two other variables.
b
2= b
b
2= b
3= c
fct_unify(fs, levels = ggplot(data = iris,
lvls_union(fs)) Standardize aes(Sepal.Width, Sepal.Length,
a 1= a
2= c
a 1= a
levels across a list of factors. color = fct_reorder2(Species, a 1= a a 1= a fct_explicit_na(f, na_level="(Missing)")
c c2 2= b
b 2= b
b 2= b Assigns a level to NAs to ensure they
3= c
fct_unify(list(f2, f1)) Sepal.Width, Sepal.Length))) + 3= x appear in plots, etc.
geom_smooth() NA x fct_explicit_na(factor(c("a", "b", NA)))
RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at forcats.tidyverse.org • Diagrams inspired by @LVaudor ! • forcats 0.3.0 • Updated: 2019-02

Vous aimerez peut-être aussi