Handling categorical variables in R

  • I
  • Thread starter fog37
  • Start date
  • Tags
    Variables
In summary, in R, nominal categorical variables must be converted into factors and then to dummy variables before using them in a statistical model. The lm() function in R automatically does this conversion, but it may not apply to other models. Python does not have factors, so the intermediate "factor" step does not apply. It is possible to convert categorical variables directly to dummy variables in R without the factor step, but this may limit the ability to choose different contrasts.
  • #1
fog37
1,568
108
TL;DR Summary
Handling categorical variables in R
Hello R users,

My general understanding is that, in R, nominal categorical variables (with 2 or more levels) must be first converted into factors and THEN to dummy variables (k-1 dummy variables for k levels). Is that correct?

Once we accomplish categorical variable -> factor -> dummy variables, we can then use the dummy variable as an independent or dependent variable in a statistical model (P.S. : when using the function ##lm()## in R, the function ##lm()## automatically does the dummy variable conversion but I am not sure that being true for other models).

What if we converted the categorical variable to dummy variables without the intermediate factor step? Would that still work in R?

Python does not have factors so that intermediate "factor" step does not apply...

Thanks!
 
Physics news on Phys.org
  • #2
Can you give a code example? I'm not sure what the factor step is but seeing what's actually called might help.
 
  • #3
fog37 said:
TL;DR Summary: Handling categorical variables in R

What if we converted the categorical variable to dummy variables without the intermediate factor step? Would that still work in R?
I have never tried this, but from my experience I would think that yes you could do that. You would lose the ability to choose different contrasts, since that would be your dummy variables. But I don’t see why it wouldn’t work
 

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Quantum Interpretations and Foundations
2
Replies
37
Views
2K
Replies
131
Views
4K
Replies
2
Views
1K
Replies
1
Views
641
  • Engineering and Comp Sci Homework Help
Replies
0
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
Back
Top