• r

Using SQL in R: sqldf

Use sqldf! Data must be in a data frame first before using sqldf. Sqldf does not accept “.” in column names, because the “.” is used to join data frames, so be sure to rename columns to get around this. In addition to joining data frames, sqldf can select columns, rename columns, add columns, and update columns. Will be adding more to this as I continue using sqldf.

library(sqldf)

Update column based on condition.

df<-sqldf(c("update df
	set CatScore=16
	where CakesEaten='TooMany'",
	"select * from df"))

Update column based on multiple conditions.

df<-sqldf(c(
	"update df
	set CatStatus='SuperCat'
	where CakesEaten in ('TooMany','WayTooMuchCake','CakeOverload','OhMyGodMoarCake')
	and TimeToFinish<=650",
	"select * from df"))

Select columns conditionally, create new sum column. Be sure to group by factors.

data<-sqldf(
	"select weight,meows,purrs,sum(cats) as cats
	from data
	where meows in ('cake','icecream','fudge')
	group by weight,meows,purrs")

or, if the reference list is in a data frame:

list<-as.data.frame(c('cake','icecream','fudge'))
colnames(list)<-"yums"
data<-sqldf(
	"select weight,meows,purrs,sum(cats) as cats
	from data
	where meows in (select yums from 'list')
	group by weight,meows,purrs")

Create new column and assign value conditionally.

df<-sqldf("select *,
	case when chocolate like '%dark%' then 1
	else 0
	end as ChocoType
	from df")