Learning SQL FTW
I’ve been handling my data in R solely. Since sql is highly recommended by experts I followed on twitter, I think it’s worth a shot.
So I went to datacamp and started the intro tutorial and finished in about 4 hrs.
Notes
After learning about the basic grammars. It seems that dplyr offers all the features for data manipulation.
Here is what I learned:
- A database is a list of data.frames
- basic usage
- construct a query to the database
- return as a data.frame
- manipulations equivalent to dplyr functions:
SELECT XXX from YYY
=YYY %>% select(XXX)
WHERE condition
=%>% filter(condition)
SELECT operation(XXX) FROM YYY GROUP BY ZZZ
=YYY %>% group_by(ZZZ) %>% summarise(operation(XXX))
SELECT op(XXX) AS NNN
=mutate(NNN = op(XXX))
- construct a query to the database
- strangely the tutorial in datacamp put join in another full tutorial… don’t know why.
The full grammar is sth like this: pretty simple and straightforward
SELECT column
FROM table
WHERE condition
GROUP BY group condition
HAVING grouped condition
Summary
At least next time I saw some sql query, I could understand what’s going on.