Learning SQL FTW

I’ve been handling my data in R solely. Since sql is highly recommended by experts I followed on twitter, I think it’s worth a shot.

So I went to datacamp and started the intro tutorial and finished in about 4 hrs.


Notes

After learning about the basic grammars. It seems that dplyr offers all the features for data manipulation.

Here is what I learned:

  • A database is a list of data.frames
  • basic usage
    • construct a query to the database
    • return as a data.frame
    • manipulations equivalent to dplyr functions:
      • SELECT XXX from YYY = YYY %>% select(XXX)
      • WHERE condition = %>% filter(condition)
      • SELECT operation(XXX) FROM YYY GROUP BY ZZZ = YYY %>% group_by(ZZZ) %>% summarise(operation(XXX))
      • SELECT op(XXX) AS NNN = mutate(NNN = op(XXX))
  • strangely the tutorial in datacamp put join in another full tutorial… don’t know why.

The full grammar is sth like this: pretty simple and straightforward

SELECT column
FROM table
WHERE condition
GROUP BY group condition
HAVING grouped condition

Summary

At least next time I saw some sql query, I could understand what’s going on.