Sunday, May 9, 2021

Educational Resources & Tech Tools 05/10/2021

      • There is a great pair of keyboard shortcuts that will work together to make sure you’ve captured the important parts of your code in the editor:

         
           
        1. Press Cmd/Ctrl + Shift + F10 to restart RStudio.
        2.  
        3. Press Cmd/Ctrl + Shift + S to rerun the current script.
        4.  
         

        I use this pattern hundreds of times a week.

    • You should never use absolute paths in your scripts, because they hinder sharing: no one else will have exactly the same directory configuration as you.
    • All R statements where you create objects, assignment statements, have the same form:

       
      object_name <- value
       

      When reading that code say “object name gets value” in your head.

       

    • We recommend snake_case where you separate lowercase words with _.
    • Tidying your data means storing it in a consistent form that matches the semantics of the dataset with the way it is stored. In brief, when your data is tidy, each column is a variable, and each row is an observation.
    • A good visualisation will show you things that you did not expect, or raise new questions about the data.
    • Visualisations can surprise you, but don’t scale particularly well because they require a human to interpret them.
    • Models are complementary tools to visualisation. Once you have made your questions sufficiently precise, you can use a model to answer them.
    • by its very nature a model cannot question its own assumptions. That means a model cannot fundamentally surprise you.
    • There’s a rough 80-20 rule at play; you can tackle about 80% of every project using the tools that you’ll learn in this book, but you’ll need other tools to tackle the remaining 20%.
      • The complement of hypothesis generation is hypothesis confirmation. Hypothesis confirmation is hard for two reasons:

         
           
        1. You need a precise mathematical model in order to generate falsifiable predictions. This often requires considerable statistical sophistication.

        2.  
        3. You can only use an observation once to confirm a hypothesis.

      • Hypothesis generation and confirmation
    • models are often used for exploration, and with a little care you can use visualisation for confirmation. The key difference is how often do you look at each observation: if you look only once, it’s confirmation; if you look more than once, it’s exploration.
      • Models for exploration, visualizations for confirmation
    • The goal of data exploration is to generate many promising leads that you can later explore in more depth.

Posted from Diigo. The rest of my favorite links are here.

No comments:

Post a Comment