Buscar
Estás en modo de exploración. debe iniciar sesión para usar MEMORY

   Inicia sesión para empezar

IST 387 Fall 2020


🇬🇧
In Inglés
Creado:


Public


0 / 5  (0 calificaciones)



» To start learning, click login

1 / 20

[Front]


True or False: You need to "library" a package each time you start a new R session.
[Back]


True

Practique preguntas conocidas

Manténgase al día con sus preguntas pendientes

Completa 5 preguntas para habilitar la práctica

Exámenes

Examen: pon a prueba tus habilidades

Pon a prueba tus habilidades en el modo de examen

Aprenda nuevas preguntas

Modos dinámicos

InteligenteMezcla inteligente de todos los modos
PersonalizadoUtilice la configuración para ponderar los modos dinámicos

Modo manual [beta]

El propietario del curso no ha habilitado el modo manual
Modos específicos

Aprende con fichas
Completa la oración
Escuchar y deletrearOrtografía: escribe lo que escuchas
elección múltipleModo de elección múltiple
Expresión oralResponde con voz
Expresión oral y comprensión auditivaPractica la pronunciación
EscrituraModo de solo escritura

IST 387 Fall 2020 - Marcador

0 usuarios han completado este curso. ¡sé el primero!

Ningún usuario ha jugado este curso todavía, sé el primero


IST 387 Fall 2020 - Detalles

Niveles:

Preguntas:

20 preguntas
🇬🇧🇬🇧
How do you declare a variable in R?
My_value <- 5 my_str <- "Hello world" my_vector <- c(5,65,23,1) names <- c("Ann", "Bob", "Clyde", "Lu") my_df <- data.frame(names, my_vector) my_df$names <- as.character(my_df$names)
What is a factor variable and how can you create one in R?
A factor variable is a variable that can take on a limited number of discrete values, i.e. a categorical variable. mtcars$gear_factor<-as.factor(mtcars$gear)
What is the difference between a histogram and bar chart?
Histograms are used for continuous variables; bar graphs are used for discrete/categorical variables. ggplot(data = mtcars,aes(x=mpg))+geom_histogram() ggplot(data = mtcars,aes(x=gear))+geom_bar()
How do you create a boxplot in R?
Boxplot(mpg ~ am, data=mtcars)
What are scatterplots useful for and how can you create one in R?
Method 1, using base graphs: plot(airquality$Ozone, airquality$Wind) Method 2, using ggplot2: ggplot(airquality, aes(x=Ozone, y=Wind)) + geom_point()
What packages can be used for data mining in R?
Ggplot2: visualization tm: text mining lm: linear regression arules: association rules mining caret: machine learning
Name an R package which can be used for data imputation.
ImputeTS (for time series data); imputeR
How do you install a package in R?
Install.packages("name_of_package")
What is R?
R is an open-source language for statistical computing and data science. It can be used in command-line mode or with "R scripts;" in its stand-alone version (base R), or in its integrated development environment (IDE) - RStudio. RStudio is also available on the cloud - RStudio Cloud.
What is the basic syntax in R?
<- is the "assignment operator," used to declare new variables and assign values to them (technically, = can be used for assignment too) # in the beginning of a line of code is used to mark that line as a comment (aka "comment it out") name_of_function() - you can identify functions in R by the parentheses following them. For example, mean(name_of_df_column) is applying the mean() function to all numbers in a dataframe column, i.e. the function arguments, or what you want to apply the function to, go inside the parentheses; in this case, the mean() function returns a single value, the average of the numbers in the dataframe column new_df <- df[df$likelihood_to_recommend == 8, ] - this is a typical way of "subsetting" from a dataframe called df. In this case, new_df is a subset of df containing all of df's columns (because there is nothing following the comma inside the square brackets - remember, the comma is used to separate the rows we want - before the comma, from the columns - after the comma), but only certain rows - the rows for which the likelihood_to_recommend column in df has a value of exactly 8. You can modify this condition - e.g. you can change == to >, in which case only rows with likelihood_to_recommend values greater than 8 will be included in the new dataframe. $ - this operator is used for "getting inside" a dataframe. E.g. df$likelihood_to_recommend means we want to access the likelihood_to_recommend column in the df dataframe. df$text means we want to access another column in that dataframe - the column called "text."
What are some of the advantages and disadvantages of R?
+ Open-source Runs on all major platforms Large and active R user community = ample online resources Developed by statisticians specifically for data analysis One of the top programming languages for data science - Its performance depends on your machine's memory resources (in particular, your RAM) Because of that, it may be slower than Python for data-intensive operations Some of us experienced difficulties loading certain packages - package compatibility issues and conflicts between different packages (e.g. tidyverse and ggplot2) are a drawback
What are some common data types in R?
Logical (TRUE or FALSE) Numeric (e.g. 5, 0.643, 1.e+9) Character (e.g. "a", "abc", "Hello", "This is my code")
What are some common data objects in R?
Single data values (e.g. 6, 23455, "What is this?", y) Vectors Data frames Matrices
Why is R useful for data science?
R was created specifically for the purposes of statistical analysis which makes it a great candidate for data science data manipulations since it offers great functionality when it comes to data cleaning, model building and evaluation, and data visualization. There are R packages specifically geared towards data science such as caret.
How do you get the name of the current working directory in R?
The working directory is the folder on your computer R checks for a file whenever you want to import data into R. For example, you can set your Downloads folder as your working directory, and then you'll only need to supply the name of the file you want to import instead of the full path to that file: df <- read_csv("myFile.csv") instead of: df <- read_csv("C:\\User\\Downloads\\myFile.csv") To see what your current working directory is, type: getwd() And to change it: setwd("path\\to\\new\\working\\directory")