xtabs: cross table
Data contains three variables X,Y,Z. To tabulate the result of Z with X and Y catalogs.xtabs(Z~X+Y)
ftable: flat tablecut2From Hmisc library splits a data frame into subgroups. It returns specific number of groups as factors.
groups <- cut2(data, g=number_of_groups)
dpylr library provides a series functions to simulate the data frame as a database.
df <- tbl_df(origin_df)
select(df, var2:var4)
select(df,-(var4:var2)) #delete
filter(df, var1==1)
filter(df,a<="3"|b=="IN") #or
filter(df,!is.na(var1)) #is not missing
arrange(df, var1)
arrange(df,desc(var1))
mutate(df, new_var1=var1+var2, new_var2=new_var1^2)
summarize(df, avg_var=mean(var))
summarize(df,sd_var=sd(var))
The group_by will initiate the groups in data frame.
by_var <- group_by(data, var)
summarize(by_var, avg_var2= mean(var2))
arrange(data, desc(var))
desc represents descending orderAlso we could use piping to organize data flow %>5.
tidyr is a library to clean dataset.
gatherseparate$ for names in data frame.