If you use these functions, please star the repo, or cite via DOI. Thanks!

CodeAndRoll

CodeAndRoll is a collection of custom R functions. Works with MarkdownReports, SeuratUtils but also as a standalone set of more than 200 productivity tools.. Many other repos/libraries of mine may have dependency on these functions. Source: own work + web (source referenced in description and/or source code). Intended for my personal use, shared because others may find (parts of it) useful.

News

CodeAndRoll (v1) repository is decommissioned.

Use the packages below:

Package Reorganisation Diagram

Install

1.) Download CodeAndRoll.R, save as local .R file, and source(~/path/to/CodeAndRoll.R):

2.) Directly source from the web:

source("https://raw.githubusercontent.com/vertesy/CodeAndRoll/master/CodeAndRoll.R")

Troubleshooting

If you encounter a bug, something doesn’t work or unclear, please let me know by raising an issue on CodeAndRoll – Please check if it has been asked.

Usage

After source("~/path/to/CodeAndRoll.R") you can use any of the functions listed below. A part of the functions have a minimal example written in the .R scripts, just below each functions definition.

Chapters

The script is roughly organised in the following sections / categories:

File handling, export, import [read & write]
Clipboard interaction (OS X)
Reading files in
Writing files out
Vector operations
Vector filtering
Matrix operations
List operations
Set operations
Math and stats
String operations
Plotting and Graphics
Read and write plotting functions READ
Generic
Plots
New additions

List of Functions

Note that this library is under continous development. Thus not all functions here may be still in CodeAndRoll.R, and vice versa, new functions in CodeAndRoll.R may not be listed here. Backward compatibility is most often, but not always taken care of. See other files in the repo if you are missing a function.

String operations

ppp():

Paste by point
pps():

Paste by (forward) slash
ppu():

Paste by underscore
ppd():

Paste by dash
kpp():

kollapse by point
kppu():

kollapse by underscore
kppd():

kollapse by dash
stry():

Silent try
say():

Use system voice to notify (after a long task is done)
sayy():

Use system voice to notify (after a long task is done)
grepv():

grep returning the value.
unload():

Unload a package. Source Stackoverflow.
clip2clip.vector():

Copy from clipboard (e.g. excel) to a R-formatted vector to the clipboard
clip2clip.commaSepString():

read a comma separated string (e.g. list of gene names) and properly format it for R.
read.simple.vec():

Read each line of a file to an element of a vector (read in new-line separated values, no header!).
read.simple():

It is essentially read.table() with file/path parsing.
read.simple_char_list():

Read in a file.
read.simple.table():

Read in a file. default: header defines colnames, no rownames. For rownames give the col nr. with rownames, eg. 1 The header should start with a TAB / First column name should be empty.
FirstCol2RowNames():

Set First Col to Row Names

File handling, export, import

read.simple.tsv():

Read in a file with excel style data: rownames in col1, headers SHIFTED. The header should start with a TAB / First column name should be empty.
read.simple.csv():

Read in a file with excel style data: rownames in col1, headers SHIFTED. The header should start with a TAB / First column name should be empty.
read.simple.ssv():

Space separeted values. Read in a file with excel style data: rownames in col1, headers SHIFTED. The header should start with a TAB / First column name should be empty.
read.simple.tsv.named.vector():

Read in a file with excel style named vectors, names in col1, headers SHIFTED. The header should start with a TAB / First column name should be empty.
convert.tsv.data():

Fix NA issue in dataframes imported by the new read.simple.tsv. Set na_rep to NA if you want to keep NA-s
read.simple.xls():

Read multi-sheet excel files. row_namePos = NULL for automatic names
write.simple():

Write out a matrix-like R-object to a file with as tab separated values (.tsv). Your output filename will be either the variable’s name. The output file will be located in “OutDir” specified by you at the beginning of the script, or under your current working directory. You can pass the PATH and VARIABLE separately (in order), they will be concatenated to the filename.
write.simple.vec():

Write out a vector-like R-object to a file with as newline separated values (.vec). Your output filename will be either the variable’s name. The output file will be located in “OutDir” specified by you at the beginning of the script, or under your current working directory. You can pass the PATH and VARIABLE separately (in order), they will be concatenated to the filename.
write.simple.xlsx():

Write out a list of matrices/ data frames WITH ROW- AND COLUMN- NAMES to a file with as an Excel (.xslx) file. Your output filename will be either the variable’s name. The output file will be located in “OutDir” specified by you at the beginning of the script, or under your current working directory. You can pass the PATH and VARIABLE separately (in order), they will be concatenated to the filename.
write.simple.append():

Append an R-object WITHOUT ROWNAMES, to an existing .tsv file of the same number of columns. Your output filename will be either the variable’s name. The output file will be located in “OutDir” specified by you at the beginning of the script, or under your current working directory. You can pass the PATH and VARIABLE separately (in order), they will be concatenated to the filename.
sstrsplit():

Alias for str_split_fixed in the stringr package
topN.dfCol():

Find the n highest values in a named vector
bottomN.dfCol():

Find the n lowest values in a named vector
as.named.vector():

Convert a dataframe column or row into a vector, keeping the corresponding dimension name.
col2named.vector():

Convert a dataframe column into a vector, keeping the corresponding dimension name.
row2named.vector():

Convert a dataframe row into a vector, keeping the corresponding dimension name.
as.numeric.wNames():

Converts any vector into a numeric vector, and puts the original character values into the names of the new vector, unless it already has names. Useful for coloring a plot by categories, name-tags, etc.
as.numeric.wNames.old():

Converts any vector into a numeric vector, and puts the original character values into the names of the new vector, unless it already has names. Useful for coloring a plot by categories, name-tags, etc.
as.character.wNames():

Converts your input vector into a character vector, and puts the original character values into the names of the new vector, unless it already has names.
rescale():

linear transformation to a given range of values
flip_value2name():

Flip the values and the names of a vector with names
sortbyitsnames():

Sort a vector by the alphanumeric order of its names (instead of its values).
any.duplicated():

How many entries are duplicated
which.duplicated():

orig =rownames(sc@expdata)
which.NA():

orig =rownames(sc@expdata)
which_names():

Return the names where the input vector is TRUE. The input vector is converted to logical.
which_names_grep():

Return the vector elements whose names are partially matched
na.omit.mat():

Omit rows with NA values from a matrix. Rows with any, or full of NA-s
inf.omit():

Omit infinite values from a vector.
zero.omit():

Omit zero values from a vector.
pc_TRUE():

Percentage of true values in a logical vector, parsed as text (useful for reports.)
NrAndPc():

Summary stat. text formatting for logical vectors (%, length)
pc_in_total_of_match():

Percentage of a certain value within a vector or table.
filter_survival_length():

Parse a sentence reporting the % of filter survival.
remove_outliers():

Remove values that fall outside the trailing N % of the distribution.
simplify_categories():

Replace every entry that is found in “replaceit”, by a single value provided by “to”
rotate():

rotate a matrix 90 degrees.
sortEachColumn():

Sort each column of a numeric matrix / data frame.
rowMedians():

Calculates the median of each row of a numeric matrix / data frame.
colMedians():

Calculates the median of each column of a numeric matrix / data frame.
rowGeoMeans():

Calculates the median of each row of a numeric matrix / data frame.
colGeoMeans():

Calculates the median of each column of a numeric matrix / data frame.
rowCV():

Calculates the CV of each ROW of a numeric matrix / data frame.
colCV():

Calculates the CV of each column of a numeric matrix / data frame.
rowVariance():

Calculates the CV of each ROW of a numeric matrix / data frame.
colVariance():

Calculates the CV of each column of a numeric matrix / data frame.
rowMin():

Calculates the minimum of each row of a numeric matrix / data frame.
colMin():

Calculates the minimum of each column of a numeric matrix / data frame.
rowMax():

Calculates the maximum of each row of a numeric matrix / data frame.
colMax():

Calculates the maximum of each column of a numeric matrix / data frame.
rowSEM():

Calculates the SEM of each row of a numeric matrix / data frame.
colSEM():

Calculates the SEM of each column of a numeric matrix / data frame.
rowSD():

Calculates the SEM of each row of a numeric matrix / data frame.
colSD():

Calculates the SEM of each column of a numeric matrix / data frame.
rowIQR():

Calculates the SEM of each row of a numeric matrix / data frame.
colIQR():

Calculates the SEM of each column of a numeric matrix / data frame.
rowquantile():

Calculates the SEM of each row of a numeric matrix / data frame.
colquantile():

Calculates the SEM of each column of a numeric matrix / data frame.
row.Zscore():

Calculate Z-score over rows of data frame.
rowACF():

RETURNS A LIST. Calculates the autocorrelation of each row of a numeric matrix / data frame.
colACF():

RETURNS A LIST. Calculates the autocorrelation of each row of a numeric matrix / data frame.
acf.exactLag():

Autocorrelation with exact lag
rowACF.exactLag():

RETURNS A Vector for the “lag” based autocorrelation. Calculates the autocorrelation of each row of a numeric matrix / data frame.
colACF.exactLag():

RETURNS A Vector for the “lag” based autocorrelation. Calculates the autocorrelation of each row of a numeric matrix / data frame.
colDivide():

divide by column
rowDivide():

divide by row
sort.mat():

Sort a matrix. ALTERNATIVE: dd[with(dd, order(-z, b)), ]. Source: stackoverflow.
rowNameMatrix():

Create a copy of your matrix, where every entry is replaced by the corresponding row name. Useful if you want to color by row name in a plot (where you have different number of NA-values in each row).
colNameMatrix():

Create a copy of your matrix, where every entry is replaced by the corresponding column name. Useful if you want to color by column name in a plot (where you have different number of NA-values in each column).
colsplit():

split a data frame by a factor corresponding to columns.
rowsplit():

split a data frame by a factor corresponding to columns.
TPM_normalize():

normalize each column to 1 million
median_normalize():

normalize each column to the median of all the column-sums
mean_normalize():

normalize each column to the median of the columns
rownames.trimws():

trim whitespaces from the rownames
select.rows.and.columns():

Subset rows and columns. It checks if the selected dimension names exist and reports if any of those they aren’t found.
getRows():

Get the subset of rows with existing rownames, report how much it could not find.
getCols():

Get the subset of cols with existing colnames, report how much it could not find.
get.oddoreven():

Get odd or even columns or rows of a data frame
combine.matrices.intersect():

combine matrices by rownames intersect
merge_dfs_by_rn():

Merge any data frames by rownames. Required plyr package
merge_numeric_df_by_rn():

Merge 2 numeric data frames by rownames
attach_w_rownames():

Take a data frame (of e.g. metadata) from your memory space, split it into vectors so you can directly use them. E.g.: Instead of metadata$color[blabla] use color[blabla]
panel.cor.pearson():

A function to display correlation values for pairs() function. Default is pearson correlation, that can be set to “kendall” or “spearman”.
panel.cor.spearman():

A function to display correlation values for pairs() function. Default is pearson correlation, that can be set to “kendall” or “spearman”.
remove.na.rows():

cols have to be a vector of numbers corresponding to columns
remove.na.cols():

cols have to be a vector of numbers corresponding to columns
intersect.ls():

Intersect any number of lists.
union.ls():

Intersect any number of list elements. Faster than reduce.
unlapply():

lapply, then unlist
list.wNames():

create a list with names from ALL variables you pass on to the function
as.list.df.by.row():

Split a dataframe into a list by its columns. omit.empty for the listelments; na.omit and zero.omit are applied on entries inside each list element.
as.list.df.by.col():

oSplit a dataframe into a list by its rows. omit.empty for the listelments; na.omit and zero.omit are applied on entries inside each list element.
reorder.list():

reorder elements of lists in your custom order of names / indices.
range.list():

range of values in whole list
intermingle2lists():

Combine 2 lists (of the same length) so that form every odd and every even element of a unified list. Useful for side-by-side comparisons, e.g. in wstripchart_list().
as.listalike():

convert a vector to a list with certain dimensions, taken from the list it wanna resemble
list2fullDF.byNames():

Convert a list to a full matrix. Rows = names(union.ls(your_list)) or all names of within list elements, columns = names(your_list).
list2fullDF.presence():

Convert a list to a full matrix. Designed for occurence counting, think tof table(). Rows = all ENTRIES of within your list, columns = names(your_list).
splitbyitsnames():

split a list by its names
splititsnames_byValues():

split a list by its names
intermingle2vec():

Combine 2 vectors (of the same length) so that form every odd and every even element of a unified vector.
intermingle.cbind():

Combine 2 data frames (of the same length) so that form every odd and every even element of a unified list. Useful for side-by-side comparisons, e.g. in wstripchart_list().
pad.na():

Fill up with a vector to a given length with NA-values at the end.
clip.values():

Signal clipping. Cut values above or below a threshold.
clip.outliers():

Signal clipping based on the input data’s distribution. It clips values above or below the extreme N% of the distribution.
ls2categvec():

Convert a list to a vector repeating list-element names, while vector names are the list elements
symdiff():

Quasy symmetric difference of any number of vectors
sem():

Calculates the standard error of the mean (SEM) for a numeric vector (it excludes NA-s by default)
fano():

Calculates the fano factor on a numeric vector (it excludes NA-s by default)
geomean():

Calculates the geometric mean of a numeric vector (it excludes NA-s by default)
mean_of_log():

Calculates the mean of the log_k of a numeric vector (it excludes NA-s by default)
movingAve():

Calculates the moving / rolling average of a numeric vector.
movingAve2():
movingSEM():

Calculates the moving / rolling standard error of the mean (SEM) on a numeric vector.
imovingSEM():

Calculates the moving / rolling standard error of the mean (SEM). It calculates it to the edge of the vector with incrementally smaller window-size.
eval_parse_kollapse():

evaluate and parse (dyn_var_caller)
lookup():

Awesome pattern matching for a set of values in another set of values. Returns a list with all kinds of results.
richColors():

Alias for rich.colors in gplots
Color_Check():

Display the colors encoded by the numbers / color-ID-s you pass on to this function
colSums.barplot():

Draw a barplot from ColSums of a matrix.
lm_equation_formatter():

Renders the lm() function’s output into a human readable text. (e.g. for subtitles)
lm_equation_formatter2():

Renders the lm() function’s output into a human readable text. (e.g. for subtitles)
lm_equation_formatter3():

Renders the lm() function’s output into a human readable text. (e.g. for subtitles)
hist.XbyY():

Split a one variable by another. Calculates equal bins in splitby, and returns a list of the corresponding values in toSplit.
flag.name_value():

returns the name and its value, if its not FALSE.
flag.nameiftrue():

Returns the name and its value, if its TRUE.
flag.names_list():

Returns the name and value of each element in a list of parameters.
param.list.flag():

Returns the name and value of each element in a list of parameters.
quantile_breaks():

Quantile breakpoints in any data vector Source: slowkow.com.
vec.fromNames():

create a vector from a vector of names
list.fromNames():

create list from a vector with the names of the elements
matrix.fromNames():

Create a matrix from 2 vectors defining the row- and column names of the matrix. Default fill value: NA.
matrix.fromVector():

Create a matrix from values in a vector repeated for each column / each row. Similar to rowNameMatrix and colNameMatrix.
array.fromNames():

create an N-dimensional array from N vectors defining the row-, column, etc names of the array
what():

A better version of is(). It can print the first “printme” elements.
idim():

A dim() function that can handle if you pass on a vector: then, it gives the length.
idimnames():

A dimnames() function that can handle if you pass on a vector: it gives back the names.
table_fixed_categories():

generate a table() with a fixed set of categories. It fills up the table with missing categories, that are relevant when comparing to other vectors.
stopif2():

Stop script if the condition is met. You can parse anything (e.g. variables) in the message
most_frequent_elements():

Show the most frequent elements of a table
top_indices():

Returns the position / index of the n highest values. For equal values, it maintains the original order
percentile2value():

Calculate what is the actual value of the N-th percentile in a distribution or set of numbers. Useful for calculating cutoffs, and displaying them by whist()’s “vline” paramter.
MaxN():

find second (third…) highest/lowest value in vector
hclust.getOrder.row():

Extract ROW order from a pheatmap object.
hclust.getOrder.col():

Extract COLUMN order from a pheatmap object.
hclust.getClusterID.row():

Extract cluster ID’s for ROWS of a pheatmap object.
hclust.getClusterID.col():

Extract cluster ID’s for COLUMNS of a pheatmap object.
hclust.ClusterSeparatingLines.row():

Calculate the position of ROW separating lines between clusters in a pheatmap object.
hclust.ClusterSeparatingLines.col():

Calculate the position of COLUMN separating lines between clusters in a pheatmap object.
Gap.Postions.calc.pheatmap():

calculate gap positions for pheatmap, based a sorted annotation vector of categories
matlabColors.pheatmap():

Create a Matlab-like color gradient using “colorRamps”.
annot_col.create.pheatmap.vec():

For VECTORS. Auxiliary function for pheatmap. Prepares the 2 variables needed for “annotation_col” and “annotation_colors” in pheatmap
annot_col.create.pheatmap.df():

For data frames. Auxiliary function for pheatmap. Prepares the 2 variables needed for “annotation_col” and “annotation_colors” in pheatmap
annot_col.fix.numeric():

fix class and color annotation in pheatmap annotation data frame’s and lists.
annot_row.create.pheatmap.df():

For data frames. Auxiliary function for pheatmap. Prepares the 2 variables needed for “annotation_col” and “annotation_colors” in pheatmap
wPairConnector():

Connect Pairs of datapoints with a line on a plot.
numerate():

numerate from x to y with additonal zeropadding
printEveryN():

Report at every e.g. 1000
zigzagger():

mix entries so that they differ
irequire():

Load a package. If it does not exist, try to install it from CRAN.
IfExistsAndTrue():

Internal function. Checks if a variable is defined, and its value is TRUE.
filter_InCircle():

Find points in/out-side of a circle.
cumsubtract():

Cumulative subtraction, opposite of cumsum()
trail():

A combination of head() and tail() to see both ends.
sort.decreasing():

Sort in decreasing order.
list.2.replicated.name.vec():

Convert a list to a vector, with list elements names replicated as many times, as many elements each element had.
idate():

Parse current date, dot separated.
view.head():

view the head of an object by console.
view.head2():

view the head of an object by View().
iidentical.names():

Test if names of two objects for being exactly equal
iidentical():

Test if two objects for being exactly equal
iidentical.all():

Test if two objects for being exactly equal.
parsepvalue():

Parse p-value from a number to a string.
shannon.entropy():

Calculate shannon entropy
id2titlecaseitalic():

Convert a gene ID to title case italic
id2titlecaseitalic.sp():

Convert a gene ID to italic
id2name():

Convert a gene ID to a gene name (symbol). From / for RaceID.
id2chr():

Convert a gene ID to the chromosome. From / for RaceID.
name2id():

Convert an name to gene ID. From / for RaceID.
name2id.toClipboard():

Convert an name to gene ID, anc copy to clipboard. From / for RaceID.
name2id.fast():

Convert an name to gene ID. From / for RaceID.
legend.col():

Legend color. # Source: aurelienmadouasse.wordpress.com.
copy.dimension.and.dimnames():

copy dimension and dimnames
mdlapply():

lapply for multidimensional arrays
arr.of.lists.2.df():

simplify 2D-list-array to a DF
mdlapply2df():

multi dimensional lapply + arr.of.lists.2.df (simplify 2D-list-array to a DF)
memory.biggest.objects():

Show distribution of the largest objects and return their names
na.omit.strip():

Calls na.omit() and returns a clean vector
md.LinkTable():

Take a dataframe where every entry is a string containing an html link, parse and write out
link_google():

Parse google search query links to your list of gene symbols. Strings “prefix” and ““suffix” will be searched for together with each gene (“Human ID4 neurons”). See many additional services in DatabaseLinke.R.
link_bing():

Parse bing search query links to your list of gene symbols. Strings “prefix” and ““suffix” will be searched for together with each gene (“Human ID4 neurons”). See many additional services in DatabaseLinke.R..
val2col():

This function converts a vector of values(“yourdata”) to a vector of color levels. One must define the number of colors. The limits of the color scale(“zlim”) or the break points for the color changes(“breaks”) can also be defined. When breaks and zlim are defined, breaks overrides zlim.
as.logical.wNames():

Converts your input vector into a logical vector, and puts the original character values into the names of the new vector, unless it already has names.
iterBy.over():

Iterate over a vector by every N-th element.
sourcePartial():

Source parts of another script. Source: stackoverflow.
oo():

Open current working directory.
jjpegA4():

Setup an A4 size jpeg
param.list.2.fname():

Take a list of parameters and parse a string from their names and values.
GC_content():

GC-content of a string (frequency of G and C letters among all letters).
eucl.dist.pairwise():

Calculate pairwise euclidean distance
sign.dist.pairwise():

Calculate absolute value of the pairwise euclidean distance
reverse.list.hierarchy():

reverse list hierarchy
extPDF():

add pdf as extension to a file name
extPNG():

add png as extension to a file name
col2named.vec.tbl():

Convert a 2-column table (data frame) into a named vector. 1st column will be used as names.

Get CodeAndRoll. Vertesy, 2020.

If you use these functions, please star the repo, or cite via DOI. Thanks!

CodeAndRoll

News

CodeAndRoll (v1) repository is decommissioned.

Install

Troubleshooting

Usage

Chapters

File handling, export, import [read & write]

Clipboard interaction (OS X)

Reading files in

Writing files out

Vector operations

Vector filtering

Matrix operations

List operations

Set operations

Math and stats

String operations

Plotting and Graphics

Read and write plotting functions READ

Generic

Plots

New additions

List of Functions

String operations

ppp():

pps():

ppu():

ppd():

kpp():

kppu():

kppd():

stry():

say():

sayy():

grepv():

unload():

clip2clip.vector():

clip2clip.commaSepString():

read.simple.vec():

read.simple():

read.simple_char_list():

read.simple.table():

FirstCol2RowNames():

File handling, export, import

read.simple.tsv():

read.simple.csv():

read.simple.ssv():

read.simple.tsv.named.vector():

convert.tsv.data():

read.simple.xls():

write.simple():

write.simple.vec():

write.simple.xlsx():

write.simple.append():

sstrsplit():

topN.dfCol():

bottomN.dfCol():

as.named.vector():

col2named.vector():

row2named.vector():

as.numeric.wNames():

as.numeric.wNames.old():

as.character.wNames():

rescale():

flip_value2name():

sortbyitsnames():

any.duplicated():

which.duplicated():

which.NA():

which_names():

which_names_grep():

na.omit.mat():

inf.omit():

zero.omit():

pc_TRUE():

NrAndPc():

pc_in_total_of_match():

filter_survival_length():

remove_outliers():

`ppp()`:

`pps()`:

`ppu()`:

`ppd()`:

`kpp()`:

`kppu()`:

`kppd()`:

`stry()`:

`say()`:

`sayy()`:

`grepv()`:

`unload()`:

`clip2clip.vector()`:

`clip2clip.commaSepString()`:

`read.simple.vec()`:

`read.simple()`:

`read.simple_char_list()`:

`read.simple.table()`:

`FirstCol2RowNames()`:

`read.simple.tsv()`:

`read.simple.csv()`:

`read.simple.ssv()`:

`read.simple.tsv.named.vector()`:

`convert.tsv.data()`:

`read.simple.xls()`:

`write.simple()`:

`write.simple.vec()`:

`write.simple.xlsx()`:

`write.simple.append()`:

`sstrsplit()`:

`topN.dfCol()`:

`bottomN.dfCol()`:

`as.named.vector()`:

`col2named.vector()`:

`row2named.vector()`:

`as.numeric.wNames()`:

`as.numeric.wNames.old()`:

`as.character.wNames()`:

`rescale()`:

`flip_value2name()`:

`sortbyitsnames()`:

`any.duplicated()`:

`which.duplicated()`:

`which.NA()`:

`which_names()`:

`which_names_grep()`:

`na.omit.mat()`:

`inf.omit()`:

`zero.omit()`:

`pc_TRUE()`:

`NrAndPc()`:

`pc_in_total_of_match()`:

`filter_survival_length()`:

`remove_outliers()`:

`simplify_categories()`:

`rotate()`:

`sortEachColumn()`:

`rowMedians()`:

`colMedians()`:

`rowGeoMeans()`:

`colGeoMeans()`:

`rowCV()`:

`colCV()`:

`rowVariance()`:

`colVariance()`:

`rowMin()`:

`colMin()`:

`rowMax()`:

`colMax()`:

`rowSEM()`:

`colSEM()`:

`rowSD()`:

`colSD()`:

`rowIQR()`:

`colIQR()`:

`rowquantile()`:

`colquantile()`:

`row.Zscore()`:

`rowACF()`:

`colACF()`: