| Title: | massSpectrometryR |
|---|---|
| Description: | Provides calculations, plotting etc for chemistry & mass spectrometry. |
| Authors: | Ben Bruyneel <[email protected]> |
| Maintainer: | Ben Bruyneel <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.6.5 |
| Built: | 2026-05-22 07:46:21 UTC |
| Source: | https://github.com/BenBruyneel/massSpectrometryR |
custom operator for subtracting formulas from one another, to make calculating with formulas a little more clear
formula1 %f-% formula2formula1 %f-% formula2
formula1 |
named numeric vector, example c(O = 2, C = 1); formula to be subtracted from |
formula2 |
named numeric vector, example c(H = 2, S = 1); formula to subtract |
c(H = 2, O = 1) %f-% c(H = 1) c(H = 2, O = 1) %f-% c(S = 1, O = 2)c(H = 2, O = 1) %f-% c(H = 1) c(H = 2, O = 1) %f-% c(S = 1, O = 2)
custom operator for adding up formulas, to make calculating with formulas a little more clear
formula1 %f+% formula2formula1 %f+% formula2
formula1 |
named numeric vector, example c(O = 2, C = 1) |
formula2 |
named numeric vector, example c(H = 2, S = 1) |
waterFormula() %f+% protonFormula() waterFormula() %f+% c(C=1, O = 2) c(H = 2, O = 1) %f+% c(S = 1, O = 2)waterFormula() %f+% protonFormula() waterFormula() %f+% c(C=1, O = 2) c(H = 2, O = 1) %f+% c(S = 1, O = 2)
Adding up two formulas, taking into account possible differing elements
addFormulas(formula1, formula2)addFormulas(formula1, formula2)
formula1 |
named numeric vector, example c(O = 2, C = 1) |
formula2 |
named numeric vector, example c(H = 2, S = 1) |
a named numeric vector (formula)
addFormulas(waterFormula(), protonFormula()) addFormulas(waterFormula(), c(C=1, O=2))addFormulas(waterFormula(), protonFormula()) addFormulas(waterFormula(), c(C=1, O=2))
Take a list of formulas and adds them all up
addListFormulas(formulas)addListFormulas(formulas)
formulas |
list of formulas |
a named numeric vector (formula)
addListFormulas(list(c(H = 2, O = 1), c(H = 1), c(H = 2, O = 1), c(S = 1, O = 2)))addListFormulas(list(c(H = 2, O = 1), c(H = 1), c(H = 2, O = 1), c(S = 1, O = 2)))
R6 Class representing a set of amino acids. It adds three functions to quickly switch between different writing 'styles' of peptides
Note: this class is meant to be used only for amino acids and such
massSpectrometryR::chemicals -> aminoacids
getName()
Function to retrieve the full name of an amino acid via the letter or shorts
aminoAcidClass$getName(searchString, checkCase = TRUE)
searchStringeither a 1- or 3- letter character vector
checkCasedefault = TRUE. If false, the function will ignore the case the searchString argument
character vector, name of the aminoacid
getShort()
Function to retrieve either the 1- or 3- letter code of an amino acid
aminoAcidClass$getShort(searchString, checkCase = TRUE)
searchStringeither a 1- or 3- letter character vector: if 1-letter than the corresponding 3-letter character vector will be returned and vice versa
checkCasedefault = TRUE. If false, the function will ignore the case the searchString argument
character vector, 1- or 3- letter code of the aminoacid
translatePeptide()
Translates a amino acid sequence from 1-letter codes to 3-letter codes and vice versa
aminoAcidClass$translatePeptide( sequence, from1to3 = FALSE, splitCharacter = NA, joinCharacter = NA, checkCase = TRUE )
sequencecharacter vector: amino acid sequence in 1-letter or 3-letter codes
from1to3logical vector: if TRUE, then translation will be from 1-letter code to 3=letter code. If FALSE, then vice versa. Default = FALSE
splitCharactercharacter vector specifying the character(s) between the 1- or 3-letter codes in the sequence. Default NA (same as "")
joinCharactercharacter vector specifying the character(s) between the translated codes. Default NA (same as "")
checkCasedefault = TRUE. If false, the function will ignore the case the sequence
character vector, sequence in either 1- or 3-letter codes
clone()
The objects of this class are cloneable with this method.
aminoAcidClass$clone(deep = FALSE)
deepWhether to make a deep clone.
aminoAcidResidues()$getShort("L") aminoAcidResidues()$getShort("Leu") aminoAcidResidues()$getName("L") aminoAcidResidues()$getName("Leu") aminoAcidResidues()$translatePeptide("Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-His-Leu", from1to3 = TRUE, splitCharacter ="-") aminoAcidResidues()$translatePeptide("DRVYIHPFHL", joinCharacter = "-")aminoAcidResidues()$getShort("L") aminoAcidResidues()$getShort("Leu") aminoAcidResidues()$getName("L") aminoAcidResidues()$getName("Leu") aminoAcidResidues()$translatePeptide("Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-His-Leu", from1to3 = TRUE, splitCharacter ="-") aminoAcidResidues()$translatePeptide("DRVYIHPFHL", joinCharacter = "-")
Returns a pre-defined object which contains info on some common amino acid modifications
aminoAcidModifications()aminoAcidModifications()
An object of class modifications containing info on amino acid modifications
the resulting modification table cannot be used immediately: there is two times a fixed modification for Cysteine amino acids. Remove one of them to prevent errors when using peptide calculations
print(aminoAcidModifications)print(aminoAcidModifications)
Generates a pre-defined object which contains info on 'normal' amino acid residues
aminoAcidResidues()aminoAcidResidues()
a R6 object of class 'chemicals'
The formulas in the object are amino acid residues as they are present in proteins. To get the actual formula of the amino acid in its 'free' form, add c(H=2, O=1) (water)
this object is used in all protein calculations in this package
print(aminoAcidResidues())print(aminoAcidResidues())
calculates the mass (Da) when it deviates a certain ppm (part per million) from the reference mass (Da).
calculate.Measured.mz(referenceMz, ppm)calculate.Measured.mz(referenceMz, ppm)
referenceMz |
reference mass or m/z in Da. Usually a theoretical mass calculated from a formula |
ppm |
deviation (in ppm) from the reference mass |
a numeric vector
calculate.Measured.mz(465.6025, 5) calculate.Measured.mz(465.6025, 0) calculate.Measured.mz(465.6025, -5)calculate.Measured.mz(465.6025, 5) calculate.Measured.mz(465.6025, 0) calculate.Measured.mz(465.6025, -5)
calculates the mass deviation in ppm (parts per million) between the measured mass or m/z and a reference mass or m/z (both in Da).
calculate.ppm(referenceMz, measuredMz)calculate.ppm(referenceMz, measuredMz)
referenceMz |
reference mass or m/z in Da. Usually a theoretical mass calculated from a formula |
measuredMz |
measured mass or mz in Da. |
a numeric vector
calculate.ppm(465.6025, 465.6028) calculate.ppm(465.6025, 465.6025) calculate.ppm(465.6025, 465.7025)calculate.ppm(465.6025, 465.6028) calculate.ppm(465.6025, 465.6025) calculate.ppm(465.6025, 465.7025)
calculate the reference mass (Da) from a maesured mass (Da) which deviates a certain amount (ppm). Somewhat odd calculation, that is usually not needed in the lab, but is sometimes needed in theoretical situations or in programming.
calculate.Reference.mz(measuredMz, ppm)calculate.Reference.mz(measuredMz, ppm)
measuredMz |
measured mass or m/z in Da. |
ppm |
deviation (in ppm) that measuredMz differs from the reference mass to be calculated |
a numeric vector
calculate.ppm(465.6025, 465.6028) calculate.Measured.mz(465.6025, 0.644) calculate.Reference.mz(465.6028, 0.644)calculate.ppm(465.6025, 465.6028) calculate.Measured.mz(465.6025, 0.644) calculate.Reference.mz(465.6028, 0.644)
Every chemical inside the object has a name, letter, short and a formula. The first 3 can be any length of string (though the letter and short field should be maximum length (nchar) 1 and 2-4 respectively). Formula should be in the form of a named numeric with the names representing elements and the values themselves being the number of atoms of that element, eg c(C = 3, H = 5, N = 1, O = 1, S = 0)
Note: this class is meant to be used for classes of compounds, eg amino acids
Also: his class is meant as a base class to be expanded via inheritance
Warning: all chemicals inside this object should be unique (names, letters & shorts)
numberretrieve the number of compounds present in the object, read only
lettersto access the letters of the compounds in the object
namesto access the names of the compounds in the object
shortsto access the shorts of the compounds in the object
formulasto access the formulas of the compounds in the object
tableretrieves all info on the compounds in data.frame format, read only
new()
Create a new chemicals object
chemicals$new(letters, shorts, names, formulas)
letterscharacter vector specifying the letters (or numbers or whatever) for the chemicals. In case of amino acids it should be eg "A" for Alanine, "G" for Glycine, etc etc
shortscharacter vector specifying the short names for the chemicals, eg Ala for Alanine
namescharacter vector specifying the names of the chemicals
formulaslist of named numeric vectors specifying the formulas of the chemicals, eg c(C = 6, H = 12, N = 4, O = 1, S = 0) for Arginine
a new 'chemical' object
print()
For printing purposes: prints a table of the chemicals with columns letter, name & short
chemicals$print(...)
...no arguments, the function takes care of printing
getFormula()
Retrieves the formula of one of the compounds in the object
chemicals$getFormula(which1)
which1specifies which chemical should be retrieved. Which the number (row number in the chemicals table), or the name, letter or short as a character vector. The way this is set up, it doesn't matter whether capital or non-capital letters are used, since all is converted to upper case before comparing with what's in the chemical table
a formula in the shape of a named numeric vector, eg c(C = 6, H = 12, N = 4, O = 1, S = 0)
clone()
The objects of this class are cloneable with this method.
chemicals$clone(deep = FALSE)
deepWhether to make a deep clone.
estrogens <- chemicals$new(letters = c("1","2","3","4"), shorts = c("E1","E2","E3","E4"), names = c("Estrone","Estradiol", "Estriol","Estetrol"), formulas = list(c(C=18, H=22, O=2), c(C=18, H=24, O=2), c(C=18, H=24, O=3), c(C=18, H=24, O=4)))estrogens <- chemicals$new(letters = c("1","2","3","4"), shorts = c("E1","E2","E3","E4"), names = c("Estrone","Estradiol", "Estriol","Estetrol"), formulas = list(c(C=18, H=22, O=2), c(C=18, H=24, O=2), c(C=18, H=24, O=3), c(C=18, H=24, O=4)))
Digests a sequence and returns
digest(sequence, enzyme = "trypsin", missed = 0)digest(sequence, enzyme = "trypsin", missed = 0)
sequence |
character vector representing the amino acid sequence to be digested. Note: the letters in sequence will be changed to upper case. |
enzyme |
character string specifying the enzyme to be used for the digestion. Default is 'trypsin'. Other options are 'trypsin.strict', 'pepsin', 'chymotrypsin','chymotrypsin.strict' and 'Glu-C' |
missed |
integer vector: the maximum number of allowed missed cleavages |
data.frame with the columns 'peptide', 'start', 'stop' and 'mc' (missed cleavages)
This function is an modified version of the Digest function found in the package 'OrgMassSpecR'
generates a pre-defined formula for electron
electronFormula()electronFormula()
a named numeric vector (formula)
this is used for calculations
print(electronFormula())print(electronFormula())
Every element inside the object has a name, letter and a mass. The first 2 can be any length of string, mass should be a numeric
Note: this class is meant to be used for elements.
Warning: all elements inside this object should be unique (names & shorts, not mass)
numberretrieve the number of elements present in the object, read only
namesto access the names of the elements in the object
shortsto access the shorts of the elements in the object
massto access the masses of the elements in the object
tableretrieves all info on the elements in data.frame format, read only
new()
creates a new elements object
elements$new(shorts, names, mass)
shortscharacter vector specifying the short names for the elements, eg Hg for Mercury
namescharacter vector specifying the names of the elements
massnumeric vector specifying the masses of the elements
a new 'elements' object
addElement()
adds one or more elements to the object. Elements to be added must have unique names and shorts.
elements$addElement(shorts, names, mass)
shortscharacter vector specifying the short names for the elements, eg Hg for Mercury
namescharacter vector specifying the names of the elements
massnumeric vector specifying the masses of the elements
nothing
print()
For printing purposes: prints a table of the chemicals with columns letter, name & short
elements$print(...)
...no arguments, the function takes care of printing
getMass()
Retrieves the mass of one of the elements in the object
elements$getMass(which1)
which1specifies which element should be retrieved. Which number (row number in the chemicals table), name or short as a character vector. The way this is set up, it doesn't matter whether capital or non-capital letters are used, since all are converted to upper case before comparing with what's in the elements table
a numeric value
clone()
The objects of this class are cloneable with this method.
elements$clone(deep = FALSE)
deepWhether to make a deep clone.
randomElements <- elements$new(shorts = c("X1","X2","X3"), names = c("Secret Element 1", "Secret Element 2", "Secret Element 1"), mass = c(301, 312, 323))randomElements <- elements$new(shorts = c("X1","X2","X3"), names = c("Secret Element 1", "Secret Element 2", "Secret Element 1"), mass = c(301, 312, 323))
generates a pre-defined object which contains info on elements, mass values are average masses (weighted mean mass of elements based on their natural occurence)
elementsAverage()elementsAverage()
an elements object
electron is not really meant to be used in formulas, but is needed to calculate mass & m/z of ions
print(elementsAverage())print(elementsAverage())
retrieves the names of the elements in a formula
elementsInFormula(formula, removeZero = FALSE)elementsInFormula(formula, removeZero = FALSE)
formula |
named numeric vector, example c(O = 2, C = 1) |
removeZero |
logical flag on how to deal with elements which are zero |
character vector
glucose = c(C=6, H=12, O=6, S=0) elementsInFormula(glucose) elementsInFormula(glucose, removeZero = TRUE)glucose = c(C=6, H=12, O=6, S=0) elementsInFormula(glucose) elementsInFormula(glucose, removeZero = TRUE)
especially important when formula1 contains elements that formula2 does not contain and vice versa, Note: also sorts the result
elementsInFormulas(formula1, formula2, decrease = FALSE)elementsInFormulas(formula1, formula2, decrease = FALSE)
formula1 |
named numeric vector, example c(O = 2, C = 1) |
formula2 |
named numeric vector, example c(H = 2, S = 1) |
decrease |
logical flag on how to sort, default = FALSE: increasing |
a character vector
elementsInFormulas(c(O = 2, C = 1), c(H = 2, S = 1))elementsInFormulas(c(O = 2, C = 1), c(H = 2, S = 1))
generates a pre-defined R6 elements object which contains info on elements, mass values are mono isotopic
elementsMonoisotopic()elementsMonoisotopic()
an elements object
this object is (by default) used in all chemical calculations in this package
electron is not really meant to be used in formulas, but is needed to calculate mass & m/z of ions
print(elementsMonoisotopic())print(elementsMonoisotopic())
generates an empty pre-defined formula
emptyFormula()emptyFormula()
a named numeric vector (formula)
this can be used for calculations and setup of unknowns
print(emptyFormula())print(emptyFormula())
Translates regular formula format into a character vector, eg C6H12O6
formulaString(formula, removeSingle = FALSE, useMarkdown = FALSE)formulaString(formula, removeSingle = FALSE, useMarkdown = FALSE)
formula |
named numeric vector, example c(O = 2, C = 1) |
removeSingle |
if TRUE then for elements that are present in the formula only a single time, the number (1) will not be included. Default is FALSE. See also examples |
useMarkdown |
default = FALSE. If TRUE, the it will use HTML/Markdown codes <sub> in the formulas, which can be used with the library 'gt' to generate 'proper' notation for chemical formulas (numbers in subscript) |
character vector
formulaString(c(C=6, H=12, O=6)) formulaString(c(H=3,O=4,P=1)) formulaString(c(H=3,O=4,P=1), removeSingle = TRUE) formulaString(c(H=2, O=1)) formulaString(c(H=2, O=1), removeSingle = TRUE)formulaString(c(C=6, H=12, O=6)) formulaString(c(H=3,O=4,P=1)) formulaString(c(H=3,O=4,P=1), removeSingle = TRUE) formulaString(c(H=2, O=1)) formulaString(c(H=2, O=1), removeSingle = TRUE)
calculates the neutral mono-isotopic mass of a formula
formulaToMass( formula = NULL, removeNA = FALSE, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )formulaToMass( formula = NULL, removeNA = FALSE, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )
formula |
named numeric vector, example c(O = 2, C = 1) |
removeNA |
logical vector: what to do if any of the elements is NA. If TRUE, then remove before calculation, if FALSE, then do not remove |
elementsInfo |
elements masses to be used, needs to be of class elements, default is elementsMonoisotopic(). The elementsAverage() function does not produce 100 complex isotope patterns that emerge. In case average masses are needed it's better to use the enviPat option |
enviPat |
logical argument that determines if the enviPat based calculations should be used. Default is FALSE. For monoisotopoc masses there is no difference, but for average masses of larger molecules (with complicated isotope patterns) it's highly recommended to use enviPat = TRUE with exact = FALSE |
exact |
determines if the exact (TRUE, default) or the average (FALSE) mass is calculated (ignored if enviPat is FALSE) |
numeric vector
formulaToMass(c(H=2, O=1)) formulaToMass(c(H=2, O=1), elementsInfo = elementsAverage()) formulaToMass(c(C = 50, H=102)) formulaToMass(c(C = 50, H=102), elementsInfo = elementsAverage()) formulaToMass(c(C = 50, H=102), enviPat = TRUE) formulaToMass(c(C = 50, H=102), enviPat = TRUE, exact = FALSE)formulaToMass(c(H=2, O=1)) formulaToMass(c(H=2, O=1), elementsInfo = elementsAverage()) formulaToMass(c(C = 50, H=102)) formulaToMass(c(C = 50, H=102), elementsInfo = elementsAverage()) formulaToMass(c(C = 50, H=102), enviPat = TRUE) formulaToMass(c(C = 50, H=102), enviPat = TRUE, exact = FALSE)
Calculates the m/z value of a charged/adducted ion
massToMz( mass, adducts = 0, adductFormula = electronFormula(), adductCharge = -1, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )massToMz( mass, adducts = 0, adductFormula = electronFormula(), adductCharge = -1, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )
mass |
numeric vector, (neutral) mass of the molecule |
adducts |
numeric vector, number of adducts 'attached to' or 'removed from' the (originally neutral) molecule |
adductFormula |
formula (named numeric vector) of the adduct |
adductCharge |
numeric vector indicating the actual charge per adduct |
elementsInfo |
elements masses to be used, needs to be of class elements, default is elementsMonoisotopic() |
enviPat |
logical argument that determines if the enviPat based calculations should be used. Default is FALSE. For monoisotopoc masses there is no difference, but for average masses of larger molecules (with complicated isotope patterns) it's highly recommended to use enviPat = TRUE with exact = FALSE |
exact |
determines if the exact (TRUE, default) or the average (FALSE) mass is calculated (ignored if enviPat is FALSE) |
numeric vector
# amino acid residue lysine + water lysineMass <- formulaToMass(aminoAcidResidues()$getFormula("K") %f+% waterFormula()) lysineMass # M+H+ : adducts = 1, adductFormula = protonFormula(), adductCharge = 1 # singly charged/protonated ion (ESI) massToMz(lysineMass, adducts = 1, adductFormula = protonFormula(), adductCharge = 1) # Doubly charged/protonated ion (ESI) massToMz(lysineMass, adducts = 2, adductFormula = protonFormula(), adductCharge = 1) # M-H- : single, negatively charged massToMz(lysineMass, adducts = -1, adductFormula = protonFormula(), adductCharge = 1) # M+ : singly positively charged (molecular) ion (EI) massToMz(lysineMass, adducts = -1, adductFormula = electronFormula(), adductCharge = 1) # M- : singly negatively charged (molecular) ion (EI) massToMz(lysineMass, adducts = 1, adductFormula = electronFormula(), adductCharge = 1)# amino acid residue lysine + water lysineMass <- formulaToMass(aminoAcidResidues()$getFormula("K") %f+% waterFormula()) lysineMass # M+H+ : adducts = 1, adductFormula = protonFormula(), adductCharge = 1 # singly charged/protonated ion (ESI) massToMz(lysineMass, adducts = 1, adductFormula = protonFormula(), adductCharge = 1) # Doubly charged/protonated ion (ESI) massToMz(lysineMass, adducts = 2, adductFormula = protonFormula(), adductCharge = 1) # M-H- : single, negatively charged massToMz(lysineMass, adducts = -1, adductFormula = protonFormula(), adductCharge = 1) # M+ : singly positively charged (molecular) ion (EI) massToMz(lysineMass, adducts = -1, adductFormula = electronFormula(), adductCharge = 1) # M- : singly negatively charged (molecular) ion (EI) massToMz(lysineMass, adducts = 1, adductFormula = electronFormula(), adductCharge = 1)
a wrapper around massToMz for positively charged,
protonated ions in ESI
massToMzH( mass, charge = 1, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )massToMzH( mass, charge = 1, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )
mass |
numeric vector, (neutral) mass of the molecule |
charge |
charge state |
elementsInfo |
elements masses to be used, needs to be of class elements, default is elementsMonoisotopic() |
enviPat |
logical argument that determines if the enviPat based calculations should be used. Default is FALSE. For monoisotopoc masses there is no difference, but for average masses of larger molecules (with complicated isotope patterns) it's highly recommended to use enviPat = TRUE with exact = FALSE |
exact |
determines if the exact (TRUE, default) or the average (FALSE) mass is calculated (ignored if enviPat is FALSE) |
numeric vector
# amino acid residue lysine + water lysineMass <- formulaToMass(aminoAcidResidues()$getFormula("K") %f+% waterFormula()) lysineMass # M+H+ : adducts = 1, adductFormula = protonFormula(), adductCharge = 1 # singly charged/protonated ion (ESI) massToMz(lysineMass, adducts = 1, adductFormula = protonFormula(), adductCharge = 1) massToMzH(lysineMass)# amino acid residue lysine + water lysineMass <- formulaToMass(aminoAcidResidues()$getFormula("K") %f+% waterFormula()) lysineMass # M+H+ : adducts = 1, adductFormula = protonFormula(), adductCharge = 1 # singly charged/protonated ion (ESI) massToMz(lysineMass, adducts = 1, adductFormula = protonFormula(), adductCharge = 1) massToMzH(lysineMass)
Every modification inside the object has a name, position, fixed (flag), gain (formula), loss (formula) and category.
'name' is a character vector.
Position is a character vector specifying the amino acids which always have the modification (fixed = TRUE) or can have the modification (fixed = FALSE). More than one amino acid can be specified, eg NQ (for Asparagine & glutamine). Please note that currently the package does NOT support 'exotic' amino acids (eg Selenocysteine) or 'combination' letters, such as 'J' (Leucine or Isoleucine). For the C- and N-terminus, use 'C_Term' or 'N_Term' for position.
Gain and loss specify what is lost and/or gained when a amino acid is modified. For example Carbamidomethylation of Cysteine has both a loss formula c(H=1) and a gain formula c(C=2, H=4, N=1, O=1); obviously this could also be defined as: loss formula = emptyFormula(), gain formula = c(C=2, H=3, N=1, O=1).
For the category field (character vector) there is no real 'rule' on how to classify modifications. I usually stick to the categorisation of Mascot or Sequest.
numberretrieve the number of modifications present in the object, read only
fixedretrieve a table of the fixed modifications in the modification table, read only
variableretrieve a table of the variable modifications in the modification table, read only
tableto access the table of modifications directly
new()
Create a new modifications object
modifications$new(data = NA)
datadefault = NA. If not NA, then should be a tibble with 6 columns: name, position, fixed, gain, loss and category. This is checked, but the contents of each column are not checked.
print()
For printing purposes: prints a table of the modifications
modifications$print(...)
...no arguments, the function takes care of printing
add()
Adds a single modification
modifications$add(name, position, fixed, gain, loss, category)
namecharacter vector
positioncharacter vector, should be a valid amino acid residue
fixedlogical vector, specifies whether the modification is fixed (TRUE) or dynamic (FALSE)
gainnamed numeric vector (formula) that specifies what (atoms) are gained when a modification is applied to an amino acid
lossnamed numeric vector (formula) that specifies what (atoms) are lost when a modification is applied to an amino acid
categorycharacter vector. Not rigidly defined: for user to be able to select/filter etc which type of modifications to use
nothing
addTable()
Add (a set of) modifications via a tibble
modifications$addTable(data = NA)
datadefault = NA. If not NA, then should be a tibble with 6 columns: name, position, fixed, gain, loss and category. This is checked, but the contents of each column are not checked.
nothing
clone()
The objects of this class are cloneable with this method.
modifications$clone(deep = FALSE)
deepWhether to make a deep clone.
the logical vector fixed is very important. If TRUE, then a modification is considered to be always present, if FALSE then its presence is optional.
aaModifications <- modifications$new() aaModifications$addTable( tibble::tibble( name = c("Carbamidomethyl (C)", "Carboxymethyl (C)", "Oxidation (M)"), position = c("C","C","M"), fixed = c(TRUE,TRUE,FALSE), gain = list(c(C = 2, H = 4, N = 1, O = 1), c(C = 2, H = 3, N = 0, O = 2), c(C = 0, H = 0, N = 0, O = 1)), loss = list(c(protonFormula()), c(protonFormula()), c(emptyFormula())), category = c("Cys-state","Cys-state","Preparation Artefact") ) ) aaModifications aaModifications$add(name = "Deamidation", position = "NQ", fixed = FALSE, gain = c(O = 1, H = 1), loss = c(N = 1, H =2), category = "Preparation Artefact") aaModificationsaaModifications <- modifications$new() aaModifications$addTable( tibble::tibble( name = c("Carbamidomethyl (C)", "Carboxymethyl (C)", "Oxidation (M)"), position = c("C","C","M"), fixed = c(TRUE,TRUE,FALSE), gain = list(c(C = 2, H = 4, N = 1, O = 1), c(C = 2, H = 3, N = 0, O = 2), c(C = 0, H = 0, N = 0, O = 1)), loss = list(c(protonFormula()), c(protonFormula()), c(emptyFormula())), category = c("Cys-state","Cys-state","Preparation Artefact") ) ) aaModifications aaModifications$add(name = "Deamidation", position = "NQ", fixed = FALSE, gain = c(O = 1, H = 1), loss = c(N = 1, H =2), category = "Preparation Artefact") aaModifications
a wrapper around mzToMass for positively charged,
protonated ions in ESI
mzHToMass( mz, charge = 1, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )mzHToMass( mz, charge = 1, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )
mz |
numeric vector, mass to charge ratio of he ion |
charge |
charge state |
elementsInfo |
elements masses to be used, needs to be of class elements, default is elementsMonoisotopic() |
enviPat |
logical argument that determines if the enviPat based calculations should be used. Default is FALSE. For monoisotopoc masses there is no difference, but for average masses of larger molecules (with complicated isotope patterns) it's highly recommended to use enviPat = TRUE with exact = FALSE |
exact |
determines if the exact (TRUE, default) or the average (FALSE) mass is calculated (ignored if enviPat is FALSE) |
numeric vector
massToMzH(mass = 174.1117, charge = 2) |> mzHToMass(charge = 2) massToMzH(mass = 174.1117, charge = 1) |> mzHToMass(charge = 1)massToMzH(mass = 174.1117, charge = 2) |> mzHToMass(charge = 2) massToMzH(mass = 174.1117, charge = 1) |> mzHToMass(charge = 1)
essentially the reverse of the massToMz
mzToMass( mz, adducts = 0, adductFormula = electronFormula(), adductCharge = -1, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )mzToMass( mz, adducts = 0, adductFormula = electronFormula(), adductCharge = -1, elementsInfo = elementsMonoisotopic(), enviPat = FALSE, exact = TRUE )
mz |
mass to charge ratio of the ion |
adducts |
numeric vector, number of adducts 'attached to' or 'removed from' the (originally neutral) molecule |
adductFormula |
formula (named numeric vector) of the adduct |
adductCharge |
numeric vector indicating the actual charge per adduct |
elementsInfo |
elements masses to be used, needs to be of class elements, default is elementsMonoisotopic() |
enviPat |
logical argument that determines if the enviPat based calculations should be used. Default is FALSE. For monoisotopoc masses there is no difference, but for average masses of larger molecules (with complicated isotope patterns) it's highly recommended to use enviPat = TRUE with exact = FALSE |
exact |
determines if the exact (TRUE, default) or the average (FALSE) mass is calculated (ignored if enviPat is FALSE) |
numeric vector
massToMz(mass = 174.1117, adductFormula = c(e=1), adducts = 2, adductCharge = -1) |> mzToMass(adductFormula = c(e=1), adducts = 2, adductCharge = -1) massToMz(mass = 174.1117, adductFormula = c(H=1), adducts = 1, adductCharge = 1) |> mzToMass(adductFormula = c(H=1), adducts = 1, adductCharge = 1)massToMz(mass = 174.1117, adductFormula = c(e=1), adducts = 2, adductCharge = -1) |> mzToMass(adductFormula = c(e=1), adducts = 2, adductCharge = -1) massToMz(mass = 174.1117, adductFormula = c(H=1), adducts = 1, adductCharge = 1) |> mzToMass(adductFormula = c(H=1), adducts = 1, adductCharge = 1)
translates a proteome Discoverer (Thermo Scientific) elements formula string to a formula as used by this package
pdToFormula(pdFormula)pdToFormula(pdFormula)
pdFormula |
a character vector. Formula in a format as used by proteome discoverer software |
formula of format c(H=2, O=1)
glucose <- pdToFormula("C(6) H(12) O(6)") glucose water <- pdToFormula("H(2) O") waterglucose <- pdToFormula("C(6) H(12) O(6)") glucose water <- pdToFormula("H(2) O") water
Contains two character vectors: one representing the amino acid sequence, and a second conatining info on the positions of 'variable' modifications. The object also contains a modification table specifying the 'fixed' amd 'variable' modifications.
sequencereturns the amino acid sequence as a character vector, can be set but is not checked against the length of the modifications string
lengthreturns the length of the peptide (read only)
modificationsreturns the moficiations string, can be set but is not checked agains the length of the sequence string
modificationsTablereturns the mofication table, can be modified. Note: 'variable' modifications should match the modifications string
new()
Create a new peptide object
peptide$new(sequence = "", modificationTable = NA, variableModifications = NA)
sequencecharacter vector, the amino acid sequence of the peptide
modificationTablethe table from a R6 'modifications' object containing the variable and fixed modifications present in the amino acid sequence. Important: do NOT pass on an R6 modifications object, the function can use a table from such an object, but not the object itself!
variableModificationscharacter vector specifying the position of variable modifications. The length of this vector must be the same length as the sequence. Each character specifies the modification at that position, eg "00010", means that position 1,2,3 & 5 are unmodified, while position 4 has the third variable modification in the the modification table. Note that the numbering follows the original row order of the modification table (fixed modifications filtered out). Additions to a modification table should not be a problem, deletions or editing can cause problems however as the object currently cannot deal with this itself. If this character vector is NA, then a character vector of "0"'s will be created (with the same length as the sequence)
a new 'peptide' object
print()
For printing purposes: prints the sequence string, the variable modifications string and the modification table
peptide$print(...)
...no arguments, the function takes care of printing
sequence.part()
Retrieve part of the amino acid squence. Note: intended for internal use
peptide$sequence.part(startSeq = 1L, endSeq = 1L)
startSeqinteger vector, specifies the start of the part of the amino acid sequence to retrieve
endSeqinteger vector, specifies the end of the part of the amino acid sequence to retrieve
character vector
modifications.part()
Retrieve part of the variable modification string. Note: intended for internal use
peptide$modifications.part(startSeq = 1L, endSeq = 1L)
startSeqinteger vector, specifies the start of the part of the variable modification string to retrieve
endSeqinteger vector, specifies the end of the part of the variable modification string to retrieve
character vector
modifications.formula.part()
Determines the gain & loss formulas for a part of the peptide (waviable modification string and modification table are used for this): adds up all the losses and gains. If the position of a variable modification in the variable modification string does not match the amino acid in the modification table, then a warning is produced
peptide$modifications.formula.part( startSeq = 1L, endSeq = 1L, Nterminal = TRUE, Cterminal = TRUE )
startSeqinteger vector, specifies the start of the part of the variable modification string to retrieve
endSeqinteger vector, specifies the end of the part of the variable modification string to retrieve
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
a list of 2 formulas: the summed up gain formulas & the summed up loss formulas which are present in the part selected by startSeq and endSeq)
modifications.formula()
Deterines the gain & loss formulas for the full length of the peptide sequence. Essentially a wrapper for modifications.formula.part
peptide$modifications.formula(Nterminal = TRUE, Cterminal = TRUE)
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
a list of 2 formulas: the summed up gain formulas & the summed up loss formulas which are present in the part selected by startSeq and endSeq)
formula.part()
Determines the chemical formula of part of the peptide with or without the modifications.
peptide$formula.part( startSeq = 1, endSeq = 1, ignoreModifications = FALSE, Nterminal = TRUE, Cterminal = TRUE )
startSeqinteger vector, specifies the start of the part of the peptide sequence
endSeqinteger vector, specifies the end of the part of the peptide sequence
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide. Note: if TRUE then the 'Nterminal' and 'Cterminal' parameters are ignored
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
a named numeric vector, eg: c(C=6, H=12, O=6)
formula()
Determines the chemical formula of the full length peptide with or without modifications. Essentially a wrapper around 'formula.part'
peptide$formula( ignoreModifications = FALSE, Nterminal = TRUE, Cterminal = TRUE )
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide. Note: if TRUE then the 'Nterminal' and 'Cterminal' parameters are ignored
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
a named numeric vector, eg: c(C=6, H=12, O=6)
mass.part()
Calculate the mass of part of the peptide with or without modifications
peptide$mass.part( startSeq = 1, endSeq = 1, ignoreModifications = FALSE, Nterminal = TRUE, Cterminal = TRUE, elementsInfo = elementsMonoisotopic() )
startSeqinteger vector, specifies the start of the part of the peptide sequence
endSeqinteger vector, specifies the end of the part of the peptide sequence
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide. Note: if TRUE then the 'Nterminal' and 'Cterminal' parameters are ignored
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
elementsInfoelements masses to be used, needs to be of class elements, default is elementsMonoisotopic()
numeric vector
mass()
Calculate the mass of the full length peptide with or without modifications
peptide$mass( ignoreModifications = FALSE, Nterminal = TRUE, Cterminal = TRUE, elementsInfo = elementsMonoisotopic() )
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide. Note: if TRUE then the 'Nterminal' and 'Cterminal' parameters are ignored
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
elementsInfoelements masses to be used, needs to be of class elements, default is elementsMonoisotopic()
numeric vector
mz.part()
Calculate the m/z of part of the peptide (as an ion) with or without modifications
peptide$mz.part( startSeq = 1, endSeq = 1, ignoreModifications = FALSE, Nterminal = TRUE, Cterminal = TRUE, elementsInfo = elementsMonoisotopic(), adducts = 1, adductFormula = protonFormula(), adductCharge = 1 )
startSeqinteger vector, specifies the start of the part of the peptide sequence
endSeqinteger vector, specifies the end of the part of the peptide sequence
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide. Note: if TRUE then the 'Nterminal' and 'Cterminal' parameters are ignored
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
elementsInfoelements masses to be used, needs to be of class elements, default is elementsMonoisotopic()
adductsnumeric vector, number of adducts attached to' or 'removed from' the (originally neutral) peptide
adductFormulaformula (named numeric vector) of the adduct
adductChargenumeric vector indicating the actual charge per adduct
numeric vector
mz()
Calculate the m/z of the full length peptide (as an ion) with or without modifications
peptide$mz( ignoreModifications = FALSE, Nterminal = TRUE, Cterminal = TRUE, elementsInfo = elementsMonoisotopic(), adducts = 1, adductFormula = protonFormula(), adductCharge = 1 )
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide. Note: if TRUE then the 'Nterminal' and 'Cterminal' parameters are ignored
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
elementsInfoelements masses to be used, needs to be of class elements, default is elementsMonoisotopic()
adductsnumeric vector, number of adducts attached to' or 'removed from' the (originally neutral) peptide
adductFormulaformula (named numeric vector) of the adduct
adductChargenumeric vector indicating the actual charge per adduct
numeric vector
mzH.part()
Calculate the m/z of part of the peptide (as a protonated ion) with or without modifications
peptide$mzH.part( startSeq = 1, endSeq = 1, ignoreModifications = FALSE, Nterminal = TRUE, Cterminal = TRUE, charge = 1, elementsInfo = elementsMonoisotopic() )
startSeqinteger vector, specifies the start of the part of the peptide sequence
endSeqinteger vector, specifies the end of the part of the peptide sequence
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide. Note: if TRUE then the 'Nterminal' and 'Cterminal' parameters are ignored
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
chargecharge state
elementsInfoelements masses to be used, needs to be of class elements, default is elementsMonoisotopic()
numeric vector
mzH()
Calculate the m/z of part of the peptide (as a protonated ion) with or without modifications
peptide$mzH( charge = 1, ignoreModifications = FALSE, Nterminal = TRUE, Cterminal = TRUE, elementsInfo = elementsMonoisotopic() )
chargecharge state
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide. Note: if TRUE then the 'Nterminal' and 'Cterminal' parameters are ignored
Nterminallogical vector if TRUE then Nterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
Cterminallogical vector if TRUE then Cterminal modifications are included (if N-terminus is present in the part selected by startSeq and endSeq)
elementsInfoelements masses to be used, needs to be of class elements, default is elementsMonoisotopic()
numeric vector
fragments.part()
generates a table of fragments which could arise from fragmenting part of the peptide. The ionseries generated are: a, a-H2O, a-NH3, b, b-H2O, b-NH3, b+H2O, c, x, y, y-H2O, y-NH3, z. Please note that the calculation is relatively 'dumb': it does NOT check whether a fragment is possible at all. Prime example is the B+H2O ion series: these fragment ions can only if certain conditions are met. Currently there is no check in this function that checks these conditions/assumptions
peptide$fragments.part( startSeq = 1, endSeq = 1, ignoreModifications = FALSE, onlyIons = TRUE, chargeState = 1, returnFormulas = FALSE, formulaIncludeChargeProtons = FALSE )
startSeqinteger vector, specifies the start of the part of the peptide sequence
endSeqinteger vector, specifies the end of the part of the peptide sequence
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide
onlyIonsdefault = TRUE, only information on the 13 (earlier mentioned) ion series is generated. If FALSE then an additional 10 columns are generated with info on the ionseries
chargeStatecharge state of the ions in the generated table
returnFormulasdefault = FALSE, if TRUE then in stead of numerical values the table will be populated by the chemical formulas of the neutral fragments or charged fragment ions
formulaIncludeChargeProtonsdefault = FALSE, if TRUE then protons will be included in the formulas (ignored when ' returnFormulas = FALSE)
a data.frame with fragment information
fragments()
generates a table of fragments which could arise from fragmenting the full sequence of the peptide. The ion series generated are: a, a-H2O, a-NH3, b, b-H2O, b-NH3, b+H2O, c, x, y, y-H2O, y-NH3, z. Please note that the calculation is relatively 'dumb': it does NOT check whether a fragment is possible at all. Prime example is the B+H2O ion series: these fragment ions can only if certain conditions are met. Currently there is no check in this function that checks these conditions/assumptions
peptide$fragments( ignoreModifications = FALSE, onlyIons = TRUE, chargeState = 1, returnFormulas = FALSE, formulaIncludeChargeProtons = FALSE )
ignoreModificationsif FALSE then modifications (both fixed & variable) are taken into account when calculating the chemical formula of the peptide
onlyIonsdefault = TRUE, only information on the 13 (earlier mentioned) ion series is generated. If FALSE then an additional 10 columns are generated with info on the ionseries
chargeStatecharge state of the ions in the generated table
returnFormulasdefault = FALSE, if TRUE then in stead of numerical values the table will be populated by the chemical formulas of the neutral fragments or charged fragment ions
formulaIncludeChargeProtonsdefault = FALSE, if TRUE then protons will be included in the formulas (ignored when ' returnFormulas = FALSE)
a data.frame with fragment information
fragments.part.immoniumIons()
generates a numeric vector containing 'expected' immonium ions based on the amino acid content of part of the peptide. Please note that this function does NOT take into account possible (fixed or variable) modifications
peptide$fragments.part.immoniumIons(startSeq = 1, endSeq = 1)
startSeqinteger vector, specifies the start of the part of the peptide sequence
endSeqinteger vector, specifies the end of the part of the peptide sequence
numeric vector
fragments.immoniumIons()
generates a numeric vector containing 'expected' immonium ions based on the amino acid content of the full sequence of the peptide. Please note that this function does NOT take into account possible (fixed or variable) modifications
peptide$fragments.immoniumIons()
numeric vector
clone()
The objects of this class are cloneable with this method.
peptide$clone(deep = FALSE)
deepWhether to make a deep clone.
testPeptide <- peptide$new(sequence = "SAMPLER", modificationTable = aminoAcidModifications()$table, variableModifications = "0010000") testPeptide testPeptide$formula()testPeptide <- peptide$new(sequence = "SAMPLER", modificationTable = aminoAcidModifications()$table, variableModifications = "0010000") testPeptide testPeptide$formula()
counts the occurence of a amino acid (sequence) in another amino acid sequence
peptideCount( thePeptide = NA, searchPeptide = NA, doNotSplice = TRUE, upper = TRUE )peptideCount( thePeptide = NA, searchPeptide = NA, doNotSplice = TRUE, upper = TRUE )
thePeptide |
character vector, the peptide to be searched |
searchPeptide |
character vector, the amino acid sequence to search for |
doNotSplice |
if FALSE the all characters in the searchPeptide are searched individually.If TRUE then the searchPeptide is searched as a whole. Default = TRUE |
upper |
convert both thePeptide & searchPeptides to uppercase before searching |
numeric vector
peptideCount("SAMPLER", "P") peptideCount("SAMPLER", "PLER", doNotSplice = TRUE) peptideCount("SAMPLER", "PLER", doNotSplice = FALSE)peptideCount("SAMPLER", "P") peptideCount("SAMPLER", "PLER", doNotSplice = TRUE) peptideCount("SAMPLER", "PLER", doNotSplice = FALSE)
gives formula of a peptide string
peptideFormula(peptide, aminoAcids = aminoAcidResidues())peptideFormula(peptide, aminoAcids = aminoAcidResidues())
peptide |
character vector specifying the sequence of amino acids in a peptide |
aminoAcids |
R6 object of type 'chemicals' with the amino acid info, default = aminoAcidResidues() |
a numeric vector
does not check for non-amino acid letters, modifications cannot be specified
peptideFormula("SAMPLER")peptideFormula("SAMPLER")
Generates a pre-defined (incomplete) table of names of fragments for the ions resulting when fragmenting a peptide in MS
peptideFragments()peptideFragments()
a data.frame of two columns: 'name' and 'series name' of fragments
will possibly be removed in the future
gives ion m/z of the protonated peptide
peptideMzH( peptide, charge = 1, aminoAcids = aminoAcidResidues(), elementsInfo = elementsMonoisotopic() )peptideMzH( peptide, charge = 1, aminoAcids = aminoAcidResidues(), elementsInfo = elementsMonoisotopic() )
peptide |
character vector specifying the sequence of amino acids in a peptide |
charge |
numeric vector specifying the charge of the peptide ion |
aminoAcids |
R6 object of type 'chemicals' with the amino acid info, default = aminoAcidResidues() |
elementsInfo |
R6 object of type 'elements' with the elements masses info, default = elementsMonoisotopic() |
a numeric vector
does not check for non-amino acid letters, modifications cannot be specified
peptideMzH("SAMPLER") peptideMzH("SAMPLER", charge = 2) peptideMzH("SAMPLER", elementsInfo = elementsAverage())peptideMzH("SAMPLER") peptideMzH("SAMPLER", charge = 2) peptideMzH("SAMPLER", elementsInfo = elementsAverage())
generates a pre-defined formula for proton
protonFormula()protonFormula()
a named numeric vector (formula)
print(protonFormula())print(protonFormula())
translates an cdkFormula object to a 'regular' formula format
rcdkFormula(cdkformula)rcdkFormula(cdkformula)
cdkformula |
an object of type rcdkFormula |
formula of format c(H=2, O=1)
This function does not deal with the charge state which is possibly defined in the rcdkFormula object
glucose <- rcdk::get.formula("C6H12O6") rcdkFormula(glucose) glucoseAdductIon <- rcdk::get.formula("C6H12O6Na1", charge = 1) glucoseAdductIon # to get to the same m/z value glucoseAdductIon |> massSpectrometryR::rcdkFormula() |> formulaToMass() |> massToMz(adducts = -1)glucose <- rcdk::get.formula("C6H12O6") rcdkFormula(glucose) glucoseAdductIon <- rcdk::get.formula("C6H12O6Na1", charge = 1) glucoseAdductIon # to get to the same m/z value glucoseAdductIon |> massSpectrometryR::rcdkFormula() |> formulaToMass() |> massToMz(adducts = -1)
removes elements that have number zero
removeZeros(formula)removeZeros(formula)
formula |
named numeric vector, example c(O = 2, C = 1) |
named numeric vector (formula)
glucose = c(O=6, H=12, C=6) glucose %f+% emptyFormula() removeZeros(glucose %f+% emptyFormula())glucose = c(O=6, H=12, C=6) glucose %f+% emptyFormula() removeZeros(glucose %f+% emptyFormula())
sorts the elements of a formula in alphabetical order (increasing/decreasing)
sortFormula(formula, decrease = FALSE)sortFormula(formula, decrease = FALSE)
formula |
named numeric vector, example c(O = 2, C = 1) |
decrease |
logical flag on how to sort, default = FALSE: increasing |
named numeric vector (formula)
glucose = c(O=6, H=12, C=6) glucose sortFormula(glucose)glucose = c(O=6, H=12, C=6) glucose sortFormula(glucose)
Translates a character vector formula, eg 'C6H12O6' to a regular formula c(C=6, H=12, O=6)
stringFormula(string)stringFormula(string)
string |
character vector, format eg: 'C6H12O6' |
formula of format c(H=2, O=1)
it's imperative that every element has a number (count), otherwise this function is highly likely to malfunction and return NA
stringFormula("H3O4P1") stringFormula("C6H12O6")stringFormula("H3O4P1") stringFormula("C6H12O6")
Translates a character vector formula, eg 'C6H12O6' to a regular formula c(C=6, H=12, O=6)
stringToFormula(string, removeCharacters = c("\\[", "\\]"))stringToFormula(string, removeCharacters = c("\\[", "\\]"))
string |
character vector, format eg: 'C6H12O6' |
removeCharacters |
character vector. Defines what (which characters) should be removed from the element names. Regular expressions can be used. Default is removal of brackets used for isoptopes. If NA, then nothing will be removed, see examples |
formula of format c(H=2, O=1)
this function is an improved version of stringFormula(). Now every elements with count 1 can have the number omitted. However, the function depends on 'correct' elements (first letter is uppercase, second letter is lowercase). This function also allows for the presence of isotopes, eg '[13]C' or '[2]H2O'
stringToFormula("H3O4P1") stringToFormula("C6H12O6") stringToFormula("C6H5Br") stringToFormula("[13]C6H12O5[18]O") stringToFormula("[13]C6H12O5[18]O", removeCharacters = NA)stringToFormula("H3O4P1") stringToFormula("C6H12O6") stringToFormula("C6H5Br") stringToFormula("[13]C6H12O5[18]O") stringToFormula("[13]C6H12O5[18]O", removeCharacters = NA)
subtracting one formula from another, taking into account possible differing elements
subtractFormulas(formula1, formula2)subtractFormulas(formula1, formula2)
formula1 |
named numeric vector, example c(O = 2, C = 1); formula to be subtracted from |
formula2 |
named numeric vector, example c(H = 2, S = 1); formula to subtract |
a named numeric vector (formula)
There are no checks for negative values!
subtractFormulas(c(H = 2, O = 1), c(H = 1)) subtractFormulas(c(H = 2, O = 1), c(S = 1, O = 2))subtractFormulas(c(H = 2, O = 1), c(H = 1)) subtractFormulas(c(H = 2, O = 1), c(S = 1, O = 2))
checks if formula is valid
validFormula(formula, string = FALSE)validFormula(formula, string = FALSE)
formula |
character vector or named numeric vector, representing the formula to be checked |
string |
logical vector specifying if the formula is a character vector or not |
logical vector
This is done via the check_chemform function from the package enviPat
glucose <- c(C=6, H=12, O=6) validFormula(glucose) formulaString(glucose) validFormula(formulaString(glucose), string = TRUE)glucose <- c(C=6, H=12, O=6) validFormula(glucose) formulaString(glucose) validFormula(formulaString(glucose), string = TRUE)
generates a pre-defined formula for water
waterFormula()waterFormula()
a named numeric vector (formula)
print(waterFormula())print(waterFormula())