Package: proteinDiscover 0.11.0

proteinDiscover: ProteinDiscover

Provides an interface to the data contained in Proteome Discoverer (Thermo Scientific) results.

Authors:Ben Bruyneel <[email protected]>

proteinDiscover_0.11.0.tar.gz
proteinDiscover_0.11.0.zip(r-4.5)proteinDiscover_0.11.0.zip(r-4.4)proteinDiscover_0.11.0.zip(r-4.3)
proteinDiscover_0.11.0.tgz(r-4.4-any)proteinDiscover_0.11.0.tgz(r-4.3-any)
proteinDiscover_0.11.0.tar.gz(r-4.5-noble)proteinDiscover_0.11.0.tar.gz(r-4.4-noble)
proteinDiscover_0.11.0.tgz(r-4.4-emscripten)proteinDiscover_0.11.0.tgz(r-4.3-emscripten)
proteinDiscover.pdf |proteinDiscover.html
proteinDiscover/json (API)

# Install 'proteinDiscover' in R:
install.packages('proteinDiscover', repos = c('https://benbruyneel.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/benbruyneel/proteindiscover/issues

On CRAN:

mass-spectrometryproteomicsproteomics-data-analysis

82 exports 1 stars 0.36 score 36 dependencies 2 scripts

Last updated 6 months agofrom:04b6e22d41. Checks:OK: 7. Indexed: yes.

TargetResultDate
Doc / VignettesOKAug 20 2024
R-4.5-winOKAug 20 2024
R-4.5-linuxOKAug 20 2024
R-4.4-winOKAug 20 2024
R-4.4-macOKAug 20 2024
R-4.3-winOKAug 20 2024
R-4.3-macOKAug 20 2024

Exports:allNodesTableanalysisDefinitionblobLengthcalcAllIFIscalcDatacalcIFIscolumnSpecialscreateDiagrammeRStringdbClosedbGetAnnotatedProteinsdbGetAnnotationGroupsdbGetAnnotationGroupsFiltereddbGetConsensusIDsdbGetConsensusTabledbGetMassSpectrumItemsdbGetModificationPeptideIDsdbGetModificationsSitesIDsdbGetModificationsTabledbGetMSnSpectrumInfodbGetPeptideIDsdbGetPeptideTabledbGetProteinAnnotationGroupIDsdbGetProteinFiltereddbGetProteinGroupIDsdbGetProteinGroupsdbGetProteinIDsdbGetProteinsdbGetProteinTabledbGetProteinUniqueSequenceIDsdbGetPsmIDsdbGetPsmTabledbGetQuanSpectrumIDsdbGetQuanSpectrumInfoTabledbGetTabledbOpendetermineBlobTypesdf_replacedfTransformRawsgetAcquistionDategetAcquistionDateTimegetBlobsgetPeptideInfogetPeptideInfoRawgetProteinInfogetProteinInfoRawisMasterProteinknockOutProteinsMSfileInfona.datenodesnodeTablepQuanInfoproteinIDTypespsmAmbiguityquanInfoquanInfoDetailsreplacementStringsSearchInfospectrum.centroidspectrum.headerspectrum.precursor.additionalInfospectrum.precursor.centroidspectrum.precursor.headerspectrum.precursor.infospectrum.precursor.profilespectrum.precursor.scanEventspectrum.profilespectrum.scanEventstudyDefinitionExtensionsstudyDefinitionExtensionSettingsstudyDefinitionFactorsstudyDefinitionFileSetsstudyDefinitionQuanMethodsstudyDefinitionSamplessystem.datetableNamesthermo.datetmt10Channelstmt11ChannelstotalSearchTimetransformSpectrumRawworkflowInfo

Dependencies:bitbit64blobcachemclicpp11DBIdplyrfansifastmapgenericsgluelaterlifecyclelubridatemagrittrmemoisepillarpkgconfigplogrpoolpurrrR6RcpprlangRSQLitestringistringrtibbletidyrtidyselecttimechangeutf8vctrswithrXML

manual

Rendered frommanual.Rmdusingknitr::rmarkdownon Aug 20 2024.

Last update: 2022-10-15
Started: 2022-01-02

Readme and manuals

Help Manual

Help pageTopics
Helper function that takes the result from the 'nodes' function, which is a named list of parameter tables (from processing or consensus workflow), and puts it all in a single table with the names of the nodes as an extra columnallNodesTable
function that gets the first element of the AnalysisDefinitionXML column from the AnalysisDefinition table in a .pdResult fileanalysisDefinition
attempts to determine the length (in bytes) of the individual elements of a blob-type column of a data.frame. It should (!) return an integer value of course (as all elements are supposed to have the same length). Also: if all elements of the column are NA, the the result will be NaNblobLength
Wrapper function that uses 'tmt11Channels' to calculate the IFI's for a set of (knock out) protein channelscalcAllIFIs
helper function to calculate a row-wise function (like mean, median etc) across a data.framecalcData
function to calculate the IFI (interference free index) of a protein entry in the protein table of a pdResult files. Note this can only be calculated on the knockout proteins in the TKO control sample: see 'tmt10Channels' or 'tmt11Channels' for the eligible proteinscalcIFIs
Specials are not numeric or integer, but have chunks of a certain size All encountered in Proteome Discoverer are actually booleans with a value 0 (FALSE), 1 (TRUE) or NAcolumnSpecials
function to create a DiagrammeR string that can be used by DiagrammeR::grViz() to plot a visual representation of the workflowcreateDiagrammeRString
Wrapper around pool::pooClose(): closes an open database (normally opened earlier via eg db_open())dbClose
Function to get the UniqueSequenceID's for proteins which are in an protein annotation group. Essentially does the reverse of 'dbGetProteinAnnotationGroupIDs'. The output of this function can serve as the input for 'dbGetProteins'dbGetAnnotatedProteins
Function to get the info for (protein) annotation groups. Takes eg 'dbGetProteinAnnotationGroupIDs' as inputdbGetAnnotationGroups
Get Group Annotation information from the table: AnnotationProteinGroups. This can be done via the GroupAnnotationAccession or via the description of an annotation. When using the Description it's possible to use the SQL 'like'dbGetAnnotationGroupsFiltered
get the ConsensusID's from (a set of) PeptideGroupIDsdbGetConsensusIDs
get the Consensus Features table belonging to the ConsensusIDsdbGetConsensusTable
get the MassSpectrumItems info from (a set of) PeptideID'sdbGetMassSpectrumItems
Function to get the peptideID's 'belonging' to a modification sitedbGetModificationPeptideIDs
function to get the modificationSite ID's from (a set of) proteinUniqueID'sdbGetModificationsSitesIDs
function to get data from the ModificationSides table using the modificiationSiteId'sdbGetModificationsTable
get the MSnSpectrumInfo from (a set of) PeptideID'sdbGetMSnSpectrumInfo
get the peptideID's from (a set of) proteinGroupIDsdbGetPeptideIDs
get the paptide table belonging defined by PeptideIDs ot proteinGroupIDsdbGetPeptideTable
Function to get the functional group annotation group ID's for proteins. This function does essentially the reverse of 'dbGetAnnotatedProteins'. The output of this function can serve as the input for 'dbGetAnnotationGroups'dbGetProteinAnnotationGroupIDs
A bit more advanced version of 'dbGetProteinTable' which allows for filtering (via SQL). Note that filtering raw columns (BLOB's) will not work properlydbGetProteinFiltered
Retrieve the ProteinGroupID's of proteins via their UniqueSequenceID'sdbGetProteinGroupIDs
Gets the ProteinGroup information from the TargetProteinGroups tabledbGetProteinGroups
Function to get proteinUniqueID's from a (set of) protein groupID's (eg from a proteinGroup tables, or dbGetProteinGroupIDs). This allows for getting all proteins (also non-master proteins) which together make up a protein group. Normally only the master protein is shown in a protein tabledbGetProteinIDs
Function to get protein information from the TargetProteins table on the basis of their UniqueSequenceIDdbGetProteins
get the protein table from a .pdResult file (essentially a wrapper around db_getTable())dbGetProteinTable
Function to retrieve the UniqueSequenceID's based on the accession field of the proteinTable. Essentially a wrapper for 'dbGetProteinFiltered'dbGetProteinUniqueSequenceIDs
get the PsmID's from (a set of) PeptideGroupIDsdbGetPsmIDs
get the PSM table belonging to the PsmIDsdbGetPsmTable
get the SpectrumID's from (a set of) PeptideIDsdbGetQuanSpectrumIDs
get the QuanSpectrumInfo table belonging to the SpectrumID'sdbGetQuanSpectrumInfoTable
get a table from a .pdResult filedbGetTable
Wrapper around pool::dbPool(): opens a databasedbOpen
function that attempts to assign types and sizes to the blob type columns in a table. The result from this function can be used in the dfTransformRaws functiondetermineBlobTypes
function that replaces (parts of) strings in a data.frame according to a provided table of replacementsdf_replace
df_transform_raws(): converts raw columns in a data.frame to the correct data typesdfTransformRaws
function to retrieve the acquisition date of the files used to generate the pdResult filegetAcquistionDate
function to retrieve the acquisition date & time of the files used to generate the pdResult filegetAcquistionDateTime
detemines which columns in a table are of the blob (raw) typegetBlobs
get peptide information from the peptide table from a pdResult file based on the provided proteinAccession (uniprot) codes. Raw columns are "translated"getPeptideInfo
get peptide information from the peptide table from a pdResult file based on the provided proteinAccession (uniprot) codes. Raw columns are not "translated"getPeptideInfoRaw
get protein info (with translation of columns) from a list of protein Accessions (uniprot code). Essentially this is a wrapper function for 'getProteinInfoRaw'getProteinInfo
get protein info (without translation of columns) from a list of protein Accessions (uniprot code). Essentially this is a wrapper function for 'dbGetTable'getProteinInfoRaw
function for 'translation' of the isMasterProtein values (0..4) in the proteinTable to words (like in Proteome Discoverer).isMasterProtein
helper function to generate the a data.frame of proteins info for other functionsknockOutProteins
get the table with info on the files used in the search from the databaseMSfileInfo
fake converter for times when no conversion is wanted/neededna.date
function that takes a (xmlToList type) workflow and returns a list of nodesnodes
function to display an overview table of the processing/consensus workflows in the nodeInfo coming out of the workflowInfo functionnodeTable
function for translation of the QuanInfos values in the psms & peptide tables to words (like in Proteome Discoverer).pQuanInfo
get the names of the identification types (sequest HT etc) used in the databaseproteinIDTypes
function for 'translation' of the psmAmbiguity values (1..5) in the psmTable to words (like in Proteome Discoverer). <...> -> means not encountered/ undefined/no inferencepsmAmbiguity
function for 'translation' of the QuanInfo values in the QuanSpectrumInfo table to words (like in Proteome Discoverer).quanInfo
function for 'translation' of the QuanInfoDetails values in the QuanSpectrumInfo table to words (like in Proteome Discoverer).quanInfoDetails
function that generates the default data.frame for the function df_replace().replacementStrings
get the table with info on the search itself from the databaseSearchInfo
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum centroided spectrumspectrum.centroid
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum headerspectrum.header
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum parent additonal infospectrum.precursor.additionalInfo
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum parent centroided spectrumspectrum.precursor.centroid
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum parent headerspectrum.precursor.header
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum parent monoisotopic peakspectrum.precursor.info
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum parent profile spectrumspectrum.precursor.profile
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum parent scan eventspectrum.precursor.scanEvent
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum profile spectrumspectrum.profile
gets the info in the list object coming from the function 'transformSpectrumRaw': spectrum scan eventspectrum.scanEvent
function that extracts information on isotope corrections (if available)studyDefinitionExtensions
function to extract sample/factor/ratio/replicate information.studyDefinitionExtensionSettings
function that extracts the factors used in the study to generate the .pdResult file. The result contains some internal info in the form of columns named id (identifiers).studyDefinitionFactors
function that extracts file information on the original .raw files used to generate the .pdResult file. Information includes the original file name, location & size. It also contains some internal info in the form of columns named id (identifiers).studyDefinitionFileSets
function that extracts quantification method information if a quantification method was used to generate the .pdResult filestudyDefinitionQuanMethods
function that extracts sample information. The information seems to be a bit redundant, as the info is also seen in other tables.studyDefinitionSamples
converts character string date into date/time formatsystem.date
internal helper function to prevent having to remember the somewhat long names of the most used tablestableNames
converts character string date into date/time formatthermo.date
helper function to generate the a data.frame of TMT knockout strain (TKO) info for other functions. This function generates a data.frame based on the 10-plex TMT TKO knockout (this was the original TMT-knockout-digest available)tmt10Channels
helper function to generate the a data.frame of TMT knockout strain (TKO) info for other functions. This function generates a data.frame based on the 11-plex TMT TKO knockouttmt11Channels
get the total search time from the databasetotalSearchTime
transforms a spectrum from the table 'MassSpectrumItems' into a R compatible listtransformSpectrumRaw
function to get the workflow information from a .pdResult fileworkflowInfo