Extracting%20biological%20names%20and%20relations%20from%20texts - PowerPoint PPT Presentation

About This Presentation
Title:

Extracting%20biological%20names%20and%20relations%20from%20texts

Description:

t (10;11) (p13; q14) DNA methyltransferase. 73 kDa protein ... State-of-the-art Systems on NER: Two evaluation contests. BioCreative 2004 (March) ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 75
Provided by: yifen
Category:

less

Transcript and Presenter's Notes

Title: Extracting%20biological%20names%20and%20relations%20from%20texts


1
Extracting biological names and relations from
texts
  • Ting-Yi Sung ???
  • Bioinformatics Program, TIGP
  • Institute of Information Science
  • Academia Sinica
  • 2004/12/16

2
Motivation
  • To automatically extract information from natural
    language text.
  • The need arises from rapid accumulation of
    biomedical literature.
  • Expedite survey efforts
  • Support the database curation (automatically
    associate the papers with database records)

3
Targets of Information Extraction
  • Protein-Protein interaction/binding/inhibition
  • Protein-Small Molecules
  • Gene-Gene regulation
  • Gene-Gene Product interaction
  • Gene-Drug relation
  • Protein-Subcellular location
  • Amino Acid-Protein relation
  • Example relationships between gene and drugs
  • The gene is the drug target
  • The gene confers resistance to the drug
  • The gene metabolizes the drug

4
Information Extraction Tasks
Identify Target Named Entities
Identify Relations among Named Entities
Identify Relations among Events and Named Entities
Associate Results with existing database records
5
Outline
  • NER (named entity recognition) in biomedical
    domain
  • Challenges in biomedical NER
  • State of progress in NER
  • Abbreviation disambiguation
  • Future works

6
What is NER?
  • NER
  • Named Entity Recognition
  • Including two tasks
  • Identification of proper names in text
  • Classification of proper names in text
  • Newswire Domain
  • Person, Location, Organization
  • Biomedical Domain
  • Protein, DNA, RNA, Body Part, Cell Type, Lipid,
    etc.

7
Example of NER - Biomedical
Protein
tissue
Disease
8
NER in biomedical domain
  • BioNER aims to recognize following names
  • First Priority
  • Protein name, DNA name, RNA name
  • Second Priority
  • cell type, other organic compound, cell line,
    lipid, multi-cell, virus, cell component, body
    part, tissue, amino acid monomer, polynucleotide,
    mono-cell, inorganic, peptide, nucleotide, atom,
    other artificial source, carbohydrate, organic

9
The Overall Spectrum
  • BioNER is only the starting point of biological
    information extraction
  • A whole suite of NLP techniques are needed to
    treat relations, events in literature mining
  • Techniques developed for BioNER should be
    adaptable to problems in later stages,
  • e.g. NE relation recognition

10
Intrinsic Features of BioNER
  • Unknown words
  • Long compound words
  • Variations of expressions
  • Nested NEs

11
Unknown Words
  • Words containing hyphen, digit, letter, Greek
    letter, Roman numeral.
  • Alpha B1
  • Adenyly cyclase 76E
  • Latent membrane protein 1
  • 4-mycarosyl isovaleryl-CoA transferase
  • oligodeoxyribonucleotide
  • 18-deoxyaldosterone
  • Abbreviation and Acronym
  • IL, TECd, IFN, TPA

12
Long Compound words
  • interleukin 1 (IL-1)-responsive kinase
  • interleukin 1-responsive kinase
  • epidermal growth factor receptor
  • SH2 domain containing tyrosine kinase Syk
  • SH2 domain (GENIA example)

13
Various expressions of the same NE
  • Spelling variation
  • N-acetylcysteine, N-acetyl-cysteine,
    NAcetylCysteine
  • Word permutation
  • beta-1 intergrin, integrin beta-1
  • Ambiguous expressions
  • epidermal growth factor receptor, EGF receptor,
    EGFR
  • c-jun, c-Jun, c jun

14
Various expressions the name explains its
function
  • the Ras guanine nucleotide exchange factor Sos
  • the Ras guanine nucleotide releasing protein Sos
  • the Ras exchanger Sos
  • the GDP-GTP exchange factor Sos
  • Sos(mSos), a GDP/GTP exchange protein for Ras

15
Various expressions The name includes
preposition and/or conjunction (ambiguity of
dependencies)
  • p85 alpha subunit of PI 3-kinase
  • SH2 and SH3 domains of Src
  • NF-AT1 , AP-1 , and NF-kB sites
  • E2F1 and -3
  • Residues 432, 435, 437, 438, and 440

16
Nested Named Entity
  • An NE embedded in another NE.
  • IL-2 protein
  • IL-2 gene gene
  • CBP/p300 associated factor protein
  • CBP/p300 associated factor binding promoter DNA

17
Outline
  • NER (named entity recognition) in biomedical
    domain
  • Challenges in biomedical NER
  • State of progress in NER
  • Abbreviation disambiguation
  • Future works

18
Challenges of NER
  • Unknown word identification
  • Named entity boundary detection
  • Class disambiguation

19
Challenges
  • Unknown word identification
  • t (1011) (p13 q14)
  • DNA methyltransferase
  • 73 kDa protein
  • interleukin 1 (IL-1)-responsive kinase (NE may
    contain an abbreviation within it.)
  • Some unknown words occur very few times in the
    corpus ? hard to recognize.

20
Challenges (contd)
  • NE boundary detection
  • Can be a regular English word, unknown word,
    Roman numeral, digit.
  • MHC Class II
  • latent protein 1 (The left boundary is an
    adjective)
  • cyclin-like UDG gene product
  • Conjunction (and, or, )
  • alpha- and beta-globin
  • human and mouse gene

21
Challenges (contd)
  • Classification of abbreviations
  • NF-AT
  • Full name nuclear factor of activated cells
  • Class Protein
  • HTLV-I
  • Full name Human T cell lymphotropic virus I
  • Class Virus
  • TCDD
  • Full name 2, 3, 7, 8-tetrachlorodibenzo-p-
    dioxin
  • Class Other Organic
  • GRE
  • Full name glucocorticoid response element
  • Class DNA

22
Outline
  • NER (named entity recognition) in biomedical
    domain
  • Challenges in biomedical NER
  • State of progress in NER
  • Abbreviation disambiguation
  • Future works

23
State-of-the-art Systems on NER Two evaluation
contests
  • BioCreative 2004 (March)
  • Critical Assessment of Information Extraction
    Systems in Biology
  • Task 1 Entity extraction
  • Target genes (or proteins, where there is
    ambiguity)
  • 10000 sentences from Medline as training data,
    and 5000 sentences as testing data
  • BioNLP 2004 (August)
  • GENIA Corpus as training data and 404 abstracts
    as testing data
  • Target 5 classes, including protein, DNA, gene,
    cell line and cell type.
  • Both use exact match scoring.

24
BioNLP 2004 Datasets
    of abstracts of sentences of tokens
Training Set Training Set 2,000 20,546 (10.27/abs) 472,006 (236.00/abs) (22.97/sen)
Test Set Total 404 4,260 (10.54/abs) 96,780 (239.55/abs) (22.72/sen)
Test Set 1978-1989 104 991 ( 9.53/abs) 22,320 (214.62/abs) (22.52/sen)
Test Set 1990-1999 106 1,115 (10.52/abs) 25,080 (236.60/abs) (22.49/sen)
Test Set 2000-2001 130 1,452 (11.17/abs) 33,380 (256.77/abs) (22.99/sen)
Test Set S/1998-2001 204 2,254 (11.05/abs) 51,628 (253.08/abs) (22.91/sen)
25
R/P/F   1978-1989 set  1990-1999 set  2000-2001 set  S/1998-2001 set  Total
Zho04 75.3 / 69.5 / 72.3 77.1 / 69.2 / 72.9 75.6 / 71.3 / 73.8 75.8 / 69.5 / 72.5 76.0 / 69.4 / 72.6
Fin04 66.9 / 70.4 / 68.6 73.8 / 69.4 / 71.5 72.6 / 69.3 / 70.9 71.8 / 67.5 / 69.6 71.6 / 68.6 / 70.1
Set04 63.6 / 71.4 / 67.3 72.2 / 68.7 / 70.4 71.3 / 69.6 / 70.5 71.3 / 68.8 / 70.1 70.3 / 69.3 / 69.8
Son04 60.3 / 66.2 / 63.1 71.2 / 65.6 / 68.2 69.5 / 65.8 / 67.6 68.3 / 64.0 / 66.1 67.8 / 64.8 / 66.3
Zha04 63.2 / 60.4 / 61.8 72.5 / 62.6 / 67.2 69.1 / 60.2 / 64.7 69.2 / 60.3 / 64.4 69.1 / 61.0 / 64.8
Rös04 59.2 / 60.3 / 59.8 70.3 / 61.8 / 65.8 68.4 / 61.5 / 64.8 68.3 / 60.4 / 64.1 67.4 / 61.0 / 64.0
Par04 62.8 / 55.9 / 59.2 70.3 / 61.4 / 65.6 65.1 / 60.4 / 62.7 65.9 / 59.7 / 62.7 66.5 / 59.8 / 63.0
Lee04 42.5 / 42.0 / 42.2 52.5 / 49.1 / 50.8 53.8 / 50.9 / 52.3 52.3 / 48.1 / 50.1 50.8 / 47.6 / 49.1
BL 47.1 / 33.9 / 39.4 56.8 / 45.5 / 50.5 51.7 / 46.3 / 48.8 52.6 / 46.0 / 49.1 52.6 / 43.6 / 47.7
26
Current Methods
  • Machine Learning
  • HMM, SVM, ME (Maximum Entropy), CRF (Conditional
    Random Field)
  • Hybrid methods
  • Dictionary Based
  • Approximate String matching algorithm
  • Naming Rules
  • Dynamic Programming

27
Features for Machine Learning Methods
  • Morphological Features
  • Orthographical Features
  • POS Features
  • Genia POS tagger
  • Semantic Trigger Features
  • Head-noun Features
  • NF-kappaB consensus site
  • IL-2 gene

28
Morphological Features
Prefix/Suffix Example
cin mide zole actinomycin Cycloheximide Sulphamethoxazole
lipid rogen vitamin phospholipids estrogen dihydroxyvitamin
blast cyte phil erythroblast thymocyte eosinophil
phosph methyl immuno phosphorylation methyltranferase immunomodulator
29
Orthographical Features
Orthographical Features Example Orthographical Features Example
AllCaps EBNA, NFAT AlphaDigit p50, p65
AlphaDigitAlpha IL23R, E1A ATGCSequence CCGCCC
CapLowAlpha Src, Ras, Epo CapMixAlpha NFkappaB
CapsAndDigits IL2, STAT4, SH2 DigitAlpha 2xNFkappaB
30
Head Nouns
Head Nouns
Unigram factor, protein, receptor, alpha, NF-kappaB, IL-2, cytokine, kinase, transcription, domain, complex, TNF-alpha, Nuclear, p50, CD28, TNF, molecule, subunit, cell, STAT3, family, tumor, factor-alpha, expression, interleukin
Bigram NF-kappa B, transcription factor, I kappa, nuclear factor, protein kinase, B alpha, kinase C, tumor necrosis, T cell, glucocorticoid receptor, binding protein, factor alpha, adhesion molecule, monoclonal antibody, gene product, binding domain
31
Additional features used by Mannings group
local features
  • Clues within a sentence
  • Include
  • Previous NEs
  • Abbreviations an abbr., a long form, neither
  • Parenthesis-matching
  • etc.

32
External resources used by Mannings group
  • Motivation
  • Contextual clues do not provide sufficient
    evidence for confident classification.
  • May be vulnerable to incompleteness, noise, and
    ambiguity.
  • Web
  • Least vulnerable to incompleteness, highly
    vulnerable to noise.
  • Prepare patterns for each class
  • For genes X gene, X antagonist, X mutation
  • For RNA X mRNA,
  • For proteins X ligation,
  • Features web-protein, web-RNA, O-web,
  • Does not work well in BioNLP Task.

33
External resources (2)
  • Gazetteers (dictionaries)
  • Are arguably subject to all three, and yet have
    been successfully in some systems.
  • Compiled a list of gene names from databases
    (e.g. Locus Link) and GO, the data from
    BioCreative Tasks 1A and 1B.
  • Filtering
  • Single character entries, e.g., A, 1 entries
    containing only digits or symbols and digits,
    e.g., 37 3-1
  • Entries containing only words can be found in an
    English dictionary (CELEX), e.g., abnormal,
    brain tumor
  • 1,731,581 entries
  • Larger context

34
State-of-the-art approaches
  • Machine learning Post-processing
  • Our method (BioKDD2004)
  • Maximum entropy
  • Post-processing
  • Boundary extension
  • Re-classification

35
Zhou et al. approach
  • HMM SVM
  • Post-processing
  • Rule-based used to resolve nested name entities.
  • Top1 in the NLPBA Task, F72.5

36
Manning et al. method
  • Machine learning
  • ME Markov model
  • Local features
  • External resources and larger context
  • Post-processing
  • To correct genes boundary (mainly for
    BioCreative Task)
  • Top 1 in BioCreative, F 83.2
  • Top 2 in NLPBA Task, F70.1

37
Our Method Overview
Training Phase
Knowledge input
Construct boundary word lists and dictionary
Dictionary
Training Data
Mapping features
Boundary word lists
Knowledge input
ME Learning
Testing Phase
Post-processing
Testing Data
ME
Boundary extension
NEs
Re-classify
38
Experimental Results
ME-based NER ME-based NER
NE identification P/R/F 0.56/0.589/0.574
NE recognition P/R/F 0.512/0.538/0.525
39
Post-Processing
  • Nested Named Entity
  • Ex CIITA mRNA
  • Nested Annotation ltRNAgtltDNAgtCIITA lt/DNAgt
    mRNAlt/RNAgt
  • ME sometimes only recognizes CIITA as DNA
  • 16.57 of NEs in GENIA 3.02 contains one or more
    shorter NE Zhang, 2003
  • Post-processing method
  • Boundary Extension
  • Re-classification

40
Boundary Extension (1)
  • Boundary extension for nested NEs
  • Extend the R-boundary repeatedly if the NE is
    followed by another NE, a head noun, or an
    R-boundary word with a valid POS tag.
  • Extend the left boundary repeatedly if the NE is
    preceded by an L-boundary word with a valid POS
    tag.

41
Example
  • ICAM-1 surface protein
  • ME result ICAM-1 /1U surface/unknown protein
    /unknown (1protein, U single)
  • Boundary extension
  • surface in R-boundary word list, valid POS tag
  • Extension ICAM-1 surface
  • protein in R-boundary word list, valid POS tag
  • Extension ICAM-1 surface protein

42
Boundary extension (2)
  • Boundary extension for NEs containing brackets or
    slashes
  • NE NE ( NE ) NE or head noun or
    R-boundary word with valid POS tag
  • NE NE / NE ( / NE ) NE or head
    noun or R-boundary word with valid POS tag
  • Example
  • granulocyte-macrophage colony-stimulating factor
    ( GM-CSF ) gene
  • ME result granulocyte-macrophage
    colony-stimulating factor, GM-CSF
  • Extension granulocyte-macrophage
    colony-stimulating factor ( GM-CSF ) gene

43
Re-classification
  • Use dictionary lookup
  • Use R-boundary word
  • CIITA mRNA RNA class
  • granulocyte-macrophage colony-stimulating factor
    ( GM-CSF ) gene DNA class

44
Experimental ResultsNE Identification
Config Boundary Extension Boundary Extension Boundary Extension NE Identification P/R/F
Config BE-1 BE-2 BE-3 NE Identification P/R/F
Baseline 0.56/0.589/0.574
Conf1 ? 0.582/0.597/0.594
Conf2 ? 0.591/0.6/0.595
Conf3 ? 0.757/0.746/0.751
Conf4 ? ? ? 0.776/0.763/0.769
BE-1boundary extension for nested NEs
BE-2boundary extension for brackets and slashes
BE-3with human name filter
45
Experimental ResultsNE Recognition
Config Boundary Extension Boundary Extension Boundary Extension Re-classification Re-classification NE Recognition P/R/F
Config BE-1 BE-2 BE-3 RC-1 RC-2 NE Recognition P/R/F
Baseline 0.512/0.538/0.525
Conf4 ? ? ? 0.645/0.634/0.639
Conf5 ? ? ? ? 0.67/0.658/0.664
Conf6 ? ? ? ? 0.707/0.695/0.701
Conf7 ? ? ? ? ? 0.727/0.715/0.721
RC-1 re-classification using dictionary lookup
RC-2 re-classification using R-boundary words
46
Experimental Results
GENIA v3.02 (10 Fold-CV) Recently, Zhou
improve the F-measure of his HMM model to 0.712
by combining SVM
System Overall Protein DNA RNA
Our System 0.721 0.785 0.700 0.752
Zhou et al. (Bioinformatics, 2004) 0.666 0.758 0.633 0.612
47
Error Analysis
  • GENIA inconsistent annotation
  • IL-2 gene expression
  • ltDNAgtIL-2 genelt/DNAgt expression
  • ltothernamegtltDNAgtIL-2 genelt/DNAgt
    expressionlt/othernamegt
  • Conjunction
  • Human and mouse gene
  • Boundary detection error (boundary not in
    boundary word file)
  • Squirrel, manic, bursal

48
Error Analysis
  • Abbreviation classification
  • Orthographical form fits into at least two
    classed.
  • Protein SOS1, FLICE, GAG
  • Other Organic CD336
  • False negative
  • A number of errors due to low-frequency words or
    works not encountered in the training data.
  • False positive
  • Ellipsis
  • Many inflammatory cytokine genes including TNF,
    IL-1, and IL-6

49
Outline
  • NER (named entity recognition) in biomedical
    domain
  • Challenges in biomedical NER
  • Current methods and our method
  • State of progress in NER
  • Future works

50
Mannings conclusion (I) Key factor for low
performance
  • Task difficulty does not appear to be the primary
    factor leading to low performance.
  • BioCreative 1 class, BioNLP 5 classes
  • Key factor quality of the training and
    evaluation data
  • Higher inconsistency in the annotation of the
    BioNLP data.
  • Two of the authors independently review 50
    systems errors 34-35 are attributed to
    annotation.
  • The authors do not think the annotation
    inconsistencies are due to biological subtleties.

51
Mannings Conclusion (II)
  • To improve biomedical annotation
  • BioNLP organizers emphasized that participants
    should focus on deep knowledge sources
  • coreference resolution and use of dependency
    relations over wide used lexical-level features
    (POS, morphological, orthographical, etc)
  • Proper exploitation of external resources
  • In both tasks, external resources led to
    improvement of only 1-2.
  • Consistent annotation might have led to a 70
    reduction in error rate.

52
Outline
  • NER (named entity recognition) in biomedical
    domain
  • Challenges in biomedical NER
  • State of progress in NER
  • Abbreviation disambiguation
  • Future works

53
Disambiguation of abbreviation
54
Motivation (I)
  • Named entity (NE) recognition (NER) is first step
    of information extraction.
  • NER contain two steps
  • NE identification extract named entity from text
  • NE classification classify given NE into
    specific class.

55
Motivation (II)
  • Since many protein or gene names are long
    compound names, they usually represent gene or
    protein names with abbreviation.
  • A2M Alpha-2-macroglobulin
  • A4GALT alpha 1,4-galactosyltransferase
  • EGFR epidermal growth factor receptor, EGF
    receptor
  • NF-AT nuclear factor of activated cells
  • HTLV-I Human T cell lymphotropic virus I
  • TCDD 2, 3, 7, 8-tetrachlorodibenzo-p- dioxin
  • GRE glucocorticoid response element

56
Motivation (III)
  • Abbreviation identification task
  • It is easier than classification task.
  • Abbreviations often have some orthographical
    clues.
  • All Capital letter, Alphabet and digit
    hybridetc.
  • Abbreviation classification task
  • In some situation, it is hard to disambiguate
    abbreviations class.
  • Example only mention abbreviation without full
    name

57
Challenges of abbreviation
  • Two cases
  • Case 1 sentence contains abbreviation and full
    name
  • Human immunodeficiency virus type 2 (HIV-2), like
    HIV-1, causes AIDS and is associated with AIDS
    cases primarily in West Africa.
  • Case 2 sentence contains only abbreviation
  • HIV-1 and HIV-2 display significant differences
    in nucleic acid sequence and in the natural
    history of clinical disease.

58
Case 1
  • Case 1 is easier than Case 2
  • The classification can be solved by following
    steps
  • Abbreviation Full name association
  • Disambiguate full names class
  • Assign full names class to abbreviation
  • Challenges has shift from abbreviation
    classification to abbreviate-full name
    association

59
Example of Case 1
  • Sentence
  • Human immunodeficiency virus type 2 (HIV-2), like
    HIV-1, causes AIDS and is associated with AIDS
    cases primarily in West Africa.
  • Step 1 Abbreviation Full name association
  • (Full name, Abbreviation) (Human
    immunodeficiency virus type 2, HIV-2)
  • Step 2 Full name class assignment
  • Name Human immunodeficiency virus type 2
  • Class Virus
  • Step 3 Abbreviation class assignment
  • Abbreviation HIV-2
  • Class Virus

60
A solution method to Case 1
  • Schwartz and Hearst, PSB 2003.
  • Identify ltlong form, short formgt pairs.
  • Both long form and short form occur in the same
    sentence.
  • long form ( short form ) more frequently
  • short form ( long form )

61
Algorithm Identify long form ( short form )
  • Identify long form and short form candidates
    (using adjacency to parentheses).
  • Identify correct long form.
  • Starting from the end of both candidates, move
    right to left, trying to find the shortest long
    form that matches the short form.
  • Every character in the short form must match a
    character in the long form.
  • The matched characters in the long form must be
    in the same order as the characters in the short
    forms.
  • ltHSF, Heat shock transcription factorgt
  • ltTTF-1, Thyroid transcription factor 1gt fail

62
Error analysis
  • Unused characters, e.g., ltCNS1, cyclophilin seven
    suppressorgt
  • Do not have any pattern between long form and
    short form, e.g., ltATN, anterior thalamusgt
  • Partial matching
  • The long form includes additional words to the
    left of the matching, e.g., ltPol I, RNA
    polymerase Igt
  • Out-of-order mapping
  • First character matches to the internal character
    (of the long form).
  • Non-continuous long form.
  • Transformation in the mapping (2D -gt
    two-dimensional)
  • Short form of only one character.

63
Other types of abbreviations
  • Schwartz and Hearsts algorithm only consider
    candidates in parentheses.
  • Challenges To find all possible pairs is a more
    difficult problem.

64
Example of Case 2
  • Its hard to disambiguate abbreviations class,
    even with context information.
  • Example
  • HIV-1 and HIV-2 display significant differences
    in nucleic acid sequence and in the natural
    history of clinical disease.
  • HIV-1 and HIV-2 are both virus, but if we replace
    HIV-1 and HIV-2 with IL-2 and IL-10, the sentence
    still make sense.
  • IL-2 and IL-10 display significant differences in
    nucleic acid sequence and in the natural history
    of clinical disease.
  • IL-2 and IL-10 gene name

65
Case 2
  • Leave for future work.
  • Clue
  • Statistical methods
  • Dictionary-based methods

66
Outline
  • NER (named entity recognition) in biomedical
    domain
  • Challenges in biomedical NER
  • State of progress in NER
  • Abbreviation disambiguation
  • Future works

67
Whats Next after NER solved?
  • Name entity relation recognition (NERR)
  • Protein-Protein interaction/binding/inhibition
  • Protein-Small Molecules
  • Gene-Gene regulation
  • Gene-Gene Product interaction
  • Gene-Drug relation
  • Protein-Subcellular location
  • Amino Acid-Protein relation
  • Gene-drug relation

68
Identify Relations among Named Entities
  • Target Extract relations between various
    biological named entities.

Here we demonstrate that the c-myb proto-oncogene
product, which is itself a DNA-binding protein,
and transcriptional transactivator, can interact
synergistically with Z.
Relation (Subject, Action, Object) (c-myb
proto-oncogene product, interact, Z)
69
Future works
  • Few papers have been published on the following
    specific challenging topics of NER.
  • Automated corpus correction
  • Disambiguation of abbreviations (Schwartz
    Hearst, 2003,)
  • Conjunction
  • NERR (difficult)
  • parser
  • Pronoun and anaphora resolution

70
Acknowledgements
  • Bioinformatics Yi-Feng Lin, Wen-Chi Chou
  • NLP Tzong-Han Tsai, Cheng-Wei Lee
  • Postdoc Kuen-Pin Wu
  • Colleague Wen-Lian Hsu (Fu Chang is jumping on
    the bandwagon now.)

71
Lab Introduction
72
Research topics
  • Protein structure prediciton
  • 2nd structure prediction
  • Tertiary structure prediction local structure
  • Members Hsin-Nan Lin, Caster Chen, Jia-Ming
    Chang
  • Protein structure determination based on NMR data
  • Backbone assignment
  • Side chain assignment
  • RDC
  • Jia-Ming Chang, Caster Chen, Philip Chen
  • Collaborator Prof TH Huang, IBMS

73
Research topics
  • Mass spectrometry based proteomics
  • Protein quantification
  • Protein identification for modification study
  • Yi-Hwa Yian, Wen-Ting Lin, Jacky Chou, Wei-Nung
    Hung
  • Collaborator Prof YR Chen, Inst of Chemistry
  • Biological literature mining
  • NER, NERR
  • Yi-Feng Lin, Jacky Chou, Richard Tsai

74
Faculty
  • PI Wen-Lian Hsu, Ting-Yi Sung
  • Post-doc Kuen-Pin Wu
Write a Comment
User Comments (0)
About PowerShow.com