10_Question_Answering
abc
abc
13
0.0 (0)
Kartei Details
Karten | 13 |
---|---|
Sprache | English |
Kategorie | Informatik |
Stufe | Universität |
Erstellt / Aktualisiert | 07.02.2018 / 09.02.2018 |
Lizenzierung | Keine Angabe |
Weblink |
https://card2brain.ch/box/20180207_10questionanswering
|
Einbinden |
<iframe src="https://card2brain.ch/box/20180207_10questionanswering/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
Question Answering vs. Information Retrieval
- INPUT:
- NL language questions and not keyword-based queries:
- QA:
- How long do polar bears live?
- IR:
- polar bears life span
- QA:
- NL language questions and not keyword-based queries:
- OUTPUT:
- Precise and concise answers, not whole documents
- QA:
- In the wild, polar bears live an average of 15 to 18 years, although biologists have tagged a few bears in their early 30s. In captivity, they may live until their mid- to late 30s. One zoo bear in London lived to be 41.
- IR:
- www.gotpetsonline.com/polar-bear/bear-habitat-polar/polar-bear-life-
- span.html
- www.starbus.com/polarbear/aboutpb.htm
- www.polarbearsinternational.org/faq
- QA:
- Precise and concise answers, not whole documents
Question Processing (including Question Type)
- Goal: extract clues from the question
- Question type
- Factual answers → Factoid questions
- How long do polar bears live?
- Definitional answers → Definition questions
- Who is Britney Spears?
- Opinionated answers → Opinion questions
- What do you think of Britney Spears' last album?
- There exist different Taxonomy for Question types: e.g. Example, Comparison, Quantification etc.
- Li&Roth Two-layered Taxonomy with 6 coarse and 50 fine classes
- Abbreviation:expression, Entity:animal, Description:def, Human:individual, Location:country, Numeric:date
- Li&Roth Two-layered Taxonomy with 6 coarse and 50 fine classes
- More difficult types like How and Why require complex answers
- TREC-QA: main question types
- Factoid
- List
- Definition
- Other
- TREC-QA: main question types
- Factual answers → Factoid questions
- Expected answer type(s)
- The answer type is the semantic category of the expected answer
- Country, author etc.
- The answer type is the semantic category of the expected answer
- Named Entities
- Interesting terms used to query the search engine
- Focus
- Topic
- Prediction of the question difficulty
- Question type
Question Classification
- Rule-based
- Biographical questions
- Who {is | was | are | were} < person name(s) >?
- Definition questions etc.
- Pros & Cons?
- Very powerful
- Cumbersome(mühsam) to create
- Do not generalise well
- Biographical questions
- Machine Learning
- Trained on hand-labeled questions, such as the corpus of Li & Roth
- Question features
- Tokens
- Named Entities
- POS tags
- Chunks
- N-grams
- Question word
Document Retrieval
- Identify the N most relevant documents given an input question
- For this, the question has to be reformulated as a query:
- Removal of stop words
- Stemming or lemmatization
- Query expansion
- Apply query reformulation rules
- …
Passage Retrieval and Scoring
- Aim: return the N most relevant passages from the top ranked
- Passages: sentences, paragraphs, sections / topical segments
- The others are ranked based on:
- the number of Named Entities of the right type
- the number of question keywords in the passage
- the rank of the document from which the passage was extracted
- …
Answer Identification
- Find the best answer to the question
- Two types of methods:
- Pattern extraction, using regular expression patterns corresponding to the expected answer type
- Redundancy-based approach
Query reformulation
- Aim: automatically generate answer paraphrases for a given question
- Formulate multiple queries for each question and retrieve the 100 best matching pages for each
- Rewrite rules are simple string-based manipulations
- Question: “Where is the Louvre located?”
- Rewrite Query:
- “+the Louvre +is located”
- “+the Louvre +is +in”
- “+the Louvre +is near”
- Use of a search engine (Google) to find answers on the Web
- Rewrite rules are simple string-based manipulations