CV Chapter 6 Categorization 2
Questions about the lecture 'Computer Vision' of the RWTH Aachen Chapter 6 Categorization 2
Questions about the lecture 'Computer Vision' of the RWTH Aachen Chapter 6 Categorization 2
Kartei Details
Karten | 50 |
---|---|
Sprache | English |
Kategorie | Informatik |
Stufe | Universität |
Erstellt / Aktualisiert | 04.02.2017 / 21.02.2017 |
Weblink |
https://card2brain.ch/box/20170204_cv_chapter_6_categorization_2
|
Einbinden |
<iframe src="https://card2brain.ch/box/20170204_cv_chapter_6_categorization_2/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
Lernkarteien erstellen oder kopieren
Mit einem Upgrade kannst du unlimitiert Lernkarteien erstellen oder kopieren und viele Zusatzfunktionen mehr nutzen.
Melde dich an, um alle Karten zu sehen.
Name three object detectors based on convolutional neural network (CNN)? [3]
1. R-CNN 2. fast R-CNN and 3. faster R-CNN // R stands for regions
What are the steps of R-CNN? [3]
1. Extract ~2k region proposals from input image // Selective search
2. Compute CNN features out of warped region with pre-trained/fine-tuned network // AlexNet, VGGNet
3. Classify regions
How are regions classified for R-CNN? [2]
1. Linear SVMs and 2. bounding box regressors (Bbox reg)
How does the linear SVM on R-CNN works? [2]
1. fc(xfc7) = wcT*xfc7
2. With xfc7 features from fully-connected layer 7 and c the object class
How does the bbox reg of R-CNN works? [2]
1. Predict 2D box due to wrong proposal region
2. Compute weight for new x*,y*,w* and h*
How are the weights of bbox reg of R-CNN computed? [5]
1. wTc,xxpool5 = x*-x / w
2. wTc,yxpool5 = y*-y / h
3. wTc,wxpool5 = ln(w*/w)
4. wTc,hxpool5 = ln(h*/h)
5. With xpool5 features from pool5 layer
What are the problems of R-CNN? [5]
1. Fine tune network with softmax classifier (log loss)
2. Train post-hoc linear SVMs (hinge loss)
3. Train post-hoc bounding box regressors (squared loss)
4. Long and slow training (3 days and 47s per image)
5. High memory consumption (Easily 200GB)
What are the techniques of fast R-CNN? [2]
1. Forward and 2. backward pass
What are the characteristics of region proposal networks (RPN) for CNN? [4]
1. Remove dependence on external region proposal algorithm
2. Get region proposal from same CNN
3. Use feature sharing
4. Single pass object detection becomes possible
What is the definition of faster R-CNN?
Fast R-CNN + RPN
What are the losses of proposals and RoI pooling for faster R-CNN? [3]
1. Total 4
2. Classification loss
3. Bounding box regression loss
What is the definition of fully convolutional networks (FCNs)? [4]
1. All operations formulated as convolutions
2. Can process arbitrarily sized images
3. Forward with inference
4. Backward with learning
How does semantic image segmentation works with FCNs? [3]
1. Perform pixel-wise prediction
2. Sliding-window classification producing heatmap of scores
3. Avoid low resolution with up-sampling and skip connections
How does human pose estimation works with FCNs? [2]
1. Choose key-points with target disk (r) for skeleton joints
2. Each disk has ground-truth label of 1
How does face verification with embedding space works with FCNs? [3]
1. Use triplet network with negatives, anchors and positives
2. Learn function grouping positives closer to anchors
3. Vector arithmetics possible due to linear regularities
What is the difference of categorization to local feature matching?
Recognizable objects have no longer exact correspondence, only local
Name models for object categorization 2? [3]
1. Part-based models
2. Implicit shape models (ISM)
3. Deformable part-based model
What is the idea of part-based models for classification 2? [2]
1. Parts are 2D image fragments
2. Structure is configuration of parts
Name connectivity structures for part-based models for categorization 2? [7]
1. Bag of visual words with O(N)
2. Constellation with O(N^k)
3. Star shape with O(N²)
4. Tree with O(N²)
5. k-fan with O(N³)
6. Hierarchy
7. Sparse flexible model
What is the idea of implicit shape models (ISM) for categorization 2? [4]
1. Learn appearance codebook and star topology structural model
2. Features are considered independent given object center
3. Use visual vocabulary with displacement vectors to index votes
4. Robust to clutter, occlusion, noise and low contrast
What changed for the probabilistic generalized hough transform? [5]
1. Exact correspondence → probabilistic match
2. NN matching → soft matching
3. Feature location on object → part location distribution
4. Uniform votes → probabilistic vote weighting
5. Quantized hough array → continuous hough space
How does recognition works for implicit shape models (ISM) for categorization 2? [3]
1. Choose interest points from image feature f
2. Compare with codebook entries with probabilistic vote weights
3. Locate object position and return back project hypothesis
How does segmentation works for implicit shape models (ISM) for categorization 2? [2]
1. Find pixel contributions with meta information from hypothesis
2. Perform segmentation
What is the definition of the scale invariant votes? [3]
1. x_vote = x_img – x_occ*(s_img/s_occ)
2. y_vote = y_img – y_occ*(s_img/s_occ)
3. s_vote = (s_img/s_occ)
What is the idea of deformable part-based model for implicit shape models (ISM) for categorization 2?
Each component has global template plus deformable parts // Bike
What is the definition of deformable part-based model for implicit shape models (ISM) for categorization 2? [3]
1. Use HOG sliding-window detector
2. Score is dot product of filter and vectors in window specified by p
3. Score of object hypothesis is sum of scores minus deformation costs s(p0,…,pn) = Sum_i Fi*phi(H,p_i) – Sum_i d_i*(dx_i²,dy_i²)
What is used for image classification for categorization 2?
Bag-of-words model
What is the difference of traditional recognition approaches compared to deep learning? [4]
1. Traditional recognition approach uses hand-designed feature extraction
2. Build better classifiers or more features?
3. New is learning features from pixel layer and forward to simple classifier
4. Inspiration by neuron cells
What are the characteristics of perceptrons for deep learning? [3]
1. Multiple inputs x_1,… ,x_d
2. Multiple weights w_1,… ,w_d
3. Single output sigma(w*x+b) with sigma(t)=1/(1+exp(-t))
How does the layer structure of multi-layer neural networks looks like for deep learning?
Input, hidden and output layer
-
- 1 / 50
-