CV Chapter 6 Categorization 2
Questions about the lecture 'Computer Vision' of the RWTH Aachen Chapter 6 Categorization 2
Questions about the lecture 'Computer Vision' of the RWTH Aachen Chapter 6 Categorization 2
Set of flashcards Details
Flashcards | 50 |
---|---|
Language | English |
Category | Computer Science |
Level | University |
Created / Updated | 04.02.2017 / 21.02.2017 |
Weblink |
https://card2brain.ch/box/20170204_cv_chapter_6_categorization_2
|
Embed |
<iframe src="https://card2brain.ch/box/20170204_cv_chapter_6_categorization_2/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
What is the difference of categorization to local feature matching?
Recognizable objects have no longer exact correspondence, only local
Name models for object categorization 2? [3]
1. Part-based models
2. Implicit shape models (ISM)
3. Deformable part-based model
What is the idea of part-based models for classification 2? [2]
1. Parts are 2D image fragments
2. Structure is configuration of parts
Name connectivity structures for part-based models for categorization 2? [7]
1. Bag of visual words with O(N)
2. Constellation with O(N^k)
3. Star shape with O(N²)
4. Tree with O(N²)
5. k-fan with O(N³)
6. Hierarchy
7. Sparse flexible model
What is the idea of implicit shape models (ISM) for categorization 2? [4]
1. Learn appearance codebook and star topology structural model
2. Features are considered independent given object center
3. Use visual vocabulary with displacement vectors to index votes
4. Robust to clutter, occlusion, noise and low contrast
What changed for the probabilistic generalized hough transform? [5]
1. Exact correspondence → probabilistic match
2. NN matching → soft matching
3. Feature location on object → part location distribution
4. Uniform votes → probabilistic vote weighting
5. Quantized hough array → continuous hough space
How does recognition works for implicit shape models (ISM) for categorization 2? [3]
1. Choose interest points from image feature f
2. Compare with codebook entries with probabilistic vote weights
3. Locate object position and return back project hypothesis
How does segmentation works for implicit shape models (ISM) for categorization 2? [2]
1. Find pixel contributions with meta information from hypothesis
2. Perform segmentation
What is the definition of the scale invariant votes? [3]
1. x_vote = x_img – x_occ*(s_img/s_occ)
2. y_vote = y_img – y_occ*(s_img/s_occ)
3. s_vote = (s_img/s_occ)
What is the idea of deformable part-based model for implicit shape models (ISM) for categorization 2?
Each component has global template plus deformable parts // Bike
What is the definition of deformable part-based model for implicit shape models (ISM) for categorization 2? [3]
1. Use HOG sliding-window detector
2. Score is dot product of filter and vectors in window specified by p
3. Score of object hypothesis is sum of scores minus deformation costs s(p0,…,pn) = Sum_i Fi*phi(H,p_i) – Sum_i d_i*(dx_i²,dy_i²)
What is used for image classification for categorization 2?
Bag-of-words model
What is the difference of traditional recognition approaches compared to deep learning? [4]
1. Traditional recognition approach uses hand-designed feature extraction
2. Build better classifiers or more features?
3. New is learning features from pixel layer and forward to simple classifier
4. Inspiration by neuron cells
What are the characteristics of perceptrons for deep learning? [3]
1. Multiple inputs x_1,… ,x_d
2. Multiple weights w_1,… ,w_d
3. Single output sigma(w*x+b) with sigma(t)=1/(1+exp(-t))
How does the layer structure of multi-layer neural networks looks like for deep learning?
Input, hidden and output layer
What is the definition of multi-layer neural networks for deep learning? [3]
1. Find weights minimizing error between true t_n and estimated labels f_w(x_n) with E(W) = Sum_n L(t_n, f(x_n;W))
2. Minimization with gradient descent if f is differentiable
3. Training with error back-propagation
What is the definition of the Hubel/Wiesel architecture (Nobel prize 1981) in deep learning?
Visual cortex consists of simple, complex and hyper-complex cells
What are the working steps of a convolutional neural network (CNN)? [6]
1. Input image
2. Convolution
3. Non-linearity
4. Spatial pooling
5. Normalization
6. Feature maps
With back-propagation classification error
What are the three network possibilities intuitions for 1k² image with 1M hidden units for convolutional neural network (CNN)? [3]
1. Fully connected network
2. Locally connected net
3. Convolutional net
What is the characteristic for the fully connected network for 1k² image with 1M hidden units for convolutional neural network (CNN)?
Requires 1T parameters
What is the characteristic for the locally connected network for 1k² image with 1M hidden units for convolutional neural network (CNN)?
With 10² receptive fields requires 100M parameters
What are the characteristics for the convolutional network for 1k² image with 1M hidden units for convolutional neural network (CNN)? [3]
1. Shares parameters across different locations
2. With 100 filters of size 10² requires 10k parameters
3. Result is a (memory) response map of size 1000x1000x100
What holds for an assumed eye detector on the convolutional network for 1k² image with 1M hidden units for convolutional neural network (CNN)? [2]
1. How to make detection robust to exact location?
2. Use pooling (e.g. max or avg) for filter response
What are the characteristics of layers for convolutional neural network (CNN)? [2]
1. Hidden neuron connects to local space covering full depth
2. Multiple neurons looking at same input region stacked in depth
What are the characteristics of filters for convolutional neural network (CNN)? [2]
1. So-called depth slice or activation map
2. Use low, mid and high level features before classifier
Name three non-linearities g(a) for convolutional neural network (CNN)? [3]
1. Sigmoid with g(a) = sigma(a) = 1/(1+exp(-a))
2. Hyperbolic tangent with g(a) = tanh(a) = 2*sigma(2a) – 1
3. Rectified linear unit (ReLU) with g(a) = max{0,a}
List possible CNN architectures? [5]
1. LeNet (1998)
2. AlexNet (2012)
3. VGGNet (2014/15)
4. GoogLeNet (2014)
5. Residual networks ReNet (2015)
What are the characteristics for LeNet a possible CNN architecture? [4]
1. Early convolution architecture
2. 2 convolutional and pooling layers
3. Fully connected NN layers for classificiation
4. Successfully used for handwritten digit recognition (MNIST)
What are the characteristics for AlexNet a possible CNN architecture? [8]
1. Similar to LeNet
2. 7 hidden layers, 650k units and 60M parameters
3. 11² with stride 4
4. More data with 10⁶ instead of 10³
5. GPU implementation
6. Better regularization and up-to-date training tricks as dropout
7. Halved error rate at ILSVRC (16.4% vs 26.2%) // Revolution
8. Acquired by Google and deployed to Google+ in 2013
What are the characteristics for VGGNet a possible CNN architecture? [3]
1. Deeper network with stacked convolutional layers // 19 layers
2. 3² with stride 1 cause less parameters
3. Ameliorating ILSVRC top-5 to 6.7%
What are the characteristics for GoogLeNet a possible CNN architecture? [2]
1. Uses inception module // 22 layers
2. At ILSVRC similar to VGGNet
What are the characteristics for ResNet a possible CNN architecture? [3]
1. Possibility to skip connections
2. Better propagation to deeper layers // 152 layers
3. Ameliorating ILSVRC top-5 to 3.57%
What are possible applications for convolutional neural network (CNN)? [4]
1. Generic learned features transfer with CalTech256
2. Detection
3. Semantic segmentation
4. Face verification
How does detection with convolutional neural network (CNN) works? [4]
1. Extract region proposals
2. Compute CNN features
3. Classify regions
4. Ameliorating accuracy from ~35% to ~50%
How did the accuracy of object detection increased with convolutional neural network (CNN)?
From before 40% to depp CNN 75% // Before with sliding-window
Name three object detectors based on convolutional neural network (CNN)? [3]
1. R-CNN 2. fast R-CNN and 3. faster R-CNN // R stands for regions
What are the steps of R-CNN? [3]
1. Extract ~2k region proposals from input image // Selective search
2. Compute CNN features out of warped region with pre-trained/fine-tuned network // AlexNet, VGGNet
3. Classify regions
How are regions classified for R-CNN? [2]
1. Linear SVMs and 2. bounding box regressors (Bbox reg)
How does the linear SVM on R-CNN works? [2]
1. fc(xfc7) = wcT*xfc7
2. With xfc7 features from fully-connected layer 7 and c the object class
How does the bbox reg of R-CNN works? [2]
1. Predict 2D box due to wrong proposal region
2. Compute weight for new x*,y*,w* and h*