Dataset	# subj. / # sess.	Links	Year	Spoof attacks	Publish Time
NUAA	15/3	Link	2010	Print	2010
CASIA-MFSD	50/3	Link	2012	Print, Replay	2012
Replay-Attack	50/1	Link	2012	Print, 2 Replay	2012
MSU-MFSD	35/1	Link	2015	Print, 2 Replay	2015
MSU-USSA	1140/1	Link	2016	2 Print, 6 Replay	2016
Oulu-NPU	55/3	Link	2017	2 Print, 6 Replay	2017
SiW	165/4	Link	2018	2 Print, 4 Replay	2018
ROSE-Youtu	25/*	Link	2018	Print, Replay, Mask	2018

Details:

NUAA: 包含 12641 静态图片;
CASIA: 包含 50 个对象共 600 视频, 覆盖三种攻击方式 (photo, cut photo, video). 对于每个对象, 真实人脸和三种攻击方式捕获的人脸包含了三种不同图像质量的人脸 (低, 正常, 高);
Replay-Attack Dataset: 包含 50 个对象共 1300 视频. 对于每个对象, 有两种拍摄背景 (control and adverse), 三种攻击方式 (print, digital photo and video), 两种攻击形式 (fixed and hand-holding).
MSU_USSA: http://biometrics.cse.msu.edu/Publications/Databases/MSU_USSA/
Oulu-NPU: 包含 4950 个真实和攻击视频. 采集设备包含 (Samsung Galaxy S6 edge, HTC Desire EYE, MEIZU X5, ASUS Zenfone Selfie, Sony XPERIA C5 Ultra Dual and OPPO N3).
SiW: 共 165 个对象的视频. 每个对象包含 8 个活体和 20 个欺骗视频. 一共 4478 个视频. 每个视频 30 fps, 持续 15 秒, 1080P HD. 活体视频包含了不同的距离, 姿态, 光照和表情. 欺骗视频包含了打印纸张和翻拍.
ROSE-Youtu: 包含 25 个对象, 共 4225 个视频 (3350 videos with 20 subjects publically available with 5.45GB in size). 每个对象包含 150~200 个视频片段, 每个片段约 10 秒. 数据来源: Hasee, HUAWEI, ipad 4, iphone 5s, ZTE. 人脸到相机距离 30~50 cm.

Deformable ConNet V1 & V2

Posted on 2018-11-30 | In Network Structures | 0 comments

Type	Formula
Regular Convolution	$y(p) = \sum_{k=1}^K w_k * x(p + p_k)$
Deformable ConvNet v1	$y(p) = \sum_{k=1}^K w_k * x(p + p_k + \Delta p_k)$
Deformable ConvNet v2	$y(p) = \sum_{k=1}^K w_k x(p + p_k + \Delta p_k) \Delta m_k$

Tricks in MxNet & Gluon

Posted on 2018-10-23 | In Deep Learning Frameworks | 0 comments

Gluon

报错:

1	RuntimeError: Parameter 'stn_conv0_weight' was not initialized on context gpu(0). It was only initialized on [gpu(0)].

解决方法:

需放在 ```gluon.Trainer(net.collect_params(), opt.optimizer)``` 的前面.



## 提取特征层输出

### Method 1: 继承 Block, 实现 forward

... 待续

### Method 2: 使用 SymbolBlock

```python
net = gluon.model_zoo.vision.densenet(pretrained=True, ctx=ctx)
internals = net.load_params("./densenet.params", ctx=context)
out_list = [internals['densenet0_stage3_conv13_fwd_output'],
                      'densenet0_stage3_conv13_fwd_output']]
net = gluon.SymbolBlock(out_list, data, params=net.collect_params())

网络冻结

方式一： freeze 层在在 record 外面，只 forward, 梯度不进行回传

不同层设置不同的学习率

1
2
3

net = gluon.model_zoo.vision.densenet(pretrained=True, ctx=ctx)
for name, params in net.features.collect_params().items():
    params.lr_mult = 0.1

所设定层的学习率变为 base_learning_rate * params.lr_mult

梯度截断

1	gluon.Trainer(net.collect_params(), 'sgd', {'lr': 1e-2, 'grad_clip': 2})

MXNet

API

reshape

mxnet.ndarray.reshape(data=None, shape=_Null, reverse=_Null, target_shape=_Null, keep_highest=_Null, out=None, name=None, **kwargs)
"""
Some dimensions of the shape can take special values from the set {0, -1, -2, -3, -4}. The significance of each is explained below:
0 copy this dimension from the input to the output shape.
-1 infers the dimension of the output shape by using the remainder of the input dimensions keeping the size of the new array same as that of the input array. At most one dimension of shape can be -1.
-2 copy all/remainder of the input dimensions to the output shape.
-3 use the product of two consecutive dimensions of the input shape as the output dimension.
-4 split one dimension of the input into two dimensions passed subsequent to -4 in shape (can contain -1)
"""

Debug

错误提示:
ValueError: You created Module with Module(..., data_names=['data']) but input with name 'data' is not found in symbol.list_arguments(). Did you mean one of: data0

解决方案:
mod = mx.mod.Module(symbol=sym, context=ctx, data_names=('data0',), label_names=None)

错误提示:  
DeferredInitializationError: Parameter disnet0_conv5_weight has not been initialized yet because initialization was deferred. Actual initialization happens during the first forward pass. Please pass one batch of data through the network before accessing Parameters. You can also avoid deferred initialization by specifying in_units, num_features, etc., for network layers.

解决方案:

在net的class里的init部分定义了层，在forward没有使用。删掉就好了

Pedstrain Attribute Notes

Posted on 2018-09-13 | In Attribute Recognition | 0 comments

行人属性识别通常包含了多个属性的识别，如 gender, age， coat, trousers, luggage 等。在实际项目中我们发现，相比于人脸属性，行人属性识别具有其自身特点，难度更高：

部分属性仅和人体的部分区域相关，如上衣长短袖仅和人体上半身相关；
训练样本较难，主要表现为部分身体和遮挡 (如某些人体只有上半身或下半身，同一人体 bbox 中出现多个人体)；
错标、歧义标签占比大 (实际场景中 > 30%);
某些样本标签丢失；
样本的类别分布不平衡；
某些属性是多分类，某些属性是二分类，不能简单地将多分类问题转化为二分类问题的；
某些属性占整张图片的区域过小，如墨镜，正脸情况下的双肩包预测；

归结起来，可以用如下几个关键词描述行人属性识别中遇到的挑战：

Occlusion
Trunction
Imbalance
Noisy Label
Label Missing

在算法设计中比较简单粗暴的思路是每个属性采用独立模型，可以想见当属性很多时，即使是小模型，高性能 GPU 都未必能够处理过来. 所以更具有可扩展性的做法是当成多任务 (multi-task) 或者多标签 (multi-label)学习. 特别地，当某些属性包含多个类 ($\geq3$) 时, 采用 multi-task learning 更合理。

基于多任务学习的属性识别面临诸多挑战，比如哪一层开始作为 share layer 开始分支, 不同任务之间 loss 的权重问题, 总结起来所面临的挑战包含如下几个方面:

Networks Structure
Adaptive Loss
Class Imbalance
Missing Labels

针对如上说列举的一些挑战, 也有相关文献提出了相应的解决方案. 但到目前为止, 还未看到哪一篇文献对以上问题进行了综合解决, 也许这只是一个工程问题, 大佬们都不屑于解决这样琐碎的问题 :yum::yum::yum:. 但是在产品落地的过程必须要解决这些问题.

CV&DL Awesome

Posted on 2018-09-13 | 0 comments

This is an arranged list of repositories about machine learning, computer vision et. al.