1. Introduction |
Pedestrian attributes are helpful to infer high-level semantic knowledge, improving the performance of pedestrian tracking, retrieval,
re-identification, etc. However, current pedestrian databases are mainly for pedestrian detection or tracking applications, and
semantic attribute annotations related to pedestrians are rarely provided. To solve this issue, we construct an Attributed Pedestrians
in Surveillance (APiS) 1.0 database with various scenes. Moreover, we develop an evaluation protocol for researchers to evaluate pedestrian attribute classification algorithms.
Figure 1 Some pedestrians with semantic annotations in the APiS 1.0 database. (a) Binary attribute annotations
(b) Upper-body clothing color annotations. (c) Lower-body clothing color annotations
2. Database Composition
The APiS 1.0  database includes 3661 pedestrian images with 11 binary and 2 multi-class (color)
attribute annotations. Figure 1 shows some pedestrians with semantic attribute annotations in the APiS 1.0
database. The cropped pedestrian images collected in the APiS database come from four sources:
KITTI  database, CBCL Street Scenes  database, INRIA  database and SVS database (Surveillance Video Sequences
at a train station collected by ourselves). The APiS 1.0 database includes the following contents:
3. Evaluation Protocols
- Pedestrian bounding boxes. We use a pedestrian detector  to locate pedestrians with bounding boxes. Because of copyright issue, we are
unable to provide all cropped images directly. You can download the row data of KITTI  and CBCL Street Scenes  from their websites, and then use our bounding
box information to crop pedestrian images.
- Attribute annotations. Each cropped pedestrian is resized to 128x48 first and then annotated with 11 binary
attributes and 2 multi-class attributes. The attribute statistics of APiS 1.0 database are list in Table 1.
- Evaluation protocols. The protocols include two aspects: one for evaluation of binary
attribute classification and other one for multi-class attribute classification.
Table 1 The attribute statistics of APiS 1.0 database.
We evaluate the performance of each attribute classification
with 5-fold cross-validation. That is, we provide a sample index to separate the APiS 1.0
database into 5 equal sized subsets, and then evaluate each attribute classification
based on the same sample index. The 5 results from the 5 folds are further averaged to
produce a single performance report.
In the evaluation of the binary attribute classification,
samples with ambiguous annotations are excluded. Two performance measures, the recall rate
and false positive rate, are applied for evaluation. The recall rate means the fraction of
the correctly detected positives out of the whole positive samples, and the false positive
rate represents the fraction of the mis-classified negatives out of the whole negative
samples. The Receiver Operating Characteristic (ROC) curve is also adopted to compare different
algorithms. At various threshold settings, a ROC curve can be drawn by plotting the recall
rate vs. the false positive rate. Since our evaluation is based on cross-validation, we report
the performance with the average ROC curve. In order to make a more intuitive performance report,
the Area Under the average ROC Curve (AUC) is also used for evaluation. The larger the AUC is,
the better the classification performance will be.
In the evaluation of the multi-class attribute classification,
samples with ambiguous or occluded annotations are excluded. In order to handle unseen colors beyond the
training data, we design an open-set identification  experiment to evaluate the performance of the
multi-class attribute classification. In our open-set identification experiment, the defined colors are
used as the gallery classes, while the undefined samples are used as negative samples beyond the defined colors.
Therefore, in the testing phase, the undefined samples should be rejected as not belonging to the gallery, so
that we know they are with other colors beyond the defined colors. To evaluate the open-set identification
performance, we adopt the detection & identification rate Pdi
and false positive rate Pfp defined in .
Assume that G represents a gallery set,
QN represents two probe sets. While
QG consists of classes in the gallery set
QG but with different images,
QN contains classes that are not present in
Then, Pdi and
Pfp are formulated as:
is the decision score function that decides whether q
is a defined sample, t is the decision threshold,
and cid(q)=1 is the classification indicator
which is equals to 1 if and only if q is correctly
classified. We also use the average ROC curve to report the open-set identification performance
as in , where the ROC curve represents detection & identification rate vs. false positive rate
at various threshold settings. In order to show the difference in performance of each defined color,
we further separate the query images into several subsets, with each subset containing a single
defined color and the undefined color. Then the above evaluation protocol is applied to each
subset to draw a performance curve with respect to the defined color of that subset.
4. Baseline Performance Report
Baseline performance report available on results page.
To download the database, please follow the steps below:
- Download and print the document
for using APiS 1.0 database.
- Sign the agreement and Send the agreement to email@example.com or
- Check your email to find a login account
and a password of our website after one day, if your
application has been approved.
- Download the APiS 1.0 database from our website with the authorized account within
The database is released for research
and educational purposes. We hold no liability for any
undesirable consequences of using the database. All rights of
the APiS 1.0 database are reserved. Any person or organization
is not permitted to distribute, publish, copy, or disseminate this database.
 Jianqing Zhu, Shengcai Liao, Zhen Lei, Dong Yi and Stan Z. Li, ¡°Pedestrian Attribute Classification in Surveillance: Database and Evaluation¡±.
In ICCV workshop on Large-Scale Video Search and Mining (LSVSM'13), Sydney, December, 2013.[pdf][code and baseline results]
 A. Geiger, P. Lenz, and R. Urtasun. ¡°Are we ready for autonomous driving? the KITTI vision benchmark suite¡±. In CVPR, 2012. http://www.cvlibs.net/datasets/kitti
 S. M. Bileschi and L. Wolf. ¡°CBCL street scenes¡±. 2006. http://cbcl.mit.edu/software-datasets/streetscenes
 N. Dalal and B. Triggs. ¡°Histograms of oriented gradients for human detection¡±. In CVPR, 2005.
 J. Yan, Z. Lei, D. Yi, and S. Z. Li. ¡°Multi-pedestrian detection in crowded scenes: A global view¡±. In CVPR, 2012.
 P. J. Phillips, P. Grother, and R. Micheals. ¡°Evaluation methods in face recognition¡±. In Handbook of Face Recognition, pages 551¨C574. Springer, 2011.
 S. Liao, A. K. Jain, and S. Z. Li. ¡°Partial face recognition: Alignment-free approach¡±. In IEEE Trans. on Pattern Analysis and Machine Intelligence, pages 1193¨C1205, 2013.