Center for Biometrics and Security Research
Home | News | Team | Projects | Research | Standards | Demos | Databases | Parteners | Contact      ÖÐÎÄ
SEARCH£º        
 
CASIA WebFace Database

Pushing by big data and deep convolutional neural network (CNN), the performance of face recognition is becoming comparable to human. Using private large scale training datasets, several groups achieve very high performance on LFW, i.e., 97% to 99%. While there are many open source implementations of CNN, none of large scale face dataset is publicly available. The current situation in the field of face recognition is that data is more important than algorithm. To solve this problem, we propose a semi-automatical way to collect face images from Internet and build a large scale dataset containing 10,575 subjects and 494,414 images, called CASIA-WebFace. To the best of our knowledge, the size of this dataset rank second in the literature, only smaller than the private dataset of Facebook (SCF). We encourage those data-consuming methods training on this dataset and reporting performance on LFW.

The statistics of the proposed CASIA-WebFace dataset is shown in Table 1. Except for Facebook's SFC dataset, the scale of CASIA-WebFace has the largest scale. For users' privacy issue, maybe SFC will never be open to research community. The features of Microsoft's WDRef dataset was publicly available from 2012 but it is inflexible for advanced researches. Among the datasets listed in the table, CASIA-WebFace+LFW is the most suitable combination for large scale face recognition in the wild. If you feel the accuracy of LFW has been saturated by the current state-of-the-art method. BLUFR is a more challenging protocol to report your results.

Table 1. The information of CASIA-WebFace and comparison to other large scale face datasets.

Dataset #Subjects #Images Availability
LFW [1] 5,749 13,233 Public
WDRef [2] 2,995 99,773 Public (feature only)
CelebFaces [3] 10,177 202,599 Private
SFC [4] 4,030 4,400,000 Private
CACD [5] 2,000 163,446 Public (partial annotated)
CASIA-WebFace 10,575 494,414 Public

Publication and Results:
To illustrate the quality of CASIA-WebFace, we train a deep CNN on it and compare its accuracy to state-of-the-art methods, such as, DeepFace and DeepID2. You can refer the following technical report for details.
♦ Dong Yi, Zhen Lei, Shengcai Liao and Stan Z. Li, ¡°Learning Face Representation from Scratch¡±. arXiv preprint arXiv:1411.7923. 2014. (pdf)

The above reference should be cited in all documents and papers that report experimental results based on the CASIA WebFace database.

Download Instructions:
To apply for the database, please follow the steps below:

  1. Download and print the document Agreement for using CASIA WebFace database
  2. Sign the agreement (The agreement must be signed by the director or the delegate of the deparmart of university. Personal applicant is not acceptable.)
  3. Send the agreement to cbsr-request@authenmetric.com
  4. Check your email to find a login account and a password of our website after one day, if your application has been approved.
  5. Download the CASIA WebFace database from our website with the authorized account within 48 hours.

Copyright Note and Contacts:
The database is released for research and educational purposes. We hold no liability for any undesirable consequences of using the database. All rights of the CASIA WebFace database are reserved.

References:
[1] LFW, http://vis-www.cs.umass.edu/lfw/
[2] D. Chen, X. Cao, L. Wang, F. Wen, and J. Sun. ¡°Bayesian face revisited: A joint formulation¡±. In ECCV 2012, pages 566¨C579. Springer, 2012.
[3] Y. Sun, X. Wang, and X. Tang. ¡°Deep learning face representation by joint identification-verification¡±. arXiv preprint arXiv:1406.4773, 2014.
[4] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. ¡°Deepface: Closing the gap to human-level performance in face verification¡±. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1701¨C1708. IEEE, 2014.
[5] CARC, http://bcsiriuschen.github.io/CARC/

¡¡  Introduction
¡¡  Iris Databases
¡¡  Gait Databases
¡¡  HFB Face Database
¡¡  NIR-VIS 2.0 Database
¡¡  WebFace Database
¡¡  NIR Face Databases
¡¡  BIT Face Databases
¡¡  Fingerprint Databases
¡¡  Handwriting Databases
¡¡  Action Databases
¡¡  Palmprint Databases
¡¡  Multi-spectral Palmprint Databases
Copy rigth All right reserved 2005 Center for Biometrics and Security Research
Center for Biometrics and Security Research 12th Floor,Institute of Automation chinese Academy of Sciences
P.O.Box2728Beijing 100080 P.R.China
Tel:010-62632259 Fax:010-62632259 E-MAIL:hjzhang@cbsr.ia.ac.cn