------------ status: published title: Brainwash Dataset desc: Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco subdesc: It includes 11,917 images of "everyday life of a busy downtown cafe" and is used for training face and head detection algorithms caption: One of the 11,917 images in the Brainwash dataset captured from the Brainwash Cafe in San Francisco slug: brainwash cssclass: dataset image: assets/background.jpg year: 2015 published: 2019-4-18 updated: 2019-6-02 authors: Adam Harvey ------------ # Brainwash Dataset Update: In response to the publication of this report, the Brainwash dataset has been "removed from access at the request of the depositor." ### sidebar + Press coverage: New York Times, De Tijd ### end sidebar Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,917 images of "everyday life of a busy downtown cafe"[^readme] captured at 100 second intervals throughout the day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's [research paper](https://www.semanticscholar.org/paper/End-to-End-People-Detection-in-Crowded-Scenes-Stewart-Andriluka/1bd1645a629f1b612960ab9bba276afd4cf7c666) introducing the dataset, the images were acquired with the help of Angelcam.com. [^end_to_end] The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without their consent. No ordinary cafe customer could ever suspect that their image would end up in dataset used for surveillance research and development, but that is exactly what happened to customers at Brainwash Cafe in San Francisco. Although Brainwash appears to be a less popular dataset, it was notably used in 2016 and 2017 by researchers affiliated with the National University of Defense Technology in China for two [research](https://www.semanticscholar.org/paper/Localized-region-context-and-object-feature-fusion-Li-Dou/b02d31c640b0a31fb18c4f170d841d8e21ffb66c) [projects](https://www.semanticscholar.org/paper/A-Replacement-Algorithm-of-Non-Maximum-Suppression-Zhao-Wang/591a4bfa6380c9fcd5f3ae690e3ac5c09b7bf37b) on advancing the capabilities of object detection to more accurately isolate the target region in an image. [^localized_region_context] [^replacement_algorithm] The [National University of Defense Technology](https://en.wikipedia.org/wiki/National_University_of_Defense_Technology) is controlled by China's top military body, the Central Military Commission. The Brainwash dataset also appears in a 2018 research paper affiliated with Megvii (Face++) that used images from Brainwash cafe "to validate the generalization ability of [their] CrowdHuman dataset for head detection."[^crowdhuman]. Megvii is the parent company of Face++, who has provided surveillance technology to [monitor Uighur Muslims](https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html) in Xinjiang and may be [blacklisted](https://www.bloomberg.com/news/articles/2019-05-22/trump-weighs-blacklisting-two-chinese-surveillance-companies) in the United States. #### Updates Since [posting](https://twitter.com/adamhrv/status/1132201604999000065) about this dataset and [showing](https://www.ft.com/content/cf19b956-60a2-11e9-b285-3acd5d43599e) its connections to the National Unviversity of Defense Technology in China, the Brainwash dataset is no longer available for download. As of June 2, 2019 it has been "removed from access at the request of the depositor." The two papers associated with the National University of Defense Technology in China have also been affected. The citations linking back to the Brainwash dataset paper no longer appear in the Semantic Scholar API search results. The citation references on the pages for [NUDT citation 1](https://www.semanticscholar.org/paper/A-Replacement-Algorithm-of-Non-Maximum-Suppression-Zhao-Wang/591a4bfa6380c9fcd5f3ae690e3ac5c09b7bf37b) and [NUDT citation 2](https://www.semanticscholar.org/paper/Localized-region-context-and-object-feature-fusion-Li-Dou/b02d31c640b0a31fb18c4f170d841d8e21ffb66c) now display the text "Sorry, this paper is not in our corpus", no longer linking back to the [original Brainwash paper](https://www.semanticscholar.org/paper/End-to-End-People-Detection-in-Crowded-Scenes-Stewart-Andriluka/1bd1645a629f1b612960ab9bba276afd4cf7c666), effectively censoring the NUDT connections from API search results.   {% include 'dashboard.html' %} {% include 'supplementary_header.html' %}  ### Press Coverage - New York Times: [Facial Recognition Tech Is Growing Stronger, Thanks to Your Face](https://www.nytimes.com/2019/07/13/technology/) - De Tijd: [Brainwash](https://www.tijd.be/dossier/legrandinconnu/brainwash/10136670.html) {% include 'cite_our_work.html' %} #### Citing Brainwash Dataset If you use any data from the Brainwash dataset, please follow their [license](https://opendatacommons.org/licenses/pddl/summary/index.html) and cite their work as:
@article{Stewart2016EndtoEndPD,
title={End-to-End People Detection in Crowded Scenes},
author={Russell Stewart and Mykhaylo Andriluka and Andrew Y. Ng},
journal={2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2016},
pages={2325-2333}
}
### Footnotes
[^readme]: "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385.
[^end_to_end]: Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016.
[^localized_region_context]: Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598.
[^replacement_algorithm]: Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering.
[^crowdhuman]: Shuai Shao, Zijian Zhao, Boxun Li, Tete Xiao, Gang Yu, Xiangyu Zhang, and Jian Sun. CrowdHuman: Benchmark for Detecting Human in a Crowd. 2018. http://arxiv.org/abs/1805.00123