RCPD Dataset
The region-based annotated child pornography dataset (RCPD) is a private database that belongs to the Brazilian Federal Police.
The paper "A Benchmark Methodology for Child Pornography Detection" [link] describes the structure of the dataset.
The aim of the dataset is to assess and compare the performance of child pornography detection methods. Its files are not available to the public by any means, but researchers can submit their child pornography detection methods to be evaluated against the RCPD dataset following the instructions below.
Evaluation Instructions
To submit your method, you should send an email to macedo.jjmn@dpf.gov.br with a zip file containing your method and a script for a docker container, defining libraries and dependencies. Alternatively, you can email a link for these files.
You should also implement the code to test your method, with the following requirements:
- - assume that the images are located in a folder named "data", under your method's directory;
- - output a csv file named results.txt, containing the name of each tested image and your prediction ('True' or 'False') for child pornography.
filename001.jpg,True filename002.jpg,False filename003.jpg,False |
The following script shows an example of this code in Python.
import os import random from os.path import abspath, dirname, isfile, join # predicts whether an image is related to child pornography # returns True or False def predict(file_path): #call your method here return random.choice([True, False]) if __name__== "__main__": script_dir = dirname(abspath(__file__)) dset = join(script_dir, "data") results = join(script_dir, "results.txt") output = open(results, "w") fnames = [f for f in os.listdir(dset) if isfile(join(dset, f))] for f in fnames: res = predict(join(dset, f)) output.write(f + "," + str(res) + "\n") output.close() |
Scoreboard
Results of methods and tools evaluated against the RCPD dataset:
Method/Tool | Accuracy (%) | Precision (%) | Recall (%) | F1-Score (%) |
Macedo et al. [Link] | 79.84 | 68.64 | 64.61 | 66.56 |
LED | 76.47 | 75.34 | 57.21 | 66.30 |
NuDetective | 57.43 | 78.74 | 41.24 | 54.13 |
Citation
The dataset is described in the paper A Benchmark Methodology for Child Pornography Detection.
If you mention the RCPD dataset or submit your method to evaluation, please use the following reference:
@inproceedings{ macedo2018rcpd, title={A Benchmark Methodology for Child Pornography Detection}, author={J. Macedo, F. Costa and J. A. dos Santos}, booktitle={Graphics, Patterns and Images (SIBGRAPI), 2018 31th SIBGRAPI Conference on}, year={2018}, organization={IEEE} } |