RCPD Dataset

 RCPD Dataset

The region-based annotated child pornography dataset (RCPD) is a private database that belongs to the Brazilian Federal Police.
The paper "A Benchmark Methodology for Child Pornography Detection" [link] describes the structure of the dataset.
The aim of the dataset is to assess and compare the performance of child pornography detection methods. Its files are not available to the public by any means, but researchers can submit their child pornography detection methods to be evaluated against the RCPD dataset following the instructions below.

Evaluation Instructions

To submit your method, you should send an email to macedo.jjmn@dpf.gov.br with a zip file containing your method and a script for a docker container, defining libraries and dependencies. Alternatively, you can email a link for these files.

You should also implement the code to test your method, with the following requirements:

  • - assume that the images are located in a folder named "data", under your method's directory;
  • - output a csv file named results.txt, containing the name of each tested image and your prediction ('True' or 'False') for child pornography.

 

 
filename001.jpg,True
filename002.jpg,False
filename003.jpg,False

The following script shows an example of this code in Python.

 
import os
import random
from os.path import abspath, dirname, isfile, join

# predicts whether an image is related to child pornography
# returns True or False
def predict(file_path):
    #call your method here
    return random.choice([True, False])

if __name__== "__main__":
    script_dir = dirname(abspath(__file__))
    dset = join(script_dir, "data")
    results = join(script_dir, "results.txt")
    output = open(results, "w")
    fnames = [f for f in os.listdir(dset) if isfile(join(dset, f))]
    for f in fnames:
        res = predict(join(dset, f))
        output.write(f + "," + str(res) + "\n")
    output.close()

Scoreboard

Results of methods and tools evaluated against the RCPD dataset:

 

Method/Tool Accuracy (%) Precision (%) Recall (%) F1-Score (%)
Macedo et al. [Link] 79.84 68.64 64.61 66.56
LED 76.47 75.34 57.21 66.30
NuDetective 57.43 78.74 41.24 54.13

Citation

The dataset is described in the paper A Benchmark Methodology for Child Pornography Detection.

If you mention the RCPD dataset or submit your method to evaluation, please use the following reference:

 
@inproceedings{
  macedo2018rcpd,
  title={A Benchmark Methodology for Child Pornography Detection},
  author={J. Macedo, F. Costa and J. A. dos Santos},
  booktitle={Graphics, Patterns and Images (SIBGRAPI), 2018 31th SIBGRAPI Conference on},
  year={2018},
  organization={IEEE}
}