Region-based Annotated Child Pornography Dataset

Region-based Annotated Child Pornography Dataset (RCPD)

This dataset is a private database that belongs to the Brazilian Federal Police. The paper “A Benchmark Methodology for Child Pornography Detection” describes the structure of the dataset. The aim of the dataset is to assess and compare the performance of child pornography detection methods. Its files are not available to the public by any means, but researchers can submit their child pornography detection methods to be evaluated against the RCPD dataset following the instructions below.

Evaluation Instructions

To submit your method, you should send an email to macedo.jjmn@dpf.gov.br with a zip file containing your method and a script for a docker container, defining libraries and dependencies. Alternatively, you can email a link for these files. You should also implement the code to test your method, with the following requirements:

assume that the images are located in a folder named “data”, under your method’s directory
output a csv file named results.txt, containing the name of each tested image and your prediction (‘True’ or ‘False’) for child pornography:


filename001.jpg,True
filename002.jpg,False
filename003.jpg,False

The following script shows an example of this code in Python:


import os
import random
from os.path import abspath, dirname, isfile, join

# predicts whether an image is related to child pornography
# returns True or False
def predict(file_path):
    #call your method here
    return random.choice([True, False])

if __name__== "__main__":
    script_dir = dirname(abspath(__file__))
    dset = join(script_dir, "data")
    results = join(script_dir, "results.txt")
    output = open(results, "w")
    fnames = [f for f in os.listdir(dset) if isfile(join(dset, f))]
    for f in fnames:
        res = predict(join(dset, f))
        output.write(f + "," + str(res) + "\n")
    output.close()

Scoreboard

Results of methods and tools evaluated against the RCPD dataset:

Method/Tool	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)
Macedo et al.	79.84	68.64	64.61	66.56
LED	76.47	75.34	57.21	66.30
NuDetective	57.43	78.74	41.24	54.13

Please, cite this dataset as:

Macedo, Joao, Filipe Costa, and Jefersson A. dos Santos. “A Benchmark Methodology for Child Pornography Detection.” 2018 31st IEEE SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). 2018.


@inproceedings{
  macedo2018rcpd,
  title={A Benchmark Methodology for Child Pornography Detection},
  author={J. Macedo, F. Costa and J. A. dos Santos},
  booktitle={Graphics, Patterns and Images (SIBGRAPI), 2018 31th SIBGRAPI Conference on},
  year={2018},
  organization={IEEE}
}