Zusammenfassung: | The Ministry of Justice (MoJ) Data First Synthetic Data Project aims to improve engagement with Data First datasets by making synthetic versions of content available to enable more rapid development of research proposals and to thereby enhance the potential for linked administrative data to improve understanding and outcomes across justice systems. The project has led the development of two components: a dataset generation platform and an initial release of lo-fidelity, synthetic data tables. This study includes a synthetically-generated version of the Ministry of Justice Data First Crown Courts datasets. Synthetic versions of all 43 tables in the MoJ Data First data ecosystem have been created. These versions can be used / joined in the same way as the real datasets. As well as underpinning training, synthetic datasets should enable researchers to explore research questions and to design research proposals prior to submitting these for approval. The code created during this exploration and design process should then enable initial results to be obtained as soon as data access is granted. The Ministry of Justice Data First Crown Court defendant dataset provides data on defendants’ appearances in criminal cases before Crown Court in England & Wales from 2013, and has been extracted from XHIBIT management information system, used by His Majesty’s Courts and Tribunals Service (HMCTS) to manage cases within the Crown Court. Please note: recent Trial and Sentencing cases are now usually recorded on a new case management system, Common Platform. These cases are not included and therefore coverage of Crown Court cases in this dataset will decrease over time (particularly for cases received from mid-2021) although the majority of cases disposed during 2021 and 2022 are captured. Appropriate coverage and time period will be considered in assessing applications to use this data. Information on defendants’ characteristics, the main offence charged, key cases dates, processes and outcomes is included: for example, age, gender, ethnicity, offence category, hearings, please, conviction and sentencing. Cases heard at the Crown Court for Trial, Sentencing or Appeal are included. Each record in the dataset gives information about a single person and case. There is one table which gives a case summary based on the principal offence and one with a record for each offence within the case. As part of Data First, records have been deidentified and deduplicated, using our probabilistic record linking package, Splink, so that a unique identifier is assigned to all records believed to relate to the same person, allowing for longitudinal analysis and investigation of repeat appearances. This opens up the potential to better understand court users and to build evidence on, for example, patterns associated with prolific offending and what works to reduce reoffending. The Ministry of Justice Data First linking dataset can be used in combination with this and other Data First datasets to join up administrative records about people from across justice services to increase understanding around users’ interactions, pathways and outcomes. Cases can also be linked directly to cases appearing in the Data First magistrates’ courts defendant dataset.
|