Summary: | The Ministry of Justice (MoJ) Data First Synthetic Data Project aims to improve engagement with Data First datasets by making synthetic versions of content available to enable more rapid development of research proposals and to thereby enhance the potential for linked administrative data to improve understanding and outcomes across justice systems. The project has led the development of two components: a dataset generation platform and an initial release of lo-fidelity, synthetic data tables. This study includes a synthetically-generated version of the Ministry of Justice Data First Magistrates' Courts dataset. Synthetic versions of all 43 tables in the MoJ Data First data ecosystem have been created. These versions can be used / joined in the same way as the real datasets. As well as underpinning training, synthetic datasets should enable researchers to explore research questions and to design research proposals prior to submitting these for approval. The code created during this exploration and design process should then enable initial results to be obtained as soon as data access is granted. The Ministry of Justice Data First magistrates’ court defendant dataset provides data on defendants' appearances in criminal cases before magistrates' courts (including Youth Courts) in England and Wales from 2011 and has been extracted from the LIBRA management information system, used by HisMajesty's Courts and Tribunals Service (HMCTS) to manage cases within magistrates' courts. Please note: recent cases are now usually recorded on a new case management system, Common Platform. These cases are not included and therefore coverage of magistrates’ cases in this dataset will decrease substantially over time(particularly throughout 2021 and 2022). No Single Justice Procedure cases are included in this dataset. Appropriate coverage and time period will be considered in assessing research project applications. Information on defendants' characteristics, the offences charged, key case dates, processes and outcomes are included: for example, age, gender, ethnicity, offence category, hearings, plea, conviction, sentencing and committal to Crown Court. Each record in the dataset gives information about a single person and case. There is one table which gives a case summary based on the principal offence and one with records for each offence within the case. Information on magistrates’ cases which are not a criminal offence in law (such as some breaches and applications for civil orders) are included. As part of Data First, records have been deidentified and deduplicated, using our probabilistic record linkage package, Splink, so that a unique identifier is assigned to all records believed to relate to the same person, allowing for longitudinal analysis and investigation of repeat appearances. This opens up the potential to better understand court users and to build evidence on, for example, patterns associated with prolific offending and what works to reduce reoffending. The Ministry of Justice Data First linking dataset can be used in combination with this and other Data First datasets to join up administrative records about people from across justice services to increase understanding around users' interactions, pathways and outcomes. Cases can also be linked directly to cases appearing in the Data First Crown Court defendant dataset.
|