The Research, Analysis & Evaluation Unit found that one of the first hurdles in looking at the data is determining the number of individuals offending versus cases reported. The lack of a unique client identifier in the Connecticut Criminal Justice System creates serious problems with data accuracy and appropriate record keeping for sentencing purposes. Although the identification of individuals going through the criminal justice system is less than perfect, if fingerprints are taken and someone has done jail time their SPBI number would provide a positive identification, even with future misspellings and incorrect dates of birth (DOB).

For those arrests that involved cases in which no fingerprints are taken and the individual has an SPBI number, individuals with the same name and DOB would appear in the file but with no fingerprints the identification is not positive. Until fingerprinting and assignment and/or attachment to an existing SPBI number are a routine part of every arrest, the use of SPBI alone will lead to underreporting of arrest activity.

Based on the Burglary Conviction data provided to date from Judicial-Court Operations System (CRMVS)...
Over the past five years:
  1. How many individuals have been charged with burglary?
  2. How many have been convicted with burglary?
  3. And of those convicted what is their degree of burglary?
  1. How many have prior convictions?
  2. How many have prior convictions for burglary?
  3. What is the age of their first conviction?
  4. What is the age of their current conviction?
Currently we are evaluating different statistical and analytics software tools to determine standards for how we analyze data.
DMHAS recently used Link King , a free module, to link up 5 years of DOC releases, arrest records and substance abuse treatment clients in its recidivism study.  Link King is a SAS tool that requires a base SAS product license (9.0 version or higher) but NO SAS programming experience is necessary. Link King uses a number of person identifying elements to make a probabilistic estimate of two or more records being the same person.  It depends on what data items are available and at what threshold (cut point) you want to set the tolerance as to how many matches and made.  Those in the mid-range of the threshold can be examined in a deterministic manner (i.e., visual inspection) - for instance the DOB is different but seems to be transposed 21 vs. 12, or gender is switched male vs. female.  This is of course very time consuming so the mid-range needs to be kept tight, so within 2 to 3 points. See documentation for more details on Link King.
DMHAS also uses SPSS software, a predictive analytics software to anticipate change, manage both daily operations and special initiatives more effectively, and realize positive, measurable benefits. By incorporating predictive analytics into daily operations they are able to direct and automate decisions to meet business goals and achieve measurable competitive advantage.  This tool requires less programming skill and can be used to quickly derive results.
CSSD uses a “home-grown” matching algorithm that is likely very similar to “Link King” and has used it successfully in importing both DOC and DPS files.  The document below describes the methods used in association with the export of criminal history data from the CT Department of Public Safety, including ingest and normalization of data, algorithms used to associate records with CSSD clients, and automation of data retrieval for analysis.
