QLFS data processing
There are three stages to the data processing pipeline, each being written in Python:
- extract: where we get a copy of the file from an appropriate source
- transform: where we convert it into a simpler form by selecting rows and filtering columns, and transforming formats to meet what we need
- prepare: where we build files which will directly drive our visualisations. These may be summarised or transposed data, or in a completely different format (e.g. JSON).
Data sources
The sources for this data is the Labour Market Survey. These are pulled from NOMIS, via the LMS extract in the YFF Data Pipelines repository.
We extract the data of interest to a monthly rolling file.
Employment status processing
For the A06 figures, the transform script extracts the following measures from the 'People' sheet (i.e. not split by Gender)
JN6B
: Total not in full-time education, 16-24, SAAGNJ
: Employed level not in fte, 16-24, SAAGOL
: Unemployed level not in fte, 16-24, SAAGPM
: Economically inactive level not in fte, 16-24, SAAIWI
: Employed rate not in fte, 16-24, SAAIXT
: Unemployed rate not in fte, 16-24, SAAIYU
: Economically inactive rate not in fte, 16-24, SAJN62
: Total in full-time education, 16-24, SAAGNT
: Employed level in FTE, 16-24, SAAGOU
: Unemployed level in FTE, 16-24, SAAGPV
: Economivally inactive level in FTE, 16-24, SAAIXB
: Employed rate in FTE, 16-24, SAAIYC
: Unemployed rate in FTE, 16-24, SAAIZD
: Economically inactive rate in FTE, 16-24, SA
We then select every third period starting at the most recent line to avoid overlapping quarters. We also convert the quarter from a string to the date representing the start of the quarter (e.g. Jan-Mar 2023 is converted to a proper datetime object at 1-Jan-2023)
Finally, we save a CSV of unemployment by education status for further processing.
Long-term unemployment status processing
This processing is similar, with the following alterations:
We extract the levels of unemployment and unemployment over 12 months as well as the rate for both ages 16-17 and age 18-24.
YBVH
: Age 16 to 17 unemployed level, SAYBXG
: Age 16 to 17 unemployed 6 to 12 months level, SAYBXJ
: Age 16 to 17 unemployed over 12 months level, SAYBXM
: Age 16 to 17 unemployed over 12 months rate, SAYBVN
: Age 18 to 24 unemployed level, SAYBXV
: Age 18 to 24 unemployed 6 to 12 months level, SAYBXY
: Age 18 to 24 unemployed over 12 months level, SAYBYB
: Age 18 to 24 unemployed over 12 months rate, SA
We convert the quarters as described above, and then combine the unemployment total and over 12 months levels across the two age ranges to come up with an aggregated figure from 16-24. We then calculate the resulting rate by simple division.
Finally, we save the last three years to a CSV of long-term unemployment data as before.