Data issues in Wave 5
Issue paper No. 9
The first four waves of LSAC data included geography items such as postcodes and various levels of the Australian Standard Geographical Classification (ASGC) that were generated from geocoding of the residential addresses of study families. In Wave 1 the geocodes were based on global positioning system (GPS) coordinates obtained by I-view interviewers at the time of interview, while Waves 2 to 4 were based on residential addresses collected by Australian Bureau of Statistics (ABS) interviewers.
In July 2011 the ABS introduced a new statistical geography framework called the Australian Statistical Geography Standard (ASGS) to replace the ASGC. The main purpose of the ASGS is to disseminate geographically classified statistics. It provides a common framework of statistical geography enabling the publication of statistics that are comparable and spatially integrated.
Improved data sources and technology have allowed the ABS the opportunity to create a better geography optimised for the release of ABS statistics. A new robust and stable structure means that changes over time are minimised, assisting in the maintenance of quality timeseries data. In addition, the ASGS, together with improved methods of calculation, allows for more accurate correspondences to translate ABS data to non-ABS administrative and geographic regions.
For further information on this new standard refer to 1270.0.55.001-Australian Statistical Geography Standard (ASGS): Volume 1-Main Structure and Greater Capital City Statistical Areas, July 2011.
To take advantage of this more comprehensive, flexible and consistent way of defining Australia's statistical geography, the ASGS will be included from Wave 5 onwards. To ensure that there is a common geographical standard across waves, the decision was made to:
- dual-code Wave 5 residential addresses to ASGC and ASGS, enabling comparison of old and new classifications; and
- back-code Waves 1-4 residential addresses to the new standard ASGS.
The new variables added to the general release file for each wave are shown in Table 1.
|Without age variable name||Label|
|gccsa||Australian Statistical Geography Standard (ASGS)-Edition 2011-Greater Capital City Statistical Area Structure|
|sos||Australian Statistical Geography Standard (ASGS)-Edition 2011-Section of State|
|sa22011||Australian Statistical Geography Standard (ASGS)-Edition 2011-SA2|
|sa32011||Australian Statistical Geography Standard (ASGS)-Edition 2011-SA3|
|sa42011||Australian Statistical Geography Standard (ASGS)-Edition 2011-SA4|
|absra||Australian Statistical Geography Standard (ASGS)-Edition 2011-Remoteness Area (ABS)|
Most addresses were auto-coded using ASGS address coders, which allow addresses to be linked to geographical areas. However, in some cases addresses were either incomplete, had spelling errors or, more rarely, were identical addresses in the same suburb. In these cases, addresses were manually cleaned to reduce the number of records with missing geocodes. After these steps there were still some records unable to be geocoded to ASGS (level SA2). These numbers for Waves 1-5 are provided in Table 2.
|Wave||Number of responding records not coded to SA2|
To enable coding to the ASGS, many addresses needed cleaning to ensure accurate data. As a result, some records have SLAs where there were none previously, and others have been coded to a different SLA.
The 2011 Census and SEIFA data are available in the new ASGS classifications. However, while is it possible to provide ASGS classifications for Waves 1 to 5, Census and SEIFA data for 2001 and 2006 are not available for these new geographic classifications (ASGS).
The first four waves of LSAC data include variables for the occupation of Parent 1 (P1) and Parent 2 (P2). In recent waves, the occupation of Parent Living Elsewhere (PLE) and the parents of P1/P2/PLE (i.e., the study child's grandparents) are also included. These were coded using the Australian Standard Classification of Occupations (ASCO). The ANU4 scale - a scale of occupational status calculated using ASCO, which is an occupational classification system that classifies jobs according to skill level and skill specialisation - is also provided to data users for Waves 1 to 4.
Since Wave 2, LSAC occupation data has also been coded to the newer occupation standard, which is the Australian and New Zealand Standard Classification of Occupations (ANZSCO). ANZSCO was introduced in 2006 and was a product of a development program between the ABS, Statistics New Zealand and the Australian Government Department of Employment and Workplace Relations.
For further information on this standard, refer to 1220.0-ANZSCO-Australian and New Zealand Standard Classification of Occupations, First Edition, 2006.
The latest release of ASCO was in 1997, reducing its applicability to the current Australian workforce. Therefore, from Wave 5 onwards only, ANZSCO codes will be produced. To enable the transition to using ANZSCO, the study has:
- added ANZSCO codes to the Waves 2-4 data files, as these codes were already generated during these waves, and is investigating the possibility of providing ANZSCO for Wave 1 through correction code;
- replaced the ANU4 scale from Wave 5 onwards with the Australian Socioeconomic Index 2006 (AUSEI06), the latest in the series of occupation status scales developed by the ANU; and1
- provided AUSEI06 for Waves 2-4 and is investigating the possibility of adding to Wave 1 through correction code.
The new variables added to the general release file are in Table 3.
|pw08_5||Current occupation (ANZSCO code)|
|pw08_6||Current or most recent occupation (ANZSCO code)|
|pw08_7||Current occupation (AUSIE06 code)|
The SEP variable (Z score for socioeconomic position among all LSAC families) has been calculated from Waves 1 to 4 using ASCO classifications. Due to ASCO being unavailable for Wave 5, the SEP variable has not been calculated and hence is not available in the Wave 5 dataset. Further work will occur into ways we can calculate the SEP using the ANZSCO classifications and a new/revised SEP variable may be available in the future.
ACIR data issue (all waves)
After analysis of the ACIR data previously supplied, it came to light that immunisation rates in LSAC did not reflect national rates. After investigation with the data provider, we came to the conclusion that the previous data extraction has not extracted all required records. We are currently in the process of rectifying this data, but unfortunately this will not occur before the Wave 5 release. So, the ACIR dataset will not be included in the Wave 5 release and we also ask all data users not to use any previous version of the ACIR data.
Changes to household files
Addition of Person Type to the files
In Wave 5, Person Type (f21a) is available on the Waves 1 to 5 files for the first time, with a code attached to each household member and wave. This item is derived from information collected in the P1 interview and amended where needed during processing. A list of the person types and a description of each is shown in Table 4.
|1||Study child||The study children are the focus of the study, and consist of two cohorts (B cohort aged 8-9 years and K cohort aged 12-13 years in Wave 5).|
|2||Parent 1||Parent or guardian who provides the greatest role in caring for the study child and is therefore likely to be the most reliable informant on the health, development and care of the study child. Parent 1 must live with the study child.|
|3||Parent 2||Study child's other resident parent/guardian, or the married or de facto partner of Parent 1. Another person in the household can be considered as Parent 2 if they are acting as a significant parental figure who helps to care for the child, and is a stable member of the child's residential family unit.|
|4||Usual resident||A person other than the study child and the study child's resident parent(s) who usually lives in the study child's house (e.g., siblings of the study child).|
|5||Nonresident||A person other than a parent who has previously been a resident of the household, but no longer lives in the same household as the study child.|
|6||Parent living elsewhere||A parent of the study child who does not live in the same household as Parent 1 and the study child. This person may previously have been a Parent 2 (or a Parent 1).|
|7||Temporary member||Includes people who, in between waves, joined the study child's household for more than 3 months but have since left.|
|8||Empty row||In the household files row/member number 3 is always used for Parent 2 at Wave 1. When there was no P2 in the house at Wave 1, this row is left as an empty row. Also used when duplicate members are picked up.|
|9||Deceased||A person who was previously recorded as a resident of the household, but has died.|
Inclusion of two waves of household data in the PLE person grid
The person grid is a list of people and their demographics associated with the study child, some members may still reside with the study child and others may have left. The Wave 5 Parent Living Elsewhere survey instrument included rollforward person grid data from Wave 4, so now two waves of household data for ongoing responding PLEs are available. Including Wave 4 details of a PLE's household in the survey instrument enables comparisons of the PLE's household circumstances between waves.
Concordance between people on main and PLE person grids
The concordance between the main household and the PLE's household has been provided for the first time in Wave 5. This enables the identification of who is the same person between the two files, who is on the main file only, and who is on the PLE file only. Table 5 provides a list of variables provided in the concordance file.
|MID5||Wave 5 Main Household Member Number|
|PLEID5||Wave 5 PLE Household Member Number|
|HHTYPE_5||Wave 5 Household Type|
|CHHFLOOP||Wave 5 Combined Household Row Number|
The values for HHTYPE_5 are:
- 0 = Not present at Wave 5
- 1 = Wave 5 main household member only
- 2 = Wave 5 PLE household member only
- 3 = Wave 5 main and PLE household member
- Main household member number 4 was present at Wave 5, and that person was also present at Wave 5 in the PLE household, where they were recorded as member number 3. The variables that link these records will contain the following values: MID5 = 4; PLEID5 = 3, HHTYPE_5 = 3;
- If main household member number 4 was in the main household only at Wave 5, the values would be: MID5 = 4; PLEID5 = -9, HHTYPE_5 = 1;
- If PLE household member number 3 was in the PLE household only at Wave 5, the values would be: MID5 = -9; PLEID5 = 3, HHTYPE_5 = 2.
The values in MID5 and PLEID5 correspond to the member number in the data files, so this will enable you to find demographic information and link it to the files if required.
Child report of whether at school
At the start of both the study child's audio-computer-assisted self interview (ACASI) module and the face-to-face Child Self-Report K (CSRK) module, the interviewer records whether the study child is attending school, using response options of Yes and No. If the study child doesn't attend a school, some questions about schooling are not asked. These questions are directly related to the school environment and therefore are not relevant to study children not attending school. Parent 1 is also asked a question about whether the child:
- attends a government school;
- attends a Catholic school;
- attends an independent or private school; or
- is not in school.
In total, the number of K cohort children coded as not in school as a result of the P1 interview was 33, whereas from the child interview the combined number was 218. Table 6 demonstrates that there were 191 records where the responses about whether the child was in school conflicted between the two interview components.
|Parent 1 (EDUC14)||Study child (ACASI02/CSRK02)|
|In school||Not at school (either question)||No study child interview||Neither question answered||Total|
|Not at school||2||29||2||0||33|
|No P1 interview||4||0||0||0||4|
|Question not answered||1||0||1||0||2|
Table 7 crosstabulates possible reasons for the discrepancy against school type, as recorded in the P1 interview for these 189 records. Around 44% of the difference seems to be accounted for by the interview taking place at the weekends or in school holidays.
|School attended||Interview date in school holidays||Interview date on weekend (not school holidays)||Interview date is school day||Total|
|Independent or private school||12||7||22||41|
To improve the quality of reporting in Wave 6, and to clear up any confusion, school attendance will be recorded in the same way in both the child interview and the Parent 1 interview. In the child interview the same response categories of government school, Catholic school, independent or private school, and not in school will be provided instead of Yes/No responses. This change is to make it clearer that the study is asking about usual school attendance and not whether school was attended on the current interview date. This point will also be further highlighted in interviewer training.
1 McMillan, J., Beavis, A., & Jones, F. L. (2009). The AUSEI06: A new socioeconomic index for Australia. Journal of Sociology, 45(2), 123-149.