survey

data

classroom activities

new curriculum

Help for the Data Viewer

View these demonstrations:

Play video introduction

Play video using the Data Viewer

Play video printing from Data Viewer

Play video Sampling from Subpopulations

Data Viewer enables you to sample and analyse data from the Census At School NZ databases. It is all done online, meaning you don’t have to install any software.

Everything is embedded in the PPDAC enquiry cycle. When you first visit Data Viewer you only see the Problem, Plan, and Data sections. You do not see the Analysis options until after you have selected your data.

Follow the PPDAC cycle through:

    Problem. What is your question?

    Plan your investigation. What variables will you need to answer your question?

    Data. We’ve already done the hard part of collecting the data. Click on “Get my sample”

    Analysis. Now you have your sample, ask for plots and tables by picking some variables from the dropdown lists that help you answer your questions and click “Do Analysis”

    Conclusion. What do those plots tell you? We’ll leave this for you to work out.

Contents of the demonstration videos

NOTES TO TEACHERS

Contents

The designers of this system foresaw it primarily being used as follows:

  • Students, either individually or in groups, would use it to perform an investigation using one of the available databases as a resource
  • Before approaching the system, the students would be aware of what variables the particular database contained and what they measured (databases and variables)
  • Sparked by this knowledge, the students would come up with their own investigative question, and state what variables in the database would enable them to address it
  • Having done this up-front thinking, they would go to the computer, take an appropriate sample, and perform an analysis aimed at answering their question
  • The webpage would capture the background thinking and the results of the analysis
    Students would then print off the page and write a “story” intrepreting what they had found
  • Because everything happens so fast, students should be able to repeat the PPDAC cycle several times in a typical lesson period

Many other forms of use are possible

  • The contents of the “I wonder” and “My variables are …” fields are not used by the system at all. Their presence relates entirely to the usage scenario above. You can do lots of analyses on the fly totally ignoring these fields
  • If the “I am a year 12 student” box is checked, some plots will include additional inferential annotations
  • Until students are familiar with the system they should just take a single random sample as in Videos 1-3 so that they do not have to cope with learning too many things at once
  • It is possible, however, to sample from subpopulations defined in terms of up to 4 variables (see Video 4)
  • “Refresh Data” (at the bottom) repeats the last analysis using a new sample

Why are we sampling?

  • When we are sampling, the databases are playing the role of the real world and what the students are doing is analogous to going out into the field and collecting data. Students need to be encouraged to see their involvement with Data Viewer in this light
  • With such usage students can learn about how samples can be used to make inferences about populations
  • If they need further motivation to sample rather than look at everything, one approach is to tell them that collecting data costs money and give them a budget for their investigation
  • Even following exactly the same plan, different students will obtain different samples and, therefore different data. Teachers can use this to help students understand sampling variation and its effects

Types of Plots and Summaries

  • As its name suggests, this tool is mainly about “viewing data”, i.e. using plots to see what the data are saying
  • Summary statistics can also be obtained (use the “Add Summaries” checkbox just to the right of “Do Analysis”) but this should be used only after students have spent time on reasoning from their plots
  • The data viewer decides what type of plot for a variable by checking whether it has been typed as “numeric” or “categorical” (i.e. “treat this variable as one that just defines group membership”). For example, year at school is treated as categorical although sometimes you might want to treat it as numeric. Currently it is not possible for the user to change the internal “numeric”/”categorical” coding
  • For a single numeric variable Data Viewer displays a dot plot with a box plot superimposed (dot+box plot)
  • For a single categorical variable, Data Viewer produces tables of counts and percentages and a bar chart of the percentages
  • For two numeric variables, Data Viewer produces a scatter plot
  • For a numeric variable and a categorical variable, Data Viewer produces a dot+box plots for each group defined by the categorical variable
  • For two categorical variables, Data Viewer produces tables of counts and percentages and side-by-side bar charts of the percentages
  • The usual pattern for incorporation of a third variable is to produce separate two-variable plots for every group defined by the third variable
  • The one exception to this is the case of three numeric variables where the third variable is coded onto the two-variable scatter plot using colour

The Data Viewer uses a simple PHP template to output a webpage. Dynamic bits and bobs are added by using jQuery.

Behind the scenes is the CensusAtSchool New Zealand database. We use some R code written by Chris Wild, Steve Taylor, and Dineika Chandrananda to decide on how to analyse the variables selected. These are then returned to the browser for viewing.