Type: | Package |
Title: | Data Files Supporting "Scientific Research and Methodology" by Peter K. Dunn (2025) |
Version: | 1.0.1 |
Author: | Peter K. Dunn [aut, cre] |
Maintainer: | Peter K. Dunn <pdunn2@usc.edu.au> |
Description: | Provides most of the data files used in the textbook "Scientific Research and Methodology" by Dunn (2025, ISBN:9781032496726; forthcoming). |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Language: | en-GB |
Encoding: | UTF-8 |
LazyData: | false |
Depends: | R (≥ 3.5.0) |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-05-26 05:29:45 UTC; pdunn2 |
Repository: | CRAN |
Date/Publication: | 2025-05-28 10:10:02 UTC |
AISsub
Description
Body measurements from athletes at the Australian Institute of Sport.
Usage
data(AISsub)
Format
A data frame with 202 rows (each athlete) and 6 columns:
- Sex
The sex of the athlete; one of
F
orM
- SSF
The sum of skin folds
- PBF
The percentage body fat
- Sport
The sport played by the athlete; one of
BBall
(basketball),Field
,Gym
(gymnastics),Netball
,Rowing
,Swim
(swimming),T400m
, (track, further than 400m),Tennis
,TPSprnt
(track sprint events),WPolo
(waterpolo)- Wt
The weight of the athlete, in kg
- Ht
The height, in cm
Source
OzDASL, available on-line at http://www.statsci.org/data/.
References
Telford, R. D. and Cunningham, R. B. (1991). Sex, sport, and body-size dependency of hematology in highly trained athletes. Medicine and Science in Sports and Exercise, 23(7):788–794.
Weight loss after treatment for anorexia
Description
Weight changes in girls with anorexia: two treatments.
Usage
data(Anorexia)
Format
A data frame with 72 rows and 3 columns:
- Treatment
The treatment type; one of
CB
(cognitive behavioural treatment),Control
(the control group) orFT
(family therapy)- Before
Weight (in kg) before the anorexia treatment
- After
Weight (in kg) after the anorexia treatment
Source
D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994) A Handbook of Small Data Sets, London: Chapman and Hall. Dataset 285.
Vegetarianism and B12
Description
B12 deficiency in vegetarian and non-vegetarian women.
Usage
data(B12Diet)
Format
A data frame with 124 rows (one for each person) and 2 columns:
- B12
B12 deficiency; one of
1
(B12 deficient) or2
(Not B12 deficient)- Diet
The diet; one of
1
(Vegetarian) or2
(non-vegetarian)
Source
Gammon, Cheryl S., Pamela R. von Hurst, Joan Coad, Rozanne Kruger, and Welma Stonehouse. 2012. Vegetarianism, Vitamin B12, and Insulin Resistance in a Group of Predominately Overweight/Obese South Asian Women. Nutrition 28: 20–24.
BMI of Irish patients
Description
The BMI and other health data number of Irish patients.
Usage
data(BMI)
Format
A data frame with 70 rows and 11 columns:
- sex
Sex of the person; one of
female
ormale
- age
Age of person, in completed years
- edu
Level of education; one of
primary
,secondary
,postLeaving
,complete3rd
- m_card
whether the person has a medical card; one of
yes
orno
- smoke
smoking status; one of
daily
,occasionally
ornot at all
- drink
whether the person drinks alcohol weekly; one of
yes
orno
- exercise
The number of days per week the person walks or exercise for 30 minutes or more
- diet
whether the person thinks they have a healthy diet; one of
yes
,no
ordont know
- ob_weight_kg
the observed (measured) weight, in kg
- ob_height_m
the observed (measured) height in metres
- sr_weight_kg
the weight reported by the person, in kg
- sr_height_m
the height reported by the person, in metres
- bmi_perception
the person perception of the BMI; one of
normalweight
,overweight
orobese
Details
The data come from a survey.
Source
Johnson, E., Millar, S. R., & Shiely, F. (2021). The association between BMI self-selection, self-reported BMI and objectively measured BMI. HRB Open Research, 4(37), 37.
Baby births in one day at one hospital
Description
Details of the births on one day from a Brisbane hospital.
Usage
data(BabyBoom)
Format
A data frame with 44 rows (one per birth) and 3 columns:
- Gender
The gender of the child; one of
Female
orMale
- Weight
The weight of the baby, in kg
- Mins.Since.Midnight
the time of birth, in minutes since midnight
Source
Steele, S. 1997. Babies by the Dozen for Christmas: 24-Hour Baby Boom. The Sunday Mail, 7.
Dunn, Peter K. 1999. A Simple Dataset for Demonstrating Common Distributions. Journal of Statistics Education, 7 (3).
Battery performance
Description
Battery life for two brands of batteries.
Usage
data(Battery)
Format
A data frame with 108 rows (one per battery) and 4 columns:
- Brand
One of
Energizer
orUltracell
(ALDI home brand))- Voltage
The voltages at which times were recorded
- Time
The time taken for 1.5V battery to reduce to the given voltage, in hours
- Battery
Which battery in the sequence
Source
Dunn, Peter K. 2013. Comparing the Lifetimes of Two Brands of Batteries. Journal of Statistical Education, 21 (1).
Bitumen content
Description
Relationship between bitumen content and percentage air voids.
Usage
data(Bitumen)
Format
A data frame with 42 rows and 2 columns:
- Bitumen
The bitumen content (by percentage weight) in the bitumen sample
- AirVoids
The percentage of air voids, by volume
Source
Panda, R. P., Sudhanshu Sekhar Das, and P. K. Sahoo. 2018. Relation Between Bitumen Content and Percentage Air Voids in Semi Dense Bituminous Concrete. Journal of The Institution of Engineers (India): Series A 99 (2): 327–32.
Body temperatures
Description
Body temperature (in degrees C and F) for people.
Usage
data(BodyTemp)
Format
A data frame with 130 rows (each person) and 4 columns:
- BodyTemp
The measured body temperature, in degrees F, as given
- Gender
One of
1
(males) or2
(females)- HeartRate
Heart rate, in beats per minute
- BodyTempC
The measured body temperature in degrees C; converted from degrees F
Source
Allen, L. S. (1996). What's normal?–Temperature, gender, and heart rate. Journal of Statistics Education, 4(2).
References
Wunderlich, C. 1868. Das Verhalten Der Eiaenwarme in Krankenheitem. Leipzig, Germany: Otto Wigard. Mackowiak, Philip A., Steven S.Wasserman, and Myron M. Levine. 1992. A Critical Appraisal of 98.6 degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlich. Journal of the American Medical Association 268 (12): 1578–80.
Bone quality in South Koreans
Description
Bone mass density of South Korean subjects, at three body locations.
Usage
data(BoneQuality)
Format
A data frame with 969 rows (one for each student) and 7 columns:
- Sex
The sex of the subject; one of
M
(male) orF
(female)- Age
The age of the subject, in years
- Height
The height of the subject, in cm
- Weight
The weight of the subject, in kg
- LumbarBMD
The bone mass density of the lumbar spine, in g/square-cm
- HipBMD
The bone mass density of the total hip, in g/square-cm
- NeckBMD
The bone mass density of the femoral neck, in g/square-cm
Details
Bone mass density and demographic information for 969 subjects in South Korea.
Source
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0260924#sec013
References
Kim, K. Y., & Kim, K. M. (2022). Similarities and differences between bone quality parameters, trabecular bone score and femur geometry. PLOS One, 17(1), e0260924.
The impact of sugarcane borers
Description
The impact of sugarcane borers on reducing sorghum fitness and grain production.
Usage
data(Borers)
Format
A data frame with 72 rows and 8 columns:
- Hybrids
The hybrid; one of
AG1090
,BRS373
orDKB590
- Insecticide
Whether insecticide was used; one of
with
orwithout
- Height
The plant height, in cm
- Tunnels
The length of borers tunnels, in cm
- PanicleLength
The panicle (flower cluster) length, in cm
- PanicleWeight
The panicle (flower cluster) weight, in cm
- Infestation
The amount of infestation (the 'stem borer injury'), as a percentage
- Yield
The sorghum yield, in kg per hectare
Details
The data provide details of sorghum yield in the presence of borer infestation, from a study Brazil conducted over three years.
Source
Souza, Camila and Souza, Bruno and Fadini, Marcos and França, Joselia and Menezes, Cícero and Nascimento, Priscilla and Mendes, Simone (2025), "What is the potential of sugarcane borer in reducing sorghum fitness and grain production?", Mendeley Data, V2, doi: 10.17632/b6s9wnxgfm.2
References
Souza, C., de Souza, B. H. S., Fadini, M. A. M., França, J. C. O., de Menezes, C. B., Nascimento, P. T., and Mendes, S. M. (2024). What is the potential of sugarcane borer in reducing sorghum fitness and grain production?. Journal of Applied Entomology, 148(7), 818–826.
The health of burros
Description
The health of females burros in the Mojave Desert.
Usage
data(Burros)
Format
A data frame with 9 rows and 3 columns:
- Status
The reproductive status of the female burro; one of
1
(barren),2
(pregnant (but not lactating)), or3
(lactating)- Health
The health of the burro; one of
1
(excellent),2
(fair) or3
(poor).- Counts
The number of female burros in each cell
Details
The data provide the number of female burros of given health and reproductive status.
Source
Johnson, R. A., Carothers, S. W., & McGill, T. J. (1987). Demography of feral burros in the Mohave Desert. The Journal of Wildlife Management, 51(4), 916–920.
Captopril effectiveness
Description
Blood pressure before and after treatment with Captopril.
Usage
data(Captopril)
Format
A data frame with 30 rows (one per person) and 3 columns:
- Before
The blood pressure before taking captopril, in mm Hg
- After
The blood pressure after taking captopril, in mm Hg
- BP
The type of blood pressure measured;
S
for systolic, andD
for diastolic
Source
D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994) A Handbook of Small Data Sets, London: Chapman and Hall. Dataset 72.
References
MacGregor, Graham A., N. D.Markandu, J. E. Roulston, and J. C. Jones. 1979. Essential Hypertension: Effect of an Oral Inhibitor of Angiotensin-Converting Enzyme. British Medical Journal 2: 1106–1109.
Car crashes
Description
The number and type of of car crashes, in two different years.
Usage
data(CarCrashes)
Format
A data frame with 4 rows and 3 columns:
- CrashType
Whether or not the crash involved pedestrians (
1
) or other vehicle (2
)- Year
Either
2011
or2015
- Counts
The number of crashes in the combination defined by
CrashType
andYear
Details
The data provide the number of car crashes in a mountainous county in western China, some involving pedestrians and some involving other vehicles, in two years
Source
Wang, Liyang, Ruimin Li, Changjun Wang, and Zhiyong Liu (2020). "Driver Injury Severity Analysis of Crashes in a Western China's Rural Mountainous County: Taking Crash Compatibility Difference into Consideration.". Journal of Traffic and Transportation Engineering (English Edition).
Cherry Ripe weights
Description
The weight of 'Fun Size' Cherry Ripe chocolate bars.
Usage
data(CherryRipe)
Format
A data frame with 16 rows (each combination of the other variables) and 4 columns:
- TotalWeight
The weight of the wrapper bar, in g
- WrapperWt
The weight of the wrapper only, in g
- BarWt
The weight of the chocolate bar itself, in g, by subtraction
- Year
The year, from 2011, 2013 to 2015, 2017 to 2019
Details
The Cherry Ripe chocolate bars were weighted as an in-class activity, usually by weighing the bar+wrapper, and then the wrapper (for hygiene reasons) on a set of scales. The bars were in a Fun Size pack, of about 11 bars. Until 2015, the weights were listed in the nutrition panel as 18g. After 2015, this changed to 14g.
Source
Collected and weighed by Peter K. Dunn and students (who got to eat the chocolate bars).
Price of second-hand Corollas
Description
The price of second-hand Corollas advertised on Gum Tree (Australia).
Usage
data(Corollas)
Format
A data frame with 45 rows (one per vehicle) and 3 columns:
- Year
the year of manufacture of the vehicle
- Price
the advertised price, in AUD
- Age
the age of the vehicle, in years
Source
Collected by Peter K. Dunn, 2014, from www.gumtree.com.au
Crab shells and anemones (2x2)
Description
The placement of anemones on their shells by hermit crabs.
Usage
data(CrabShells2)
Format
A data frame with 4 rows and 3 columns:
- ShellColumn
The column where anemone placed; one of
1
(Side) or2
(Central)- ShellRow
The row where anemone placed; one of
1
(Side) or2
(Central)- Counts
The number of anemones in the indicated sector on the shell
Details
The data provide the number of anemones placed on their shell by
hermit crabs in indicated regions.
Roughly, the shells are divided into a 3x3 grid of approximately
equal areas (see CrabShell3
) but here the 3x3 table has been
collapsed to a 2x2 table.
Source
Brooks, W. R. (1989). Hermit crabs alter sea anemone placement patterns for shell balance and reduced predation. Journal of Experimental Marine Biology and Ecology, 132(2), 109–121.
Crab shells and anemones (3x3)
Description
The placement of anemones on their shells by hermit crabs.
Usage
data(CrabShells3)
Format
A data frame with 9 rows and 3 columns:
- ShellColumn
The column where anemone placed; one of
1
(Side 1),2
(Central) or3
(Side 2)- ShellRow
The row where anemone placed; one of
1
(Side 1),2
(Central) or3
(Side 2)- Counts
The number of anemones in the indicated sector on the shell
Details
The data provide the number of anemones placed on their shell by hermit crabs in indicated regions. Roughly, the shells are divided into a $3$ x $3$ grid of approximately equal areas.
Source
Brooks, W. R. (1989). Hermit crabs alter sea anemone placement patterns for shell balance and reduced predation. Journal of Experimental Marine Biology and Ecology, 132(2), 109–121.
Cyclones in the Australian region
Description
The number of cyclones (severe; non-severe) and the ONI.
Usage
data(Cyclones)
Format
A data frame with 37 rows (one per person) and 8 columns:
- Year
The year
- Severe
The number of severe cyclones recorded in the Australian region
- NonSevere
The number of non-severe cyclones recorded in the Australian region
- Total
The total number of cyclones recorded in the Australian region
- JFM
the Ocean Nino Index, or oni, averaged over the months January to March; a numeric vector
- AMJ
the Ocean Nino Index, or oni, averaged over the months April to June; a numeric vector
- JAS
the Ocean Nino Index, or oni, averaged over the months July to September; a numeric vector
- OND
the Ocean Nino Index, or oni, averaged over the months October to December; a numeric vector
Source
Dunn, Peter K., and Gordon K. Smyth. 2018. Generalized Linear Models with Examples in R. Springer.
Danish lung cancer cases
Description
The number of cases of lung cancer in four Danish cities.
Usage
data(DanishLC)
Format
A data frame with 24 rows (each combination) and 4 columns:
- Cases
The number of lung cancer cases for the given age group and city
- Pop
The population for the given age group and city
- Age
The age group; one of
40-54
,55-59
,60-64
,65-69
,70-74
or>74
- City
The city; one of
Fredericia
,Horsens
,Kolding
orVejle
Source
James K. Lindsey (1995). Modelling frequency and count data. Clarendon Press, page 157.
References
E. B. Andersen (1977). Multiplicative Poisson models with unequal cell rates. Scandinavian Journal of Statistics, 4, 153–158.
Deceleration of cars
Description
The deceleration of cars after adding additional speed signage.
Usage
data(Deceleration)
Format
A data frame with 79 rows (one per car) and 2 columns:
- When
When the deceleration is measured:
Before
orAfter
signage added- Deceleration
The deceleration, in metres-per-second-squared
Source
Ma, Yongfeng, Wenbo Zhang, Xin Gu, and Jiguang Zhao. 2019. Impacts of Experimental Advisory Exit Speed Sign on Traffic Speeds for Freeway Exit Ramp. PLoS One 14 (11): e0225203.
Dental statistics
Description
The data give the estimates of the mean number of decayed, missing and filled teeth (DMFT) at age 12 years, and the mean annual sugar consumption in the previous five years for 90 countries.
Usage
data(Dental)
Format
A data frame with 90 rows (one per person) and 4 columns:
- Country
the country; a factor
- Indus
whether the country is considered an industrialized country; a factor with levels
Yes
(industrialized) orNo
(not industrialized)- Sugar
the mean annual sugar consumption in kilograms per person per year, computed over the five years (or as much as available) prior to the survey; a numeric vector
- DMFT
estimates of the mean number of decayed, missing and filled teeth at age 12; a numeric vector
Source
Woodward, M., and A. R. P.Walker. 1994. Sugar Consumption and Dental Caries: Evidence from 90 Countries. British Dental Journal 176: 297–302
References
M. Woodward (2004) Epidemiology: Study Design and Data Analysis, second edition. Chapman and Hall.
Diabetes
Description
Blood pressure on the first and second visits.
Usage
data(Diabetes)
Format
A data frame with 403 rows (one per person) and 4 columns; many values are missing
- SBPfirst
the systolic blood pressure from the first visit, in mm Hg
- DBPfirst
the diastolic blood pressure from the first visit, in mm Hg
- SBPsecond
the systolic blood pressure from the second visit, in mm Hg
- DBPsecond
the diastolic blood pressure from the second visit, in mm Hg
Source
Originally from <http://biostat.mc.vanderbilt.edu/DataSets>, though that URL no longer works. It seems to now appear at <https://hbiostat.org/data/repo/diabetes.html>
Dog walks
Description
Dog walking in the city and country.
Usage
data(DogWalks)
Format
A data frame with 8 rows and 3 columns:
- Location
One of
1
(City) or2
(Farm)- WalkLength
One of
1
(Under 30 mins),2
(30 to under 60 mins),3
(60 to under 120 mins), or4
(varies; mostly long walk but some shorter walks)- Counts
The number of dogs in each cell
Details
The data provide the number of dogs being walked for given times, in the city and country.
Source
Naughton, Violetta, Teresa Grzelak, and Patrick J. Naughton. (2024). "Association Between Household Lo-cation (Urban Versus Rural) and Fundamental Care Provided to Domestic Dogs (Canis Familiaris) in Northern Ireland.” In Nutrition and Metabolism of Dogs and Cats, 217–236. Springer.
Dog measurements
Description
Measurements of Phu Quoc Ridgeback dogs.
Usage
data(Dogs)
Format
A data frame with 30 rows (one per dog) and 4 columns:
- BL
Body length, in cm
- BH
Body height, in cm
- Chest
Chest measurement, in cm
- Waist
Waist measurement, in cm
Source
Quan, Quoc-Dang, Hoang-Dung Tran, and Anh-Dung Chung. 2017. The Relation of Body Score (Body Height/Body Length) and Haplotype E on Phu Quoc Ridgeback Dogs (Canis Familiaris). Journal of Entomology and Zoology Studies 5: 388–94
Lifespan of dogs
Description
The average weight of dog breeds, and the average lifespan of dog breeds, using over 50 individuals for each breed.
Usage
data(DogsLife)
Format
A data frame with 73 rows and 5 columns:
- Breed
The breed name
- Weight
The average breed weight (in kg)
- LitterSize
The average breed litter size
- BirthWeight
The average breed birthweight (in kg)
- Lifespan
The average breed lifespan (in years)
Details
The original data list many more breeds, but these are (as best as I can determine) those based on at least 50 individuals, as noted in the original article.
Source
da Silva, Jack and Cross, Bethany (2022). Data and code for: Dog lifespans and the evolution of ageing [Dataset]. Dryad https://doi.org/10.5061/dryad.wwpzgmsn6
References
da Silva, J., & Cross, B. J. (2023). Dog life spans and the evolution of aging. The American Naturalist, 201(6), E140–E152.
ED patients and welfare
Description
Welfare distribution and emergency department (ED) patients.
Usage
data(EDpatients)
Format
A data frame with 30 rows (one per person) and 2 columns:
- Days
The number of days after welfare distribution
- ED
The mean number of emergency department (ED) patients
Source
Data read from the scatterplot in Brunette, Douglas D., John Kominsky, and Ernest Ruiz. 1991. Correlation of Emergency Health Care Use, 911 Volume, and Jail Activity with Welfare Check Distribution. Annals of Emergency Medicine 20 (7): 739–42.
EV purchasing
Description
Details of people regarding the purchase of an EV.
Usage
data(EVpurchase)
Format
A data frame with 4 rows (corresponding to the 4 cells in a $2$ times $2$ table) and three columns:
- Education
The level of education; one of
1
('no post-graduate study') or2
(post-graduate study')- PurchaseEV
Whether respondent would purchase an electric vehicle in the next 10 years'; one of
1
(Yes) or2
(No)- Counts
The number of respondents in the given cell
Source
Egbue, Ona and Long, Suzanna (2012). Barriers to widespread adoption of electric vehicles: An analysis of consumer attitudes and perceptions. Energy Policy, 48, 717–729.
Ear infections in Sydney
Description
Ear infections for swimmers at a Sydney beach.
Usage
data(EarInfection)
Format
A data frame with 287 rows and 6 columns:
- Swimmer
The type of swimmer; one of
Occasional
orFrequent
- Location
The usul swimming location; one of
Non-beach
orBeach
- Age
The age group; one of
15 to 19
,20 to 24
, or25 to 29
- Sex
The sex of the person; one of
Male
orFemale
- NumInfections
The number of self-reported ear infections
- Infections
Whether the person had experienced an ear infection; one of
Yes
orNo
Source
James K. Lindsey (1995). This data file was downloaded from OzDASL (http://www.statsci.org/data/oz/earinf.html) where it was prepared by Dr Gordon Smyth from Hand et al (1994) Dataset 328.
References
D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994) A Handbook of Small Data Sets, London: Chapman and Hall. Dataset 328.
Elephant measurements
Description
Physical measurements of elephants.
Usage
data(Elephants)
Format
A data frame with 1470 rows and 5 columns:
- Sex
Sex of the elephant; one of
A
orB
(anonymised)- Age
Age of elephant, in completed years
- Chest
Chest girth, in cm
- Height
Height to shoulder, in cm
- Mass
Body mass, in kg
Source
Lalande, Lucas; Lummaa, Virpi; Aung, Htoo Htoo; Htut, Win; Nyein, U. Kyaw; Berger, Verane; Briga, Michael (2022). Sex-specific body mass aging trajectories in adult Asian elephants. Dryad. https://doi.org/10.5061/dryad.5dv41ns59
References
Lalande, L. D., Lummaa, V., Aung, H. H., Htut, W., Nyein, U. K., Berger, V., & Briga, M. (2022). Sex‐specific body mass ageing trajectories in adult Asian elephants. Journal of Evolutionary Biology, 35(5), 752–762.
Emerald rainfall in Augusts
Description
The total monthly rainfall in Emerald, Australia, and the average monthly SOI.
Usage
data(EmeraldAug)
Format
A data frame with 114 rows (one per August over 114 years) and 4 columns:
- Year
The year
- Rain
The rainfall in August of the given year; in mm
- SOI
The monthly average Southern Oscillation Index (SOI)
- Phase
the SOI phase (see Stone and Auliciems, 1992); a factor with these values: 1 (consistently negative), 2 (consistently positive), 3 (rapidly falling), 4 (rapidly rising), or 5 (consistently near zero)
Source
Data obtained from the Australian Bureau of Meteorology (<http://www.bom.gov.au>) and iri/ldeo Climate Data Library (<http://www.longpaddock.qld.gov.au/seasonalclimateoutlook/southernoscillationindex/soidatafiles/index.php>) on 21 December 2010, then compiled. The values of the SOI used here is that used by LongPaddock, which is slightly different than that used by the BoM (based on a different period of standardisation), because the SOI Phases are computed from these SOI values.
R. C. Stone and A. Auliciems (1992). SOI phase relationships with rainfall in eastern Australia, International Journal of Climatology, 12, 625–636.
References
Dunn, Peter K., and Gordon K. Smyth. 2018. Generalized Linear Models with Examples in R. Springer.
Ferritin changes
Description
Ferritin concentration changes.
Usage
data(Ferritin)
Format
A data frame with 20 rows (one per patient) and 3 columns:
- September
The patients' ferritin content (in micrograms/L) in September
- March
The patients' ferritin content (in micrograms/L) in March
- Reduction
The reduction in the patients' ferritin content (in micrograms/L) between September and the following March, during which time they had treatment
Source
Cressie, N. A. C., L. J. Sheffield, and H. J.Whitford. 1984. Use of the One Sample $t$-Test in the Real World. Journal of Chronic Diseases 37 (2): 107–14.
Flowering shrubs
Description
First-flowering dates for two shrubs.
Usage
data(Flowering)
Format
A data frame with 25 rows (one per person) and 4 columns:
- Willow
The (Julian) date on which flowering began for the encroaching Salix (willows)
- Skypilot
The (Julian) date on which flowering began for the native Polemonium viscosum (alpine skypilot)
- MinTemp
The minimum June temperature (in degrees C)
- Altitude
The altitude (in m)
Source
Kettenbach, Jessica A.; Miller-Struttmann, Nicole; Moffett, Zoë; Galen, Candace (2018). Data from: How shrub encroachment under climate change could threaten pollination services for alpine wildflowers: a case study using the alpine skypilot, Polemonium viscosum [Dataset]. Dryad. https://doi.org/10.5061/dryad.2p2bh
Fluoroscopic scanning
Description
The data give the total procedure time during CT fluoroscopic scanning, and the radiation dose received.
Usage
data(Fluoro)
Format
A data frame with 19 rows and 2 columns:
- Time
The total procedure time, in minutes
- Dose
The total radiation dose received, in rads
Source
Kelly H. Zou, Kemal Tuncali, and Stuart G. Silverman (2003). Correlation and simple linear regression. Radiology, 227, 617–628.
References
The data were originally used, but not given, in: S. G. Silverman, K. Tuncali, D. F. Adams, R. D. Nawfel, K. H. Zou, and P. F. Judy (1999). CT fluoroscopy-guided abdominal interventions: techniques, results, and radiation exposure. Radiology, 212, 673–681.
Forward-falling women
Description
The forward-leaning angle before women fall over.
Usage
data(ForwardFall)
Format
A data frame with 15 rows (one per patient) and 2 columns:
- LeanAngle
The angle at which patients could lean forward and still recover
- Group
The age group;
1
means 'younger women' and2
mean 'older women'
Source
Wojcik, Laura A., Darryl G. Thelen, Albert B. Schultz, James A. Ashton-Miller, and Neil B. Alexander. 1999. Age and Gender Differences in Single-Step Recovery from a Forward Fall. Journal of Gerentology 54A (1): M44–50.
McDonald's fries
Description
The weights of McDonald's large fries.
Usage
data(FriesWt)
Format
A data frame with 32 observations. The data give the weights of large fries bought from a McDonald (target: 171g).
- FriesWt
The weight of 32 large French fry order at McDonalds, in grams
Source
The data were extracted by reading Figure 2 in: Wetzel, Nathan (2005). "McDonald's french fries: Would you like small or large fries?" STATS, 43, 12–14.
Fruit statistics from farms
Description
Details of fruit from different farms.
Usage
data(Fruit)
Format
A data frame with 37 rows (one per person) and 11 columns:
- Farm
The farm identifier
- Flowers2014
The number of flowers in 2014
- Flowers2015
The number of flowers in 2015
- Fruit2014
The total number of fruits formed in 2014
- Fruit2015
The total number of fruits formed in 2015
- FLength2014
The fruit length (in cm) in 2014
- FLength2015
The fruit length (in cm) in 2015
- FBreadth2014
The fruit breadth (in cm) in 2014
- FBreadth2015
The fruit breadth (in cm) in 2015
- FWeight2014
The fruit weight (in g) in 2014
- FWeight2015
The fruit weight (in g) in 2015
Source
Ronita Mukherjee, Rittik Deb and Soubadra Devy (2020). Diversity matters: effects of density compensation in pollination service during rainfall shift [Dataset]. Dryad. https://doi.org/10.5061/dryad.0n5v168
References
Mukherjee, Ronita; Deb, Rittik; Devy, Soubadra (2020). Diversity matters: Effects of density compensation in pollination service during rainfall shift Ecology and Evolution, 9(17), 9701–9711.
Chest-beating rates in gorillas
Description
Chest-beating rates in Gorillas.
Usage
data(Gorillas)
Format
A data frame with 25 rows (one per gorilla) and 7 columns:
- Male
An identifier
- NoChestBeats
The number of chest beats
- FocalTime
The focal time in hours (i.e., time spent watching gorilla)
- ChestBeatRate
The rate of chest beating, in beats per 10 hours
- BackBreadth
The breadth of the gorilla's back, in cm
- Age
Mean age during the study period, in years
- Age20
Whether the gorillas is aged under 20 or not; one of
Younger
orOlder
Source
Wright, Edward, Sven Grawunder, Eric Ndayishimiye, Jordi Galbany, Shannon C.McFarlin, Tara S. Stoinski, and Martha M. Robbins. 2021. Chest Beats as an Honest Signal of Body Size in Male Mountain Gorillas (Gorilla Beringei Beringei). Scientific Reports 11 (1): 6879.
Hermit crabs
Description
The number of male crabs attached to female horseshoe crabs
Usage
data(HCrabs)
Format
A data frame with 173 rows (each crab) and 5 columns:
- Col
The female's carapace colour; one of
LM
(light medium),M
(medium),DM
(dark medium) orD
(dark)- Spine
The female's spine condition; one of
BothOK
,OneOK
orNoneOK
- Width
The female's carapace width, in cm
- Wt
The weight of the female, in grams
- Sat
The number of male crabs attached ('satellites')
Source
H. J. Brockmann (1996) Satellite male groups in horseshoe crabs, Limulus polyphemus. Ethology, 102(1), 1–21.
Wearing hats and sunglasses
Description
The wearing of hats and sunglasses in Brisbane.
Usage
data(HatSunglasses)
Format
A data frame with 16 rows (each combination of the other variables) and 5 columns:
- Gender
Gender of person; one of
Male
orFemale
- Hat
Whether the person was wearing a hat; one of
Yes
orNo
- Sunglasses
Whether the person was wearing sunglasses; one of
Yes
orNo
- Phone
Whether the person had easy access to their phone; one of
Easy
orNot easy
- Count
The number if people meeting the given combination
Source
Dexter, Ben, Rachel King, Simone L. Harrison, Alfio V. Parisi, and Nathan J. Downs. 2019. A Pilot Observational Study of Environmental Summertime Health Risk Behavior in Central Brisbane, Queensland: Opportunities to Raise Sun Protection Awareness in Australia’s Sunshine State. Photochemistry and Photobiology 95 (2): 650–55
IgE concentrations
Description
IgE concentration before and after intervention.
Usage
data(IgE)
Format
A data frame with 11 rows (one per child) and 3 columns:
- Before
IgE (before intervention), in micrograms/L
- After
IgE (after intervention), in micrograms/L
- Reduction
The reduction in IgE, in micrograms/L
Source
Lothian, James B. and Grey, Vijaylaxmi and Lands, Larry C. (2006). "Effect of whey protein to modulate immune response in children with atopic asthma", International Journal of Food Science and Nutrition, 57 (3/4), 204–211.
Insulation and energy
Description
Energy consumption before and after adding insulation.
Usage
data(Insulation)
Format
A data frame with 10 rows (each house) and 2 columns:
- Before
Energy consumption before adding insulation, in MWh
- After
Energy consumption after adding insulation, in MWh
Source
D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994) A Handbook of Small Data Sets, London: Chapman and Hall. Dataset 86.
References
Originally from: The Open University. 1983. MDST242 Statistics in Society, Unit A0: Introduction. The Open University.
Jeans' pockets
Description
Measurements of pockets in men's and women's jeans.
Usage
data(Jeans)
Format
A data frame with 80 rows (each pair of jeans) and 14 columns:
- Brand
The brand of jeans; 22 brands are represented
- Style
The style of jeans; one of
boot-cut
,regular
,skinny
,slim
orstraight
- Sex
Whether the jeans are men's or women's jeans; one of
men
orwomen
- Price
The price, in US dollars
- MaxHeightFront
The height (in cm) of the front pocket from the top of the highest rivet to the lowest point of the pocket (along the left-hand side or zipper side)
- MinHeightFront
The height (in cm) of the front pocket from the top of the highest rivet to the lowest point of the pocket (along the right-hand side or non-zipper side)
- MaxWidthFront
The width (in cm) from the widest point of the front pocket
- MinWidthFront
The width (in cm) from the highest rivet to the right or non-zipper side of the pocket
- MaxHeightBack
The height (in cm) from the deepest point of the back pocket (usually in the pocket's center) to the top of the pocket
- MinHeightBack
The height (in cm) from the shallowest point of the back pocket to the top of the pocket
- MaxWidthBack
The width of the pocket at the very top (opening)
- MinWidthBack
The width of the pocket at its narrowest (just before the pocket tapers to a point)
- Area
The area of the pocket, from the pocket's measurements (in square cm)
- Style2
The style, where
skinny
now meansStyle == "skinny" | "slim"
and wherestraight
meansStyle == "straight" | "boot-cut"
Note
The githib
source contains a diagram explaining the pocket
measurements more clearly.
All jeans that were measured have a 32-inch waistband,
as indicated by the brand.
Source
https://github.com/the-pudding/data/tree/master/pockets (used with permission).
References
Diehm, Jan & Thomas, Amber (August 2018). Women's pockets are inferior. The Pudding.
Length and width of jellyfish
Description
Width and length of jellyfish at two locations.
Usage
data(Jellyfish)
Format
A data frame with 46 rows (one per jellyfish) and 3 columns:
- Location
the location of the jellyfish; one of
Dangar
(Dangar Island) orSalamander
(Salamander Bay)- Width
the width (breadth) of the jellyfish, in mm
- Length
the length of the jellyfish, in mm
Source
D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994) A Handbook of Small Data Sets, London: Chapman and Hall. Dataset 72.
References
Lunn, A. D. and McNeil, D. R. (1991). Computer-Interactive Data Analysis, Chichester: John Wiley and Sons.
Jumping and footwear
Description
Double-leg jumping distance, wearing shoes and barefoot.
Usage
data(Jumping)
Format
A data frame with 80 rows (each person) and 2 columns:
- Shoes
The jumping distance, while wearing shoes, in cm
- Barefoot
The jumping distance, while barefoot, in cm
Source
Hébert-Losier, K., Boswell-Smith, C., & Hanzlíková, I. (2023). Effect of Footwear Versus Barefoot on Double-Leg Jump-Landing and Jump Height Measures: A Randomized Cross-Over Study. International Journal of Sports Physical Therapy, 18(4), 845.
Kidney stone treatments
Description
Treatment of kidneys stones, and the result.
Usage
data(KStones)
Format
A data frame with 8 rows (each variable combination) and 4 columns:
- Counts
The number of people with the combination of the other variables
- Size
One of
Small
orLarge
, the kidney stone size- Method
The method used; one of
Method A
orMethod B
- Result
The result of the procedure; one of
Success
orFailure
Source
Charig, C. R.,D. R. Webb, S. R.Payne, and J. E. A. Wickham. 1986. Comparison of Treatment of Renal Calculi by Open Surgery, Percutaneous Nephrolithotomy, and Extracorporeal Shockwave Lithotripsy. British Medical Journal 292: 879–82.
Accuracy of scientific instruments
Description
Measurements of LH concentrations at different concentrations, for two instruments.
Usage
data(LHconc)
Format
A data frame with 36 rows and 4 columns:
- High1
Instrument 1 measurement of luteotropichormone (LH) concentrations at a high level, in mIU/ml
- Mid1
Instrument 1 measurement of LH concentrations at a middle level, in mIU/ml
- High2
Instrument 2 measurement of LH concentrations at a high level, in mIU/ml
- Mid2
Instrument 2 measurement of LH concentrations at a middle level, in mIU/ml
Note
The known values are, respectively, 64.31, 19.24, 64.97 and 19.40 mIU/ml.
Source
Feng, Yang-chun and Huang, Yan-chun and Ma, Xiu-min. 2017. The application of Student's $t$-test in internal quality control of clinical laboratory. Frontiers in Laboratory Medicine 1 (3): 125–128.
Lime tree foliage
Description
The foliage biomass of small-leaved lime trees of different origins.
Usage
data(Lime)
Format
A data frame with 385 rows (each tree) and 4 columns:
- Foliage
The oven-dried foliage biomass, in kg
- DBH
The diameter at breast height, in cm
- Age
The age of the tree, in years
- Origin
The origin of the tree; one of
Coppice
,Natural
orPlanted
Source
Schepaschenko, Dmitry; Shvidenko, Anatoly; Usoltsev, Vladimir A; Lakyda, Petro; Luo, Yunjian; Vasylyshyn, Roman; Lakyda, Ivan; Myklush, Yuriy; See, Linda; McCallum, Ian; Fritz, Steffen; Kraxner, Florian; Obersteiner, Michael (2017). Biomass tree data base. doi:10.1594/PANGAEA.871491
In supplement to: Schepaschenko, D et al. (2017): A dataset of forest biomass structure for Eurasia. Scientific Data, 4, 170070, doi:10.1038/sdata.2017.70
Extracted from <https://doi.pangaea.de/10.1594/PANGAEA.871491>
References
The source (Schepaschenko et al.) obtains the data from various sources, which are given there.
Lung capacity
Description
The lung capacity of children.
Usage
data(LungCap)
Format
A data frame with 654 rows (each child) and 5 columns:
- Age
The age of the child, in years
- FEV
The forced expiratory volume, in litres
- Ht
The height, in inches
- Gender
The gender of the child; one of
F
orM
- Smoke
Whether the child is a smoker; one of
0
(non-smoker) or1
(smoker)
Source
Kahn, M. (2003) Data Sleuth, STATS, 37, 24.
Ira B. Tager, Scott T. Weiss, Alvaro Munoz, Bernard Rosner, and Frank E. Speizer (1983). Longitudinal study of the effects of maternal smoking on pulmonary function in children. New England Journal of Medicine, 309(12):699–703.
References
Kahn, Michael (2005). An exhalent problem for teaching statistics. The Journal of Statistical Education, 13(2). Available on-line.
Mandible lengths
Description
The mandible length and gestational age for 167 foetuses from the 12th week of gestation onwards
Usage
data(Mandible)
Format
A data frame with 167 rows (each foetus) and 2 columns:
- Age
The foetus age, in weeks
- Length
The foetus length, in mm
Source
Patrick Royston and Douglas G. Altman (1994). Regression using fractional polynomials of continuous covariates: Parsimonious parametric modelling. Applied Statistics, 43(3), 429–467.
Mary River stream flow
Description
The mean daily stream flow from the Mary River.
Usage
data(MaryRiver)
Format
A data frame with 21,659 rows and 3 columns:
- Month
The month (where
1
means January, etc.- Year
The year
- Mean
The mean stream flow recording for given date, in ML
Source
Originally sourced from: <http://watermonitoring.dnrm.qld.gov.au/cgi/webhyd.pl?rsdf_org=138110A&cat=rs&lvl=1&0>, but the actual website address keeps changing...
Last time I checked it was: <https://water-monitoring.information.qld.gov.au>; then select "Streamflow data", "Mary Basin" and "Mary River at Bellbird Creek" (i.e., station 138110A).
Mumps and isolating
Description
Whether students complied with isolation orders duration a mumps outbreak.
Usage
data(Mumps)
Format
A data frame with 8 rows and 3 columns:
- AgeGroup
One of
1
(18 to 19),2
(20 to 21) or3
(Older than 22)- Compliance
One of
1
(complied with isolation order) or2
(did not comply- Counts
The number of students in each cell
Details
The data provide the number of students complying and not complying with an isolation order during a mumps outbreak in Kansas in 2006.
Source
Soud, F. A., M. M. Cortese, A. T. Curns, P. J. Edelson, R. H. Bitsko, H. T. Jordan, A. S. Huang, J. M.Villalon-Gomez, and G. H. Dayan. (2009). "Isolation Compliance Among University Students During a Mumps Outbreak, Kansas 2006". Epidemiology & Infection, 137(1): 30–37.
Noisy miner (birds)
Description
The number of noisy miners detected in various 2 hectare transects in buloke woodland patches within the Wimmera Plains of western Victoria, Australia
Usage
data(NMiner)
Format
A data frame with 31 rows (each transect) and 2 columns:
- Eucs
The number of eucalypt trees in the transect
- Minerab
The number of noisy miners ('abundance') in three 20 minute surveys in each transect
Source
Personal communication from Martine Maron.
References
Martine Maron (2007). Threshold effect of eucalypt density on an aggressive avian competitor. Biological Conservation, 136, 100–107.
Obstructive sleep apnea (OSA)
Description
Sleeping information for adults with Down Syndrome.
Usage
data(OSA)
Format
A data frame with 60 rows (each patient) and 7 columns:
- ID
An identifier
- Age
The age of the patient, in years
- Gender
The gender of the patient; one of
1
(male) or2
(female)- BMI
The BMI of the patient
- Neck
The neck circumference of the patient, in cm
- REI
The Respiratory Event Index for the patient
- SAOS
The SAOS score; one of
Severe
,Moderate
orLow
Source
de Carvalho, Anderson Albuquerque, Fabio Ferreira Amorim, Levy Aniceto Santana, Karlo Jozefo Quadros de Almeida, Alfredo Nicodemos Cruz Santana, and Francisco de Assis Rocha Neves. 2020. STOP-Bang Questionnaire Should Be Used in All Adults with Down Syndrome to Screen for Moderate to Severe Obstructive Sleep Apnea. PloS ONE 15 (5): e0232596.
The data are given at: <https://figshare.com/articles/dataset/Raw_database_and_statistical_analysis_results-STOP-Bang_questionnaire_should_be_used_in_all_adults_with_Down_Syndrome_to_screen_for_moderate_to_severe_obstructive_sleep_apnea_OSA_/9788903/1>
Orthoses for children
Description
Details of children fitted with orthoses.
Usage
data(Orthoses)
Format
A data frame with 15 rows and 5 columns:
- Gender
The gender of the child; one of
M
(male) orF
(female)- Age
The age of the child, in years
- Height
The height of the child, in cm
- Weight
The weight of the child, in kg
- GMFCS
The value of the ordinal Gross Motor Function Classification System describing the impact of cerebral palsy on their motor function; lower levels mean better functionality; one of
1
or2
Source
Swinnen, Eva, Jean-Pierre Baeyens, Benjamin Van Mulders, Julian Verspecht, and Marc Degelaen (2017). "The Influence of the Use of Ankle-Foot Orthoses on Thorax, Spine, and Pelvis Kinematics During Walking in Children with Cerebral Palsy". Prosthetics and Orthotics International. 42(2), 208–213.
Pain relief for mothers
Description
Pain relief for birthing mothers.
Usage
data(PainRelief)
Format
A data frame with 912 rows (228 mothers with four rows (Time
)
for each) and 8 columns:
- ID
The patient ID; a digit from
1
to228
- Time
The time point of the measurement; one of
1
(0 minutes),2
(after 20 mins),3
(after 40 mins) or4
(after 60 mins)- Score
Pain score
- Group
The type of pain-relief used; one of
palacetamol
orcoldpack
- Age
The age of the mother, in years
- Parity
Which number child is this (e.g., 1 means this is the mother's first child)
- ChildSex
The sex of the baby; one of
female
ormale
- Birthweight
The birthweight of the baby, in kg, to the nearest 0.5kg
Source
Augustino, J., Moshi, F., Joho, A., & Mageda, J. F. K. (2023). Dataset comparing the effectiveness of perineal cold pack application over oral paracetamol 1000mg on postpartum perineal pain among women after spontaneous vaginal delivery in Dodoma region. "Data in Brief", 109766.
Pea nutrition
Description
Nutritional content of peas.
Usage
data(Peas)
Format
A data frame with 96 rows (each seed) and 11 columns:
- Origin
The seed origin; a vector of strings listing locations
- P
The phosphorus content, in mg/g
- K
The potassium content, in mg/g
- Ca
The calcium content, in mg/g
- Mg
The magnesium content, in mg/g
- S
The sulphur content, in mg/g
- Zn
The zinc content, in mg/g
- Fe
The iron content, in mg/g
- Cu
The copper content, in mg/g
- B
The boron content, in mg/g
- Mn
The manganese content, in mg/g
Source
Hacisalihoglu, Gokhan, Nicole S. Beisel, and A.Mark Settles. 2021. Characterization of Pea Seed Nutritional Value Within a Diverse Population of Pisum Sativum. PLoS One 16 (11): e0259565.
Permeability of building materials
Description
The permeability of building materials.
Usage
data(Perm)
Format
A data frame with 81 rows (each sample) and 3 columns:
- Day
The day of the data collection;
1
to9
- Mach
The machine; one of
A
,B
orC
- Perm
The permeability of the sample, in seconds
Source
Bent Joergensen (1992) Exponential dispersion models and extensions: A review. International Statistical Review, 60(1), 5–20
References
A. Hald (1952). Statistical theory with engineering applications. New York: Wiley.
Pet birds
Description
Lung cancer and owning pet birds.
Usage
data(PetBirds)
Format
A data frame with 4 rows (each combination) and 3 columns:
- LC
Whether the adult had lung cancer; one of
Adults with lung cancer
orAdults without lung cancer
- Pets
Whether the adult kept pet birds; one of
Kept pet birds
orDid not keep pet birds
- Counts
The number of adults with the given combination
Source
Kohlmeier, L., G. Arminger, S. Bartolomeycik, B. Bellach, J. Rehm, and M. Thamm. 1992. Pet birds as an independent risk factor for lung cancer: case-control study. British Medical Journal 305 (6860): 986–89.
Diameters of pizzas
Description
The diameter of 12-inch pizzas from two companies.
Usage
data(PizzaSize)
Format
A data frame with 250 rows (one per pizza) and 5 columns:
- Store
the pizza chain; one of
Dominos
(Domino's Pizza) orEagleBoys
(Eagle Boy's Pizza)- CrustDescription
the type of crust for the pizza; one of
ClassicCrust
,DeepPan
,MidCrust
,ThinCrust
orThinNCrispy
(some unique to one pizza company)- Topping
the type of pizza topping; one of
BBQMeatlovers
,Hawaiian
,SuperSupremo
orSupreme
(some unique to one pizza company)- Diameter
the pizza diameter, in cm)
- DiameterInches
the pizza diameter, in inches (converted from cm)
Source
P. K. Dunn. Assessing claims made by a pizza chain. Journal of Statistical Education, 20(1), 2012.
Placebos and pain relief
Description
Pain relief from analgesics and placebos.
Usage
data(Placebos)
Format
A data frame with 7 rows (each time point) and 6 columns:
- Time
The time after taking the treatment, in hours
- Placebo
The mean pain relief score for 22 patients given placebos
- Distr
The mean pain relief score for 22 patients given distalgesics
- Asp
The mean pain relief score for 22 patients given aspirin
- Codis
The mean pain relief score for 22 patients given codis
- PlaceboRed
The mean pain relief score for 22 patients given red placebos
Source
Read from Figures 3 and 4 of Huskisson, E. C. 1974. Simple Analgesics for Arthritis. British Medical Journal 4: 196–200.
Possum weights
Description
Sex and weight of possums at various elevations.
Usage
data(Possums)
Format
A data frame with 135 rows (each possum) and 3 columns:
- Sex
The sex of the possum; one of
Female
orMale
- Wgt
The weight of the possum, in g
- DEM
The elevation, in m, where the possum is found
Source
Williams, Jessica L., Dan Harley, Darcy Watchorn, Lachlan McBurney, and David B. Lindenmayer. 2022. Relationship Between Body Weight and Elevation in Leadbeater's Possum (Gymnobelideus Leadbeateri). Australian Journal of Zoology 69 (5): 167–74
Premier league results
Description
Premier League football (soccer) results from 2019 to 2020.
Usage
data(PremierL)
Format
A data frame with 208 rows (games) and 6 columns:
- Date
The data of the game
- HomeTeam
The name of the home team; for example
Liverpool
orMan United
- AwayTeam
The name of the away team; for example
Wolves
orWest Ham
- HomeGoals
The number of goals scored by the home team
- AwayGoals
The number of goals scored by the away team
- Result
The result, one of
H
for the home team,A
for the away team, orD
for a draw
Source
The website https://sports-statistics.com/sports-data/soccer-datasets/
Queensland school children
Description
The number of four-year-old students enrolled at school in Queensland (Australia), classified by sex, school type and whether the students are First Nations students (in 2019).
Usage
data(QSchools)
Format
A data frame with 8 rows and 4 columns:
- Sex
Sex of the student; one of
F
(female) orM
(male)- FNations
The first-nations status; one of
Yes
(First Nations students) orNo
(non-First Nations students)- School
The school type; one of
Government
orNon-government
- Counts
The number of four-year-old students meeting the designated criteria
Source
Collated by Peter K. Dunn, obtained from data at the Australian Bureau of Statistics, web page (https://www.abs.gov.au) in 2023.
References
Peter K. Dunn. Generalized linear models. In R. J. Tierney, F. Rizvi, and K. Erkican, editors, International Encyclopedia of Education, pages 583–589. Elsevier, 2023.
Reaction times when driving
Description
Reaction times when driving, when using and not using a mobile phone.
Usage
data(ReactionTime)
Format
A data frame with 64 rows (each student) and 2 columns:
- Reaction
The reaction time, in milliseconds
- Group
Which group the student was in; one of
Phone
orControl
Source
Reported by: Agresti, Alan, and Christine A. Franklin. 2007. Statistics: The Art and Science of Learning from Data.
Agresti & Franklin claim the data comes from: Strayer, David L., and William A. Johnston. 2001. Driven to Distraction: Dual-Task Studies of Simulated Driving and Conversing on a Cellular Telephone. Psychological Science 12 (6):462–66
Molar weights of red deer
Description
The age and weight of molars in male red deer.
Usage
data(RedDeer)
Format
A data frame with 78 rows (each deer) and 2 columns:
- Age
The age of the deer, in years
- Weight
The weight of the first molar tooth, in g
Source
D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994) A Handbook of Small Data Sets, London: Chapman and Hall. Dataset 170.
References
The data originally come from: Holgate, P. 1965. Fitting a Straight Line to Data from a Truncated Population. Biometrics 21(3): 715–20
Biofiltration removal efficiency
Description
The removal efficiency in biofiltration.
Usage
data(Removal)
Format
A data frame with 32 rows (each experiment) and 2 columns:
- Removal
The removal efficiency, in percent
- Temp
The inlet temperature, in degrees C
Source
Exercise 12.109 in Devore, Jay L., and Kenneth N. Berk. 2007. Modern Mathematical Statistics with Applications. Thomson Higher Education
References
The data originally come from: Chitwood, Derek E., and Joseph S. Devinny. 2001. Treatment of Mixed Hydrogen Sulfide and Organic Vapors in a Rock Medium Biofilter. Water Environment Research 73 (4): 426–35.
Rip identification
Description
Whether people of given age groups can correctly identify ocean rips.
Usage
data(RipsID)
Format
A data frame with 8 rows and 3 columns:
- AgeGroup
The age group of the person; one of
1
(18 to 24),2
(25 to 34),3
(25 to 50) or4
(51 to 65)- Identification
Whether the person correctly identified a rip from a picture; one of
1
(correctly) or2
(incorrectly)- Counts
The number of people in each cell
Details
The data provide the number of people correctly identifying a rip from a photo, by age group.
Source
Diez-Fern\'andez, P., Ruibal-Lista, B., Lobato-Alejano, F., & L\'opez-Garc\'ia, S. (2023). 'Rip current knowledge: do people really know its danger? Do lifeguards know more than the general public?'. Heliyon, 9(7).
Running data
Description
The reliability vertical oscillation measurements in wearable devices for running.
Usage
data(Running)
Format
A data frame with 150 rows (15 participants by 10 reps each) and 8 columns:
- ID
The participant ID
- Trial
Which trial; one of
1
to5
- Speed
The average running speed, in km.h
- HRM
The vertical oscillation (VO) as measured by the Garmin Heart Rate Monitor-Pro (HRM), in cm
- NOVA
The VO as measured by the the INCUS NOVA device, in cm
- RDP
The VO as measured by the Garmin Running Dynamics Pod (RDP), in cm
- Footpod
The VO as measured by the Stryd Running Power Meter Footpod (Footpod), in cm
- Video
The VO as measured by video analysis, in cm
Source
From Tables 1 and 5 of:
Smith, Craig P. and Fullerton, Elliott and Walton, Liam and Funnell, Emelia and Pantazis, Dimitrios and Lugo, Heinz (2022). The validity and reliability of wearable devices for the measurement of vertical oscillation for running. Plos One, 17 (11), p. e0277810.
Soft drink delivery
Description
The time taken to deliver soft drinks to vending machines.
Usage
data(SDrink)
Format
A data frame with 25 rows (each delivery) and 3 columns:
- Time
The time taken to service the vending machine, in minutes
- Cases
The number of cases of soft drink stocked
- Distance
The distance walked by the driver to service the vending machine, in feet
Source
The data were obtained electronically from OzDASL <http://www.statsci.org/data/>.
References
D. C. Montgomery and E. A. Peck (1992). Introduction to Regression Analysis. Wiley, New York. Example 4.1
Sand dollars
Description
Details about reproduction of sand dollars
Usage
data(Sanddollars)
Format
A data frame with 36 rows (each experiments) and 4 columns:
- SD.temperatures
The temperature, in degrees C, where the sand dollar is located
- SD.fertilization
Sand dollar fertilization rates, in percent
- SD.speeds
Sperm swimming velocities, in micrometres per second
- SD.motility
Sperm motility
Source
Leuchtenberger, Sara Grace, Maris Daleo, Peter Gullickson, Andi Delgado, Carly Lo, and Michael T. Nishizaki. 2022. The Effects of Temperature and pH on the Reproductive Ecology of Sand Dollars and Sea Urchins: Impacts on Sperm Swimming and Fertilization. PLoS One 17 (12): e0276134
The data are available directly from: Nishizaki, Michael T., Sara Grace Leuchtenberger, Maris Daleo, Peter Gullickson, Andi Delgado, and Carly Lo. 2022. Echinoderm Sperm Swimming and Fertilization. Dryad. <https://doi.org/10.5061/dryad.jwstqjqbz>
Scar heights
Description
Scar heights for men and women.
Usage
data(ScarHeight)
Format
A data frame with 4 rows (each combination) and 3 columns:
- Counts
The number of people with the given combination
- Gender
The gender of the person; one of
Women
orMen
- ScarHt
The scar height; one of
0mm
(i.e., smooth) or1mm
(i.e., 0mm to 1mm)
Source
Wallace, Hilary J., Mark W. Fear, Margaret M. Crowe, Lisa J. Martin, and Fiona M. Wood.2017. Identification of Factors Predicting Scar Outcome After Burn in Adults: A Prospective Case-Control Study. Burns 43: 1271–83
Shopping bags
Description
Age of people, and whether they bring their own shopping bags.
Usage
data(ShoppingBags)
Format
A data frame with 6 rows and 3 columns:
- AgeGroup
The age group:
1
means '30 and under';2
means '31 to 40';3
means 'Over 40'- BringBags
Whether people bring their own shopping bags or not;
y
means they do;n
means they do not- Counts
The number of people in each designated category
Source
From Tables 1 and 5 of: Choon, S. W., Tan, S. H., & Chong, L. L. (2017). The perception of households about solid waste management issues in Malaysia. Environment, Development and Sustainability, 19, 1685–1700.
Six-minute walk time tests
Description
Six-minute walk time data for two different walkway lengths.
Usage
data(SixMWT)
Format
A data frame with 50 rows (one per subject) and 3 columns:
- Dist20
The 6MWT distance in a 20m corridor, in m
- Dist30
The 6MWT distance in a 30m corridor, in m
- Age
The age of the subject, in completed years
Source
Saiphoklang, N., Pugongchai, A., & Leelasittikul, K. (2022). Comparison between 20 and 30 meters in walkway length affecting the 6-minute walk test in patients with chronic obstructive pulmonary disease: A randomized crossover study. Plos One, 17(1), e0262238.
Snakes
Description
Measurements of snakes, some of which eat crayfish, and some of which do not.
Usage
data(Snakes)
Format
A data frame with 28 rows (each plot) and 4 columns:
- Crayfish
Whether the snake lives in a crayfish region or not; one of
Cfish
orNoCfish
- Sex
The snake sex; one of
male
orfemale
- SVL
The snout-to-length length, in cm
- Teeth
The number of number of maxillary teeth
Source
Javier Manjarrez, Constantino Macías Garcia, Hugh Drummond (2018). Data from: Morphological convergence in a Mexican garter snake associated with the ingestion of a novel prey [Dataset]. Dryad. https://doi.org/10.5061/dryad.mg152
References
Manjarrez, J., Macias Garcia, C., & Drummond, H. (2017). Morphological convergence in a Mexican garter snake associated with the ingestion of a novel prey. Ecology and Evolution, 7(18), 7178–7186.
Soil carbon and nitrogen
Description
Percentage of carbon and nitrogen in irrigated and non-irrigated plots.
Usage
data(SoilCN)
Format
A data frame with 28 rows (each plot) and 4 columns:
- IrrigatedC
The percentage carbon, in a paired irrigated plot
- NonirrigatedC
The percentage carbon, in a paired non-irrigated plot
- IrrigatedN
The percentage nitrogen, in a paired irrigated plot
- NonirrigatedN
The percentage nitrogen, in a paired non-irrigated plot
Source
Lambie, S. M., Mudge, P. L., & Stevenson, B. A. (2021). Microbial community composition and activity in paired irrigated and non-irrigated pastures in New Zealand. Soil Research, 60(4), 337–348.
Soil properties
Description
Properties of soil and the California Bearing Ratio.
Usage
data(Soils)
Format
A data frame with 16 rows (each sample) and 12 columns:
- Sample
An identifier
- Gravel
The percentage of gravel in the sample
- Sand
The percentage of sand in the sample
- Clay
The percentage of clay in the sample
- PI
Plasticity index (PI, a measure of the plasticity of the soil
- CBR
The California Bearing Ratio, a measure of flexibility, as a percentage
Source
Talukdar, Dilip Kumar. 2014. A Study of Correlation Between California Bearing Ratio (CBR) Value with Other Properties of Soil. International Journal of Emerging Technology and Advanced Engineering 4 (1): 559–62
Speed of vehicles
Description
Speeds of vehicles before and after adding additional signage.
Usage
data(Speed)
Format
A data frame with 79 rows (each vehicle) and 2 columns:
- When
When the speed is measured; one of
Before
orAfter
new signage added- Speed
The measured speed, in km/h
Source
Ma, Yongfeng, Wenbo Zhang, Xin Gu, and Jiguang Zhao. 2019. Impacts of Experimental Advisory Exit Speed Sign on Traffic Speeds for Freeway Exit Ramp. PLoS One 14 (11):e0225203
Stress before surgery
Description
Stress at two time-points before surgery.
Usage
data(Stress)
Format
A data frame with 19 rows and 2 columns:
- BeforeHours
beta-endorphin concentrations measured 12–14 hours before surgery, in fmol/ml
- BeforeMins
beta-endorphin concentrations measured 10 minutes before surgery, in fmol/ml
Source
D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994) A Handbook of Small Data Sets, London: Chapman and Hall. Dataset 232.
References
The original source is given as Hoaglin, D. C., Mosteller, F. and Tukey. J. W. 1985. Exploring data tables, trends and shapes. New York: John Wiley & Sons.
Students' weight changes
Description
Weights of students from Week 1 to Week 12 of semester.
Usage
data(StudentWt)
Format
A data frame with 68 rows (each student) and 4 columns:
- Student
An identifier
- WtWk1
The student's weight in Week 1, in kg
- WtWk12
The student's weight in Week 12, in kg
- GainWt
The student's weight gain, in kg
Source
David. n.d. DASL: Data and Story Library. <https://dasl.datadescription.com/datafile/freshman-15/>
References
Levitsky, D. A., Halbmaier, C. A., & Mrdjenovic, G. (2004). The freshman weight gain: a model for the study of the epidemic of obesity. International Journal of Obesity, 28(11), 1435–1442.
Students eating habits
Description
Where students live and where they eat most of their meals.
Usage
data(StudentsEat)
Format
A data frame with 183 rows (each student) and 2 columns:
- Meals
Where the student eats most of their meals; one of
Most off-campus
orMost on-campus
- Live
Where the student lives; one of
Living with parents
orNot living with parents
Source
Mann, Linda, and Karen Blotnicky. 2017. Influences of Physical Environments on University Student Eating Behaviors. International Journal of Health Sciences 5 (2): 42–52
Kinesio tape use
Description
The use of tapes to reduce pain.
Usage
data(Tape)
Format
A data frame with 16 individuals having 18 observations:
- Age
The age of the participant, in years
- Sex
The sex of the participant; one of
1
or2
, but what they refer to is unknown- Pre.Left.KT.NoTension
The pressure pain threshold (PPT) in the left arm, using Kinesio tape (KT), applied without tension: The level of pressure where pain was felt, in kPa
- Pre.Right.KT.NoTension
The PPT, in the right arm, using KT, 5 mins before application of KT, applied without tension: The level of pressure where pain was felt, in kPa
- Post1.Left.KT.NoTension
The PPT, in the left arm, using KT, 5 mins after application of KT, applied without tension: The level of pressure where pain was felt, in kPa
- Post1.Right.KT.NoTension
The PPT, in the right arm, using KT, 5 mins after application of KT, applied without tension: The level of pressure where pain was felt, in kPa
- Post2.Left.KT.NoTension
The PPT, in the left arm, using KT, 15–20 mins after application of KT, applied without tension: The level of pressure where pain was felt, in kPa
- Post2.Right.KT.NoTension
The PPT, in the right arm, using KT, 15–20 mins after application of KT, applied without tension: The level of pressure where pain was felt, in kPa
- Post1.Left.75KT.Tension
The PPT, in the left arm, using KT, 5 mins after application of KT, applied with 75% tension: The level of pressure where pain was felt, in kPa
- Post1.Right.75KT.Tension
The PPT, in the right arm, using KT, 5 mins after application of KT, applied with 75% tension: The level of pressure where pain was felt, in kPa
- Post2.Left.75KT.Tension
The PPT, in the left arm, using KT, 15–20 mins after application of KT, applied with 75% tension: The level of pressure where pain was felt, in kPa
- Post2.Right.75KT.Tension
The PPT, in the right arm, using KT, 15–20 mins after application of KT, applied with 75% tension: The level of pressure where pain was felt, in kPa
- Pre.Left.NoTape
The PPT, in the left arm, using no tape: The level of pressure where pain was felt, in kPa
- Pre.Right.NoTape
The PPT, in the right arm, using no tape: The level of pressure where pain was felt, in kPa
- Post1.Left.NoTape
The PPT, in the left arm, using no tape, 10 minutes after first test: The level of pressure where pain was felt, in kPa
- Post1.Right.NoTape
The PPT, in the right arm, using no tape, 10 minutes after first test: The level of pressure where pain was felt, in kPa
- Post2.Left.NoTape
The PPT, in the left arm, using no tape, 20–35 minutes after first test: The level of pressure where pain was felt, in kPa
- Post2.Right.NoTape
The PPT, in the right arm, using no tape, 20–35 minutes after first test: The level of pressure where pain was felt, in kPa
Source
Naugle, K. E., Hackett, J., Aqeel, D., & Naugle, K. M. (2021). "Effect of different Kinesio tape tensions on experimentally-induced thermal and muscle pain in healthy adults." PloS One, 16(11), e0259433.
Throttles
Description
Throttle and manifold air pressure.
Usage
data(Throttle)
Format
A data frame with 68 rows (each student) and 2 columns:
- ThrottleAngle
The throttle angle, in degrees
- MAPvalue
The manifold air pressure, as a fraction of the maximum value
Source
Amin, Arslan Ahmed, and Khalid Mahmood-ul-Hasan. 2019. Robust Active Fault-Tolerant Control for Internal Combustion Gas Engine for Air-Fuel Ratio Control with Statistical Regression-Based Observer Model. Measurement and Control, 0020294018823031
Turbine fissures
Description
Fissure cracks appearing in turbines.
Usage
data(Turbines)
Format
A data frame with 4 rows and 3 columns:
- Hours
The approximate number of hours run by these turbines
- Turbines
The number of turbines run for the indicated number of hours
- Fissures
The number of fissure cracks in the turbines
Details
The data provide the number of turbines, and those with fissure cracks,
for an approximate given hours of run-time.
A two-way table of the data as given in not appropriate;
Turbines
includes all turbines,
including those given in Fissures
.
Source
Raymond H. Myers, Douglas C. Montgomery, and G. Geoffrey Vining (2002). Generalized linear models with applications in engineering and the sciences, Wiley.
Turtle nests
Description
Infected and non-infected turtle nests, and whether the nests were relocated.
Usage
data(TurtleNests)
Format
A data frame with 4 rows and 3 columns:
- Infected
Whether the nest was infected with fungi or bacteria; one of
0
(not infected) or1
- Nest
Whether the nest was relocated; one of
0
(Natural (not relocated)) or1
(relocated)- Counts
The number of nests in the combination defined by
Infected
andNest
Details
The data provide the number of nests from Mediterranean loggerhead turtles that had fungal or bacterial infections. Some nests are relocated due to the risk if tidal inundation; researchers were interested to see if the relocation was related to the probability of infection.
Source
Candan, Ahmet Yavuz, Katilmis, Yusuf and Ergin, Cagri (2021). "First report of Fusarium species occurrence in loggerhead sea turtle (Caretta caretta) nests and hatchling success in Iztuzu Beach, Turkey". Biologia, 76, 565–573.
Typing speeds
Description
Typing speeds and accuracy.
Usage
data(Typing)
Format
A data frame with 1301 rows (one for each student) and 5 columns:
- Subject
The subject number
- mTS
The mean typing speed (wpm) for each subject
- mAcc
The mean typing accuracy for each subject
- Age
The age, in completed years
.
- Sex
The sex of the subject; one of
female
ormale
Details
Typing speeds measured online for students.
Source
https://osf.io/v92fy/files/osfstorage?view_only=87885752038b4be190d532143fdedb07
References
Pinet, Svetlana, Christelle Zielinski, F.-Xavier Alario, and Marieke Longcamp. Typing Expertise in a Large Student Population. Cognitive Research: Principles and Implications 7, no. 1 (August 5, 2022): 77. https://doi.org/10.1186/s41235-022-00424-3.
Wheelchair tennis
Description
The push time for wheelchair tennis players, with and without holding a racquet.
Usage
data(WCTennis)
Format
A data frame with 13 rows (each player) and 3 columns:
- Person
The person
- PTwith
The push time, when holding a racquet; in seconds
- PTwithout
The push time, without holding a racquet; in seconds
Source
I. Alberca, 2016, Kinetic and temporal parameters calculated from raw data collected via wireless instrumented wheel for measuring 3D pushrim kinetics of a racing wheelchair, https://doi.org/10.17026/dans-xjf-bs8v, DANS Data Station Life Sciences, V1.
References
Alberca, I., Chénier, F., Astier, M., Watelain, E., Vallier, J. M., Pradon, D., & Faupin, A. (2022). Sprint performance and force application of tennis players during manual wheelchair propulsion with and without holding a tennis racket. PLoS ONE, 17(2), e0263392.
Water access
Description
Water access for households in West Cameroon.
Usage
data(WaterAccess)
Format
A data frame with 150 rows (15 participants by 10 reps each) and 12 columns:
- Region
The region; one of
Mbeng
,Mbih
orNtsingbeu
- Age
The age of the woman in the household, in years
- Education
The level of education of the woman; one of
Primary or less
orSecondary or higher
- SourceDistance
The distance to the water source; one of
Under 100m
,100m to 1000m
orOver 1000m
- SourceQueueTime
The queuing time at the water source; one of
Under 5 min
,5 to 15 min
orOver 15 min
- HasGarden
Whether the household has a farming garden; one of
Y
orN
- HasLivestock
Whether the household keeps livestock; one of
Y
orN
- HouseholdPeople
The number of people in the household
- HouseholdUnder5s
The number of people under 5 in the household
- WaterSource
The water source; one of
Tap
,Bore
,Well
orRiver
- WashContainer
How often the water container is washed; one of
Before each fill
,Once per week
orOnce per month
- Diarrhea
Whether a child has had diarrhoea in the last two weeks; one of
Y
orN
Source
Nounkeu, C. D., Metapi, Y. D., Ouabo, F. K., Kamguem, A. S. T., Nono, B., Azza, N., Leumeni, P., Nguefack-Tsague, G., Todem, D., Dharod, J. M., & Kuate, D. (2022). "Assessment of drinking water access and household water insecurity: A cross sectional study in three rural communities of the Menoua division, West Cameroon". PLOS Water, 1(8), e0000029.
Windmill and current
Description
The amount of direct current (DC) output from windmills for varying wind velocities.
Usage
data(Windmill)
Format
A data frame with 25 rows (each windmill) and 2 columns:
- Wind
The wind velocity, in miles per hour
- DC
The DC output
Source
G. Joglekar, J. H. Schuenemeyer and V. LaRicca (1989) Lack-of-fit testing when replicates are not available. American Statistician, 43, 135–143.
References
D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994). A Handbook of Small Data Sets, London: Chapman and Hall. Dataset 271.
D. C. Montgomery and E. A. Peck (1982). Introduction to Linear Regression Analysis. New York: John Wiley.
Yield of onions
Description
The mean yields per plant for three onion varieties.
Usage
data(YieldDen)
Format
A data frame with 30 rows (each plants) and 3 columns:
- Yield
The yield per plant, in grams
- Dens
The planting density, in plants per square foot
- Var
The variety; one of
1
.2
or3
Source
R. Mead (1970). Plant density and crop yield. Applied Statistics, 19(1), 64–81.