help dthaz
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Title
dthaz -- Discrete-time hazard and survival probability estimates
Syntax
dthaz [indepvars] [if] [in] [weight] [, options]
options description
-------------------------------------------------------------------------------------------------------------------------------------------------------
Model
specify(numlist) specify values for predicted population values
tpar(#) select alternative parameterizations of time
truncate(#) truncate the maximum time of length to event
pretrunc(#) ignore some initial time periods in the model
link(linkname) link function; default is logit hazard
SE/Robust
cluster(varname) adjust standard errors for intragroup correlation
Reporting
display(#) limit the maximum displayed period
level(#) set confidence level; default is level(95)
model output model estimate
suppress switch off dthaz output
Graph options
graph(#) conditional hazard, survival, or cumulative incidence curves
twoway_options graph twoway options
Miscellaneous
copyleft display license information
-------------------------------------------------------------------------------------------------------------------------------------------------------
linkname description
-------------------------------------------------------------------------------------------------------------------------------------------------------
logit logit hazard
probit probit hazard
cloglog complimentary log log hazard
-------------------------------------------------------------------------------------------------------------------------------------------------------
fweights, iweights, and pweights are allowed; see weight.
Description
dthaz estimates the hazard and survival probabilities of the population, given the specified model by means of a logit link (default) or by a
complementary log-log link. This program requires data in person-period format, and person-period variables may be created using prsnperd.
Typed with no varlist and with no tpar() option, dthaz estimates baseline conditional hazard (h) and survival probabilities (S) for the sample. These
estimates correspond exactly with actuarial estimates of sample hazard and sample survival functions. Specifying numeric predictors in varlist and the
required set of associated values with the specify() option adds them to the model following as follows (for logit hazard):
h_i = 1/(1+e^-(a_i*d_i + BX_i))
Where:
a_i is the effect of the ith time period, d_i,
B is a vector of effects for a vector of predictors X_i during the ith time period, and
S_i = (1-h_1)*(1-h_2) * ¥ ¥ ¥ * (1-h_i).
The reported conditional hazard and survival probabilities are accompanied by standard errors approximated using a first order application of the delta
method (Dinno and Kim, 2011). The current survival confidence intervals give suboptimal confidence coverage, and revised intervals will soon be
provided in an update. The normally approximated confidence intervals drawn using the graph() option are obtained by application of these standard
errors with the alpha specified by level().
Options
+-------+
----+ Model +------------------------------------------------------------------------------------------------------------------------------------------
specify(numlist) The user must specify which category of population members the hazard and survival estimates are to be calculated. Currently, if
specifications are made with this option, they must be made for each of the variables specified in varlist. Specifications may be separated by
spaces, commas or both.
tpar(#) The user may select alternative parameterizations of time. Such time parameterizations allow a parsimonious smoothing of the effects of time,
and are as follows:
-1 Fully discrete time parameterization. This setting is the default, and reflects unique effects of time for each period.
0 Constant time parameterization. This model constrains the effect of time to be constant across all periods. The model includes a prespecified
constant term, is used in the following models, and permits model nesting.
N Polynomial time parameterization. This model constrains the effect of time as a polynomial function of order N. If the representation of time is
over-specified (i.e. has more predictors than the number of periods in the dataset, or than the number the analysis has been truncated to) then the
user will be warned and the parameterization will be reset to its maximum. Lower order models nest within higher order ones. N > 0.
-2 Root time parameterization. This model constrains the effect of time as a square-root function of period (plus constant plus linear terms)
truncate(#) The user may truncate the maximum time of length to event to this number. The estimate will censor data for time periods beyond this point.
Negative values and values greater than the maximum period value are ignored.
Note: Specifying this option for the baseline model will produce exactly the same estimates as for the untruncated model for the given periods, since
baseline estimates are always equal to the sample hazard and sample survival functions.
pretrunc(#) The user may discard early time periods from the new dataset. For example, when pre-truncating with a value of 2, the period that would be
indicated by _d3 becomes _d1 instead, and the value of _period would be decreased by 2. The dataset is preserved when using this option
Note: Specifying values of truncate greater than the one minus the maximum value of length-to-event (or specifying negative values) produces the same
dataset as one with no value of truncate specified. Also, truncate and pretrunc cannot be combined when their values would result in fewer than two
periods. Discrete time survival analyses conducted upon pre-truncated datasets are, in effect analyses conducted upon separate populations from the not
pre-truncated datasets if the conditional hazard during the pre-truncated periods is greater than zero. The author suggests that an analyst may desire to
perform a pre-truncated analysis either because there are no events during initial periods, or because she is interested in analyzing a surviving
sub-population at a later starting period. However, in cases where events occurred during the pre-truncated periods, a survival analysis cannot be said
to generalize to the population of the not pre-truncated dataset. In cases where events occur in initial periods, but at rates that are too few to
provide reliable estimates for these periods, the analyst should both employ a sensitivity analysis to describe differences between models on
pre-truncated and not pre-truncated datasets, but also examine the characteristics of anomalous individuals--qualitative data may particularly help
illuminate how these persons differ from the majority of individuals who remain in the pre-truncated dataset.
link(linkname) switches between different models of the hazard function: logit, probit or complimentary log log hazards. The default logit hazard model
is described above. The general discrete time probit hazards model is:
h = Phi(a_i*d_i + B*X_i)
Where Phi() is the inverse of the cumulative distribution function of the standard normal distribution, and the parameters follow the same
conventions described for the logit hazard model above.
The cloglog link option produces estimates under an assumption of proportional hazards. The general discrete time complimentary log log hazard model
is:
h = 1-exp(-exp(a_i*d_i + B*X_i))
Where the parameters follow the same conventions described for the logit hazard model above.
+-----------+
----+ SE/Robust +--------------------------------------------------------------------------------------------------------------------------------------
cluster(varname) The user may adjust the standard errors of the estimates for person-level (between person) variance in repeated measures designs by
specifying the id variable used to construct the person-period dataset.
+-----------+
----+ Reporting +--------------------------------------------------------------------------------------------------------------------------------------
display(#) The user may limit the maximum period for hazard and survival probabilities to this number. This option only affects which values are
displayed. The estimated and values returned in r(Hazard) remain as for the maximum period of the person-period dataset. Negative values and values
greater than the maximum period value are ignored.
level(#); see [R] estimation options.
model This option includes the estimated model in the output.
suppress Switches off dthaz output. Graphs still display if selected. The estimated model is displayed if the model option is turned on.
+---------------+
----+ Graph options +----------------------------------------------------------------------------------------------------------------------------------
graph(#) Users may opt to graph conditional hazard probabilities (1), survival probabilities (2), both (3) or (4) cumulative incidence probabilities
(i.e. 1 - survival) against discrete time periods. Graphing options available to grtwoway are available. The default setting is no graph.
Note: the graph() option does not yet plot confidence intervals in Stata 7.
+---------------+
----+ Miscellaneous +----------------------------------------------------------------------------------------------------------------------------------
copyleft dthaz is free software, licensed under the GPL. The copyleft option displays the copying permission statement for dthaz. The full license can
be obtained by typing:
. net describe dthaz, from (http://www.alexisdinno.com/stata)
and clicking on the click here to get link for the ancillary file.
Examples
. dthaz
. dthaz sex region, specify(0 6) truncate(6)
. dthaz sex educate class, sp(1, 12, 0) gr(3)
. dthaz party age, sp(0 1) model link(cloglog)
. dthaz, tp(3)
Author
Alexis Dinno
Portland State University
alexis dot dinno at pdx dot edu
Please contact me with any questions, bug reports or suggestions for improvement.
My thanks to Dr. Suzanne Graham.
References
Dinno A and Kim JS. 2011. "Approximating Confidence Intervals About Discrete-Time Survival/Cumulative Incidence Estimates Using the Delta Method."
Unpublished (manuscript available on request)
Singer JD and Willett JB. 2003. Applied Longitudinal Data Analysis: Modeling Change and Event Occurence. Oxford, UK: Oxford University Press. 672 pages.
Willet JB and Singer JD. 1991. "From Whether to When: New Methods for Studying Student Dropout and Teacher Attrition." Review of Educational Research.
61: 407-450
Singer JD and Willett JB. 1991. "Modeling the Days of Our Lives: Using Survival Analysis When Designing and Analyzing Longitudinal Studies of Duration
and Timing of Events." Psychological Bulletin. 110: 268-290
Saved results
In addition to the results returned by the estimation commands logistic, probit and cloglog, dthaz saves the following in e():
Matrices
e(Hazard) Conditional hazard vector for the specified group
e(HazardSE) Standard error vector for the conditional hazards
e(Survival) Survival probability vector for the specified group
e(SurvivalSE) Standard error vector for the survival probabilities
Also See