Stata Notes

Variable Management

Summary Example Command Notes
Re-labeling variables
rename q8 uid
Changes the variable q8 to the name uid
Deleting Variables
drop q8
 
View typing info
describe uid
 
Specify missing values
mvdecode *, mv(-1)
Changes all values in all variable with -1 to "missing value"

Only works with non-string type variables

View frequency of data values
codebook numrecalledtask
only interesting with non-string variables
Create a set of labels for a variable's values
label define percent 0 "a" 1 "b" 2 "c"
The value of 0 is now called "a" in all printouts and the set of labels is called "percent"
Destroy a label set
label drop percent
 
Apply labels to a variable's values 
label values percentagecovered percent
 
Remove labels from a variable's values
label values
 

Altering Data

Summary Example Command Notes
Saving changes
preserve
Be sure to execute this before the end of a "do" file or changes will be lost.
Changing values
replace q8 = "3268" in 11
Changes the 11th item's value in variable q8 to the value 3268
Combining Datasets
merge uid using "2nd.dta" "3rd.dta"
One dataset (the master) must be loaded before merging with other datasets.  You need to sort the master data by the merge variable (ie "uid") before executing the merge.

The merge variable needs to have the same type (ie, "int") in all datasets.  

Merging creates variables named _merge with this meaning:

_merge==1 obs. from master data 
_merge==2 obs. from only one using dataset 
_merge==3 obs. from at least two datasets, master or using

Sorting
sort uid
Converting from a string
destring q8, generate(uid)
Defaults to int?  Must all of the data already be numerical?
Converting from a string, but replacing nonnumerical data as missing values
destring q65, generate (foo) force

Viewing Data

Summary Example Command Notes
Query variable values
list uid if _merge == 1
Conditional query example
Count the number of observations satisfying a given condition
count
count if _merge1 > 1
count the number of observations

count the number of observations where _merge1 is greater than 1

     

Statistics

Summary Example Command Notes
Chi-squared test
csgof q34, expperc(6.7 24.2 38.2 24.2 6.7)
Second parameter is the expected frequency of observations, shown is the expected frequencies in the normal distribution
Frequency counts tabulate q34 Provides frequency counts, percents, and cumulative percents
Return values return list

ereturn list 

sreturn list

nreturn list

list all return values for the last calculation

e-class for estimation class

 

Retreive a return value display r(chi2)  
Correlate two variables corr q72 q79  
Chi-square test for independence tabulate q62 q67, chi2  
Regress between two variables reg a b  
Regress with a categorical variable, treating it as binary variable xi: reg a i.b  
     

Programming

Summary Example Command Notes
repeating a command over several variables foreach x of varlist q18 q19 q20 q21 q22 q23 q24 q25 q26 q27 q28 q29 q31 q32 q33 q34 q35 q36 q41 q42 q43 q44 q45 q46 q47 q48 q49 q50 q51 q52 q113 q114 q115 q116 q117 q118 q119 q120 q121 q122 q123 q124 {
csgof `x', expperc(6.7 24.2 38.2 24.2 6.7)
}
repeats the command 'csgof' for all variables q18, q19,...,q123, q124
Suppress the output of a calculation quietly count if `x' = .
Display a string display "\begin{tabular}"
Format a value display %9.2f = 22/7 Show pi with at most 9 leading integers and round to the nearest hundredth

 

Using other packages

Summary Example Command Notes
open a connection to download packages
net from URL
the site needs
install a package from the current connection net install tab_chi
list installed packages ado