What is Stata?

Stata is a general purpose statistical packages, that provides data management, statistical analysis, graphics, simulations, regression, and custom programming.

Stata is licensed through StatCorp (www.stata.com) . ITS has subscribed a Stata/MP license to support HKU staff and students’ research projects, so that their Stata computation can be speeded up with multicore parallelism on the HPC2021 cluster system. You can visit the Stata/MP Performance Report for a complete assessment of Stata/MP’s performance that including command-by-command statistics.

Permission of using Stata on HPC2021

In order to run Stata in the HPC systems, user must be assigned in the group of “stata“. Please send your request to group-its-hpc@hku.hk to arrange the access.

Note on the STATA license: ITS HPC only purchased license for the HPC2021 cluster, and the cluster is for research work only (aka not for group project/coursework). We DO NOT have license for other computers (e.g. your personal computer, computers owned by other HKU departments) for whatever purposes, research or not. For those cases, please find your course teacher or departmental IT administrator.

 

Using Stata in HPC system

To use Stata in the HPC cluster system, you must first load the module to setup required environment variables.

System STATA version Command
HPC2021 16.1 module load stata/16.1
17.0 module load stata/17.0
18.0 module load stata/18.0

Stata Commands

Command Description
stata The Command Line interface (CLI) version of Stata
stata-mp The multiple core CLI version of Stata
xstata The Graphical User Interface (GUI) version of Stata
xstata-mp The multiple core GUI version of Stata

The Stata MP16 3-user network license restricted 3 concurrent users to run Stata simulations with maximum 16 cores per session. User must request the appropriate license either from the command line (Interactive session) or as part of the batch script (batch mode). If no license is available, a job will be held pending until a license becomes available.

  1. Running Stata jobs with SLURM (batch mode)

    Be reminded that do not run computationally intensive processes on the frontend nodes. You should submit the Stata simulations through the job scheduling system. Sample SLURM scripts for submitting Stata jobs are available at /share1/stata/sample/.

    #!/bin/bash	 	 
    #SBATCH --job-name=StataMP 	 	 
    #SBATCH --nodes=1	 	 
    #SBATCH --tasks-per-node=16      # for Stata MP16 (max. 16 cores) 	 	 
    #SBATCH --output=%x.out.%j       # Standard output 	 	 
    #SBATCH --error=%x.err.%j        # Standard error 	 	 
    #SBATCH --licenses=stata:1       # Use 1 Stata/MP license 	 	 
    
    module load stata	 	 
    stata-mp -b do stata_parallel.do	 	 
    
  2. Graphical User Interface
    $ ssh -X <username>@hpc2021-io1.hku.hk 	 	 
    $ module load stata/17.0 
    $ xstata my-stata.do	 	 
    
    STATA is also available in the HPC-one Web portal. Please visit https://hpc.hku.hk/guide/hpc-one-web-portal/ for details.

 

Packages Management

Installing Packages

Stata user on HPC2021 should use the IO nodes (i.e., hpc2021-io1 or hpc2021-io2) to install packages as the compute nodes are not allowed to connect to the Internet. The widely-known package repository SSC (Statistical Software Components) packages could be installed directly like this (note, the “.” is the Stata command prompt). for example, in order to install a package called summtab, you may type:

. ssc install summtab

On default the packages are installed into your home directory (/home/username/ado/).

Besides the SSC, you may encounter codes on GitHub which you may want to use. For those cases, you should consider installing a module named github which provides the command with the same name:

. net install github, from("https://haghish.github.io/github/")

After this you may install packages via this syntax (for example the parallel module at https://github.com/gvegayon/parallel):

. github install gvegayon/parallel

This github module provides its own managing capabilities. You may refer to its documentation for details.

This practice has not yet been universal for all STATA developers whom put their codes on GitHub, so if that does not work for your package, you may consider using the more verbose way. First locate its stata.toc file (such as https://github.com/mcaceresb/stata-gtools/tree/master/build ), then use STATA’s built-in net command (note the change from github.com to raw.githubusercontent.com):

. net install gtools, from("https://raw.githubusercontent.com/mcaceresb/stata-gtools/master/build/")

Listing Installed Packages

Installed packages could be listed via the Stata built-in command ado.

Removing Installed Packages

Installed packages could be listed via the Stata built-in command ado uninstall modulename.

Additional Information

Stata website: https://www.stata.com/
Stata official documentation: https://www.stata.com/features/documentation/
Stata webinars: https://www.stata.com/training/webinar/
Stata tutorials in YouTube:  https://www.youtube.com/user/statacorp